<a href="https://colab.research.google.com/github/malam1210/Air-Quality-Visualizer/blob/main/AirQuality_ProjectFinal_py.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# **Air Quality Data Visualizer:**


In this first part of our code, we just import various libraries that are necessary for our project to work. Some of the main libraries that we used were folium, requests, and zip codes. We also used lots of other python libraries.

In [1]:
!conda install -c conda-forge folium=0.5.0 --yes
import folium 
print('Folium installed and imported!')

/bin/bash: conda: command not found
Folium installed and imported!


In [2]:
import requests, json
import pandas as pp
import numpy as np
from folium.plugins import Search

In [3]:
pip install pgeocode

Collecting pgeocode
  Downloading https://files.pythonhosted.org/packages/86/44/519e3db3db84acdeb29e24f2e65991960f13464279b61bde5e9e96909c9d/pgeocode-0.2.1-py2.py3-none-any.whl
Installing collected packages: pgeocode
Successfully installed pgeocode-0.2.1


In this part of our code, we used the AirNow api which is a government run API which allows us to input various different zip codes, and the API will return to us the air quality at that particular zip code. We put all of this functionality into a fuction called findAQI.

In [4]:
# returns the air quality for the given zip code for the current date
def findAQI(zip):
    website = f"https://www.airnowapi.org/aq/observation/zipCode/current/?format=application/json&zipCode={zip}&distance=25&API_KEY=APIKEY"
    webRequest = requests.get(website).json() # []
    AQI = 0
    if webRequest:
        for i in range(len(webRequest)):
            AQI += webRequest[i]["AQI"]
        return AQI/len(webRequest)

This is just a base folium map centered at the United States of America. We will add markers at different zip codes to show the various air quality indexes at different zip codes in America.

In [5]:
# creates baseMap to be expanded upon further below
latitude = 37.0902
longitude = -95.7129
baseMap = folium.Map(location=[latitude, longitude], zoom_start=4.2069)
baseMap

This function, formatZips, will execute if there is a zip code that begins with a 0. In python, we cannot pass in the value 00501 into a for loop since that will cause an error. So to overcome this barrier, we simply passed in 501 and added in the zeroes with this function. This function will execute for any zip code that begins with the number zero.


In [6]:
#changes 3 number zipcode into 5 digit
def formatZips(number):
    tempString = str(number)
    while len(tempString) != 5:
        tempString = "0" + tempString
    return tempString

In [7]:
pip install zipcodes

Collecting zipcodes
[?25l  Downloading https://files.pythonhosted.org/packages/df/33/da326d64915c1ac7ca8c3232b004ff6c8c60a326012a970199455c437f31/zipcodes-1.1.2-py2.py3-none-any.whl (717kB)
[K     |████████████████████████████████| 727kB 2.8MB/s 
[?25hInstalling collected packages: zipcodes
Successfully installed zipcodes-1.1.2


In the cell below, we define a dictionary called AQIdict, this dictionary will take in zip codes at various places in America as keys and the values of the dictionary will be the air quality index at that particular zip code. The for loop in the cell below starts running at the lowest zip code in America (00501), it actually starts at 501 and we add in two zeroes with our formatZips function that we described above. The ending value for our for loop is 99950 which is the last valid zip code in America. We also increment each entry by 1000 so that the number of entries on our map does not become to overwhelming to our user to look at. Using this method ensures that we get a solid amount of geographical diversity on our map, so users from various regions can identify the air quality in their area.

In [8]:
#creates a dictionary of zipcodes with AQI values
import zipcodes
AQIdict = {}

for i in range(501, 99950, 1000):
    if len(str(i)) != 5:
        correctZip = formatZips(i)
        if zipcodes.is_real(formatZips(i)):
            findAQIResult = findAQI(correctZip)
            AQIdict[correctZip] = findAQIResult
    else:
        findAQIResult = findAQI(str(i))
        if zipcodes.is_real(str(i)) and findAQIResult:
            AQIdict[i] = findAQIResult

AQIdict
        

{'00501': None,
 '01501': 31.0,
 '05501': 36.0,
 '06501': 20.0,
 '07501': 31.0,
 '08501': None,
 '09501': None,
 10501: 17.333333333333332,
 11501: 23.5,
 12501: 19.0,
 13501: 22.0,
 16501: 24.5,
 17501: 29.5,
 18501: 26.5,
 19501: 29.5,
 20501: 27.0,
 23501: 19.5,
 25501: 27.0,
 26501: 35.0,
 27501: 26.0,
 28501: 24.0,
 29501: 27.0,
 30501: 21.5,
 32501: 42.0,
 37501: 39.0,
 39501: 48.0,
 41501: 26.5,
 44501: 31.0,
 45501: 29.0,
 48501: 23.0,
 49501: 24.5,
 51501: 18.0,
 53501: 23.0,
 54501: 31.0,
 56501: 12.0,
 58501: 14.333333333333334,
 60501: 30.0,
 62501: 30.0,
 64501: 26.0,
 68501: 24.0,
 70501: 54.0,
 73501: 27.0,
 76501: 41.0,
 77501: 47.0,
 78501: 36.0,
 80501: 23.333333333333332,
 81501: 21.5,
 82501: 23.5,
 83501: 18.0,
 84501: 33.0,
 87501: 21.0,
 89501: 25.333333333333332,
 90501: 56.0,
 91501: 85.5,
 92501: 61.666666666666664,
 94501: 47.0,
 95501: 54.0,
 96501: 47.0,
 97501: 26.0,
 98501: 9.0,
 99501: 13.0}

As we can see from the output of the cell above, there are certain values where NaN values are returned for air quality indexes. This is often due to the fact that zip codes for mailing items. Some of the zip codes are meant for entities such as the US Army, which does not have a particular location associated with it, and consequently will not have an Air Quality Index. So the function below just removes NaN values from the dictionary. 


In [9]:
# This deletes NAN from Dictionary
for i in list(AQIdict):
    if not AQIdict[i]:
        del AQIdict[i]

AQIdict      

{'01501': 31.0,
 '05501': 36.0,
 '06501': 20.0,
 '07501': 31.0,
 10501: 17.333333333333332,
 11501: 23.5,
 12501: 19.0,
 13501: 22.0,
 16501: 24.5,
 17501: 29.5,
 18501: 26.5,
 19501: 29.5,
 20501: 27.0,
 23501: 19.5,
 25501: 27.0,
 26501: 35.0,
 27501: 26.0,
 28501: 24.0,
 29501: 27.0,
 30501: 21.5,
 32501: 42.0,
 37501: 39.0,
 39501: 48.0,
 41501: 26.5,
 44501: 31.0,
 45501: 29.0,
 48501: 23.0,
 49501: 24.5,
 51501: 18.0,
 53501: 23.0,
 54501: 31.0,
 56501: 12.0,
 58501: 14.333333333333334,
 60501: 30.0,
 62501: 30.0,
 64501: 26.0,
 68501: 24.0,
 70501: 54.0,
 73501: 27.0,
 76501: 41.0,
 77501: 47.0,
 78501: 36.0,
 80501: 23.333333333333332,
 81501: 21.5,
 82501: 23.5,
 83501: 18.0,
 84501: 33.0,
 87501: 21.0,
 89501: 25.333333333333332,
 90501: 56.0,
 91501: 85.5,
 92501: 61.666666666666664,
 94501: 47.0,
 95501: 54.0,
 96501: 47.0,
 97501: 26.0,
 98501: 9.0,
 99501: 13.0}

This is a dataframe of the dictionary data from above. Dataframes organize data into neat and readable parts that are easy to derive insights from.

In [10]:
# adding zipcode and AQI to dataFrame
df = pp.DataFrame(list(AQIdict.items()),columns = ['ZipCode','AQI'])
df

Unnamed: 0,ZipCode,AQI
0,1501,31.0
1,5501,36.0
2,6501,20.0
3,7501,31.0
4,10501,17.333333
5,11501,23.5
6,12501,19.0
7,13501,22.0
8,16501,24.5
9,17501,29.5


In the code segment below, we are using the pgeocode library to identify the latitude and longitude of a particular zip code. The latitude and longitude are necessary since we need to input those values into our folium map in order for markers to be created on the folium map. We also added our longitude and latitude into our dataframe for the reasons mentioned above.

In [11]:
# cleaning dataFrame
import pgeocode
nomi = pgeocode.Nominatim('us')
latVals, longVals = [], []
for k in AQIdict.keys():
    latVals.append(nomi.query_postal_code(str(k))["latitude"]), longVals.append(nomi.query_postal_code(str(k))["longitude"])
df.insert(2, "Longitude", longVals), df.insert(3, "Latitude", latVals)
df

Unnamed: 0,ZipCode,AQI,Longitude,Latitude
0,1501,31.0,-71.8391,42.2055
1,5501,36.0,-71.1842,42.6472
2,6501,20.0,-72.9282,41.3082
3,7501,31.0,-74.1671,40.9143
4,10501,17.333333,-73.7611,41.2946
5,11501,23.5,-73.6398,40.7469
6,12501,19.0,-73.5542,41.8447
7,13501,22.0,-75.2315,43.0871
8,16501,24.5,-80.086,42.126
9,17501,29.5,-76.2042,40.1573


In the below cell, we are just removing the NaN values from our longitude and latitude columns in our dataframe. Reasons why NaN latitude and longitude values could occure were previously mentioned above.

In [12]:
# cleaning up dataFrame
df = df.dropna().reset_index(drop = True)
df

Unnamed: 0,ZipCode,AQI,Longitude,Latitude
0,1501,31.0,-71.8391,42.2055
1,5501,36.0,-71.1842,42.6472
2,6501,20.0,-72.9282,41.3082
3,7501,31.0,-74.1671,40.9143
4,10501,17.333333,-73.7611,41.2946
5,11501,23.5,-73.6398,40.7469
6,12501,19.0,-73.5542,41.8447
7,13501,22.0,-75.2315,43.0871
8,16501,24.5,-80.086,42.126
9,17501,29.5,-76.2042,40.1573


In the code cell below, we read the values from our dataframe and plot them onto our folium map. We also used various colors in our map to represent the air quality at different zip codes. For instance, anywhere the air quality is green means that the air is safe and healthy, anywhere the air quality is blue shows that the air quality index indicates that the air quality in the region is moderate. Thankfully from our limited sample size, we did not encounter any air, with an AQI greater than 100, but if we ever did, the markers would change to a different color based on how bad their air quality is.

In [19]:
incidents, labelsList = folium.map.FeatureGroup(), []
for lat, lng, in zip(df.Longitude, df.Latitude):
    incidents.add_child(
        folium.CircleMarker(
            [lat, lng],
            radius=5,
            color='yellow',
            fill=True,
            fill_color = 'orange',
            fill_opacity=0.6
        )
    )
latitudes, longitude = list(df.Latitude), list(df.Longitude)
for m in range(len(df.index)):
    labelValue = "Zip Code: " + str(df.ZipCode[m]) + "<br>" + " AQI: " + str(df.AQI[m])
    labelsList.append(labelValue)
labels = labelsList
count = 0
for lat, lng, label in zip(latitudes, longitude, labels):
    if df['AQI'].values[count] <= 50:
        folium.Marker([lat, lng], popup=label, icon=folium.Icon(color='green')).add_to(baseMap)
    elif df['AQI'].values[count] >= 51 and df['AQI'].values[count] <= 100:
        folium.Marker([lat, lng], popup=label, icon=folium.Icon(color='cadetblue')).add_to(baseMap)
    elif df['AQI'].values[count] >= 101 and df['AQI'].values[count] <= 150:
        folium.Marker([lat, lng], popup=label, icon=folium.Icon(color='orange')).add_to(baseMap)
    elif df['AQI'].values[count] >= 151 and df['AQI'].values[count] <= 200:
        folium.Marker([lat, lng], popup=label, icon=folium.Icon(color='red')).add_to(baseMap)
    elif df['AQI'].values[count] >= 201 and df['AQI'].values[count] <= 300:
        folium.Marker([lat, lng], popup=label, icon=folium.Icon(color='purple')).add_to(baseMap)
    count+=1
    #folium.Marker([lat, lng], popup=label, icon=folium.Icon(color='red')).add_to(rondimap)    
baseMap.add_child(incidents)


The cell below allows for the lookup feature to work on our folium map. With this feature, the user can simply enter a zip code and the map will highlight that zip code and show the user what the air quality index is at the zip code that they have entered.


In [18]:
geo_json = {
  "type": "FeatureCollection",
  "features": [],
}
for d in df.iterrows():
    temp_dict = {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates":[d[1]["Longitude"], d[1]["Latitude"]],
        
      },"properties": {"ZipCode": d[1]["ZipCode"]}
    }
    geo_json["features"].append(temp_dict)
geojson_obj = folium.GeoJson(geo_json).add_to(baseMap)
servicesearch = Search(
    layer=geojson_obj,
    search_label="ZipCode",
    placeholder='Search for a service',
    collapsed=False,
).add_to(baseMap)