<a href="https://colab.research.google.com/github/malam1210/malam1210.github.io/blob/master/AirQuality_Project_py.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# **Air Quality Data Visualizer:**
By: Alex Tang, Jawad Alam, and Ranadheer Tripuraneni


In this first part of our code, we just import various libraries that are necessary for our project to work. Some of the main libraries that we used were folium, requests, and zip codes. We also used lots of other cool python libraries.

In [None]:
!conda install -c conda-forge folium=0.5.0 --yes
import folium 
print('Folium installed and imported!')

/bin/bash: conda: command not found
Folium installed and imported!


In [None]:
import requests, json
import pandas as pp
import numpy as np
from folium.plugins import Search

In [None]:
pip install pgeocode



In this part of our code, we used the AirNow api which is a government run API which allows us to input various different zip codes, and the API will return to us the air quality at that particular zip code. We put all of this functionality into a fuction called findAQI.

In [None]:
website1 = "https://www.airnowapi.org/aq/observation/zipCode/current/?format=application/json&zipCode="
website2 = "&distance=25&API_KEY=85AB1975-5F9F-466E-BF7A-7FAEBF27F2D5"

# returns the air quality for the given zip code for the current date
def findAQI(zip):
    webFull = website1 + zip + website2
    webRequest = requests.get(webFull).json() # []
    AQI = 0
    if webRequest:
        for i in range(len(webRequest)):
            AQI += webRequest[i]["AQI"]
        return AQI/len(webRequest)

def getLatLong(zip):
    webFull = website1 + zip + website2
    webRequest = requests.get(webFull).json()
    latit = 0
    longit = 0
    for i in range(len(webRequest)):
        latit = webRequest[i]["Latitude"]
        longit = webRequest[i]["Longitude"]
    print(latit, longit)
    

This is just a base folium map centered at the United States of America. We will add markers at different zip codes to show the various air quality indexes at different zip codes in America.

In [None]:
# creates baseMap to be expanded upon further below
latitude = 37.0902
longitude = -95.7129
baseMap = folium.Map(location=[latitude, longitude], zoom_start=4.2069)
baseMap

This function, formatZips, will execute if there is a zip code that begins with a 0. In python, we cannot pass in the value 00501 into a for loop since that will cause an error. So to overcome this barrier, we simply passed in 501 and added in the zeroes with this function. This function will execute for any zip code that begins with the number zero.


In [None]:
#changes 3 number zipcode into 5 digit
def formatZips(number):
    tempString = str(number)
    while len(tempString) != 5:
        tempString = "0" + tempString
    return tempString

In [None]:
pip install zipcodes



In the cell below, we define a dictionary called AQIdict, this dictionary will take in zip codes at various places in America as keys and the values of the dictionary will be the air quality index at that particular zip code. The for loop in the cell below starts running at the lowest zip code in America (00501), it actually starts at 501 and we add in two zeroes with our formatZips function that we described above. The ending value for our for loop is 99950 which is the last valid zip code in America. We also increment each entry by 1000 so that the number of entries on our map does not become to overwhelming to our user to look at. Using this method ensures that we get a solid amount of geographical diversity on our map, so users from various regions can identify the air quality in their area.

In [None]:
#creates a dictionary of zipcodes with AQI values
import zipcodes
AQIdict = {}

for i in range(501, 99950, 1000):
    if len(str(i)) != 5:
        correctZip = formatZips(i)
        if zipcodes.is_real(formatZips(i)):
            findAQIResult = findAQI(correctZip)
            AQIdict[correctZip] = findAQIResult
    else:
        findAQIResult = findAQI(str(i))
        if zipcodes.is_real(str(i)) and findAQIResult:
            AQIdict[i] = findAQIResult

AQIdict
        

{'00501': None,
 '01501': 36.5,
 '05501': 26.5,
 '06501': 28.666666666666668,
 '07501': None,
 '08501': None,
 '09501': None,
 10501: 31.0,
 11501: 34.5,
 12501: 37.5,
 13501: 17.5,
 16501: 20.0,
 17501: 36.0,
 18501: 33.0,
 19501: 36.0,
 20501: 27.0,
 23501: 18.0,
 27501: 18.5,
 28501: 20.5,
 29501: 31.0,
 30501: 17.5,
 32501: 31.5,
 37501: 19.0,
 39501: 32.5,
 41501: 14.5,
 44501: 37.5,
 45501: 32.0,
 48501: 8.0,
 49501: 16.0,
 51501: 83.33333333333333,
 53501: 16.0,
 54501: 13.5,
 56501: 16.0,
 57501: 56.0,
 58501: 49.0,
 60501: 28.5,
 62501: 42.0,
 64501: 36.333333333333336,
 68501: 36.0,
 70501: 26.5,
 73501: 19.333333333333332,
 76501: 32.0,
 77501: 41.5,
 78501: 39.0,
 80501: 43.666666666666664,
 81501: 30.0,
 82501: 36.0,
 83501: 11.0,
 84501: 40.0,
 87501: 23.0,
 89501: 20.333333333333332,
 90501: 28.0,
 91501: 38.0,
 92501: 43.0,
 94501: 17.0,
 95501: 35.0,
 96501: 17.0,
 97501: 34.0,
 98501: 8.0,
 99501: 59.0}

As we can see from the output of the cell above, there are certain values where NaN values are returned for air quality indexes. This is often due to the fact that zip codes for mailing items. Some of the zip codes are meant for entities such as the US Army, which does not have a particular location associated with it, and consequently will not have an Air Quality Index. So the function below just removes NaN values from the dictionary. 


In [None]:
# This deletes NAN from Dictionary
for i in list(AQIdict):
    if not AQIdict[i]:
        del AQIdict[i]

AQIdict      

{'01501': 36.5,
 '05501': 26.5,
 '06501': 28.666666666666668,
 10501: 31.0,
 11501: 34.5,
 12501: 37.5,
 13501: 17.5,
 16501: 20.0,
 17501: 36.0,
 18501: 33.0,
 19501: 36.0,
 20501: 27.0,
 23501: 18.0,
 27501: 18.5,
 28501: 20.5,
 29501: 31.0,
 30501: 17.5,
 32501: 31.5,
 37501: 19.0,
 39501: 32.5,
 41501: 14.5,
 44501: 37.5,
 45501: 32.0,
 48501: 8.0,
 49501: 16.0,
 51501: 83.33333333333333,
 53501: 16.0,
 54501: 13.5,
 56501: 16.0,
 57501: 56.0,
 58501: 49.0,
 60501: 28.5,
 62501: 42.0,
 64501: 36.333333333333336,
 68501: 36.0,
 70501: 26.5,
 73501: 19.333333333333332,
 76501: 32.0,
 77501: 41.5,
 78501: 39.0,
 80501: 43.666666666666664,
 81501: 30.0,
 82501: 36.0,
 83501: 11.0,
 84501: 40.0,
 87501: 23.0,
 89501: 20.333333333333332,
 90501: 28.0,
 91501: 38.0,
 92501: 43.0,
 94501: 17.0,
 95501: 35.0,
 96501: 17.0,
 97501: 34.0,
 98501: 8.0,
 99501: 59.0}

This is a dataframe of the dictionary data from above. Dataframes organize data into neat and readable parts that are easy to derive insights from.

In [None]:
# adding zipcode and AQI to dataFrame
df = pp.DataFrame(list(AQIdict.items()),columns = ['ZipCode','AQI'])
df

Unnamed: 0,ZipCode,AQI
0,1501,36.5
1,5501,26.5
2,6501,28.666667
3,10501,31.0
4,11501,34.5
5,12501,37.5
6,13501,17.5
7,16501,20.0
8,17501,36.0
9,18501,33.0


In the code segment below, we are using the pgeocode library to identify the latitude and longitude of a particular zip code. The latitude and longitude are necessary since we need to input those values into our folium map in order for markers to be created on the folium map. We also added our longitude and latitude into our dataframe for the reasons mentioned above.

In [None]:
# cleaning dataFrame
import pgeocode
nomi = pgeocode.Nominatim('us')
latVals, longVals = [], []
for k in AQIdict.keys():
    latVals.append(nomi.query_postal_code(str(k))["latitude"]), longVals.append(nomi.query_postal_code(str(k))["longitude"])
df.insert(2, "Longitude", longVals), df.insert(3, "Latitude", latVals)
df

Unnamed: 0,ZipCode,AQI,Longitude,Latitude
0,1501,36.5,-71.8391,42.2055
1,5501,26.5,-71.1842,42.6472
2,6501,28.666667,-72.9282,41.3082
3,10501,31.0,-73.7611,41.2946
4,11501,34.5,-73.6398,40.7469
5,12501,37.5,-73.5542,41.8447
6,13501,17.5,-75.2315,43.0871
7,16501,20.0,-80.086,42.126
8,17501,36.0,-76.2042,40.1573
9,18501,33.0,-75.6376,41.4019


In the below cell, we are just removing the NaN values from our longitude and latitude columns in our dataframe. Reasons why NaN latitude and longitude values could occure were previously mentioned above.

In [None]:
# cleaning up dataFrame
df = df.dropna().reset_index(drop = True)
df

Unnamed: 0,ZipCode,AQI,Longitude,Latitude
0,1501,36.5,-71.8391,42.2055
1,5501,26.5,-71.1842,42.6472
2,6501,28.666667,-72.9282,41.3082
3,10501,31.0,-73.7611,41.2946
4,11501,34.5,-73.6398,40.7469
5,12501,37.5,-73.5542,41.8447
6,13501,17.5,-75.2315,43.0871
7,16501,20.0,-80.086,42.126
8,17501,36.0,-76.2042,40.1573
9,18501,33.0,-75.6376,41.4019


In the code cell below, we read the values from our dataframe and plot them onto our folium map. We also used various colors in our map to represent the air quality at different zip codes. For instance, anywhere the air quality is green means that the air is safe and healthy, anywhere the air quality is blue shows that the air quality index indicates that the air quality in the region is moderate. Thankfully from our limited sample size, we did not encounter any air, with an AQI greater than 100, but if we ever did, the markers would change to a different color based on how bad their air quality is.

In [None]:
incidents, labelsList = folium.map.FeatureGroup(), []
for lat, lng, in zip(df.Longitude, df.Latitude):
    incidents.add_child(
        folium.CircleMarker(
            [lat, lng],
            radius=5,
            color='yellow',
            fill=True,
            fill_color = 'orange',
            fill_opacity=0.6
        )
    )
latitudes, longitude = list(df.Latitude), list(df.Longitude)
for m in range(len(df.index)):
    labelValue = "Zip Code: " + str(df.ZipCode[m]) + "<br>" + " AQI: " + str(df.AQI[m])
    labelsList.append(labelValue)
labels = labelsList
count = 0
for lat, lng, label in zip(latitudes, longitude, labels):
    if df['AQI'].values[count] <= 50:
        folium.Marker([lat, lng], popup=label, icon=folium.Icon(color='green')).add_to(baseMap)
    elif df['AQI'].values[count] >= 51 and df['AQI'].values[count] <= 100:
        folium.Marker([lat, lng], popup=label, icon=folium.Icon(color='cadetblue')).add_to(baseMap)
    elif df['AQI'].values[count] >= 101 and df['AQI'].values[count] <= 150:
        folium.Marker([lat, lng], popup=label, icon=folium.Icon(color='orange')).add_to(baseMap)
    elif df['AQI'].values[count] >= 151 and df['AQI'].values[count] <= 200:
        folium.Marker([lat, lng], popup=label, icon=folium.Icon(color='red')).add_to(baseMap)
    elif df['AQI'].values[count] >= 201 and df['AQI'].values[count] <= 300:
        folium.Marker([lat, lng], popup=label, icon=folium.Icon(color='purple')).add_to(baseMap)
    count+=1
    #folium.Marker([lat, lng], popup=label, icon=folium.Icon(color='red')).add_to(rondimap)    
baseMap.add_child(incidents)


The cell below allows for the lookup feature to work on our folium map. With this feature, the user can simply enter a zip code and the map will highlight that zip code and show the user what the air quality index is at the zip code that they have entered.


In [None]:
geo_json = {
  "type": "FeatureCollection",
  "features": [],
}
for d in df.iterrows():
    temp_dict = {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates":[d[1]["Longitude"], d[1]["Latitude"]],
        
      },"properties": {"ZipCode": d[1]["ZipCode"]}
    }
    geo_json["features"].append(temp_dict)
geojson_obj = folium.GeoJson(geo_json).add_to(baseMap)
servicesearch = Search(
    layer=geojson_obj,
    search_label="ZipCode",
    placeholder='Search for a service',
    collapsed=False,
).add_to(baseMap)

In the next few cells below, we are importing various libraries that will allow us to have text message functionality in our program.

In [None]:
pip install imapclient

Collecting imapclient
[?25l  Downloading https://files.pythonhosted.org/packages/dc/39/e1c2c2c6e2356ab6ea81fcfc0a74b044b311d6a91a45300811d9a6077ef7/IMAPClient-2.1.0-py2.py3-none-any.whl (73kB)
[K     |████▍                           | 10kB 16.1MB/s eta 0:00:01[K     |████████▉                       | 20kB 3.1MB/s eta 0:00:01[K     |█████████████▎                  | 30kB 3.7MB/s eta 0:00:01[K     |█████████████████▊              | 40kB 4.0MB/s eta 0:00:01[K     |██████████████████████▏         | 51kB 3.5MB/s eta 0:00:01[K     |██████████████████████████▋     | 61kB 3.9MB/s eta 0:00:01[K     |███████████████████████████████ | 71kB 4.1MB/s eta 0:00:01[K     |████████████████████████████████| 81kB 3.2MB/s 
Installing collected packages: imapclient
Successfully installed imapclient-2.1.0


In [None]:
pip install imap_tools

Collecting imap_tools
[?25l  Downloading https://files.pythonhosted.org/packages/ea/c1/95d5e746830f467d94c4a6ffce279116976611c0a4c6af9a4870842423cf/imap_tools-0.26.0-py3-none-any.whl (60kB)
[K     |█████▌                          | 10kB 15.7MB/s eta 0:00:01[K     |███████████                     | 20kB 3.0MB/s eta 0:00:01[K     |████████████████▍               | 30kB 3.6MB/s eta 0:00:01[K     |█████████████████████▉          | 40kB 3.9MB/s eta 0:00:01[K     |███████████████████████████▎    | 51kB 3.4MB/s eta 0:00:01[K     |████████████████████████████████| 61kB 2.6MB/s 
[?25hInstalling collected packages: imap-tools
Successfully installed imap-tools-0.26.0


The code cell below sets up our text messaging system. The code starts by authenticating the sender's email address. Out of the privacy of the sender, the email address and password have been removed and replaced with email and password. After authentication, the smtplib library in python basically allows individuals to send emails, however many mobile carriers have features that allow messages to be sent to mobile numbers over email, which is the capability that we leveraged here. So this cell, will ultimately send the message "What is your zip code" to the recipient's phone number which has been removed for privacy.


In [None]:
import time, smtplib
from email.mime.text import MIMEText
server = smtplib.SMTP( "smtp.gmail.com", 587 )
server.starttls()
server.login( 'email', 'password' )
fromx = 'email@gmail.com'
to  = 'recipient@tmomail.net'
msg = MIMEText('')
msg['Subject'] = 'What is your zip code'
msg['From'] = fromx
msg['To'] = to
server.sendmail(fromx, to, msg.as_string())

{}

In the cell below we took advantage of the mailbox library. This library allows us to read emails in the inbox of the sender. Using this feature, we can see when the recipient of the text message replies to the text message. We will also store the last communication with the recipient in the variable lastEmailData which we can use to see what was entered with the user. Again, the emails and phone numbers of the participants in this project were redated for personal privacy and replaced with email, password, and recipient.

In [None]:
from imap_tools import MailBox, AND

# get last email from Alex
with MailBox('imap.gmail.com').login('email', 'password') as mailbox:
    emailData = [msg.text for msg in mailbox.fetch(AND(from_ ='recipient@tmomail.net'))]
    
lastEmailData = emailData[len(emailData) - 1]

In this code segment we check if the user sent a valid zip code. If they did, then we will run an API call using our previously defined and desribed findAQI function. If the user entered an invalid zip code than that will cause the except block to run which will tell the user that they entered an invalid value and will reprompt them to enter a valid 5 digit zip code.


In [None]:
try:
    if zipcodes.is_real(lastEmailData):
        msg = MIMEText('The AQI in your region is ' + str(findAQI(lastEmailData)))
        msg['Subject'] = 'Your AQI Result'
        msg['From'] = fromx
        msg['To'] = to
        server.sendmail(fromx, to, msg.as_string())
except ValueError:
    msg = MIMEText("Please send a valid 5-digit US ZipCode")
    msg['Subject'] = 'Incorrect Value'
    msg['From'] = fromx
    msg['To'] = to
    server.sendmail(fromx, to, msg.as_string())
    