#### The Air Quality Index (AQI) data were scraped from AirNow (https://docs.airnowapi.org/) on 2017-10-17 for ground-level ozone.

The AQI reported for ground-level ozone and fine particles (PM2.5) is based on an average of hourly data. For ozone, the AQI is based on the maximum observed 8-hour average from midnight to midnight. For PM2.5, the AQI is simply the 24-hour average. For AQI values reported in real-time, before a full day's data are available, the AQI is based on a surrogate calculation. (See more from https://docs.airnowapi.org/aq101#aqiColors.)

#### Link to download the "AQI_data.csv" directly: http://yili.georgetown.domains/ANLY503/AQI_data.csv.

* AQI range: 0 - 50, Good, Level 1, Lightgreen
* AQI range: 51 - 100, Moderate, Level 2, Yellow
* AQI range: 101 - 150, Unhealthy for Sensitive Groups, Level 3, Orange
* AQI range: 151 - 200, Unhealthy, Level 4, Red

In [1]:
import pandas as pd
import plotly.plotly as py
import plotly.tools as tls

In [2]:
## Load the data 
df = pd.read_csv('AQI_data.csv')
print(df.shape)
df.head()

(864, 5)


Unnamed: 0,lat,lon,date,AQI,level
0,39.145733,-123.202995,2017-10-17T23:00,61,2
1,38.403765,-122.818294,2017-10-17T23:00,48,1
2,45.39916,-122.7455,2017-10-17T23:00,10,1
3,37.9722,-122.5189,2017-10-17T23:00,112,3
4,40.6894,-122.4024,2017-10-17T23:00,64,2


In [3]:
## Create a column "text"
df['text'] = 'Ozone Value: ' + (df['AQI']).astype(str) + '<br>Level: ' + (df['level']).astype(str)

## Create a column "color"
colors = ["lightgreen","yellow","orange","red"]
df['color'] = [colors[i-1] for i in df['level']]

## Sort by the value of AQI
df.sort_values('AQI', ascending=True, inplace=True)
df.head()

Unnamed: 0,lat,lon,date,AQI,level,text,color
100,35.0506,-118.1464,2017-10-17T23:00,-999,0,Ozone Value: -999<br>Level: 0,red
817,38.895683,-76.958089,2017-10-17T23:00,3,1,Ozone Value: 3<br>Level: 1,lightgreen
448,38.890301,-90.148102,2017-10-17T23:00,5,1,Ozone Value: 5<br>Level: 1,lightgreen
2,45.39916,-122.7455,2017-10-17T23:00,10,1,Ozone Value: 10<br>Level: 1,lightgreen
9,40.4531,-122.2978,2017-10-17T23:00,10,1,Ozone Value: 10<br>Level: 1,lightgreen


In [13]:
## Remove the outlier: AQI = -999
df = df[df.AQI > 0]

data = []
for i in range(1,5):
    df_sub = df[df.level==i]
    data_sub = dict(
        type = 'scattergeo',
        locationmode = 'USA-states',
        lon = df_sub['lon'],
        lat = df_sub['lat'],
        text = df_sub['text'],
        name = "Level " + str(i),
        marker = dict(
            size = df_sub['level']**2*15,
            color = df_sub['color'],
            alpha = 0.3,
            line = dict(width=0.5, color='rgb(0,0,0)'),
            sizemode = 'area'
        )
    )
    data.append(data_sub)
    
layout = dict(
        title = 'U.S. AQI for ground-level ozone at 2017-10-17T23:00<br>(Click legend to toggle traces)',
        showlegend = True,
        geo = dict(
            scope='usa',
            projection=dict( type='albers usa' ),
            showland = True,
            landcolor = 'rgb(230, 230, 230)',
            subunitwidth=1,
            countrywidth=1,
            subunitcolor="rgb(255, 255, 255)",
            countrycolor="rgb(255, 255, 255)"
        )
    )

fig = dict(data=data, layout=layout)
#py.iplot(fig, validate=False)

py.plot(fig, validate=False, filename='d3-bubble-map-AQI' )
#tls.get_embed('https://plot.ly/~GULily/AQI')
