# Observed Air Quality (PurpleAir)

This notebook retrieves readings from PurpleAir Sensors in Minneapolis and cleans the entries and saves the results as a csv file.

Documentation is available here: https://api.purpleair.com.
You can read this article for help getting started: https://community.purpleair.com/t/making-api-calls-with-the-purpleair-api/180.

From PurpleAir: 

"The data from individual sensors will update no less than every 30 seconds. As a courtesy, we ask that you limit the number of requests to no more than once every 1 to 10 minutes, assuming you are only using the API to obtain data from sensors. If retrieving data from multiple sensors at once, please send a single request rather than individual requests in succession.

The PurpleAir historical API is released as of July 18, 2022. For more information, view this post: https://community.purpleair.com/t/new-version-of-the-purpleair-api-on-july-18th/1251.

Please let us know if you have any questions or concerns, and have a great day!"

A paper on this process: https://doi.org/10.5194/amt-14-4617-2021 (Link for [Download](https://www.researchgate.net/publication/352663348_Development_and_application_of_a_United_States-wide_correction_for_PM25_data_collected_with_the_PurpleAir_sensor) )

Chat on which PM Estimate to use: https://community.purpleair.com/t/pm2-5-algorithms/3972/6

In [24]:
import os
import requests 
import datetime as dt
import pandas as pd
import arcpy
import numpy as np
import io

In [4]:
cwd = os.getcwd() # This is a global variable for where the notebook is (must change if running in arcpro)

# Make it workspace

arcpy.env.workspace = os.path.join(cwd, '..', '..', 'data', 'QAQC.gdb')

arcpy.env.overwriteOutput = True # Overwrite layers is okay

## Setting MPLS Bounds

In [5]:
#bound strings

mpls_8km = "mpls_8km"

bounds_strings = [f'nwlng=-93.43083707299996',
                  f'nwlat=45.12366876300007',
                  f'selng=-93.09225748799997',
                  f'selat=44.81791263300005']
bounds_string = '&'.join(bounds_strings)

print(bounds_string)

nwlng=-93.43083707299996&nwlat=45.12366876300007&selng=-93.09225748799997&selat=44.81791263300005


## Get Station IDs

In [6]:
# This function will be used to collect data for multiple public PurpleAir sensors.
def getSensorsData(query='', api_read_key=''):

    # my_url is assigned the URL we are going to send our request to.
    url = 'https://api.purpleair.com/v1/sensors?' + query
    
    print('Here is the full url for the API call:\n\n', url)

    # my_headers is assigned the context of our request we want to make. In this case
    # we will pass through our API read key using the variable created above.
    my_headers = {'X-API-Key':api_read_key}

    # This line creates and sends the request and then assigns its response to the
    # variable, r.
    response = requests.get(url, headers=my_headers)

    # We then return the response we received.
    return response

In [7]:
# This is my personal API key... Please use responsibly!
# 51592903-B445-11ED-B6F4-42010A800007

api = input('Please enter your Purple Air api key')

Please enter your Purple Air api key 51592903-B445-11ED-B6F4-42010A800007


In [8]:
# Designating and formatting the fields to request

fields = ['location_type']

fields_string = 'fields=' + '%2C'.join(fields)

print(fields_string)

fields=location_type


In [9]:
# Put it all together

query_string = '&'.join([fields_string, bounds_string])

print(query_string)

fields=location_type&nwlng=-93.43083707299996&nwlat=45.12366876300007&selng=-93.09225748799997&selat=44.81791263300005


In [10]:
# Make the request

response = getSensorsData(query_string, api)

Here is the full url for the API call:

 https://api.purpleair.com/v1/sensors?fields=location_type&nwlng=-93.43083707299996&nwlat=45.12366876300007&selng=-93.09225748799997&selat=44.81791263300005


In [11]:
# Get response into Pandas DataFrame

response_dict = response.json() # Read response as a json (dictionary)

col_names = response_dict['fields']
data = np.array(response_dict['data'])

df = pd.DataFrame(data, columns = col_names)

df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 88 entries, 0 to 87
Data columns (total 2 columns):
 #   Column         Non-Null Count  Dtype
---  ------         --------------  -----
 0   sensor_index   88 non-null     int32
 1   location_type  88 non-null     int32
dtypes: int32(2)
memory usage: 832.0 bytes


In [12]:
df.head()

Unnamed: 0,sensor_index,location_type
0,3088,0
1,5582,0
2,137876,1
3,11134,0
4,142718,0


In [13]:
# Only want outside sensors

outside_sensors = df[df['location_type']==0] # 0 = outside

len(outside_sensors)

80

In [14]:
response.json()

{'api_version': 'V1.0.11-0.0.42',
 'time_stamp': 1680562985,
 'data_time_stamp': 1680562925,
 'max_age': 604800,
 'firmware_default_version': '7.02',
 'fields': ['sensor_index', 'location_type'],
 'location_types': ['outside', 'inside'],
 'data': [[3088, 0],
  [5582, 0],
  [137876, 1],
  [11134, 0],
  [142718, 0],
  [142720, 0],
  [142726, 0],
  [142730, 0],
  [142728, 0],
  [142734, 0],
  [142732, 0],
  [142736, 0],
  [142744, 0],
  [142750, 0],
  [142748, 0],
  [142752, 0],
  [142756, 0],
  [142774, 0],
  [142772, 0],
  [142852, 0],
  [143214, 0],
  [143216, 0],
  [143222, 0],
  [143226, 0],
  [143224, 0],
  [143238, 0],
  [143242, 0],
  [143240, 0],
  [143246, 0],
  [143636, 0],
  [143648, 0],
  [143656, 0],
  [143666, 0],
  [143668, 0],
  [143916, 0],
  [145202, 0],
  [145204, 0],
  [145234, 0],
  [145242, 0],
  [145250, 0],
  [145454, 0],
  [145470, 0],
  [145498, 0],
  [145506, 0],
  [145604, 0],
  [145610, 0],
  [145616, 0],
  [147749, 1],
  [17189, 1],
  [21179, 0],
  [154751, 

In [15]:
#drop the location_type now that we have filtered for outdoor sensors only
df_historic = outside_sensors.drop('location_type', axis=1)

## Pulling Historic Sensor CSVs

### Setting time period

In [16]:
#pulling from 9/1/22 - 4/2/23 

# Start time

end_datetime = dt.datetime(2023,4,2) # April 2, 2023
end_timestamp = int(dt.datetime.timestamp(end_datetime))

# End time

start_datetime = dt.datetime(2022,9,1) # September 1, 2022
start_timestamp = int(dt.datetime.timestamp(start_datetime))

# Sensors

sensor_ids = outside_sensors.sensor_index.apply(lambda x: int(x))

### Creating the Query for the API

In [26]:
# Sensor id

sensor_id = sensor_ids[0]

# Timestamp String

time_string = 'start_timestamp=' + str(start_timestamp) + '&end_timestamp=' + str(end_timestamp)

# Average string (in minutes) 1440 is 1 day average

avg_string = 'average=1440'

# Environmental fields

env_fields = ['humidity', 'temperature', 'pressure', 'pm2.5_cf_1']

env_fields_string = 'fields=' + '%2C%20'.join(env_fields)

# Base URL

base_url = f'https://api.purpleair.com/v1/sensors/{sensor_id}/history/csv?'

# Put it all together

query_url = base_url + '&'.join([time_string, avg_string, env_fields_string])

my_headers = {'X-API-Key':api}

# This line creates and sends the request and then assigns its response to the
# variable, r.
r = requests.get(query_url, headers=my_headers)

# Read response as CSV data
csv_data = r.content.decode('utf-8')

# Parse CSV data into pandas DataFrame
df_historic = pd.read_csv(io.StringIO(csv_data), header=None, names=env_fields + ['timestamp'])
df_historic

Unnamed: 0,humidity,temperature,pressure,pm2.5_cf_1,timestamp
time_stamp,sensor_index,humidity,temperature,pressure,pm2.5_cf_1
1674864000,3088,40.201,17.581,991.743,0.633
1671753600,3088,42.605,0.743,987.762,0.0785
1668729600,3088,48.795,29.08,990.701,54.419000000000004
1672531200,3088,53.942,42.17,980.359,8.882
...,...,...,...,...,...
1677628800,3088,52.806,44.6,977.058,3.9945
1679356800,3088,41.454,42.326,987.113,2.38
1676592000,3088,38.144,22.964,995.503,0.56
1679529600,3088,41.673,41.035,988.664,3.2915


In [22]:
r_dict = r.json() # Read response as a json (dictionary)
r_timestamp = r_dict['time_stamp']
r_col_names = r_dict['evn_fields']
r_data = np.array(r_dict['r_data'])

df_historic = pd.DataFrame(r_data, columns = r_col_names)
df_historic['timestamp'] = r_timestamp

df_historic.info()

JSONDecodeError: [Errno Expecting value] time_stamp,sensor_index,humidity,temperature,pressure,pm2.5_cf_1
1674864000,3088,40.201,17.581,991.743,0.633
1671753600,3088,42.605,0.743,987.762,0.0785
1668729600,3088,48.795,29.08,990.701,54.419000000000004
1672531200,3088,53.942,42.17,980.359,8.882
1676764800,3088,50.66,40.259,978.572,7.806000000000001
1679875200,3088,30.287,43.825,993.008,2.356
1670198400,3088,43.188,37.943,980.436,2.352
1664668800,3088,46.151,72.548,996.869,4.325
1664755200,3088,42.182,74.594,994.996,3.458
1675209600,3088,35.599,22.992,991.576,6.7555
1677974400,3088,44.411,40.873,990.663,9.638
1677456000,3088,49.32,42.39,968.441,5.535
1672790400,3088,62.39,40.904,974.7,0.957
1668297600,3088,35.871,38.463,996.327,0.446
1670544000,3088,51.125,39.365,993.775,11.072
1666483200,3088,39.845,75.349,972.201,6.134500000000001
1671321600,3088,46.182,18.951,989.357,0.637
1675382400,3088,32.033,7.196,1005.455,0.4365
1677196800,3088,38.239,19.432,1005.852,2.3689999999999998
1668384000,3088,47.281,38.103,994.459,3.517
1667347200,3088,36.208,70.398,985.18,3.3899999999999997
1677283200,3088,40.459,26.584,993.56,5.5435
1667260800,3088,34.911,61.267,983.612,14.0525
1667088000,3088,39.304,63.026,986.672,8.202
1678838400,3088,45.454,44.408,985.077,7.506
1670716800,3088,52.974,45.321,992.805,12.078
1666828800,3088,42.768,56.854,991.944,5.554
1670976000,3088,62.533,43.45,976.015,2.904
1666656000,3088,47.49,53.027,977.738,0.5625
1674259200,3088,55.458,32.415,990.489,3.9770000000000003
1663718400,3088,42.835,74.726,985.199,3.566
1665878400,3088,42.14,53.478,986.496,0.754
1672185600,3088,49.699,39.428,968.407,4.845000000000001
1673913600,3088,60.112,43.667,972.488,3.5455
1679097600,3088,42.347,23.71,989.223,1.089
1663286400,3088,52.275,80.138,984.387,10.022499999999999
1663372800,3088,55.647,81.015,981.256,9.136
1662336000,3088,49.257,75.839,990.032,3.002
1674604800,3088,53.121,38.274,982.903,13.28
1679616000,3088,34.888,43.156,987.745,4.432
1663891200,3088,50.085,63.218,989.776,0.9295
1673654400,3088,52.395,30.687,990.074,8.9785
1678924800,3088,55.252,48.044,974.357,6.689
1663027200,3088,44.392,74.675,983.728,5.631
1667433600,3088,45.473,70.25,980.472,7.0465
1663545600,3088,52.078,74.961,985.413,9.985
1670630400,3088,55.878,42.674,991.372,6.8745
1671494400,3088,43.178,15.979,1001.955,2.2995
1673136000,3088,43.742,25.469,993.871,26.4855
1669939200,3088,45.026,42.194,975.279,1.8495
1668988800,3088,43.629,37.799,987.467,5.426
1675468800,3088,37.438,23.98,985.856,2.623
1665446400,3088,37.217,72.706,978.497,6.2615
1665187200,3088,38.383,54.96,996.125,2.487
1663113600,3088,47.975,75.687,987.495,8.146
1678492800,3088,51.018,40.374,987.547,1.087
1671062400,3088,59.447,44.855,965.902,2.3215
1667001600,3088,39.927,62.54,993.031,5.3575
1666051200,3088,35.157,41.286,996.572,0.06
1668124800,3088,50.84,36.576,985.052,0.126
1664323200,3088,42.627,58.451,1001.375,2.404
1677888000,3088,47.33,43.723,980.474,11.0295
1669766400,3088,46.276,27.643,983.786,0.8724999999999999
1673740800,3088,51.641,41.804,979.084,5.722
1674950400,3088,34.392,14.193,999.315,0.6595
1672358400,3088,48.318,32.735,983.241,5.4735
1661990400,3088,47.624,86.69,983.783,4.834
1680134400,3088,37.633,36.349,990.988,5.0535
1676160000,3088,45.142,43.447,984.368,6.814
1666137600,3088,32.199,46.241,990.588,0.79
1672704000,3088,57.46,34.77,980.246,13.107999999999999
1664236800,3088,35.625,61.925,992.069,0.38899999999999996
1680048000,3088,38.001,31.234,996.325,4.244
1674086400,3088,56.504,40.904,977.945,0.7675
1675987200,3088,35.152,31.561,996.647,0.49349999999999994
1673568000,3088,49.999,31.55,999.056,3.859
1676937600,3088,44.093,23.293,978.319,1.8085
1667865600,3088,44.476,49.011,1001.817,1.7345000000000002
1662422400,3088,48.7,79.822,989.484,4.0865
1671408000,3088,42.475,20.403,997.619,3.051
1662681600,3088,53.656,77.645,979.683,5.491
1663804800,3088,45.843,65.794,994.204,0.2815
1673222400,3088,46.675,31.158,985.429,25.3605
1672444800,3088,52.84,36.586,980.789,7.1385000000000005
1662854400,3088,44.492,71.013,991.606,12.466000000000001
1666224000,3088,30.927,53.354,978.794,4.77
1662508800,3088,51.615,85.5,991.25,9.2915
1673308800,3088,52.578,39.365,980.636,36.8945
1673049600,3088,41.581,24.021,998.696,24.892
1675296000,3088,38.389,22.594,992.177,8.0305
1676851200,3088,47.949,35.111,977.213,3.6165000000000003
1669680000,3088,51.594,39.6,975.992,2.846
1675036800,3088,36.868,10.692,1000.96,0.984
1673481600,3088,55.118,36.409,990.612,15.939
1674777600,3088,50.155,30.739,974.067,3.965
1670889600,3088,55.039,43.128,987.475,11.211500000000001
1678406400,3088,51.355,42.313,993.784,0.856
1680307200,3088,50.444,43.552,976.954,1.063
1678320000,3088,52.037,44.109,998.373,2.9285
1677542400,3088,45.198,42.745,978.734,6.1865000000000006
1676505600,3088,40.471,26.952,992.662,0.2145
1662940800,3088,43.722,72.086,986.154,7.008
1666569600,3088,50.501,70.511,971.475,4.34
1666742400,3088,44.383,52.305,985.905,4.4510000000000005
1679184000,3088,35.168,31.629,992.597,0.903
1678147200,3088,47.965,44.547,1000.473,1.1375
1667692800,3088,48.74,51.314,973.943,2.238
1679443200,3088,53.694,44.7,981.783,6.6594999999999995
1679270400,3088,41.03,42.834,983.774,2.1935000000000002
1665705600,3088,47.5,47.739,972.805,1.3715
1670284800,3088,40.518,31.801,987.448,2.1625
1663977600,3088,58.28,66.954,980.745,4.193
1675123200,3088,34.63,12.214,996.96,1.9385000000000001
1665014400,3088,47.011,65.471,990.862,10.068999999999999
1678233600,3088,41.318,44.583,1001.09,1.4625
1675814400,3088,46.271,43.075,989.331,4.89
1673827200,3088,56.458,45.814,971.825,8.271
1665532800,3088,47.329,65.201,973.203,4.0675
1676678400,3088,43.488,48.357,984.975,6.545
1672272000,3088,52.302,46.262,971.813,11.601500000000001
1670371200,3088,44.054,30.62,994.758,4.171
1664064000,3088,48.961,68.603,981.826,1.3765
1664928000,3088,43.611,75.858,986.815,5.299
1672963200,3088,48.568,29.904,990.655,8.7465
1669593600,3088,48.405,41.797,977.768,6.114
1680220800,3088,61.289,43.874,972.402,6.702999999999999
1666310400,3088,36.354,62.917,971.775,13.55
1671926400,3088,38.381,11.583,991.182,0.47450000000000003
1666396800,3088,38.764,64.947,971.915,20.619
1678579200,3088,53.498,41.38,981.208,3.8245
1662595200,3088,48.382,85.646,982.408,9.255
1678665600,3088,45.241,35.907,993.042,0.45
1679702400,3088,33.608,47.673,978.403,4.9825
1676073600,3088,42.193,40.478,991.007,1.0075
1668470400,3088,54.967,42.483,993.684,9.901
1676246400,3088,46.219,43.631,979.391,4.8759999999999994
1672876800,3088,56.554,37.677,982.528,1.7005
1668211200,3088,44.145,37.151,991.138,0.0455
1665273600,3088,34.551,64.638,988.073,3.5075
1667520000,3088,53.648,54.731,980.633,5.5
1668643200,3088,51.975,34.513,988.521,11.2905
1672617600,3088,52.859,36.032,989.137,13.39
1667952000,3088,62.928,61.278,986.681,8.6345
1665619200,3088,41.925,51.508,975.673,5.0305
1664582400,3088,41.284,72.318,993.177,2.5524999999999998
1670112000,3088,41.432,28.262,989.249,2.7140000000000004
1669075200,3088,44.868,39.315,991.518,12.531
1663200000,3088,44.799,81.789,986.398,5.2015
1669852800,3088,42.192,28.76,992.828,0.95
1669420800,3088,38.708,53.061,980.085,1.522
1667606400,3088,54.912,47.11,972.95,2.801
1679011200,3088,46.129,23.512,981.923,0.9015
1677024000,3088,51.009,26.002,981.877,0.54
1679961600,3088,35.623,36.235,996.551,2.7225
1678752000,3088,40.473,34.233,996.656,5.4030000000000005
1676332800,3088,53.075,46.563,969.172,4.23
1674172800,3088,51.186,37.205,988.256,1.288
1662768000,3088,55.435,72.949,988.961,6.405
1672099200,3088,40.475,23.723,983.622,1.268
1671667200,3088,41.244,5.368,990.302,1.0245
1664150400,3088,35.337,64.435,986.266,0.35550000000000004
1663459200,3088,55.604,77.98,981.605,5.5725
1662163200,3088,41.749,77.801,991.159,1.3235000000000001
1679788800,3088,39.756,45.375,985.193,6.4719999999999995
1675728000,3088,52.167,43.51,980.775,7.737
1662249600,3088,43.297,74.176,993.32,0.4255
1674518400,3088,51.282,36.119,987.903,7.0200000000000005
1666915200,3088,38.189,62.204,995.74,4.3125
1665792000,3088,48.263,50.52,978.528,0.617
1677801600,3088,48.605,42.556,977.982,8.257
1677369600,3088,36.869,33.155,984.532,7.048
1669334400,3088,48.656,45.34,991.626,6.5175
1664409600,3088,35.421,64.461,998.415,0.4465
1677715200,3088,47.614,35.901,986.225,1.665
1677110400,3088,50.98,25.778,984.885,0.19699999999999998
1662076800,3088,46.108,88.351,981.196,3.4699999999999998
1674432000,3088,52.539,30.499,985.579,10.423
1665360000,3088,39.075,66.08,989.318,3.7915
1673395200,3088,52.694,42.79,980.71,40.847500000000004
1667779200,3088,37.649,42.809,999.38,1.1315
1670803200,3088,50.753,41.939,993.718,8.3635
1674000000,3088,53.469,43.418,984.919,3.0309999999999997
1667174400,3088,41.221,59.047,981.858,35.5225
1664496000,3088,35.705,69.975,992.578,0.7825
1674345600,3088,51.829,33.121,985.288,10.3585
1671235200,3088,54.315,31.434,975.603,1.361
1675641600,3088,47.31,35.801,984.713,15.240499999999999
1669161600,3088,45.515,42.585,987.057,19.819499999999998
1665964800,3088,35.644,43.267,991.817,0.0425
1664841600,3088,36.46,75.898,990.601,2.7045
1663632000,3088,47.663,84.744,980.002,7.671
1671580800,3088,38.683,9.782,1000.946,2.1385
1675555200,3088,44.693,35.5,983.103,12.597999999999999
1671840000,3088,44.097,6.799,990.883,0.05450000000000001
1674691200,3088,43.967,25.134,986.324,0.44949999999999996
1671148800,3088,55.728,41.66,964.968,3.3325
1675900800,3088,51.507,42.106,983.603,9.4475
1665100800,3088,48.783,54.165,1002.293,0.106
1669507200,3088,46.939,47.085,979.315,8.1355
1670457600,3088,43.837,28.125,999.079,6.8465
1672012800,3088,38.569,19.561,992.82,1.7645
1678060800,3088,59.505,43.149,988.693,0.6759999999999999
1668816000,3088,42.451,27.148,985.557,0.538
1670025600,3088,45.521,21.555,993.332,1.0635
1668902400,3088,41.654,29.813,990.352,1.2235
1676419200,3088,56.689,41.321,969.103,2.6719999999999997
1669248000,3088,54.525,46.331,987.268,15.6935
1668556800,3088,55.091,42.8,992.586,12.7225
1677628800,3088,52.806,44.6,977.058,3.9945
1679356800,3088,41.454,42.326,987.113,2.38
1676592000,3088,38.144,22.964,995.503,0.56
1679529600,3088,41.673,41.035,988.664,3.2915
1668038400,3088,62.755,67.122,977.428,7.2415
: 0

### Creating a 'for' loop to parse through all sensor_ids

In [None]:
#need to add timestamp of reading
#call the df df_historic

## Cleaning Historic Data for Analysis

In [None]:
#rename pm2.5 column to pm2_5 for SQL
df_historic = df_historic.rename(columns={'pm2.5_cf_1' : 'pm2_5'})

In [None]:
#changing UNIX date to pd date
df_historic['timestamp'] = pd.to_datetime(df_historic['timestamp'], unit='s')

## QAQC

In [None]:
#create a blank dataframe to hold the errors

purpleair_historic_errors = pd.DataFrame(columns = ['humidity_error', 'temperature_error', 'pressure_error', 'pm2_5_error'])
purpleair_historic_errors['sensor_index'] = df_historic['sensor_index']
purpleair_historic_errors['timestamp'] = df_historic['timestamp']

### Humidity Check

In [None]:
#ranges pulled from https://www.currentresults.com/Weather/Minnesota/humidity-annual.php
#range is actually 40-90 but I was getting tons of errors so I reduced it to 10-90

def check_range(value):
    if value is None:
        return 'no value given'  # or any other value that indicates a missing value
    elif value >= 10 and value <= 90:
        pass
    else:
        return 'out of range (10%-90%)'
    
purpleair_historic_errors['humidity_error'] = df_historic['humidity'].apply(check_range)

print(purpleair_historic_errors)

### Temperature Check

In [None]:
#winter -4 - 28
#spring 22 - 57
#summer 48 - 81
#fall 29 - 59
#ref from https://www.dnr.state.mn.us/climate/summaries_and_publications/normalsportal.html

def check_range(value):
    if value is None:
        return 'no value given'  # or any other value that indicates a missing value
    elif value >= -20 and value <= 100:
        pass
    else:
        return 'out of range (-20-100F)'
'''
#if we can get time stamp we should use this with a date check too
#this is not correct - we can do seasonal if we can relate it to date range
def check_range(value):
    if value is None:
        return -1
    if value >= -20 and value <=35:
        return 'winter (-20-35F)'
    if value >10 and value <=70:
        return 'spring (10-70F)'
    if value >30 and value <=100:
        return 'summer (30-100F)'
    if value >15 and value <=70:
        return 'fall (15-70F)'
    else:
        return 'out of range'
'''

purpleair_historic_errors['temperature_error'] = df_historic['temperature'].apply(check_range)

print(purpleair_historic_errors)

### Pressure Check

In [None]:
# range is 25 - 35 Hg according to https://barometricpressure.app/minneapolis
# PurpleAir uses Millibars so I used https://www.weather.gov/epz/wxcalc_pressureconvert to convert
# rage is 846.6 - 1185.24

def check_range(value):
    if value is None:
        return 'no value given'  # or any other value that indicates a missing value
    elif value >= 830 and value <= 1200:
        pass
    else:
        return 'out of range (830 - 1200 Millibars)'
    
purpleair_historic_errors['pressure_error'] = df_historic['pressure'].apply(check_range)

print(purpleair_historic_errors)

### PM Check

In [None]:
#Average reading in MPLS is 30 ug/m3 per https://www.epa.gov/air-trends/air-quality-cities-and-counties

def check_range(value):
    if value is None:
        return 'no value given'
#    if value == 0:
 #       return '0'
#    if value >0.1 and value <=10:
#        return 'PM2.5 0.1-10'
#    if value >10 and value <=20:
#        return 'PM2.5 10-20'
#    if value >20 and value <=30:
#        return 'PM2.5 20-30'
#    if value >30 and value <=40:
#        return 'PM2.5 30-40'
#    if value >40 and value <=50:
#        return 'PM2.5 40-50'
#    if value >50 and value <=60:
#        return 'PM2.5 50-60'
#    if value >60 and value <=70:
#        return 'PM2.5 60-70'
    if value >0.1 and value <70:
        pass
    else:
        return 'above 70'
    
purpleair_historic_errors['pm2_5_error'] = df_historic['pm2_5'].apply(check_range)

print(purpleair_historic_errors)

In [None]:
# Removing rows from the error table that don't have any errors

purpleair_historic_errors = purpleair_historic_errors.dropna(subset=purpleair_historic_errors.columns.difference(['sensor_index', 'timestamp']), how='all')
purpleair_historic_errors

## Connecting to the Server

In [None]:
import psycopg2
from psycopg2 import sql

In [None]:
connection = psycopg2.connect(host = '34.132.44.118',
                              database = 'lab1-2',
                              user = 'postgres',
                              password = 'password',
                              port = '5432')
connection.closed

## Insert Data into SQL Table

In [None]:
#connect to the cursor
cur = connection.cursor()

# iterate over the dataframe and insert each row into the database using a SQL INSERT statement
for index, row in df_historic.iterrows():
    cur.execute('''
    INSERT INTO PURPLEAIR_HISTORIC (sensor_index, timestamp, humidity, temperature, pressure, pm2_5) 
    VALUES (%s, %s, %s, %s, %s) 
    ''', (row['sensor_index'], row['timestamp'], row['humidity'], row['temperature'], row['pressure'], row['pm2_5']))
    connection.commit()
    
for i, r in purpleair_historic_errors.iterrows():
    cur.execute('''
    INSERT INTO PURPLEAIR_HISTORIC_ERRORS (sensor_index, timestamp, humidity_error, temperature_error, pressure_error, pm2_5_error) 
    VALUES (%s, %s, %s, %s, %s) 
    ''', (r['sensor_index'], r['timestamp'], r['humidity_error'], r['temperature_error'], r['pressure_error'], r['pm2_5_error']))
    connection.commit()
# commit the changes to the database and close the cursor and connection
cur.close()
connection.close()