# Areca - Data Analysis

This is a sample analysis of what the data from **Areca** could look like and help out administrations to understand the major points of issues in a city. It will also help to provide a live datastream of the city. The datasets used have been acquired from various sources:

#### 1. [pollution_data.csv](https://www.kaggle.com/sogun3/uspollution) - This dataset deals with pollution in the U.S. It contains four major pollutants (Nitrogen Dioxide, Sulphur Dioxide, Carbon Monoxide and Ozone) for every day from 2000 - 2016.

#### 2. [air_sensor_data.csv](https://archive.ics.uci.edu/ml/datasets/Air+quality) - This dataset contains the responses of a gas multisensor device deployed on the field in an Italian city. Hourly responses averages are recorded along with gas concentrations references from a certified analyzer.

#### 3. [noise_data](https://data.smartdublin.ie/dataset/ambient-sound-monitoring-network) - This datasets consists of spreadsheets and raw data taken from monitoring sites around Dublin City. The sound level meters store continuous 5 minute sound pressure levels, with information from the individual daily files then collated into a spreadsheet with separate worksheet for each month of the year .

## Importing libraries

In [1]:
from folium.features import DivIcon
from IPython.display import clear_output

import folium
import pandas
import time

  return f(*args, **kwds)
  return f(*args, **kwds)


## Loading datasets

In [2]:
air_sensor_data = pandas.read_csv('datasets/air_sensor_data.csv')
air_sensor_data

Unnamed: 0,Date,Time,CO(GT),PT08.S1(CO),NMHC(GT),C6H6(GT),PT08.S2(NMHC),NOx(GT),PT08.S3(NOx),NO2(GT),PT08.S4(NO2),PT08.S5(O3),T,RH,AH
0,10/03/2004,18.00.00,26,1360,150,119,1046,166,1056,113,1692,1268,136,489,07578
1,10/03/2004,19.00.00,2,1292,112,94,955,103,1174,92,1559,972,133,477,07255
2,10/03/2004,20.00.00,22,1402,88,90,939,131,1140,114,1555,1074,119,540,07502
3,10/03/2004,21.00.00,22,1376,80,92,948,172,1092,122,1584,1203,110,600,07867
4,10/03/2004,22.00.00,16,1272,51,65,836,131,1205,116,1490,1110,112,596,07888
5,10/03/2004,23.00.00,12,1197,38,47,750,89,1337,96,1393,949,112,592,07848
6,11/03/2004,00.00.00,12,1185,31,36,690,62,1462,77,1333,733,113,568,07603
7,11/03/2004,01.00.00,1,1136,31,33,672,62,1453,76,1333,730,107,600,07702
8,11/03/2004,02.00.00,09,1094,24,23,609,45,1579,60,1276,620,107,597,07648
9,11/03/2004,03.00.00,06,1010,19,17,561,-200,1705,-200,1235,501,103,602,07517


In [3]:
#pollution_data_usa = pandas.read_csv('datasets/pollution_data.csv')
#pollution_data_usa

## Working with the datasets

### Analysis of the NO2 data

In [5]:
user_location = [12.970643, 79.159385]
NO2_data = air_sensor_data['NO2(GT)']

NO2_data.describe()

count    9357.000000
mean       58.148873
std       126.940455
min      -200.000000
25%        53.000000
50%        96.000000
75%       133.000000
max       340.000000
Name: NO2(GT), dtype: float64

In [6]:
limit_25 = NO2_data.describe()['25%']
limit_50 = NO2_data.describe()['50%']
limit_75 = NO2_data.describe()['75%']

### Animation loop of the daily sensor values

In [14]:
map_items = []
timestamp_data = []
count = 0

for value in NO2_data[:100]:
    timestamp_data.append(air_sensor_data.loc[count]['Date'] + ' ' + air_sensor_data.loc[count]['Time'])
    count = count + 1
    
    air_quality_map = folium.Map(location=user_location,
                        zoom_start=25,
                        tiles="cartodbpositron")
    
    if value < limit_25:
        user_point = folium.CircleMarker(location=user_location, color='#2E7D32', radius = 30, fill=True)
    elif value < limit_50:
        user_point = folium.CircleMarker(location=user_location, color='#9E9D24', radius = 30, fill=True)
    elif value < limit_75:
        user_point = folium.CircleMarker(location=user_location, color='#FF8F00', radius = 30, fill=True)
    else:
        user_point = folium.CircleMarker(location=user_location, color='#D84315', radius = 30, fill=True)
        
    user_point.add_to(air_quality_map)
    
    folium.map.Marker([user_location[0] + 0.00140, user_location[1] + 0.0005], icon=DivIcon(icon_size=(150,36), icon_anchor=(0,0), html='<div style="font-size: 10pt; color: grey;">' 
                                                                                           + air_sensor_data.loc[count]['Date'] + ' ' + air_sensor_data.loc[count]['Time'] + ' Value: ' + str(value) + '</div>')).add_to(air_quality_map)

    map_items.append(air_quality_map)

In [15]:
for one_map in map_items:
    clear_output(wait=True)
    display(one_map)
    time.sleep(0.27)

KeyboardInterrupt: 

**Animation legend** (With respect to the PPM scale):

1. <font color='#2E7D32'>Less than 25%</font> <br>
2. <font color='#9E9D24'>From 25% - 50%</font> <br>
3. <font color='#FF8F00'>Less 50% - 75%</font> <br>
4. <font color='#D84315'>Above 75%</font>

### Radar plot of a single day