#  Operating Stations in Mexico City

Using the extracted data from Dirección de Monitoreo Atmosférico del Gobierno de CDMX, we get the stations that are operating in Mexico City, their locations and the code name.


In [1]:
# Dependencies.
import pandas as pd
import csv

In [2]:
# Retrieve stations csv file.
file_estaciones = "Resources/cat_estacion.csv"

# Read our Data file with the pandas library
cat_estaciones_df = pd.read_csv(file_estaciones, encoding="ISO-8859-1")
cat_estaciones_df.head()

Unnamed: 0,cve_estac,nom_estac,longitud,latitud,alt,obs_estac,id_station,Ciudad,Municipio
0,ACO,Acolman,-98.912003,19.635501,2198.0,,484000000000.0,Estado de Mexico,Valle de Mexico
1,AJU,Ajusco,-99.162611,19.154286,2942.0,,484000000000.0,Ciudad de Mexico,Tlalpan
2,AJM,Ajusco Medio,-99.207744,19.272161,2548.0,,484000000000.0,Ciudad de Mexico,Tlalpan
3,ARA,Aragón,-99.074549,19.470218,2200.0,Finalizó operación en 2010,484000000000.0,Ciudad de Mexico,Gustavo A Madero
4,ATI,Atizapan,-99.254133,19.576963,2341.0,,484000000000.0,Estado de Mexico,Atizapan de Zaragoza


In [3]:
# Keep wanted columns
cat_estaciones_df = cat_estaciones_df[['cve_estac', 'nom_estac', 'longitud', 'latitud', 'obs_estac']]
cat_estaciones_df.head()

Unnamed: 0,cve_estac,nom_estac,longitud,latitud,obs_estac
0,ACO,Acolman,-98.912003,19.635501,
1,AJU,Ajusco,-99.162611,19.154286,
2,AJM,Ajusco Medio,-99.207744,19.272161,
3,ARA,Aragón,-99.074549,19.470218,Finalizó operación en 2010
4,ATI,Atizapan,-99.254133,19.576963,


From the created DataFrame we observe that some stations have stopped operations since 2010 and before. We need to get the stations that are still operating.

In [4]:
# First, get stations that have stopped operating
halt_stations = cat_estaciones_df.dropna(how='any')
halt_stations.head()

Unnamed: 0,cve_estac,nom_estac,longitud,latitud,obs_estac
3,ARA,Aragón,-99.074549,19.470218,Finalizó operación en 2010
5,AZC,Azcapotzalco,-99.198657,19.487728,Finalizó operación en 2010
6,BJU,Benito Juárez,-99.159596,19.370464,Finalizó operación en 2005
9,CES,Cerro de la Estrella,-99.074678,19.334731,Finalizó operación en 2010
10,CFE,Museo Tecnológico de la CFE,-99.194279,19.414393,Finalizó operación en 1996


In [5]:
# Halted stations count
halt_stations.count()

cve_estac    20
nom_estac    20
longitud     20
latitud      20
obs_estac    20
dtype: int64

In [6]:
# Retrieve operating stations, rename columns and keep most important.
cat_estaciones_df = cat_estaciones_df.fillna('OK')

active_stations_df = cat_estaciones_df.loc[cat_estaciones_df['obs_estac'] == 'OK']
active_stations_df = active_stations_df[['cve_estac', 'nom_estac', 'longitud', 'latitud']]
active_stations_df = active_stations_df.rename(columns={'cve_estac' : 'Key',
                                                        'nom_estac' : 'Name',
                                                        'longitud' : 'Lng',
                                                        'latitud' : 'Lat'})
active_stations_df.head()

Unnamed: 0,Key,Name,Lng,Lat
0,ACO,Acolman,-98.912003,19.635501
1,AJU,Ajusco,-99.162611,19.154286
2,AJM,Ajusco Medio,-99.207744,19.272161
4,ATI,Atizapan,-99.254133,19.576963
7,CAM,Camarones,-99.169794,19.468404


In [7]:
# Operationg stations count
active_stations_df.count()

Key     49
Name    49
Lng     49
Lat     49
dtype: int64

In [8]:
# Extract stations codes already operating
active_stations_list = []
for station in active_stations_df['Key']:    
    active_stations_list.append(station)
len(active_stations_list)

49

In [9]:
# Printing codes of operating stations.
print(active_stations_list)

['ACO', 'AJU', 'AJM', 'ATI', 'CAM', 'CCA', 'CHO', 'COR', 'COY', 'CUA', 'CUT', 'DIC', 'EAJ', 'EDL', 'FAC', 'GAM', 'HGM', 'IBM', 'INN', 'IZT', 'LAA', 'LLA', 'LOM', 'LPR', 'MCM', 'MER', 'MGH', 'MON', 'MPA', 'NEZ', 'PED', 'SAG', 'SFE', 'SHA', 'SJA', 'SNT', 'SUR', 'TAH', 'TEC', 'TLA', 'TLI', 'TPN', 'UAX', 'UIZ', 'UNM', 'VIF', 'XAL', 'FAR', 'SAC']


According to these tesults we found that:
* 20 Stations are not operating since 2010 and before
* 49 Stations are still in operation, we will focus on these ones.


The stations list created was done by taking the code of each operating station.From here, we can separate the current active stations that we get from the csv file of the stations. 

Now we can use an API to retrieve the measurements done by each of this active stations.

***

# API: Air Quality from Mexico

Using the API on the URL: https://datos.gob.mx/blog/ventilando-datos-abiertos-sobre-calidad-del-aire
we get data realated to these stations and the parameters measured by each one.

* To make this work it is necessary to install the module <font color=blue>datosgobmx</font> so it can be possible to make the calls to the API.
    * Run the command: **<font color=blue>pip install datosgobmx</font>**

    
* First, retrieve the parameters that are measured by the stations


To make a call we simply use the function <font color=red>makeCall</font>, included in the module datosgobmx and pass the desired endpoint as parameter to the client. Then convert the returned json into a Pandas DataFrame. 
The returned info is the data provided through SINAICA from the meteorological stations.


In [10]:
# Dependencies
from datosgobmx import client 

- We first use <font color=blue>'sinacia-parametros'</font> as endpoint to get the **parameters** measured by the stations.

In [11]:
# Perform the call to the client to get the PARAMETERS measured.
params_request = client.makeCall('sinaica-parametros')

CALL: https://api.datos.gob.mx/v2/sinaica-parametros
{'pagination': {'pageSize': 100, 'page': 1, 'total': 7}, 'results': [{'_id': '5c2aa3dfe2705c1932134299', 'parametro': 'CO', 'date-insert': '2018-12-31T23:18:55.923Z'}, {'_id': '5c2aa3dfe2705c193213429a', 'parametro': 'O3', 'date-insert': '2018-12-31T23:18:55.923Z'}, {'_id': '5c2aa3dfe2705c193213429b', 'parametro': 'PM10', 'date-insert': '2018-12-31T23:18:55.923Z'}, {'_id': '5c2aa3dfe2705c193213429c', 'parametro': 'SO2', 'date-insert': '2018-12-31T23:18:55.923Z'}, {'_id': '5c2aa3dfe2705c193213429d', 'parametro': 'NO2', 'date-insert': '2018-12-31T23:18:55.923Z'}, {'_id': '5c2aa3dfe2705c193213429e', 'parametro': 'PM2.5', 'date-insert': '2018-12-31T23:18:55.923Z'}, {'_id': '5c2aa3dfe2705c193213429f', 'parametro': 'TMP', 'date-insert': '2018-12-31T23:18:55.923Z'}]}


In [12]:
# Convert the obtained json into a DataFrame.
params_measured = []

for v in params_request['results']:
    aux = pd.DataFrame.from_dict(v, orient='index').T
    params_measured.append(aux)

params_measured = pd.concat(params_measured, ignore_index=True)
params_measured['date-insert'] = pd.to_datetime(params_measured['date-insert'])
params_measured

Unnamed: 0,_id,parametro,date-insert
0,5c2aa3dfe2705c1932134299,CO,2018-12-31 23:18:55.923
1,5c2aa3dfe2705c193213429a,O3,2018-12-31 23:18:55.923
2,5c2aa3dfe2705c193213429b,PM10,2018-12-31 23:18:55.923
3,5c2aa3dfe2705c193213429c,SO2,2018-12-31 23:18:55.923
4,5c2aa3dfe2705c193213429d,NO2,2018-12-31 23:18:55.923
5,5c2aa3dfe2705c193213429e,PM2.5,2018-12-31 23:18:55.923
6,5c2aa3dfe2705c193213429f,TMP,2018-12-31 23:18:55.923


These DataFrame, shows the 7 parametes measured by the stations according to the API provided.

- Now, we use <font color=blue>'sinaica-estaciones'</font> as the endpoint, to get all the stations from the API. The parameter <font color=blue>'pageSize'</font> is the number of results that we want the API to give us. In this case, there are 185 stations in the data base, so we give a bigger number to make sure it will return all of them.

In [13]:
# Now, retrieve all stations and locations in the country by calling the air quality API provided.
# The dependencie used is datosgobmx input at the beginning of the notebook.
params_estaciones = client.makeCall('sinaica-estaciones', {'pageSize':200})

CALL: https://api.datos.gob.mx/v2/sinaica-estaciones?pageSize=200
{'pagination': {'pageSize': 200, 'page': 1, 'total': 185}, 'results': [{'_id': '5ca7a94ee2705c1932937045', 'lat': 21.873311111111, 'long': -102.32080277778, 'id': 31, 'nombre': 'CBTIS\xa0', 'codigo': 'CBT', 'redesid': 30, 'date-insert': '2019-04-05T19:15:26.824Z'}, {'_id': '5ca7a94ee2705c1932937046', 'lat': 21.846391666667, 'long': -102.28843055556, 'id': 32, 'nombre': 'Secretaría de Medio Ambiente', 'codigo': 'SMA', 'redesid': 30, 'date-insert': '2019-04-05T19:15:26.828Z'}, {'_id': '5ca7a94ee2705c1932937047', 'lat': 21.883780555556, 'long': -102.295825, 'id': 33, 'nombre': 'Centro', 'codigo': 'CEN', 'redesid': 30, 'date-insert': '2019-04-05T19:15:26.828Z'}, {'_id': '5ca7a94ee2705c1932937048', 'lat': 32.639722222222, 'long': -115.50638888889, 'id': 39, 'nombre': 'COBACH', 'codigo': 'SPABC14', 'redesid': 32, 'date-insert': '2019-04-05T19:15:26.828Z'}, {'_id': '5ca7a94ee2705c1932937049', 'lat': 32.603638888889, 'long': -11

In [14]:
# Create a DataFrame of this stations. 
stations_params = []
for v in params_estaciones['results']:
    aux = pd.DataFrame.from_dict(v, orient='index').T
    stations_params.append(aux)

stations_params = pd.concat(stations_params, ignore_index=True)
stations_params['date-insert'] = pd.to_datetime(stations_params['date-insert'])
stations_params.head()

Unnamed: 0,_id,lat,long,id,nombre,codigo,redesid,date-insert
0,5ca7a94ee2705c1932937045,21.8733,-102.321,31,CBTIS,CBT,30,2019-04-05 19:15:26.824
1,5ca7a94ee2705c1932937046,21.8464,-102.288,32,Secretaría de Medio Ambiente,SMA,30,2019-04-05 19:15:26.828
2,5ca7a94ee2705c1932937047,21.8838,-102.296,33,Centro,CEN,30,2019-04-05 19:15:26.828
3,5ca7a94ee2705c1932937048,32.6397,-115.506,39,COBACH,SPABC14,32,2019-04-05 19:15:26.828
4,5ca7a94ee2705c1932937049,32.6036,-115.486,41,CESPM,SPABC19,32,2019-04-05 19:15:26.828


This DataFrame shows every station in the Country (total of 185). We have to perform some operations to get the stations that are still operating in Mexico City.

This will be made by using the list of operating stations created in the first part of the notebook.

In [15]:
# Select rows whose column value is in OPERATING STATIONS list. 
stations_params.to_csv('Outputs/data_stations_sinaica.csv')
stations_params = stations_params.loc[stations_params['codigo'].isin(active_stations_list)]
operating_stations_params_df = stations_params.iloc[16:]

In [16]:
stations_codes_list = stations_params['codigo'].tolist()
print(stations_codes_list)

['SUR', 'TEC', 'CAM', 'SAG', 'TEC', 'ATI', 'SFE', 'TLA', 'TEC', 'TEC', 'TEC', 'COR', 'TEC', 'TEC', 'TEC', 'TEC', 'ACO', 'AJU', 'AJM', 'ATI', 'CAM', 'CCA', 'CHO', 'COY', 'CUA', 'CUT', 'FAC', 'HGM', 'IZT', 'LPR', 'LLA', 'MER', 'MON', 'NEZ', 'PED', 'SAG', 'SFE', 'MGH', 'TAH', 'TLA', 'TLI', 'UIZ', 'UAX', 'VIF', 'XAL', 'CHO', 'MPA', 'INN', 'GAM', 'SUR', 'SUR', 'FAR', 'SAC']


In [17]:
# Keep operating stations in the DataFrame.
operating_stations_df = operating_stations_params_df.drop([142,179])

# Keep and rename wanted to columns for better understanding.
operating_stations_df = operating_stations_df[['lat', 'long', 'id', 'nombre', 'codigo']]
operating_stations_df = operating_stations_df.rename(columns={'lat'  : 'Lat',
                                                              'long' : 'Lng',
                                                              'id' : 'id',
                                                              'nombre' : 'Name',
                                                              'codigo' : 'Key'
})
operating_stations_df.head()

Unnamed: 0,Lat,Lng,id,Name,Key
109,19.6356,-98.9122,240,Acolman,ACO
110,19.1543,-99.1628,241,Ajusco,AJU
111,19.2722,-99.2078,242,Ajusco Medio,AJM
112,19.5772,-99.2542,243,Atizapán,ATI
113,19.4686,-99.17,244,Camarones,CAM


In [18]:
# Resuming: Active Stations in the API
operating_stations_df.count()

Lat     35
Lng     35
id      35
Name    35
Key     35
dtype: int64

* The next step in this Notebook is to bring all parameters measured by each station. We have found that not every staion meaures all parameters. Then we need to see which station measures which parameter to help us visualize and have a strong analysis when we compare and study the rest of the data.

In [19]:
# First, store params measured in a list.
params_list = params_measured['parametro'].tolist()
params_list

['CO', 'O3', 'PM10', 'SO2', 'NO2', 'PM2.5', 'TMP']

In [20]:
# Store operating stations in a list.
stations_id = operating_stations_df['id'].tolist()
stations_names = operating_stations_df['Name'].tolist()

# Create a DataFrame with id and name of stations.
list_of_tuples = list(zip(stations_id, stations_names))
id_names_df = pd.DataFrame(list_of_tuples, columns = ['id', 'Name'])
id_names_df.head()

Unnamed: 0,id,Name
0,240,Acolman
1,241,Ajusco
2,242,Ajusco Medio
3,243,Atizapán
4,244,Camarones


***

Now, we need to make calls to bring the parameters measured by each stations. These calls are made in real-time so it brings the last measured get in the data.

To make a call for this we use the next structure:
* client.makeCall('sinaica', {'pageSize':5000, 'parametro':'O3', 'estacionesid':259, 'page':1})

Where, **pageSize** is the number of results we want, **parametro** is the parameter we wanto to see the measure, **estacionesid** is the id of the station we want to see, and **page** is the number of the json page we want to bring in case we want to retrieve historical data.

If, no paramenter brought means station doesn't take that measured. Then it will no be added to the DataFrame.

Using this code it takes to much time to get the data. However, we can make calls and bring just 1 result for each parameter and station. We do this, because we just want thetype of parameters measured by each station.

## Retrieve one measure for each station and paramenter.

In [21]:
# CODE to get stations and params.
guard = 0
if guard == 1:
    
    params_and_stations_o3 = []    
    for e in stations_id:
        data_api = client.makeCall('sinaica', {'pageSize':1, 'parametro':'O3', 'estacionesid':e, 'page':3})
        
        for v in data_api['results']:
            aux = pd.DataFrame.from_dict(v, orient='index').T
            params_and_stations_o3.append(aux)
    
    params_and_stations_o3 = pd.concat(params_and_stations_o3, ignore_index=True)


params_and_stations_co = []

for e in stations_id:
    
    data_api = client.makeCall('sinaica', {'pageSize':1, 'parametro':'XX', 'estacionesid':e, 'page':3})
    
    for v in data_api['results']:
        aux = pd.DataFrame.from_dict(v, orient='index').T
        params_and_stations_co.append(aux)
    
params_and_stations_XX = pd.concat(params_and_stations_co, ignore_index=True)

Note: We need to run this loop 7 times, changing each time XX to the corresponding parameter in: ['CO', 'O3', 'PM10', 'SO2', 'NO2', 'PM2.5', 'TMP'] This is due to the API from the government tends to fail when doing a lot of consecutive calls.
Then save each data frame we get into a csv file:
params_and_stations_XX.to_csv('params_stations_index_XX.csv', index=False)

Once we have finished making the calls, we keep the results in a csv file to make a later use of them.


In [22]:
# Save the files
if guard == 1:
    params_stations_index_co.to_csv('params_stations_index_co.csv', index=False)
    params_stations_index_no2.to_csv('params_stations_index_no2.csv', index=False, header=None)
    params_stations_index_o3.to_csv('params_stations_index_o3.csv', index=False)
    params_stations_index_pm2.to_csv('params_stations_index_pm2.csv', index=False, header=None)
    params_stations_index_pm10.to_csv('params_stations_index_pm10.csv', index=False)
    params_stations_index_so2.to_csv('params_stations_index_so2.csv', index=False, header=None)
    params_stations_index_tmp.to_csv('params_stations_index_tmp.csv', index=False, header=None)

***

## Setup Stations and Parameters measured.

In [23]:
# New Dependencies
from pathlib import Path
import json

In [24]:
for page in range(1,650):
    filename = 'page_%i_sinaica_mediciones.json'%page
    filename = '/mediciones_sinaica_json/'+filename

In [25]:
# Retrieve stations csv file.
co_measured = 'Resources/params_stations_index_co.csv'
no2_measured = 'Resources/params_stations_index_no2.csv'
o3_measured = 'Resources/params_stations_index_o3.csv'
pm2_measured = 'Resources/params_stations_index_pm2.csv'
pm10_measured = 'Resources/params_stations_index_pm10.csv'
so2_measured = 'Resources/params_stations_index_so2.csv'
tmp_measured = 'Resources/params_stations_index_tmp.csv'

co_measured_df = pd.read_csv(co_measured, encoding="ISO-8859-1")
no2_measured_df = pd.read_csv(no2_measured, encoding="ISO-8859-1")
o3_measured_df = pd.read_csv(o3_measured, encoding="ISO-8859-1")
pm2_measured_df = pd.read_csv(pm2_measured, encoding="ISO-8859-1")
pm10_measured_df = pd.read_csv(pm10_measured, encoding="ISO-8859-1")
so2_measured_df = pd.read_csv(so2_measured, encoding="ISO-8859-1")
tmp_measured_df = pd.read_csv(tmp_measured, encoding="ISO-8859-1")

co_measured_df = co_measured_df[['parametro', 'estacionesid']]
no2_measured_df = no2_measured_df[['parametro', 'estacionesid']]
o3_measured_df = o3_measured_df[['parametro', 'estacionesid']]
pm2_measured_df = pm2_measured_df[['parametro', 'estacionesid']]
pm10_measured_df = pm10_measured_df[['parametro', 'estacionesid']]
so2_measured_df = so2_measured_df[['parametro', 'estacionesid']]
tmp_measured_df = tmp_measured_df[['parametro', 'estacionesid']]

# Concatenate all data in a single one.
stations_df_full = pd.concat([co_measured_df, no2_measured_df], ignore_index=True)
stations_df_full = pd.concat([stations_df_full, o3_measured_df], ignore_index=True)
stations_df_full = pd.concat([stations_df_full, pm2_measured_df], ignore_index=True)
stations_df_full = pd.concat([stations_df_full, pm10_measured_df], ignore_index=True)
stations_df_full = pd.concat([stations_df_full, so2_measured_df], ignore_index=True)
stations_df_full = pd.concat([stations_df_full, tmp_measured_df], ignore_index=True)
stations_df_full.count()

parametro       178
estacionesid    178
dtype: int64

In [26]:
# Get the station id and the paramenter from the read csv files. This is one lecture of each
# parameter from each station who reads that parameter.
stations_df_full = stations_df_full.rename(columns={'parametro' : 'Parameter',
                                                    'estacionesid' : 'id'
})
stations_df_full.head()

Unnamed: 0,Parameter,id
0,CO,240
1,CO,242
2,CO,243
3,CO,244
4,CO,245


In [27]:
stations_df_full.count()

Parameter    178
id           178
dtype: int64

In [41]:
params_estaciones = stations_df_full.groupby('id').Parameter.unique().reset_index()
params_estaciones['Parameters_str'] = params_estaciones.Parameter.map(lambda x: ', '.join(x))
params_estaciones.head()

Unnamed: 0,id,Parameter,Parameters_str
0,240,"[CO, NO2, O3, PM10, SO2]","CO, NO2, O3, PM10, SO2"
1,241,"[O3, PM2.5, TMP]","O3, PM2.5, TMP"
2,242,"[CO, NO2, O3, PM2.5, PM10, SO2, TMP]","CO, NO2, O3, PM2.5, PM10, SO2, TMP"
3,243,"[CO, NO2, O3, SO2]","CO, NO2, O3, SO2"
4,244,"[CO, NO2, O3, PM2.5, PM10, SO2]","CO, NO2, O3, PM2.5, PM10, SO2"


Remembering we have **id_names_df** as:
<img src="Images/id_names_df.png" alt="Active Stations DF" title="ID & Names Stations DF" />

#### Merging DataFrames to get Names and Parameters

In [29]:
merging_id_df = id_names_df.merge(stations_df_full)
merging_id_df.head()
merging_id_df.count()

id           178
Name         178
Parameter    178
dtype: int64

#### Merging to get Locations
Remeber that **operating_stations_df** is like:<br>


<img src="Images/operating_stations_df.png" alt="Active Stations DF" title="Active Stations DF" />


In [30]:
# Create new DF with Nombre, Lat, Lng, Parametro, id
merging_names_df = merging_id_df.merge(operating_stations_df)
merging_names_df = merging_names_df[['Name', 'Lng', 'Lat','Parameter', 'id']]
merging_names_df.head()
merging_names_df.count()

Name         178
Lng          178
Lat          178
Parameter    178
id           178
dtype: int64

In [31]:
# Set id as an index and Parameters as columns.
id_df = pd.crosstab(merging_names_df['id'], merging_names_df['Parameter'].fillna('n/a'))
id_df.head()

Parameter,CO,NO2,O3,PM10,PM2.5,SO2,TMP
id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
240,1,1,1,1,0,1,0
241,0,0,1,0,1,0,1
242,1,1,1,1,1,1,1
243,1,1,1,0,0,1,0
244,1,1,1,1,1,1,0


In [32]:
# Join parameters for each id station. Reset index for id.
parameters_df = id_df.reset_index()
parameters_df.head()

Parameter,id,CO,NO2,O3,PM10,PM2.5,SO2,TMP
0,240,1,1,1,1,0,1,0
1,241,0,0,1,0,1,0,1
2,242,1,1,1,1,1,1,1
3,243,1,1,1,0,0,1,0
4,244,1,1,1,1,1,1,0


In [33]:
# GET the last form of the DataFrame. Drop Parameter as an index name. Set it as a column.
final_stations_params_df = parameters_df.merge(merging_names_df)
final_stations_params_df.head()

Unnamed: 0,id,CO,NO2,O3,PM10,PM2.5,SO2,TMP,Name,Lng,Lat,Parameter
0,240,1,1,1,1,0,1,0,Acolman,-98.9122,19.6356,CO
1,240,1,1,1,1,0,1,0,Acolman,-98.9122,19.6356,NO2
2,240,1,1,1,1,0,1,0,Acolman,-98.9122,19.6356,O3
3,240,1,1,1,1,0,1,0,Acolman,-98.9122,19.6356,PM10
4,240,1,1,1,1,0,1,0,Acolman,-98.9122,19.6356,SO2


In [34]:
# Delete repeated rows keeping first occurence in column 'Nombre'
final_stations_params_df.drop_duplicates(subset ="Name", keep = 'first', inplace = True) 

# Delete column 'Parametro'
del final_stations_params_df['Parameter']

In [35]:
# Total number of rows is 31, this is all stations with reading parameters
final_stations_params_df.head()

Unnamed: 0,id,CO,NO2,O3,PM10,PM2.5,SO2,TMP,Name,Lng,Lat
0,240,1,1,1,1,0,1,0,Acolman,-98.9122,19.6356
5,241,0,0,1,0,1,0,1,Ajusco,-99.1628,19.1543
8,242,1,1,1,1,1,1,1,Ajusco Medio,-99.2078,19.2722
15,243,1,1,1,0,0,1,0,Atizapán,-99.2542,19.5772
19,244,1,1,1,1,1,1,0,Camarones,-99.17,19.4686


In [36]:
final_stations_params_df.count()

id       31
CO       31
NO2      31
O3       31
PM10     31
PM2.5    31
SO2      31
TMP      31
Name     31
Lng      31
Lat      31
dtype: int64

In [37]:
# Reorder columns to get a better understanding and better appeal.
final_stations_params_df = final_stations_params_df[['id', 'Name', 'Lat', 'Lng', 'CO', 'NO2', 'O3', 'PM10', 'PM2.5', 'SO2', 'TMP']]
final_stations_params_df.head()

Unnamed: 0,id,Name,Lat,Lng,CO,NO2,O3,PM10,PM2.5,SO2,TMP
0,240,Acolman,19.6356,-98.9122,1,1,1,1,0,1,0
5,241,Ajusco,19.1543,-99.1628,0,0,1,0,1,0,1
8,242,Ajusco Medio,19.2722,-99.2078,1,1,1,1,1,1,1
15,243,Atizapán,19.5772,-99.2542,1,1,1,0,0,1,0
19,244,Camarones,19.4686,-99.17,1,1,1,1,1,1,0


Final Stations Params DataFrame indicates, id, name, latitude and longitude for each operating station.
Also it tells which parameter is read by each station: 1 means it reads the parameter.

***

## Mapping Stations: location and parameters

In [38]:
# Dependencies for folium module.
import folium
m = folium.Map(location=[19.3911668, -99.4238175])

In [39]:
centro_lat, centro_lon = 19.3911668, -99.4238175
folium_map = folium.Map(location=[centro_lat, centro_lon], zoom_start = 5, tiles='cartodb positron')

for i, row in final_stations_params_df.iterrows():
    params = []
    
    if row.CO == 1:
        params.append('CO')
    if row.NO2 == 1:
        params.append('NO2')
    if row.O3 == 1:
        params.append('O3')
    if row.PM10 == 1:
        params.append('PM10')
    if row['PM2.5'] == 1:
        params.append('PM2.5')
    if row.SO2 == 1:
        params.append('SO2')
    if row.TMP == 1:
        params.append('TMP')
    
    
    popup_text = f'<b> Nombre: </b> {row.Name} <br> \
                  <b> Latitud: </b> {row.Lat:.5f} <br> \
                  <b> Longitud: </b> {row.Lng:.5f} <br> \
                  <b> Parametros: </b>   {params}'
    folium.CircleMarker(location=[row.Lat, row.Lng], radius=5,
                       tooltip=popup_text, fill=True, fill_opacity=0.4).add_to(folium_map)

folium_map.save('all_stations.html')

In [40]:
folium_map