<a href="https://colab.research.google.com/github/andonyns/air-quality/blob/main/main.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Laboratorio 01

## Grupo 04
- Jorge Ignacio Chavarría Herrera - B82073
- Antonio Badilla-Olivas - B80874
- Enrique Guillermo Vílchez Lizano - C18477
- Andony Nuñez Solano - B04539

## Objetivos

1. Selección y recolección de parámetros y ciudades.
2. Limpiar y transformar los datos para comparaciones.
3. Análisis univariable y multivariable. Analizar las tendencias de los indicadores y hacer comparaciones. Incluir posibles correlaciones entre variables.
4. Conclusiones y recomendaciones según las políticas ambientales de cada país.

# Relevant concepts
These are the variables that [OpenAQ](https://openaq.org/) offers to measure air pollution. The definitions were taken from [Clean Air Act](https://www.epa.gov/criteria-air-pollutants/information-pollutant):

1. PM (Particular Matter): These particles come in many sizes and shapes and can be made up of hundreds of different chemicals. Some are emitted directly from a source, such as construction sites, unpaved roads, fields, smokestacks or fires. Most particles form in the atmosphere as a result of complex reactions of chemicals such as sulfur dioxide and nitrogen oxides, which are pollutants emitted from power plants, industries and automobiles.

  - PM₂.₅ (Particulate Matter 2.5 micrometers or smaller):
fine inhalable particles, with diameters that are generally 2.5 micrometers and smaller

  - PM₁₀ (Particulate Matter 10 micrometers or smaller):
inhalable particles, with diameters that are generally 10 micrometers and smaller

2.	O₃ (Ozone):
tropospheric, or ground level ozone, is not emitted directly into the air, but is created by chemical reactions between oxides of nitrogen (NOx) and volatile organic compounds (VOC). This happens when pollutants emitted by cars, power plants, industrial boilers, refineries, chemical plants, and other sources chemically react in the presence of sunlight.

3.	NO₂ (Nitrogen Dioxide):
Nitrogen Dioxide (NO2) is one of a group of highly reactive gases known as oxides of nitrogen or nitrogen oxides (NOx). Other nitrogen oxides include nitrous acid and nitric acid. NO2 is used as the indicator for the larger group of nitrogen oxides. NO2 primarily gets in the air from the burning of fuel. NO2 forms from emissions from cars, trucks and buses, power plants, and off-road equipment.

4.	SO₂ (Sulfur Dioxide):
SO2 is the component of greatest concern and is used as the indicator for the larger group of gaseous sulfur oxides (SOx).  Other gaseous SOx (such as SO3) are found in the atmosphere at concentrations much lower than SO2. The largest source of SO2 in the atmosphere is the burning of fossil fuels by power plants and other industrial facilities. Smaller sources of SO2 emissions include: industrial processes such as extracting metal from ore; natural sources such as volcanoes; and locomotives, ships and other vehicles and heavy equipment that burn fuel with a high sulfur content.

5.	CO (Carbon Monoxide):
CO is a colorless, odorless gas that can be harmful when inhaled in large amounts. CO is released when something is burned. The greatest sources of CO to outdoor air are cars, trucks and other vehicles or machinery that burn fossil fuels. A variety of items in your home such as unvented kerosene and gas space heaters, leaking chimneys and furnaces, and gas stoves also release CO and can affect air quality indoors.

In [None]:
# For API requests
import requests
from urllib.parse import urljoin

# For environment
import os
from dotenv import load_dotenv

load_dotenv()

True

# 1. Selección de dataset:

Descargar de [OpenAQ.org](https://openaq.org), utilizando APIs, los datos disponibles de Costa Rica en la última década (o para los años que haya datos), en todos los indicadores disponibles sobre la calidad del Aire.

También se deben descargar los datos de al menos otras cinco ciudades para las cuales deben incluir al menos 5 parámetros de calidad de aire.

Se debe brindar una explicación de cada parámetro analizado y buscar comparar ciudades que al menos compartan 3 parámetros para compararlos entre sí.


In [15]:
# Base urls
BASE_API_URL = "https://api.openaq.org/v3/"

HEADERS = {"X-API-Key": os.getenv("API_KEY")}

LOCATIONS_ENDPOINT = urljoin(BASE_API_URL, "locations/{location_id}")
MEASUREMENTS_ENDPOINT = urljoin(BASE_API_URL, "sensors/{sensor_id}/measurements")

In [16]:
def fetch_data(
    base_url: str,
    headers: dict[str, str] | None = None,
    parameters: dict[str, any] | None = None,
):
    if parameters is not None:
        base_url = base_url.format(**parameters)

    if headers is not None:
        response = requests.get(
            url=base_url,
            headers=headers,
        )
    else:
        response = requests.get(
            url=base_url,
        )

    if response.status_code != 200:
        raise Exception(
            f"Request failed with status: {response.status_code}. Reason: {response.text}"
        )

    return response.json()

In [None]:
# Trial
cr_location_id = 3070644

fetch_data(
    base_url=LOCATIONS_ENDPOINT,
    headers=HEADERS,
    parameters={"location_id": cr_location_id},
)

{'meta': {'name': 'openaq-api',
  'website': '/',
  'page': 1,
  'limit': 100,
  'found': 1},
 'results': [{'id': 3070644,
   'name': 'NASA GSFC Rutgers Calib. N13',
   'locality': None,
   'timezone': 'America/Costa_Rica',
   'country': {'id': 29, 'code': 'CR', 'name': 'Costa Rica'},
   'owner': {'id': 9, 'name': 'Clarity'},
   'provider': {'id': 166, 'name': 'Clarity'},
   'isMobile': False,
   'isMonitor': False,
   'instruments': [{'id': 4, 'name': 'Clarity Sensor'}],
   'sensors': [{'id': 10669679,
     'name': 'pm25 µg/m³',
     'parameter': {'id': 2,
      'name': 'pm25',
      'units': 'µg/m³',
      'displayName': 'PM2.5'}}],
   'coordinates': {'latitude': 9.938, 'longitude': -84.0417},
   'licenses': [{'id': 38,
     'name': 'CC0 1.0',
     'attribution': {'name': 'Clarity', 'url': None},
     'dateFrom': '2021-10-20',
     'dateTo': None}],
   'bounds': [-84.0417, 9.938, -84.0417, 9.938],
   'distance': None,
   'datetimeFirst': {'utc': '2024-09-19T20:01:34Z',
    'local': '


# 2. Tareas de limpieza y transformación:

Se deben realizar las tareas de limpieza y transformación necesarias para poder hacer un comparativo de la evolución de los diferentes indicadores de la calidad del aire en Costa Rica y las otras ciudades.



# 3. Implementación en Google Colab:

Realizar la implementación en Google Colab. Si existen problemas de desempeño, se puede optar por otro entorno, lo cual debe ser anotado en la documentación del notebook así como en la presentación.



# 4. Análisis y comparación:

Se debe realizar un análisis EDA que incluya análisis univariable y multivariable.

Analizar las tendencias de los indicadores para las diferentes ciudades y hacer comparaciones entre diferentes países y ciudades.

Incluir posibles correlaciones entre las variables y parámetros de calidad del aire de cada país/ciudad.

Utilizar diferentes tipos de visualizaciones relevantes para el análisis.



# 5. Conclusiones y Recomendaciones:

Extraer conclusiones sobre la evolución de la calidad del aire en Costa Rica y las ciudades seleccionadas, explicando cómo los datos sustentan estas conclusiones.

Buscar información sobre las políticas ambientales y regulaciones en estas ciudades y mostrar cómo los datos reflejan el efecto de estas políticas.