# Final Project #
**Name:** Andrea Gonzalez Cruz

**e-mail:** andrea.gcruz@alumnos.udg.mx

## Minimum requirements ##
The final project is a dashboard that uses interactive graphics to display information. In this case, the minimum requirements are:
1. The dashboard must include a **parameters menu for modifying the characteristics of the dataset**. We are working with an existing dataset, so the parameters in the menu will act as a filters.
2. Display a **graphical representation** of the source dataset.
3. Include a **control panel for computing metrics from the source dataset**. The metrics must also be represented graphically.

### Air Quality Dashboard ###
The purpose of the final project is to visualize environmental parameters for air quality prediction through graphs.
The variables contained in the dataset *(Air Quality and Pollution)* include **temperature, humidity level, concentration of pollutant gas particles** and **population density**. 

This section imports the libraries that will be used through the project.

In [1]:
import panel as pn #This library helps in the development of dashboards
import panel.widgets as pnw
import pandas as pd
import numpy as np
import plotly.graph_objects as go
pn.extension('plotly')
import math
import matplotlib.pyplot as plt

df = pd.read_csv('updated_pollution_dataset.csv') #Import the csv dataset obtained in Kaggle (Air Quality and Pollution Assessment)
df = df.dropna() #This helps to delete rows containing null data
df.head() #This helps to verify that the dataset has been imported correctly showing the first rows in the dataset
df['Temperature'] = pd.to_numeric(df['temperature'], errors='coerce') #Temperature to numerical values to avoid errors in graph
df['Humidity'] = pd.to_numeric(df['humidity'], errors='coerce') #Humidity to numerical values to avoid errors in graph

Unnamed: 0,temperature,humidity,PM25,PM10,NO2,SO2,CO,proximity_ia,population,air_quality
0,29.8,59.1,5.2,17.9,18.9,9.2,1.72,6.3,319,Moderate
1,28.3,75.6,2.3,12.2,30.8,9.7,1.64,6.0,611,Moderate
2,23.1,74.7,26.7,33.8,24.4,12.6,1.63,5.2,619,Moderate
3,27.1,39.1,6.1,6.3,13.5,5.3,1.15,11.1,551,Good
4,26.5,70.7,6.9,16.0,21.9,5.6,1.01,12.7,303,Good


### 1. Filter function ###
The **filtering** function takes the data from the dataset. In this case, we are going to filter the information using **air_quality, population density (people/km2) and proximity_ia (proximity to industrial areas in km)**. These variables can be modified in the dashboard.

The **heatmap_figure** help us to represent the values of a parameter *(in this case are PM2.5, PM10, NO2, SO2 and CO)* on two axis variables *(proximity_ia as the x value and population as the y value)*. The colorbar indicates the range of values for the chosen parameter. In the **histogram**, the x value is represented by the parameter chosen in the dashboard.
Finally, the variables **temperature (°C)** and **humidity (%)** will be represented by a **scatter plot** that will use the air_quality variable to be interactive.

In [9]:
def filtering(data, air_quality, min_population, max_proximity_ia):    
    return data[(data['air_quality'] == air_quality) & 
                (data['population'] >= min_population) & 
                (data['proximity_ia'] <= max_proximity_ia)]

#Define the parameters to create the Heatmap Plot
def heatmap_figure(data, parameter):
    fig = go.Figure(data=go.Heatmap(
        z=data[parameter],
        x=data['proximity_ia'],  
        y=data['population'],  
        hoverongaps=False,
        colorscale='Viridis',
        hovertemplate='Industrial Area Proximity: %{x}<br>Population: %{y}<br>' + parameter + ': %{z}<extra></extra>',
        colorbar=dict(title=parameter)
    ))
    fig.update_layout(
        title=f'Heat Map {parameter}',
        xaxis_title="IA proximity",
        yaxis_title="Population"
    )
    return fig

#Define the parameters to create the Histrogram Plot
def histogram_figure(data, parameter):
    data = data.dropna(subset=[parameter])
    fig = go.Figure()
    fig.add_trace(go.Histogram(
        x=data[parameter], name=parameter,
        marker=dict(color='lightskyblue'),
        nbinsx=30
    ))
    fig.update_layout(
        title=f'Histogram {parameter}', 
        xaxis_title=parameter, 
        yaxis_title='Frequency', 
        showlegend=True, bargap=0.4)
    return fig

#Define the parameters to create the Scatter Plot
def scatter_figure(data):
    fig = go.Figure()
    for category in data['air_quality'].unique():
        filtered = data[data['air_quality'] == category]
        fig.add_trace(go.Scatter(
            x=filtered['Temperature'], y=filtered['Humidity'], mode='markers',
            name=category, marker=dict(size=8, opacity=0.7)
        ))
    fig.update_layout(title='Temperature vs Humidity b Air Quality', xaxis_title="Temperature (°C)", yaxis_title="Humidity (%)")
    return fig

### 2. Create the widgets ###
By using the **pn.widgets.Select** we can create drop-down menus for the dashboard. The **air_quality** (evaluated as good,
moderate, poor and hazardous) is the main filter. Then, follows choosing the parameter and finally the graph type.
The graphs shows the following parameters:
3. **PM2.5 Concentration (μg/m3)**: Fine particulate matter levels.
4. **PM10 Concentration (μg/m3)**: Coarse particulate matter levels.
5. **NO2 Concentration (ppb)**: Nitrogen dioxide levels.
6. **SO2 Concentration (ppb)**: Sulfur dioxide levels.
7. **CO Concentration (ppm)**: Carbon monoxide levels.

The **pn.widgets.IntSlider** creates sliders to choose a numeric value. In this case, we use:
1. **Population density (people/km2)**
2. **Proximity to Industrial Areas (km)**

In [14]:
air_quality_select = pn.widgets.Select(name='Air Quality', options=list(df['air_quality'].unique()), width=200)
parameter_select = pn.widgets.Select(name='Parameter', options=['PM25', 'PM10', 'NO2', 'SO2', 'CO'], width=200)
graph_type_select = pn.widgets.Select(name='Graph type', options=['Heatmap', 'Frequency Histogram'], width=200)

#Slider to modify the population density (people/km2)
population_slider = pn.widgets.IntSlider(name='Min Population', 
                                         start=int(df['population'].min()), 
                                         end=int(df['population'].max()), 
                                         value=int(df['population'].min()), 
                                         step=100, 
                                         width=200)

#Slider to modify the proximity to industrial areas (km)
proximity_slider = pn.widgets.IntSlider(name='Max Proximity to IA', 
                                         start=int(df['proximity_ia'].min()), 
                                         end=int(df['proximity_ia'].max()), 
                                         value=int(df['proximity_ia'].max()), 
                                         step=10, 
                                         width=200)

### 3. Metric and Panel ###
In this section, the interactive widgets are connected to the **data_fig** in order to modificate the parameters on the dashboard. The **filtered_data** option helps us to filter the data depending on the options chosen by the user, while **filtered_count** shows the number of records that apply to the specified parameters.
The interactive graph (heatmap or histogram) is created and the plot shows the visualization panel with the records and the graph.