# Project Overview
This project aims to analyze air quality data for Delhi during January 2023 to gain insights into the Air Quality Index (AQI) and pollutant concentrations. The analysis involves calculating AQI, categorizing air quality, exploring temporal trends, identifying correlations between pollutants, and comparing AQI metrics with recommended air quality standards.

# About the Dataset
The dataset contains air quality measurements for Delhi during January 2023 and contains 9 columns and 562 records. It includes the following variables:

Date and time of measurement and
Concentrations of various air pollutants such as Carbon Monoxide (CO), Nitric Oxide (NO), Nitrogen Dioxide (NO2), Ozone (O3), Sulfur Dioxide (SO2), Particulate Matter (PM2.5 and PM10), and Ammonia (NH3) in µg/m³.

# Concepts Used in the Project

-Air Quality Index (AQI): AQI is a numerical scale used to represent the quality of air in a specific location. It considers various pollutants and categorizes air quality into different levels ranging from "Good" to "Hazardous".

-Data Analysis: Techniques such as data cleaning, manipulation, and aggregation are employed to analyze the dataset effectively.

-Data Visualization: Visualizations including time series plots, bar charts, pie charts, and correlation matrices are utilized to represent and interpret the data.

-Statistical Analysis: Statistical methods are used to calculate AQI, identify trends, and analyze correlations between pollutants.

-Data Interpretation: The analysis aims to interpret the data to understand the prevailing air quality conditions, identify pollution patterns, and assess the severity of air pollution in Delhi during January 2023.

# IMPORTING LIBRARIES AND DATASET




In [None]:
import pandas as pd
import plotly.express as px
import plotly.graph_objects as go

# Read data
data = pd.read_csv("delhiaqi.csv")
data['date'] = pd.to_datetime(data['date'])

Defining AQI breakpoints and categories

In [None]:
aqi_breakpoints = [
    (0, 12.0, 50), (12.1, 35.4, 100), (35.5, 55.4, 150),
    (55.5, 150.4, 200), (150.5, 250.4, 300), (250.5, 350.4, 400),
    (350.5, 500.4, 500)
]

aqi_categories = [
    (0, 50, 'Good'), (51, 100, 'Moderate'), (101, 150, 'Unhealthy for Sensitive Groups'),
    (151, 200, 'Unhealthy'), (201, 300, 'Very Unhealthy'), (301, 500, 'Hazardous')
]

Function to calculate AQI

In [None]:
def calculate_aqi(pollutant_name, concentration):
    for low, high, aqi in aqi_breakpoints:
        if low <= concentration <= high:
            return aqi
    # Return a default value if concentration is out of range
    return -1

Function to calculate AQI category

In [None]:
def categorize_aqi(aqi_value):
    for low, high, category in aqi_categories:
        if low <= aqi_value <= high:
            return category
    return None

Calculate AQI for each row

In [None]:
data['AQI'] = data.apply(lambda row: max([calculate_aqi(pollutant, row[pollutant]) for pollutant in data.columns[1:]]), axis=1)


Categorize AQI

In [None]:
data['AQI Category'] = data['AQI'].apply(categorize_aqi)


Plot AQI over time

In [None]:
fig_aqi_time = px.bar(data, x="date", y="AQI",
                      title="AQI of Delhi in January 2023",
                      labels={"date": "Date", "AQI": "AQI"},
                      color="AQI Category",
                      color_discrete_sequence=px.colors.qualitative.Set1)
fig_aqi_time.show()

Plot pollutant concentrations

In [None]:
# Define color sequence for pollutants
pollutant_colors = px.colors.sequential.Pinkyl

# Plot pollutant concentrations
concentration_data = data.drop(columns=['date', 'AQI', 'AQI Category']).sum().reset_index()
concentration_data.columns = ['Pollutant', 'Concentration']

fig_pollutants = px.pie(concentration_data,
                        values='Concentration',
                        names='Pollutant',
                        title='Pollutant Concentrations in Delhi (Jan 2023)',
                        color='Pollutant',
                        color_discrete_sequence=pollutant_colors)

# Update layout to name the legend
fig_pollutants.update_layout(legend_title_text='Pollutant')

fig_pollutants.show()

Hourly average AQI trends

In [None]:
# Create line plot for hourly average AQI trends
fig_hourly_avg_aqi = px.line(hourly_avg_aqi, x='Hour', y='AQI',
                             title='Hourly Average AQI Trends in Delhi (Jan 2023)',
                             labels={"Hour": "Hour of the Day", "AQI": "Average AQI"})

# Update layout to change colors
fig_hourly_avg_aqi.update_layout(
    plot_bgcolor='lavenderblush',  # Set background color to light pink
    xaxis=dict(linecolor='black'),  # Set x-axis line color to dark pink
    yaxis=dict(linecolor='black'),  # Set y-axis line color to dark pink
    xaxis_title_text='Hour of the Day',  # Update x-axis title
    yaxis_title_text='Average AQI',  # Update y-axis title
    font=dict(color='black')  # Set font color to dark pink
)

# Update line color
fig_hourly_avg_aqi.update_traces(line=dict(color='crimson'))

fig_hourly_avg_aqi.show()


Correlation between pollutants

In [None]:
fig_corr = px.imshow(data.corr(numeric_only=True),
                     x=data.corr().columns,
                     y=data.corr().columns,
                     title='Correlation Between Pollutants')
fig_corr.show()







Average AQI by day of the week

In [60]:
fig_avg_aqi_by_day = px.bar(average_aqi_by_day, x=average_aqi_by_day.index, y='AQI',
                            title='Average AQI by Day of the Week',
                            labels={"Day_of_Week": "Day of the Week", "AQI": "Average AQI"},
                            color_discrete_sequence=['teal'])


fig_avg_aqi_by_day.update_layout(
    plot_bgcolor='powderblue'
)

fig_avg_aqi_by_day.show()