# Laboratory exercise 4

## Warm-Up Mode (2 points)

**Task Description**  
Using the given dataset, develop and implement **3** different neural networks to predict the **air quality level**. Each network should differ in the following ways:  

- **layer configurations** - use different numbers and types of layers;
- **activation functions** - try different activation functions;
- **neurons per layer** - experiment with different numbers of neurons in each layer; and
- **number of layers** - build networks with varying depths.

After developing the models, evaluate and compare the performance of all **3** approaches.

**About the Dataset**  
This dataset focuses on air quality assessment across various regions. The dataset contains 5,000 samples and captures critical environmental and demographic factors that influence pollution levels.

**Features**:  
- **Temperature (°C)**: Average temperature of the region.  
- **Humidity (%)**: Relative humidity recorded in the region.  
- **PM2.5 Concentration (µg/m³)**: Levels of fine particulate matter.  
- **PM10 Concentration (µg/m³)**: Levels of coarse particulate matter.  
- **NO2 Concentration (ppb)**: Nitrogen dioxide levels.  
- **SO2 Concentration (ppb)**: Sulfur dioxide levels.  
- **CO Concentration (ppm)**: Carbon monoxide levels.  
- **Proximity to Industrial Areas (km)**: Distance to the nearest industrial zone.  
- **Population Density (people/km²)**: Number of people per square kilometer in the region.  

**Target Variable**: **Air Quality**  
- **Good**: Clean air with low pollution levels.  
- **Moderate**: Acceptable air quality but with some pollutants present.  
- **Poor**: Noticeable pollution that may cause health issues for sensitive groups.  
- **Hazardous**: Highly polluted air posing serious health risks to the population.  

In [10]:
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import OrdinalEncoder
from sklearn.metrics import classification_report, confusion_matrix, r2_score
from keras.models import Sequential
from keras.layers import Dense, Input, Dropout
from xgboost import XGBClassifier
from sklearn.preprocessing import StandardScaler


In [3]:
from google.colab import files
uploaded = files.upload()

Saving pollution_dataset.csv to pollution_dataset.csv


In [4]:
data=pd.read_csv('pollution_dataset.csv')

In [6]:
data.head()

Unnamed: 0,Temperature,Humidity,PM2.5,PM10,NO2,SO2,CO,Proximity_to_Industrial_Areas,Population_Density,Air Quality
0,29.8,59.1,5.2,17.9,18.9,9.2,1.72,6.3,319,Moderate
1,28.3,75.6,2.3,12.2,30.8,9.7,1.64,6.0,611,Moderate
2,23.1,74.7,26.7,33.8,24.4,12.6,1.63,5.2,619,Moderate
3,27.1,39.1,6.1,6.3,13.5,5.3,1.15,11.1,551,Good
4,26.5,70.7,6.9,16.0,21.9,5.6,1.01,12.7,303,Good


In [7]:
encoder=OrdinalEncoder(categories=[["Hazardous","Poor","Moderate","Good"]])
data['Air Quality']=encoder.fit_transform(data[['Air Quality']])

In [8]:
x = data.drop(columns=['Air Quality'])
y = data['Air Quality']

In [11]:
scaler=StandardScaler()
x_scaled=scaler.fit_transform(x)

In [12]:
x_train, x_test, y_train, y_test=train_test_split(x_scaled,y,test_size=0.2)

In [23]:
model1=Sequential([
    Dense(32, kernel_initializer="uniform", activation="relu"),
    Dense(16, kernel_initializer="uniform", activation="relu"),
    Dense(4, kernel_initializer="uniform", activation="softmax")
])

In [25]:
model1.compile(
    loss="sparse_categorical_crossentropy",
    optimizer="adam",
    metrics=["accuracy"]
)

In [27]:
history1=model1.fit(x_train, y_train, validation_split=0.1, epochs=20, batch_size=32)

Epoch 1/20
[1m113/113[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 2ms/step - accuracy: 0.9364 - loss: 0.1562 - val_accuracy: 0.9375 - val_loss: 0.1595
Epoch 2/20
[1m113/113[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 2ms/step - accuracy: 0.9432 - loss: 0.1535 - val_accuracy: 0.9375 - val_loss: 0.1637
Epoch 3/20
[1m113/113[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 2ms/step - accuracy: 0.9454 - loss: 0.1450 - val_accuracy: 0.9350 - val_loss: 0.1620
Epoch 4/20
[1m113/113[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 2ms/step - accuracy: 0.9395 - loss: 0.1499 - val_accuracy: 0.9375 - val_loss: 0.1577
Epoch 5/20
[1m113/113[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 2ms/step - accuracy: 0.9473 - loss: 0.1346 - val_accuracy: 0.9375 - val_loss: 0.1617
Epoch 6/20
[1m113/113[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 2ms/step - accuracy: 0.9482 - loss: 0.1333 - val_accuracy: 0.9325 - val_loss: 0.1563
Epoch 7/20
[1m113/113[0m 

In [33]:
model2=Sequential([
    Dense(64, kernel_initializer="uniform", activation="tanh"),
    Dense(32, kernel_initializer="uniform", activation="relu"),
    Dense(16, kernel_initializer="uniform", activation="relu"),
    Dense(4, kernel_initializer="uniform", activation="softmax")
])

In [34]:
model2.compile(
    loss="sparse_categorical_crossentropy",
    optimizer="adam",
    metrics=["accuracy"]
)

In [35]:
history2=model2.fit(x_train, y_train, validation_split=0.1, epochs=20, batch_size=32)

Epoch 1/20
[1m113/113[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 6ms/step - accuracy: 0.4368 - loss: 1.2421 - val_accuracy: 0.6350 - val_loss: 0.6957
Epoch 2/20
[1m113/113[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 4ms/step - accuracy: 0.6835 - loss: 0.6466 - val_accuracy: 0.9050 - val_loss: 0.4592
Epoch 3/20
[1m113/113[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 4ms/step - accuracy: 0.8939 - loss: 0.3562 - val_accuracy: 0.9300 - val_loss: 0.2062
Epoch 4/20
[1m113/113[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 3ms/step - accuracy: 0.9320 - loss: 0.1776 - val_accuracy: 0.9125 - val_loss: 0.2123
Epoch 5/20
[1m113/113[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 2ms/step - accuracy: 0.9240 - loss: 0.1837 - val_accuracy: 0.9350 - val_loss: 0.1722
Epoch 6/20
[1m113/113[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 2ms/step - accuracy: 0.9376 - loss: 0.1625 - val_accuracy: 0.9375 - val_loss: 0.1669
Epoch 7/20
[1m113/113[0m 

In [36]:
model3=Sequential([
    Dense(128, kernel_initializer="uniform", activation="relu"),
    Dense(64, kernel_initializer="uniform", activation="relu"),
    Dense(32, kernel_initializer="uniform", activation="relu"),
    Dense(16, kernel_initializer="uniform", activation="relu"),
    Dense(4, kernel_initializer="uniform", activation="softmax")
])

In [37]:
model3.compile(
    loss="sparse_categorical_crossentropy",
    optimizer="adam",
    metrics=["accuracy"]
)

In [40]:
history3=model3.fit(x_train, y_train, validation_split=0.1, epochs=20, batch_size=32)

Epoch 1/20
[1m113/113[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 3ms/step - accuracy: 0.9701 - loss: 0.0840 - val_accuracy: 0.9400 - val_loss: 0.1788
Epoch 2/20
[1m113/113[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 2ms/step - accuracy: 0.9760 - loss: 0.0669 - val_accuracy: 0.9400 - val_loss: 0.1634
Epoch 3/20
[1m113/113[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 3ms/step - accuracy: 0.9757 - loss: 0.0624 - val_accuracy: 0.9275 - val_loss: 0.1932
Epoch 4/20
[1m113/113[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 4ms/step - accuracy: 0.9735 - loss: 0.0802 - val_accuracy: 0.9375 - val_loss: 0.1748
Epoch 5/20
[1m113/113[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 4ms/step - accuracy: 0.9706 - loss: 0.0766 - val_accuracy: 0.9350 - val_loss: 0.1925
Epoch 6/20
[1m113/113[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 4ms/step - accuracy: 0.9708 - loss: 0.0739 - val_accuracy: 0.9350 - val_loss: 0.1766
Epoch 7/20
[1m113/113[0m 

In [47]:
print("Model 1: ")
model_1_eval = model1.evaluate(x_test, y_test, verbose=0)
print(f"Accuracy: {model_1_eval[1]:.4f}")

print('----------------------')

print("Model 2: ")
model_2_eval = model2.evaluate(x_test, y_test, verbose=0)
print(f"Accuracy: {model_2_eval[1]:.4f}")

print('----------------------')

print("Model 3: ")
model_3_eval = model3.evaluate(x_test, y_test, verbose=0)
print(f"Accuracy: {model_3_eval[1]:.4f}")

Model 1: 
Accuracy: 0.9420
----------------------
Model 2: 
Accuracy: 0.9450
----------------------
Model 3: 
Accuracy: 0.9440
