# Air quality and Meteorological Conditions in Beijing

**The data used in this notebook was downloaded from the [UCI Machine Learning Repository](https://archive.ics.uci.edu/dataset/501/beijing+multi+site+air+quality+data). The data consists of hourly data of six main air pollutants and six relevant meteorological variables at multiple sites in Beijing.**

In [1]:
#Importing python packages
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import yaml
from functions import wspm_conversion
#import functions as f

In [10]:
try:
    with open("../config.yaml", "r") as file:
        config = yaml.safe_load(file)
except:
    print("The configuration file was not found!")

In [14]:
config['data']['raw']['file1']

'.../data/Raw/PRSA_Data_Changping_20130301-20170228.csv'

In [None]:
#provide description for each columns
data = {
    'Column':['year', 'month', 'day', 'hour', 'pm2.5', 'pm10', 'so2', 'no2', 'co', 'o3', 'temp', 'pres', 'dewp', 'rain', 'wd', 'wspm', 'station'],
    'Description': ['Year', 'Month', 'Day', 'Hour', 'PM2.5 concentration (ug/m^3)', 'PM10 concentration (ug/m^3)', 'Sulfur Dioxide concentration (ug/m^3)', 'Nitrogen Dioxide concentration (ug/m^3)', 'Carbon Monoxide concentration (ug/m^3)', 'Ozone concentration (ug/m^3)', 'Temperature (degree Celsius)', 'Pressure (hPa)', 'Dew Point Temperature (degree Celsius)', 'Precipitation (mm)', 'Wind Direction', 'Wind Speed (m/s)', 'Name of the Air-Quality Monitoring Site']
}

datainfo = pd.DataFrame(data)
datainfo

# 1. Data Cleaning

In [None]:
#Loading Datas 
#df_changping = pd.read_csv("config['data']['raw']['file1']")
df_changping = pd.read_csv("/Users/ryanvandenakker/Documents/Ironhack/Week 3/Project1/data/Raw/PRSA_Data_Changping_20130301-20170228.csv")
df_dingling = pd.read_csv("/Users/ryanvandenakker/Documents/Ironhack/Week 3/Project1/data/Raw/PRSA_Data_Dingling_20130301-20170228.csv")
df_dongsi = pd. read_csv("/Users/ryanvandenakker/Documents/Ironhack/Week 3/Project1/data/Raw/PRSA_Data_Dongsi_20130301-20170228.csv")
df_guanyuan = pd.read_csv("/Users/ryanvandenakker/Documents/Ironhack/Week 3/Project1/data/Raw/PRSA_Data_Guanyuan_20130301-20170228.csv")
df_gucheng = pd.read_csv("/Users/ryanvandenakker/Documents/Ironhack/Week 3/Project1/data/Raw/PRSA_Data_Gucheng_20130301-20170228.csv")
df_huairou = pd.read_csv("/Users/ryanvandenakker/Documents/Ironhack/Week 3/Project1/data/Raw/PRSA_Data_Huairou_20130301-20170228.csv")
df_nongzhanguan = pd.read_csv("/Users/ryanvandenakker/Documents/Ironhack/Week 3/Project1/data/Raw/PRSA_Data_Nongzhanguan_20130301-20170228.csv")
df_shunyi = pd.read_csv("/Users/ryanvandenakker/Documents/Ironhack/Week 3/Project1/data/Raw/PRSA_Data_Shunyi_20130301-20170228.csv")
df_tiantan = pd.read_csv("/Users/ryanvandenakker/Documents/Ironhack/Week 3/Project1/data/Raw/PRSA_Data_Tiantan_20130301-20170228.csv")
df_wanliu = pd.read_csv("/Users/ryanvandenakker/Documents/Ironhack/Week 3/Project1/data/Raw/PRSA_Data_Wanliu_20130301-20170228.csv")
df_wanshouxigong = pd.read_csv("/Users/ryanvandenakker/Documents/Ironhack/Week 3/Project1/data/Raw/PRSA_Data_Wanshouxigong_20130301-20170228.csv")
df_aotizhongxin = pd.read_csv("/Users/ryanvandenakker/Documents/Ironhack/Week 3/Project1/data/Raw/PRSA_Data_Aotizhongxin_20130301-20170228.csv")

In [None]:
#Concatenate the dataframes into one
df = pd.concat([df_changping, df_dingling, df_dongsi, df_guanyuan, df_gucheng, df_huairou, df_nongzhanguan, df_shunyi, df_tiantan, df_wanliu, df_wanshouxigong, df_aotizhongxin]).reset_index(drop = True)
df.head()

In [None]:
#Checking the shape of the dataframe. It has 420,768 rows and 18 columns. 
df.shape

In [None]:
df.duplicated().sum() #Counting the duplicated rows in the data

In [None]:
df.dtypes #Checking the data types of the columns

In [None]:
df.info() #Overview information of the dataframe

In [None]:
#renaming all columns to be lowercase and removing white spaces
clean_df = df.copy()
clean_df = clean_df.rename(columns = lambda x: x.strip().lower())
clean_df.head()

- **There seems to be a column that provides no value to our dataset.**
- **I will go ahead and remove this column first.**

In [None]:
#removing 'no' column
clean_df = clean_df.drop(['no'], axis = 1)
clean_df

In [None]:
#Checking for null values
print(clean_df.isna().any())
print("")
print("The columns with null values are:", clean_df.columns[clean_df.isna().any()].tolist())
print("")

In [None]:
#observe which rows in 'pm2.5' column has null values
clean_df[clean_df['pm2.5'].isna()]

- **Because the dataset contains hhourly measurements of all variables, I believe the most appropriate approach to handling missing values is to use the forward-fill method.** 
- **Additionally, I have decided to start with filling the missing values before any form of data cleaning as I want to preserve the integerity of the dataset before proceeding with the the forward-fill method.**

In [None]:
#Fill missing values using the forward-fill method
clean_df = clean_df.ffill()
clean_df

In [None]:
#Check for remaining null values in data
remaining_null = clean_df.isna().sum()
remaining_null

In [None]:
#checking for unique values in 'wd' column
clean_df['wd'].unique()

- **After observing the unique values for 'wd', I have decided to drop certain variables and keep the more traditional wind directions.**

In [None]:
#Dropping certain wind direction variables
drop_wd_variable = ['ENE', 'NNE', 'NNW', 'SSW', 'WSW', 'ESE', 'WNW', 'SSE']
variables_to_drop = clean_df[clean_df['wd'].isin(drop_wd_variable)].index
clean_df = clean_df.drop(variables_to_drop).reset_index(drop = True)
clean_df

In [None]:
clean_df['wd'].unique()

- **I will condense the 'year', 'month', and 'day' column into one to produce a traditional date format.**

In [None]:
#Create a 'date' column
clean_df['date'] = pd.to_datetime(clean_df[['year', 'month', 'day']])
#Extract the 'date' column
date_column = clean_df.pop('date')
#Insert the 'date' column to index 3
clean_df.insert(3, 'date', date_column)
clean_df

- **When observing the units for different variables, the unit provided for wind spped is in 'm/s'. I have decided to convert the unit from m/s to km/h as this unit is more commonly used.**

In [None]:
#Define a function to convert unit for wind speed
#def wspm_conversion(m):
#    km = m * 3.6
#    return km
#Create a copy of the original dataset
df_clean = clean_df.copy()
#Use the apply method to convert all rows for wind speed
df_clean['wspm'] = clean_df['wspm'].apply(functions.wspm_conversion)
df_clean['wspm'] = clean_df['wspm'].apply(f.wspm_conversion)
#df_clean['wspm'] = clean_df['wspm'].apply(wspm_conversion)
df_clean.head()

- **I also noticed the unit for pressure is hPa. However, atm (SI unit for atmospheric pressure) is more commonly used.**

In [None]:
#Define a function to convert unit for pressure
def pressure_conversion(p):
    atm = p * 0.000987
    return atm
#Use the apply method to convert all rows for pressure
df_clean['pres'] = df_clean['pres'].apply(pressure_conversion)
df_clean.head()

- **I have decided to group the month of the year into seasons in order to analyze whether levels of pollutions are higher during different seasons.**
- **Beijing has 4 seasons: Winter, Spring, Summer, and Autumn.**

In [None]:
#Defining a dictionary to be used for mapping
group = {1:'Winter', 2:'Winter', 3:'Spring', 4:'Spring', 
         5:'Spring', 6:'Summer', 7:'Summer', 8:'Summer', 
         9:'Autumn', 10:'Autumn', 11:'Autumn', 12:'Winter'}
#Using map() method
df_clean['season'] = df_clean['month'].map(group)
df_clean.head()
#Extract the 'season' column and adding to index 5
season_column = df_clean.pop('season')
df_clean.insert(5, 'season', season_column)
df_clean

In [None]:
df_clean['hour'].unique()

- **I also wanted to analyze whether pollution levels differ based on the time of day. As there are 24 hours in a day, I decided the best way to proceed would be to divide the day into 4: 'Night', 'Morning', 'Afternoon', and 'Evening.**

In [None]:
#grouping 'hour' into time of day
hour_grouped = {0:'Night', 1:'Night', 2:'Night', 3:'Night', 
                4:'Night', 5:'Night', 6:'Morning', 7:'Morning', 
                8:'Morning', 9:'Morning', 10:'Morning', 11:'Morning',
                12:'Afternoon', 13:'Afternoon', 14:'Afternoon', 15:'Afternoon',
                16:'Afternoon', 17:'Afternoon', 18:'Evening', 19:'Evening',
                20:'Evening', 21:'Evening', 22:'Evening', 23:'Evening'}
#Using map()
df_clean['time'] = df_clean['hour'].map(hour_grouped)
df_clean
#Extract the 'time' column and adding it to index 4
time_column = df_clean.pop('time')
df_clean.insert(4, 'time', time_column)
df_clean

- **To finalize the data cleaning process and derive a final DataFrame, I will drop the remaining columns that will not provide any value to the analysis**

In [None]:
#As the final dataframe, drop columns 'day' and 'hour'
final_df = df_clean.copy()
final_df = df_clean.drop(['day', 'hour'], axis=1)
display(final_df.head())
display(final_df.tail())
print("")

#Display the number of rows and columns in the final dataset
num_rows = final_df.shape[0]
num_cols = final_df.shape[1]
print(f"The final DataFrame has {num_rows} rows and {num_cols} columns")
print("")

In [None]:
final_df.info() #Summary information about the final dataframe

In [None]:
final_df.dtypes #Summary of datatypes of each column of final dataframe

In [None]:
#Save the cleaned dataframe to a new csv file
final_df.to_csv('cleaned_data.csv')

# Exploratory Data Analysis

- **Majority of the AQI includes PM2.5 and PM10 to assess air quality. As such, I have chosen to focus on these two variables for my analysis and I will be dropping the other four variables from the dataset.**

In [None]:
#Dropping 'no2', '03', 'so2', and 'co'
df_final = final_df.copy()
df_final.drop(['no2', 'o3', 'so2', 'co'], axis = 1, inplace = True)
df_final

### Question: Which District in Beijing Recorded the Highest Average PM2.5 and PM10 Levels?

In [None]:
#Which station reported the highest average air pollutant
avg_pollutant = df_final.groupby('station')[['pm2.5', 'pm10']].mean()
display(avg_pollutant)

PM2.5 are atmospheric aerosols with a maximum diameter of 2.5 micrometers. This type of suspended particulate matter is considered the most dangerous to human health. This is due to its very fine nature, and its ability to penetrate directly into the bloodstream.

According to the [Common Air Quality Index](https://airly.org/en/air-quality-index-caqi-and-aqi-methods-of-calculation/):
<br>
**PM 2.5**
<br>
***Low*** *(Green)*: 0 - 29 µg/m³
<br>
***Medium*** *(Yellow)*: 30 - 54 µg/m³
<br>
***High*** *(Orange)*: 55 - 110 µg/m³
<br>
***Very High*** *(Red)*: > 110 µg/m³

In [None]:
#Average pm2.5 levels by station
avg_pm25_values = avg_pollutant['pm2.5']  # Average pm2.5 values
station_names = avg_pm25_values.index

# Define color mapping function for pm2.5
def get_pm25_color(value):
    if value > 86:
        return 'red'
    if value > 55 and value <=110:
        return 'orange'
    if value >30 and value <= 55:
        return 'yellow'
    if value <= 30:
        return 'green'

# Plotting bar chart for pm2.5 pollutant
plt.figure(figsize=(10, 6))
colors = [get_pm25_color(value) for value in avg_pm25_values]
plt.bar(station_names, avg_pm25_values, color=colors)
plt.title('Average PM2.5 Levels by Station')
plt.xlabel('Station')
plt.ylabel('Average PM2.5 (µg/m³)')
plt.xticks(rotation=90)
plt.grid(axis='y', linestyle='--', alpha=0.7)
plt.show()

PM10 is a mixture of particles suspended in the air that do not exceed 10 micrograms in diameter. It is harmful because it contains benzopyrenes, furans, dioxins and in short, carcinogenic heavy metals.

According to the [Common Air Quality Index](https://airly.org/en/air-quality-index-caqi-and-aqi-methods-of-calculation/):
<br>
**PM 10**
<br>
***Low*** *(Green)*: 0 - 49 µg/m³
<br>
***Medium*** *(Yellow)*: 50 - 89 µg/m³
<br>
***High*** *(Orange)*: 90 - 180 µg/m³
<br>
***Very High*** *(Red)*: > 180 µg/m³

In [None]:
#Average pm10 levels by station
avg_pm10_values = avg_pollutant['pm10']
station_names = avg_pm10_values.index

# Define color mapping function for pm10
def get_pm10_color(value):
    if value > 120:
        return 'red'
    if value > 90 and value <=180:
        return 'orange'
    if value >50 and value <= 90:
        return 'yellow'
    if value <= 50:
        return 'green'

plt.figure(figsize=(10, 6))
colors = [get_pm10_color(value) for value in avg_pm10_values]
plt.bar(station_names, avg_pm10_values, color=colors)
plt.title('Average PM10 Levels by Station')
plt.xlabel('Station')
plt.ylabel('Average PM10 (µg/m³)')
plt.xticks(rotation=90)
plt.grid(axis='y', linestyle='--', alpha=0.7)
plt.show()

- **The colors of the bar in the plot above corresponds to the air quality index. All districts recorded high levels of PM2.5. All districts also recorded high levels of PM10, except for Dingling which recorded medium levels of PM10.**
- **Based on the two barplots above, we can see that among the 12 districts, Gucheng has the highest average levels of both PM2.5 and PM10. This is followed by Dongsi and Aotizhongxin as having the highest levels of PM2.5 and PM10.** 
- **On the other hand, the Dingling district recorded the lowest average levels of both PM2.5 and PM10.**

### Question: What is the Effect of Time on Pollution Levels in Beijing?
Observing how level of pollutants change over time

Lets start by looking at how daily average PM2.5 and PM10 levels differ between 2013 and 2017

In [None]:
#Plot average levels of 'PM2.5' over the periods of 2013 to 2017
plt.figure(figsize=(12, 6))
plt.plot(df_final['date'], df_final['pm2.5'])
plt.ylim(0, 1100)
plt.title('PM2.5 Levels Over Time')
plt.xlabel('Date')
plt.ylabel('PM2.5 Level (µg/m³)')
plt.xticks(rotation = 45)
plt.grid(True)
plt.show()

#Plot average levels of 'PM10' over the periods of 2013 to 2017
plt.figure(figsize=(12,6))
plt.plot(df_final['date'], df_final['pm10'])
plt.ylim(0, 1100)
plt.title('PM10 Levels Over Time')
plt.xlabel('Date')
plt.ylabel('PM10 Level (µg/m³)')
plt.xticks(rotation = 45)
plt.grid(True)
plt.show()

- **We can see that between Jan 2013 to Jan 2017, the safety level for both PM2.5 (<55 µg/m³) and PM10 (<90 µg/m³) has been exceeded every year. Certain days have even recorded levels surpassing hazardous levels of 1000 µg/m³.**
- **We can also already see that the highest average levels of PM2.5 and PM10 seem to be recorded during the start and end of every year. We will explore this further later on in the analysis.**

Now I will check to see which year had the highest levels of PM2.5 and PM10

In [None]:
#Group pm2.5 and pm10 by 'year' and reset index to have 'year' as a column
pm_over_year = df_final.groupby('year')[['pm2.5', 'pm10']].mean().reset_index()
display(pm_over_year)

# Melt the DataFrame to long format for seaborn
pm_over_year_melted = pm_over_year.melt(id_vars='year', var_name='Pollutant', value_name='Average Level')

# Plot using Seaborn
plt.figure(figsize=(15, 8))
sns.barplot(data=pm_over_year_melted, x='year', y='Average Level', hue='Pollutant')
plt.title('Average PM2.5 and PM10 Levels by Year')
plt.xlabel('Year')
plt.ylabel('Average Pollutant Level (µg/m³)')
plt.legend(title='Pollutant', loc = 'upper left')
plt.xticks(rotation=45)
plt.grid(axis='y', linestyle='--', alpha=0.7)
plt.show()

- **We can see from the barplot above, the year 2014 produced the highest average levels of PM10 and 2017 produced the highest average levels of PM2.5. 2016 recorded the lowest levels of both PM2.5 and PM10.**
- **We can also derive from the barplot that every year recorded dangerous levels of pollutants in the air.**

Lets explore this further and observe how levels of pollutants vary per season

In [None]:
#aggregating pm2.5 and pm10 by season
season_pm = df_final.groupby('season')[['pm2.5', 'pm10']].mean().reset_index()
display(season_pm)

# Melt the DataFrame to long format for seaborn
season_pm_melted = season_pm.melt(id_vars='season', var_name='Pollutant', value_name='Average Level')

#Plot average pollutant levels per season using seaborn
plt.figure(figsize=(20, 7))
sns.barplot(data=season_pm_melted, x='season', y='Average Level', hue='Pollutant')
plt.title('Average PM2.5 and PM10 Levels by Season')
plt.xlabel('Season')
plt.ylabel('Average Pollutant Level (µg/m³)')
plt.legend(title='Pollutant', loc = 'upper left')
plt.xticks(rotation=45)
plt.grid(axis='y', linestyle='--', alpha=0.7)
plt.show()

- **As shown in the plot above, there are strong seasonal dependencies of pollutants, leading to substantial seasonal variation in the average concentration of PM2.5 and PM10.**
- **Highest average levels of PM2.5 were recorded during the winter time (Dec, Jan, Feb), followed closely by autumn (Sep, Oct, Nov).**
- **While concentrations of PM2.5 in the spring are relatively lower, PM10 concentration is highest during spring. This, most likely, is due to pollination during these months, as pollen is a type of PM10. Additionally, Beijing is regularly hit with sandstorms in the spring with the smog made worse by rising industrial activities and rapid deforestation.**

Lets find an explanation for why air pollutant levels are higher during winter periods:

In [None]:
# Grouping by month and calculating the mean temperature
temp_over_year = df_final.groupby('month')['temp'].mean()
display(temp_over_year)

#Plot average temperatures per month using seaborn
plt.figure(figsize=(10, 6))
temp_over_year.plot(kind='bar', color='skyblue')
plt.title('Average Temperature by Month')
plt.xlabel('Month')
plt.ylabel('Average Temperature (°C)')
plt.xticks(ticks=range(12), labels=['Jan', 'Feb', 'Mar', 'Apr', 'May', 'Jun', 'Jul', 'Aug', 'Sep', 'Oct', 'Nov', 'Dec'], rotation=0)
plt.grid(axis='y', linestyle='--', alpha=0.7)
plt.show()

- **The winter periods in Beijing are the coldest. Therefore, a possible explanation for higher levels of pollutants during these months is the increased coal burning for heating.**
- **Another explanation is that cold air is denser and moves slower than warm air. This density means that cold air traps the pollution and the particles remain in place for much longer.**
- **In an attempt to improve air quality during these months, the Chinese authority could consider incentivizing cleaner methods of heating, such as electricity or natural gases. These alternatives are considered cleaner and more efficient than its counterparts.** 

Finally, I will explore how PM levels differ depending on the time of day

In [None]:
# Grouping by time of day and calculating the mean pm2.5 and pm10
time_pm = df_final.groupby('time')[['pm2.5', 'pm10']].mean()
display(time_pm)

# Plotting
plt.figure(figsize=(10, 6))
time_pm.plot(kind='bar')
plt.title('Average Pollutant level by Time of Day')
plt.xlabel('Time of Day')
plt.ylabel('Average Pollutant Level (µg/m³)')
plt.legend(title='Pollutant', loc = 'upper left')
plt.xticks(rotation=0)
plt.grid(axis='y', linestyle='--', alpha=0.7)
plt.show()

- **Based on the plot, the average PM2.5 and PM10 levels are lowest in the morning and highest in the evening.**
- **During the day, the sun heats up the ground, causing the air to rise and creating wind. On the other hand, at night time, the ground cools down which reduces the wind speed. With less wind to disperse pollutants, they tend to accumulate and settle near the surface.**

# Meteorological Conditions and Air Quality

### Question: Does Wind Speed Affect Air Quality?

In [None]:
#Grouping average windspeed by month
wspm_by_month = df_final.groupby('month')['wspm'].mean()
display(wspm_by_month)

# Plotting the average wind speed by month
plt.figure(figsize=(10, 6))
wspm_by_month.plot(kind='bar')
plt.title('Average Wind Speed by Month')
plt.xlabel('Month')
plt.ylabel('Average Wind Speed (km/h)')
plt.xticks(ticks=range(12), labels=['Jan', 'Feb', 'Mar', 'Apr', 'May', 'Jun', 'Jul', 'Aug', 'Sep', 'Oct', 'Nov', 'Dec'], rotation=0)
plt.grid(axis='y', linestyle='--', alpha=0.7)
plt.show()

In [None]:
#Specific size of the plot
plt.figure(figsize=(10, 6))

# Scatter plot for PM10
plt.scatter(df_final['wspm'], df_final['pm10'], alpha=0.5, label='PM10', marker='x', color='skyblue')

#Scatter plot for pm2.5
plt.scatter(df_final['wspm'], df_final['pm2.5'], alpha=0.5, label='PM2.5')

#Determine characteristics of the scatter plot
plt.title('Scatter Plot of PM2.5 and PM10 vs Wind Speed')
plt.xlabel('Wind Speed (km/h)')
plt.ylabel('Concentration (µg/m³)')
plt.legend()
plt.grid(True)
plt.show()

- **This further confirms the previous finding regarding pollutant concentration and time of day. We can see that the concentration is higher when the wind speed is lowest and lower concentrations were recorded when wind speed was higher.**
- **The higher the wind speed, the more contaminants are dispersed and the lower their concentration.**

### Question: Does Precipitation Affect Air Quality?

In [None]:
#Specify size of the plot
plt.figure(figsize=(10, 6))

#Scatter plot for pm2.5
plt.scatter(df_final['rain'], df_final['pm2.5'], alpha=0.5, label='PM2.5')

# Scatter plot for PM10
plt.scatter(df_final['rain'], df_final['pm10'], alpha=0.5, label='PM10', marker='x', color='skyblue')

#Determine characteristics of the scatter plot
plt.title('Scatter Plot of PM2.5 and PM10 vs Precipitation')
plt.xlabel('Precipitation (mm)')
plt.ylabel('Concentration (µg/m³)')
plt.legend()
plt.grid(True)
plt.show()

- **On rainy days, rain droplets forces down the most common air pollutants, like particulate matter and pollen down. Thereby, the quality of air becomes drastically better. This phenomenon is called wet deposition.**
- **While the pollutants are washed out or dispersed, they are not gone. They have just been transported to another location. For instance, they may end up in bodies of water or in someone's lungs.**
- **This is confirmed by the plot above. On days with heavier precipitation, we see a lower concentration of pollutants recorded.**
- **Advised that individuals with health concerns minimize exposure to the outdoors or consider wearing a face mask to prevent inhalation of pollutant.**

### Question: Does Pressure Affect Air Quality?

In [None]:
#Specify size of the plot
plt.figure(figsize=(10, 6))

#Scatter plot for pm2.5
plt.scatter(df_final['pres'], df_final['pm2.5'], alpha=0.5, label='PM2.5')

# Scatter plot for PM10
plt.scatter(df_final['pres'], df_final['pm10'], alpha=0.5, label='PM10', marker='x', color='skyblue')

#Determine characteristics of the scatter plot
plt.title('Scatter Plot of PM2.5 and PM10 vs Pressure')
plt.xlabel('Pressure (atm)')
plt.ylabel('Concentration (µg/m³)')
plt.legend()
plt.grid(True)
plt.show()

- **Although the pressure throughout the dataset does not vary significantly, we can see that the pollutant concentration tended towards 1.00 atm which is the average pressure at mean sea level.**
- **Low-pressure conditions bring wet and windy conditions. A passing storm front can wash pollutants out of the atmosphere or transport them to a new area. This explains why lower concentration of pollutants can be observed at lower pressure levels.**
- **The opposite is true of high-pressure conditions, which can create stagnant air. When the air stops moving, pollutants concentrate over an area.**