## **What role does urbanization, using China as an example, play in shaping average temperature trends in a warming world?**

### Introduction

With the development of modern cities and urbanization, the global average temperature has been increasing over the past century, leading to concerns about the impacts of climate change. While urbanization signifantly improves life standards and society effienicy, it is commonly believed to be a significant contributor to the global warming process. China, one of the fastest-growing developing countries in the world, is undergoing rapid urbanization in the recent few decades. This research aims to examine the role of urbanization in shaping average temperature trends in China. The analysis will be foused on finding the relationship between average temperature and other independent variables such as city, longitude, latitude, and seasons based on a time series analysis starting from 1979 to 2013. This is the period in which China decided to begin the journey of reforming and opening-up. By investigating the impact of urbanization on temperature trends in China, this study will further enhance our understanding of the interaction between human acitivity and the enviroment.

### Data Cleaning

In [6]:
# Imports
import pandas as pd
import numpy as np

# Read dataset
df = pd.read_csv('/Users/booker/Desktop/ECO225Project/Data/GlobalLandTemperaturesByCity.csv')

In [7]:
# Check missing values
print(df.isnull().sum())

# Drop missing values
df.dropna(axis = 0, inplace = True)

dt                                    0
AverageTemperature               364130
AverageTemperatureUncertainty    364130
City                                  0
Country                               0
Latitude                              0
Longitude                             0
dtype: int64


In [8]:
# Convert all dates
df['Date'] = pd.to_datetime(df.dt)
df.drop(columns = ['dt'], axis = 1, inplace = True)
df['Year'] = df['Date'].dt.year
df['Month'] = df['Date'].dt.month

# Group by year
gb_year = df.groupby(['Year']).mean().reset_index()

gb_year[['AverageTemperature', 'AverageTemperatureUncertainty']].describe()

# Describing X and Y variables 
print(gb_year['AverageTemperature'].describe())
print(gb_year['AverageTemperatureUncertainty'].describe())
print(gb_year['Year'].describe())

count    267.000000
mean      15.340967
std        3.360706
min        1.497593
25%       14.512396
50%       17.032359
75%       17.699692
max       19.061038
Name: AverageTemperature, dtype: float64
count    267.000000
mean       1.575500
std        1.335002
min        0.323927
25%        0.445444
50%        1.078305
75%        2.380854
max        5.470221
Name: AverageTemperatureUncertainty, dtype: float64
count     267.000000
mean     1879.955056
std        77.298695
min      1743.000000
25%      1813.500000
50%      1880.000000
75%      1946.500000
max      2013.000000
Name: Year, dtype: float64
