# Climate Change


“Some say climate change is the biggest threat of our age while others say it’s a myth based on dodgy science”. 

The objective of the project is to analyze the pattern in the various aspects of climatic systems to provide evidence about the climatic changes, a top concern in today’s fast-developing world. The plan is to collect the data of some of the top factors contributing to the change in climate. This includes the factors like rising temperatures, increase in the sea levels, rise in air pollution, etc. While there are many independent submissions that address these issues, there is a need for a consolidated view which connects all these factors. The project endeavors to answer some of the concerning questions with the data. 


### Import the necessary Libraries

In [1]:
import sqlite3 as sq
import pandas as pd
import csv

### Create connection to sqlite3 database

In [2]:
db_file = 'Climate.db'
conn = sq.connect(db_file)
# conn.close()
c=conn.cursor()

### Load the datasets into tables 

We start by reading the sql queries from file.

In [4]:
def create_tables_from_file():
    with open('create_tables.sql','r') as f:
        for query in f:
            c.execute(query)

        
create_tables_from_file()

Next step is to insert the records from the csv files into their respective tables

In [9]:
def insert_records():
    datasets = ['GlobalLandTemperaturesByCountry.csv',
                'GlobalLandTemperaturesByCity.csv',
                'GlobalTemperatures.csv',
                'GlobalLandTemperaturesByState.csv',
                'GlobalLandTemperaturesByMajorCity.csv']
    
    #datasets = ['GlobalLandTemperaturesByMajorCity.csv']
    
    for dataset in datasets:
        
        if dataset == 'GlobalLandTemperaturesByCity.csv':
            f= open(dataset,'r',encoding="utf8")
        else:
            f= open(dataset,'r')
            
        
        data = csv.DictReader(f)
        
        if dataset == 'GlobalLandTemperaturesByCountry.csv':
            to_db = [(i['dt'], i['AverageTemperature'],i['AverageTemperatureUncertainty'],i['Country']) for i in data]
            query = "INSERT INTO COUNTRY (DATE, AVG_TEMP,AVG_TEMP_UNC,COUNTRY) VALUES (?,?,?,?);"
        
        elif dataset == 'GlobalLandTemperaturesByCity.csv':
            to_db = [(i['dt'], i['AverageTemperature'],i['AverageTemperatureUncertainty'],i['City'],i['Country'],i['Latitude'],i['Longitude']) for i in data]
            query = "INSERT INTO CITY (DATE, AVG_TEMP,AVG_TEMP_UNC,CITY,COUNTRY,LATITUDE,LONGITUDE) VALUES (?, ?,?,?,?,?,?);"
        
        elif dataset == 'GlobalTemperatures.csv':
            to_db = [(i['dt'], i['LandAverageTemperature'],i['LandAverageTemperatureUncertainty'],i['LandMaxTemperature'],i['LandMaxTemperatureUncertainty'],i['LandMinTemperature'],i['LandMinTemperatureUncertainty'],i['LandAndOceanAverageTemperature'],i['LandAndOceanAverageTemperatureUncertainty']) for i in data]
            query = "INSERT INTO GLOBAL_TEMP (DATE,LAND_AVG_TEMP,LAND_AVG_TEMP_UNC,LAND_MAX_TEMP,LAND_MAX_UNC,LAND_MIN_TEMP,LAND_MIN_UNC,LAND_OC_TEMP,LAND_OC_UNC) VALUES (?, ?,?,?,?,?,?,?,?);"
        
        elif dataset == 'GlobalLandTemperaturesByState.csv':
            to_db = [(i['dt'], i['AverageTemperature'],i['AverageTemperatureUncertainty'],i['State'],i['Country']) for i in data]
            query = "INSERT INTO STATE (DATE,AVG_TEMP,AVG_TEMP_UNC,STATE,COUNTRY) VALUES (?, ?,?,?,?);"
        
        elif dataset == 'GlobalLandTemperaturesByMajorCity.csv':
            to_db = [(i['dt'], i['AverageTemperature'],i['AverageTemperatureUncertainty'],i['City'],i['Country'],i['Latitude'],i['Longitude']) for i in data]
            query = "INSERT INTO MAJOR_CITY (DATE,AVG_TEMP,AVG_TEMP_UNC,CITY,COUNTRY,LATITUDE,LONGITUDE) VALUES (?, ?,?,?,?,?,?);"
            
 
        c.executemany(query, to_db)
        conn.commit()
        f.close()        
            

insert_records()

In [10]:
#c.execute('select * from COUNTRY LIMIT 5')
c.execute('select count(*) from COUNTRY')
rows = c.fetchall()
print(rows)
c.execute('select count(*) from CITY')
rows = c.fetchall()
print(rows)
c.execute('select count(*) from GLOBAL_TEMP')
rows = c.fetchall()
print(rows)
c.execute('select count(*) from STATE')
rows = c.fetchall()
print(rows)
c.execute('select count(*) from MAJOR_CITY')
rows = c.fetchall()
print(rows)

[(577462,)]
[(8599212,)]
[(3192,)]
[(645675,)]
[(239177,)]


In [8]:
#c.execute('delete from  COUNTRY')
#c.execute('delete from  CITY')
#c.execute('delete from  GLOBAL_TEMP')
#c.execute('delete from  STATE')
#c.execute('delete from  MAJOR_CITY')

<sqlite3.Cursor at 0x20b34080110>

In [None]:
#c.execute('drop table  COUNTRY')
#c.execute('drop table  CITY')
#c.execute('drop table  GLOBAL_TEMP')
#c.execute('drop table  STATE')
#c.execute('drop table  MAJOR_CITY')

In [3]:
c.execute('select count(*) from GLOBAL_TEMP')
rows = c.fetchall()
print(rows)

[(3192,)]


## Visualizations

### Is the temperature really increasing ??

Let's take a look at the Global temperatures over the years.

In [4]:
import numpy as np 
import pandas as pd 
import matplotlib.pyplot as plt
%matplotlib inline
import plotly.offline as py
py.init_notebook_mode(connected=True)
import plotly.graph_objs as go
import plotly.tools as tls
import seaborn as sns
import warnings
warnings.filterwarnings('ignore')

Create a dataframe with average global temperature over the years.

In [5]:
query = """SELECT strftime('%Y',DATE)Year, AVG(LAND_AVG_TEMP) LandAverageTemperature
            FROM GLOBAL_TEMP
            WHERE LAND_AVG_TEMP != ''
            GROUP BY Year"""
df = pd.read_sql(query, con = conn)
df.head(5)

Unnamed: 0,Year,LandAverageTemperature
0,1750,8.719364
1,1751,7.976143
2,1752,5.779833
3,1753,8.388083
4,1754,8.469333


In [6]:
trace=go.Scatter(x=df['Year'],y=df['LandAverageTemperature'],mode='lines')
data=[trace]

py.iplot(data, filename='line-mode')

Land temperature does appear to be increasing over the years

In [None]:
conn.close()