# Average Temperature Correlations?

"Some skeptics of human-induced climate change blame global warming on natural variations in the sun’s output due to sunspots and/or solar wind. They say it’s no coincidence that an increase in sunspot activity and a run-up of global temperatures on Earth are happening concurrently, and view regulation of carbon emissions as folly with negative ramifications for our economy and tried-and-true energy infrastructure."
https://www.scientificamerican.com/article/sun-spots-and-climate-change/

## Sunspot Data: 
SWS is often asked about historical sunspot numbers. The table below, available from the National Geophysical Data Center in Boulder (USA), lists the yearly values of sunspot number from 1700.
http://www.sws.bom.gov.au/Educational/2/3/6

## Average CO2 (ppm) Data:
The carbon dioxide record from Mauna Loa Observatory, known as the “Keeling Curve,” is the world’s longest unbroken record of atmospheric carbon dioxide concentrations. Scientists make atmospheric measurements in remote locations to sample air that is representative of a large volume of Earth’s atmosphere and relatively free from local influences.

#Import Libraries and Data

In [None]:
import numpy as np # linear algebra
import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)
import matplotlib.pyplot as plt
import seaborn as sns
%matplotlib inline

from subprocess import check_output
print(check_output(["ls", "../input"]).decode("utf8"))

# Any results you write to the current directory are saved as output.
df = pd.read_csv("../input/climate-change-earth-surface-temperature-data/GlobalLandTemperaturesByState.csv")
ss = pd.read_csv("../input/yearly-mean-sunspot-numbers/sunspotnumber.csv")
cd = pd.read_csv("../input/carbon-dioxide/archive.csv")

# Check Average Temperature Data

In [None]:
df.head()

In [None]:
#clean up year
def to_year(date):
    """
    returns year from date time
    """
    for i in [date]:
        first = i.split('-')[0]
        return int(first)
 
df['year'] = df['dt'].apply(to_year)


#select only data from the united states
dfs = df[df['Country'] == 'United States']

#calculate average temp per year
dfa = pd.DataFrame()
years = dfs['year'].unique()
for i in years:
    df_avg = dfs[dfs['year'] == i]['AverageTemperature'].mean()
    df_new = (dfs[dfs['year'] == i]).head(1)
    df_new['AverageTemperature'] = df_avg
    dfa = dfa.append(df_new)

In [None]:
#drop and plot temps below 9 degrees
df_nine = dfa[dfa['AverageTemperature'] >= 9]
df_nine.plot.scatter(x='year', y='AverageTemperature', c = 'AverageTemperature', cmap ='coolwarm')

# Check out Sun Spot Data

In [None]:
ss.head()

In [None]:
#drop unneaded columns
ss.drop(['Unnamed: 2', 'Unnamed: 3','Unnamed: 4','Unnamed: 5',
         'Unnamed: 6', 'Unnamed: 7','Unnamed: 8', 'Unnamed: 9'], axis=1, inplace=True)


#merge sun spot data and average temp data
dfsc = pd.merge(dfa, ss, on=['year'])

In [None]:
plt.scatter(x=ss['year'], y=ss['suns_spot_number'])
plt.ylabel('Sun Spot Number')
plt.xlabel('Years')
plt.title('Number of Sunspots since 1700')

# Check out Carbon Dioxide (ppm)
Rename Year column to year and merge with temp and sunspot dataframe then drop unwanted columns

In [None]:
cd.head()

In [None]:
#average CO2 ppm per year
dfc = pd.DataFrame()
years = cd['Year'].unique()
for i in years:
    df_avg = cd[cd['Year'] == i]['Carbon Dioxide (ppm)'].mean()
    df_new = (cd[cd['Year'] == i]).head(1)
    df_new['Carbon Dioxide (ppm)'] = df_avg
    dfc = dfc.append(df_new)
 
#change Year column to year
dfc.rename(index=str, columns={"Year": "year"}, inplace=True)

#merge CO2 data with temp and sun spot data
dfcss = pd.merge(dfsc, dfc, on=['year'])

#drop unwanted columns
dfcss.drop(['Seasonally Adjusted CO2 (ppm)', 
           'Carbon Dioxide Fit (ppm)', 
           'Seasonally Adjusted CO2 Fit (ppm)',
          'Decimal Date',
          'Month'], inplace=True, axis=1)

In [None]:
sns.lmplot(x='year', y='Carbon Dioxide (ppm)', data=dfcss)


# Check out Data Correlations

In [None]:
sns.heatmap(dfcss.corr())

# Linear relationship between Temp and CO2

In [None]:
sns.lmplot(x='AverageTemperature', y='Carbon Dioxide (ppm)', data =dfcss)

# Linear Relationship between Sun Spot Number and Temp

In [None]:
sns.lmplot(x='AverageTemperature', y='suns_spot_number', data =dfcss)