# 07B: Methane

In [None]:
# This code will load the R packages we will use
suppressPackageStartupMessages({
    library(coursekata)
})

<img src="https://i.postimg.cc/2CdHrqmS/methane-in-the-environment.jpg" alt="A chimney burning methane" width = 50%>

## Climate and Methane

Our changing climate (e.g., more extreme hurricanes, droughts, warming temperatures) have been linked to increases in pollutants and greenhouse gases (such as methane). But there are skeptics who do not believe human activities are linked to changes in climate.

A climate change skeptic chose one summer month and one winter month from each of the years 1999-2018. For each of those months, the skeptic found data on the average global temperature and atmospheric methane level. 

The skeptic claims that his data shows a **negative relationship** between methane and temperature. In other words, higher concentrations of the greenhouse gas tend to be associated with lower average global temperature. 

We will take a look at this data (in the data frame `temp_data`). Is the climate change skeptic correct? Could this be evidence that greenhouse gases and human pollution aren't driving climate change? Let's explore.

### Motivating Question: Is Climate Change Real?

### The Dataset `temp_data`
##### Description
A climate change skeptic (non-randomly) chose one summer month and one winter month from each of the years 1999-2018. For each of those months, the skeptic found data on the average global temperature and atmospheric methane level. 

##### Variables
- `year`: Year
- `month`: Month (1 = Jan, 2 = Feb, etc.)
- `decimal`: Month as a decimal of the year
- `season`: Winter or Summer
- `methane`: Atmospheric methane concentraion, parts per trillion (ppt)
- `temp_anamoly`: Average global temperature, relative to mean temperature from 1980 - 2015, in degreese Celsius

**Data Sources:**

- Temperature readings from GISS/NASA: https://data.giss.nasa.gov/gistemp/graphs_v4/. Shows global temperature anamoly relative to mean temp from 1980 - 2015. 
- Methane levels from NOAA: https://gml.noaa.gov/ccgg/trends_ch4. Converted ppb to ppt (divided by 1,000). Measurements conducted monthly from 1999 - 2018.

### 1.0 - A climate skeptic's analysis

**1.1 -** The code below reads in the data. Take a look at the data frame. Do you have any questions about the variables?

In [None]:
# Read in data
temp_data <- read.csv("https://docs.google.com/spreadsheets/d/e/2PACX-1vQxO1NNRbXecdTvW8bfSlEytylLLv3jZ_ElehIakBQ157vKHQywyDs_cmyHZG9S0pjQN_SMKrAwSHEy/pub?gid=2129057940&single=true&output=csv")


**1.2 -** What is the climate skeptic's hypothesis about temperature and methane? Write it as a regular sentence as well as a word equation.

**1.3 -** Create a model to examine the climate skeptic's hypothesis.

**1.4 -** Is the skeptic's claim true that these two factors (temperature anamoly and methane levels) are negatively associated? How can you tell? (Support your answer by interpreting the parameter estimates or other statistics.)

### 2.0 - Exploring the data behind the model

Typically, we start by exploring the variation with a visualization. Let's check out this climate skeptic's data this way.

**2.1 -** Visualize the relationship between temperature anamoly and methane. What do you notice?

<div class="alert alert-block alert-info">

<b> <font size="+1">Key Question</font></b>

**2.2 -** Look back at the dataset. Is there a variable that may explain the pattern you see above? Test your idea by creating a new visual or set of visuals. 

</div>

**2.3 -** Below, we've visualized the relationship between temperature and methane in our dataset, showing the line of best fit (used in the climate skeptic's model). Does the model fit the data well? Why does the model have a negative slope?


In [None]:
methane_model <- lm(temp_anomaly ~ methane, data = temp_data)

# Scatterplot between methane and temp anamoly
gf_point(temp_anomaly ~ methane, data = temp_data, color = ~season) %>%
  gf_labs(title = "Global Temps vs. Global Methane Concentration (1999 - 2010)",
    x = "Methane Concentration (parts per trillion)",
    y = "Temperature Anamoly (C), w.r.t 1980-2015 Avg Temp") %>%
  gf_model(methane_model)

<div class="alert alert-block alert-info">

<b> <font size="+1">Key Question</font></b>


**2.4 -** Create a subset of the data that is *only* the winter months. Then, visualize the relationship between methane and temperature. What do you notice? Do the same with the summer month data. What do you notice?

</div>

### 3.0 - Getting multivariate

<div class="alert alert-block alert-info">

<b> <font size="+1">Key Question</font></b>

**3.1 -** Create and fit a model predicting temperature from methane, taking season into account. How does the coefficient value for `methane` differ from the one seen in the initial model in the notebook? Why do you think this change occurred?

</div>

**3.2 -** Create a scatterplot showing temperature as predicted by methane, and color the data according to season. Then, visualize your multiple regression fit on the graph.

**3.3 -** Your model had a coefficient value for `seasonwinter`. What feature of your graph does this value describe?

<div class="alert alert-block alert-info">

<b> <font size="+1">Key Question</font></b>


3.4 - What feature of your graph does the `methane` coefficient describe? How does its value contradict the claim made by the climate change skeptic?

</div>

**3.5 -** Which model is more convincing: The skeptic's single-predictor model or the multivariate model? Provide evidence using the visualizations created above and supernova tables.