# University of Illinois Data Mining Specialization
## Course 01: Data Visualization
*2018-04-30 to 2018-05-06 - Week 02*

### Programming Assignment

1. Take the data from the [GISTEMP site](gis-temperatures), specifically the data from “Table Data: Global and Hemispheric Monthly Means and Zonal Annual Means.”
2. Parse the data to a suitable format for the tools that you are using
    1. Course provided two files (in JS, TXT, and CSV formats)
    2. Subset of the data on the GISTEMP site
3. Visualize the data to meet the requirements of the [Programming Assignment 1 Rubric](#grading-rubric)

#### Grading Rubric

| Criteria | Poor (1–2 points) | Fair (3 points) | Good (4 points) | Great (5 points) |
| :--- | :--- | :--- | :--- | :--- | :--- |
| *Appropriate Chart Selection and Variables* | Chart is indecipherable or significantly misleading because of poor chart type or assignment of variables to elements | Major problem(s) with chart selection or assignment of elements to variables | Minor problem(s) with chart selection or assignment of elements to variables | Chart selection is appropriate for data and its elements properly assigned to appropriate data variables |
| *Design of the Chart*<sup>1</sup> | No apparent attention paid to design | Evidence that several of the design rules should have been followed but were not | Evidence that one of the design rules should have been followed but was not | Attention paid to all design rules |
| *Contest*<sup>2</sup> | Misleading | Boring | Not boring | Interesting |

<sup>1</sup>Does the chart effectively display the data, based on the design rules in lecture 2.3.1?
<br><sup>2</sup>How interesting is the result? Does this represent an interesting choice of data and/or an interesting way to display the data? For example, was a streamgraph used instead of an ordinary bar chart?

<!--Link Aliases-->
[gis-temperatures]: http://data.giss.nasa.gov/gistemp/

In [1]:
# Imports
import pandas as pd
from plotly.offline import download_plotlyjs, init_notebook_mode, plot, iplot
import plotly.graph_objs as go

# Set Plotly to display in notebook
init_notebook_mode(connected=True)

#### Plot Global Temperature Variations by Month

In [2]:
# Load data and inspect
dataSeasonal = pd.read_csv("data/ExcelFormattedGISTEMPDataCSV.csv")
print("Shape: %s" % (dataSeasonal.shape, ))
dataSeasonal.set_index("Year", inplace=True) # Index rows by year
print(dataSeasonal.head(5))

Shape: (136, 19)
      Jan  Feb  Mar  Apr  May  Jun  Jul  Aug  Sep  Oct  Nov  Dec  J-D  D-N  \
Year                                                                         
1880  -29  -19  -17  -27  -13  -28  -22   -6  -16  -15  -18  -20  -19  ***   
1881   -8  -13    2   -2   -3  -27   -5   -1   -8  -18  -25  -14  -10  -11   
1882   10   10    2  -19  -17  -24   -9    5    0  -21  -20  -24   -9   -8   
1883  -32  -41  -17  -23  -24  -11   -7  -12  -18  -11  -19  -17  -19  -20   
1884  -17  -11  -33  -35  -31  -37  -33  -25  -22  -22  -30  -28  -27  -26   

       DJF  MAM  JJA  SON  
Year                       
1880  ****  -19  -19  -16  
1881   -13   -1  -11  -17  
1882     2  -11   -9  -14  
1883   -32  -22  -10  -16  
1884   -15  -33  -32  -25  


In [3]:
# Plot monthly temperatures
plotData = []
for month in dataSeasonal.loc[:, "Jan":"Dec"]:
    plotData.append(go.Scatter(name=month, x=dataSeasonal.index.values, y=dataSeasonal[month]))

# Add vertical lines to show start and end of basline    
plotShapes = [
    # Start baseline measurement year
    {
        "type": "line",
        "xref": "x",
        "yref": "paper",
        "x0": 1951,
        "y0": 0,
        "x1": 1951,
        "y1": 1,
        "line": {"width": 1, "dash": "dot"}
    },
    # End baseline measurement year
    {
        "type": "line",
        "xref": "x",
        "yref": "paper",
        "x0": 1980,
        "y0": 0,
        "x1": 1980,
        "y1": 1,
        "line": {"width": 1, "dash": "dot"}
    }
]

# Annotate vertical lines for start and end of baseline
plotAnnotations = [
    # Annotate start baseline measurement year
    {
        "xref": "x",
        "yref": "paper",
        "x": 1951,
        "y": 1.03,
        "text": "Means from 1951",
        "showarrow": False
    },
    # Annotate end baseline measurement year
    {
        "xref": "x",
        "yref": "paper",
        "x": 1980,
        "y": 1.03,
        "text": "to 1980 as Zero",
        "showarrow": False
    }
]

# Set title and label axes
plotLayout = go.Layout(\
                       title="Global Temperature Variations by Month", \
                       xaxis={"title": "Year"}, \
                       yaxis={"title": "Difference from Mean (\u00B0C)"}, \
                       shapes=plotShapes, annotations=plotAnnotations\
                      )

# Plot entire figure (both interactive and to file)
plotFigure = go.Figure(data=plotData, layout=plotLayout)
iplot(plotFigure)
plot(plotFigure, filename="output/global-temperature-variations-by-month.html")

'file:///mnt/d/GitHub/uoi-coursera-data-mining/crs01/wk02/programming-assignment/output/global-temperature-variations-by-month.html'

#### Plot Global Temperature Variations by Latitude

In [4]:
# Load data and inspect
dataLatitude = pd.read_csv("data/ExcelFormattedGISTEMPData2CSV.csv")
print("Shape: %s" % (dataLatitude.shape, ))
dataLatitude.set_index("Year", inplace=True) # Index rows by year
print(dataLatitude.head(5))

Shape: (135, 15)
      Glob  NHem  SHem  24N-90N  24S-24N  90S-24S  64N-90N  44N-64N  24N-44N  \
Year                                                                           
1880   -19   -33    -5      -38      -16       -5      -89      -54      -22   
1881   -10   -18    -2      -27       -2       -5      -54      -40      -14   
1882    -9   -17    -1      -21      -10        4     -125      -20       -3   
1883   -19   -30    -8      -34      -22       -2      -28      -57      -20   
1884   -27   -42   -12      -56      -17      -11     -127      -58      -41   

      EQU-24N  24S-EQU  44S-24S  64S-44S  90S-64S  
Year                                               
1880      -26       -5       -2       -8       39  
1881       -5        2       -6       -3       37  
1882      -12       -8        3        8       42  
1883      -25      -19       -1        0       37  
1884      -21      -14      -15       -5       40  


In [5]:
# Plot monthly temperatures
plotData = []
for latitude in dataLatitude.loc[:, "24N-90N":"90S-24S"]:
    plotData.append(go.Scatter(name=latitude, x=dataLatitude.index.values, y=dataLatitude[latitude]))

# Add vertical lines to show start and end of basline    
plotShapes = [
    # Start baseline measurement year
    {
        "type": "line",
        "xref": "x",
        "yref": "paper",
        "x0": 1951,
        "y0": 0,
        "x1": 1951,
        "y1": 1,
        "line": {"width": 1, "dash": "dot"}
    },
    # End baseline measurement year
    {
        "type": "line",
        "xref": "x",
        "yref": "paper",
        "x0": 1980,
        "y0": 0,
        "x1": 1980,
        "y1": 1,
        "line": {"width": 1, "dash": "dot"}
    }
]

# Annotate vertical lines for start and end of baseline
plotAnnotations = [
    # Annotate start baseline measurement year
    {
        "xref": "x",
        "yref": "paper",
        "x": 1951,
        "y": 1.03,
        "text": "Means from 1951",
        "showarrow": False
    },
    # Annotate end baseline measurement year
    {
        "xref": "x",
        "yref": "paper",
        "x": 1980,
        "y": 1.03,
        "text": "to 1980 as Zero",
        "showarrow": False
    }
]

# Set title and label axes
plotLayout = go.Layout(\
                       title="Global Temperature Variations by Latitude", \
                       xaxis={"title": "Year"}, \
                       yaxis={"title": "Difference from Mean (\u00B0C)"}, \
                       shapes=plotShapes, annotations=plotAnnotations\
                      )

# Plot entire figure (both interactive and to file)
plotFigure = go.Figure(data=plotData, layout=plotLayout)
iplot(plotFigure)
plot(plotFigure, filename="output/global-temperature-variations-by-latitude.html")

'file:///mnt/d/GitHub/uoi-coursera-data-mining/crs01/wk02/programming-assignment/output/global-temperature-variations-by-latitude.html'