## Heatmap Visualizations
*Written by Alexandra Rivera, Duke University, Sep 10, 2020*

> Refer to [ITP Profile Visualizations](https://colab.research.google.com/drive/1IDAnyuaYz5H54QDRjDdKRYlhgwBqSFGf?usp=sharing) for more background on this project.

In [None]:
import pathlib
import os
import csv
import pandas as pd

from bokeh.io import output_notebook
output_notebook()
from bokeh.io import output_file, show
from bokeh.models import (ColorBar, LinearColorMapper, LinearAxis,
                          PrintfTickFormatter, HoverTool, BasicTicker, SingleIntervalTicker)
from bokeh.plotting import figure
from bokeh.palettes import mpl

from datetime import datetime, timedelta
import math

> We can pick whichever machine we want to visualize. **Choose the machine from the dropdown menu, and run the cell.**

In [None]:
machine = 'ITP41' #@param ["ITP1", "ITP2", "ITP6", "ITP8", "ITP41", "ITP48", "ITP49", "ITP86", "ITP91", "ITP92"]
itpnumber = machine
itpnumber2 = machine.lower()

> We will be working with `Pandas DataFrames`.
For this visualization, we make a master dataframe that includes the information from each individual CSV file (the CSV's for this dataset are provided to you). For each data point (pressure and temperature), the dataframe also inclues the time when it was taken.
**Run the cell to see the first couple lines of `master_df`.**

In [None]:
# Creating dataframe for heatmap
master_df = pd.DataFrame(columns=['Temperature', 'Pressure', 'Time'])

# Importing csvs
df = pd.read_csv('https://raw.githubusercontent.com/explore-ITP/explore-itp.github.io/master/data/'+itpnumber2+'final_csvs/'+itpnumber2+'_urls.csv')
url_list = df['0'].tolist()

# The columns from the .dat files we are interested in
col_list = ['%pressure(dbar)', 'temperature(C)']

# Iterating through each CSV file to add to master dataframe
# Change the splicing of url_list to make it your own!
for csvfilename in url_list[0:100]: 
    # Dataframe of pressure and temperature datapoints
    df = pd.read_csv(csvfilename, skiprows=1, usecols = col_list)
    df.columns = ['Pressure', 'Temperature']
    # Dataframe of file's information (time and geographic coordinates)
    info = pd.read_csv(csvfilename, nrows=1, names=['Year', 'YearFrac', 'Long', 'Lat', 'Ndepths'])

    # Change year/year fraction to traditional datetime string format   
    tyr = str(info['Year'])
    tyrfr = str(info['YearFrac'])    
    year_start = datetime(int(tyr[5:9]), 1, 1) 
    itp_yearfraction = float(tyrfr[5:-31])    
    itp_day_whole=math.floor(itp_yearfraction)
    final_date = year_start + timedelta(days=itp_day_whole)     
    final_date_str = final_date.strftime('%Y-%m-%d')

    # Adds time data to dataframe 
    df["Time"] = final_date_str
    target_df = df[['Temperature','Pressure', 'Time']]
    master_df = master_df.append(target_df)

master_df = master_df.reset_index(drop=True)
print(master_df.head())

   Temperature  Pressure        Time
0      -1.4575       6.9  2011-12-06
1      -1.4581       7.9  2011-12-06
2      -1.4578       8.9  2011-12-06
3      -1.4577       9.9  2011-12-06
4      -1.4578      10.9  2011-12-06


> Now we will create our heatmap and corresponding colorbar using Bokeh. **Run the cell to view the what we created.**

In [None]:
# Plotting heatmap 
TOOLS = "hover,save,pan,box_zoom,reset,wheel_zoom"

colors = mpl['Plasma'][11]
mapper = LinearColorMapper(
    palette=colors, low=master_df.Temperature.min(), high=master_df.Temperature.max())

# Sorting through timestamps and removing duplicates
time_list = list(master_df.Time)
time_list = list(sorted(time_list))
simplified_tl = []
[simplified_tl.append(x) for x in time_list if x not in simplified_tl]

# Initializing the heatmap
hm = figure(title= 'Temperature Heatmap: ' + master_df.Time.min() + ' to ' + master_df.Time.max() + ', ' + itpnumber,
           x_range= simplified_tl, y_range=list(reversed((0,760))),
           x_axis_type=None, plot_width=1000, plot_height=400,
           tools=TOOLS, toolbar_location='above')

# Adding data into the heatmap
hm.rect(x="Time", y="Pressure",width=1,height=1,source = master_df, fill_color={'field': 'Temperature', 'transform': mapper},
       line_color=None) 

# Creating the colorbar
color_bar = ColorBar(color_mapper=mapper, major_label_text_font_size="11px",
                     ticker=BasicTicker(desired_num_ticks=13),
                     formatter=PrintfTickFormatter(format='%1.2f' '\N{DEGREE SIGN}C'),
                     label_standoff=12, border_line_color=None, location=(0, 0))
hm.add_layout(color_bar, 'right')

# Other formatting for plot
ticker = SingleIntervalTicker(interval=10, num_minor_ticks=5)
xaxis = LinearAxis(ticker=ticker)
hm.add_layout(xaxis, 'below')
hm.xaxis.axis_label = 'Time (days)'
hm.xaxis.major_label_orientation = math.pi / 3
hm.yaxis.axis_label = 'Pressure'
hm.select_one(HoverTool).tooltips = [('Time', '@Time'),('Pressure', '@Pressure'), ('Temperature', '@Temperature')]
hm.background_fill_color = "black"

show(hm)