# Tutorial 1: Plot data in Watson studio  

In this tutorial you will:  
1. Upload some external data into IBM Cloud Object Store
2. Load this data into the Watson Studio environment  
3. Create an interactive time series plot of the data using plotly  

# Preparatory steps

### Toggle here to run on Watson Studio or locally

In [27]:
running_watson_studio=False

### First, upload sample  data to IBM Cloud  

1. Locate the data file "sample_climate_data.csv" in the repostitory folder: c2ma-tutorials/sample-data/sample_climate_data.csv.  

2. Download this file to your local machine. 
  
3. Upload to your Watson Studio project assets: From the main project page, navigate to Assets -> Data assets. Click "New Data asset +". Drag and drop the file and wait for the upload to complete.  


### Set up Watson studio project token - replace project ids and tokens for your Watson Studio project as described in workshop setup instructions [here](https://github.com/C2MA-workshop/c2ma-docs)

In [12]:
# @hidden_cell
# The project token is an authorization token that is used to access project resources like data sources, connections, and used by platform APIs.
if running_watson_studio:
    from project_lib import Project
    project = Project(project_id='XXXX', project_access_token='XXXX')
    pc = project.project_context

# Load and plot the sample data

### Load the required libraries  

In [23]:
import numpy as np
import pandas as pd
import math
import plotly.graph_objects as go
from plotly.subplots import make_subplots

### Load the data from the project assets

In [68]:
if running_watson_studio:
    my_file = project.get_file("sample_climate_data.csv") 
    my_file.seek(0)
    df = pd.read_csv(my_file)

### (Alternative version to load data from local storage)

In [69]:
if not running_watson_studio:
    df = pd.read_csv("./sample-data/sample_climate_data.csv")

In [70]:
df.head()

Unnamed: 0,day,month,year,longitude,latitude,rainfall,temperature,datetime
0,1,1,2001,29.9,-22.27,0.41736,29.974467,2001-01-01
1,1,1,2001,30.5,-22.97,1.939459,24.973081,2001-01-01
2,1,1,2001,31.59,-24.99,38.258316,24.669413,2001-01-01
3,1,1,2002,29.9,-22.27,0.117372,27.76304,2002-01-01
4,1,1,2002,30.5,-22.97,0.091737,23.332628,2002-01-01


In [71]:
df['timestamp'] = pd.to_datetime(df['datetime'])

In [72]:
df.head()

Unnamed: 0,day,month,year,longitude,latitude,rainfall,temperature,datetime,timestamp
0,1,1,2001,29.9,-22.27,0.41736,29.974467,2001-01-01,2001-01-01
1,1,1,2001,30.5,-22.97,1.939459,24.973081,2001-01-01,2001-01-01
2,1,1,2001,31.59,-24.99,38.258316,24.669413,2001-01-01,2001-01-01
3,1,1,2002,29.9,-22.27,0.117372,27.76304,2002-01-01,2002-01-01
4,1,1,2002,30.5,-22.97,0.091737,23.332628,2002-01-01,2002-01-01


### There are three locations contained in the file

In [73]:
locations = df[['latitude','longitude']].value_counts().to_frame().reset_index()[['latitude', 'longitude']]

In [74]:
locations

Unnamed: 0,latitude,longitude
0,-24.99,31.59
1,-22.97,30.5
2,-22.27,29.9


### Select the location and variable to be plotted

In [75]:
lat = locations.loc[0, 'latitude']
lon = locations.loc[0, 'longitude']
print(lat, lon)
var = "rainfall"
unit = "mm"

-24.99 31.59


In [76]:
dfplot = df.loc[(df['latitude']==lat) & (df['longitude']==lon), var]

infostr = ' for location: ' + str(lat) + ' N, ' + str(lon) + ' E' 

fig = make_subplots(rows=1, cols=1, shared_xaxes=True, \
                   subplot_titles = [var + infostr],
                   vertical_spacing = 0.05)
fig.add_trace(
    go.Scatter(x=times, y=tmax[ix,iy,:], showlegend=False), 
    row=1, col=1) 

fig.update_layout(
    autosize=False,
    width=800,
    height=900)
fig.update_yaxes(title_text= var + " " + unit, row=1, col=1)
fig.for_each_yaxis(lambda axis: axis.title.update(font=dict(size=12)))
fig.show()

### Author and license

Anne Jones is a Research Staff Member at IBM Research, specialising in AI for Climate Risk and Impacts. 

Copyright © 2021 IBM. This notebook and its source code are released under the terms of the MIT License.