# Plotting
We will be plotting using both `plotly` and `plotly express`.

The `plotly express` API can be seen here:
https://plot.ly/python/plotly-express/

Most `plotly express` plots are on the form:
```python
px.SOMEPLOT(data, x = "VAR0", y = "VAR1")
```
where `VAR0` and `VAR1` are string refering to the column names in `df` that you wish to plot, while `df` is a `pandas` DataFrame.

Base `plotly` examples can be found here:
https://plot.ly/python/

In `plotly` the syntax is different as this builds on a `json` structure and one has to think in the form of dictionaries and list to be able to follow the `plotly` logic. 
```python
fig = go.Figure()
fig.add_trace(go.Scatter(x=random_x, y=random_y0,
                    mode='lines',
                    name='lines'))
fig.add_trace(go.Scatter(x=random_x, y=random_y1,
                    mode='lines+markers',
                    name='lines+markers'))
```
where each `add_trace` adds an entrance to the `json` that needs to be constructed for the plot to be made.

## Load libraries
Import `plotly.express`. Use the abbreviations shown in the slides:

In [109]:
import plotly.express as px

## Loading data
The dataset `titanic_train.csv` will be used in the first exercises, start out by loading it into a variable called `df`.

In [110]:
import pandas as pd

In [111]:
#ANS
df = pd.read_csv('../data/titanic_train.csv')
df.head()

## Correlation matrix
Create a correlation matrix, which shows the correlation between features and plot it using a heatmap: `sns.heatmap`. If you want to rotate the text on the y-axis this can be done with `plt.yticks(rotation=0)` just before showing the plot.

HINT: A correlation matrix can be made by applying the `corr` method on a `pandas` dataframe [Link](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.corr.html)

In [112]:
#ANS
import plotly.graph_objects as go
corr = df.corr()
go.Figure(data =
        [
        go.Heatmap(
            # X axis names
            x = corr.columns,
            y = corr.index,
            z = corr,
            type = 'heatmap',
            colorscale = 'Viridis')
        ]
    )

## Scatter plot
Scatter plots are quite useful for visualizing data. Load the `iris.csv` dataset and plot the first two features in a scatter plot. Do it using both `seaborn`'s `lmplot` (`scatterplot` for seaborn 0.9.0 and above) function and `matplotlib`'s `scatter` function.

In [113]:
#ANS

# Load data
iris = pd.read_csv('../data/iris.csv', header = 0, sep = ";", index_col = 0, decimal = ",")

# Plot
px.scatter(iris,x = "Sepal.Length", y = "Sepal.Width", color = "Species")

## Barplot

The barplot takes a category as `x` and a numeric column as `y`. If more than one numeric value exists per category, the barplot will `sum` the entrances yet leaving each obervation seperated by transparrent lines.

```python
tips = px.data.tips()
fig = px.bar(tips, x="sex", y="total_bill", color="smoker", barmode="group")
fig.show()
```

Create a barplot of the survival probability for men and women. Use `px.bar`.

In [114]:
df_survive = df.groupby(["Sex"]).mean().reset_index()
# Write your implementation here


In [115]:
#ANS
fig = px.bar(df_survive, x="Sex", y="Survived")
fig.show()

This involves some data-wrangling, which we might want to avoid.

The same can be achived using `histogram` and the `histfunc` where we pass the string `"avg"` as the argument.

In [116]:
#ANS
px.histogram(df, x="Sex", y="Survived", histfunc="avg")

Try making a histogram for these data using `Age` and `Sex` as the variables

In [117]:
#ANS
px.histogram(df, x="Age", color="Sex",)

Create a plot where the count of `Survived` are showed on the y-axis for both men and women. Use `sns.countplot()`.

### Pokemon
Load the Pokemon dataset from `Pokemon.csv`.

In [118]:
#ANS
poke = pd.read_csv('../data/Pokemon.csv', encoding = 'latin1')

Create a histogram showing the mean Attack for each Type 1

In [119]:
#ANS
px.histogram(poke,y="Attack",x="Type 1", histfunc="avg")



## Boxplot
We will do a couple of boxplots based on the pokemon dataset. 

Try to create a boxplot of the following stats: Attack, Defense and Speed. If you are unfamiliar with boxplots, you can read more about them here: https://towardsdatascience.com/understanding-boxplots-5e2df7bcbd51
Otherwise you can skip to the next exercise.

In [120]:
poke["Type 3"] =  [x if not pd.isnull(x) else "Typeless" for x in poke["Type 2"]]

In [121]:
#ANS
px.box(poke, x="Type 1", y="Attack", color="Type 3", notched=True)

## Create a facet histogram

Use the option `facet_row` along with `category_orders` to examine the probability of survival on titanic across social classes using the variable `Pclass`.

In [122]:
#ANS
px.histogram(df, x="Sex", y="Survived", histfunc="avg", barmode="group",
             facet_row="Pclass", category_orders={"Pclass": [1, 2, 3]})


## Time series plot
Load the `sp500.csv` dataset and make a timeseries plot of the closing price. Try using the candlestick plot,

```
go.Candlestick(
    x=..,
    open=.., high=...,
    low=..., close=...,
    increasing_line_color=color_1,
    decreasing_line_color=color_2
```
is to use the `plot()` function on the individual column of a dataframe, where the dates are loaded as an index column. Use the `parse_dates` argument to do this.

In [123]:
#ANS

# Load data
sp = pd.read_csv('../data/sp500.csv', parse_dates = ['Date'], index_col=0)

go.Figure(data=[go.Candlestick(
    x=sp.index,
    open=sp['Open'], high=sp['High'],
    low=sp['Low'], close=sp['Close'],
    increasing_line_color= 'cyan', decreasing_line_color= 'gray'
)])

Another approach to plotting time series data is to use the function `px.line` in plotly express.

Try to plot the sp500 dataset using this function. 

Try both using `sp` and `sp_melt` 

In [124]:
sp_melt = pd.melt(sp.reset_index(), id_vars="Date")

sp_melt = sp_melt[sp_melt.variable.isin(["Open","High","Low"])]

sp = sp.reset_index()

In [125]:
#ANS
px.line(sp,x="Date",y="Open")

px.line(sp_melt, x="Date", y="value", color="variable")

# Subplots
Now you will create subplots using the `make_subplots` function and pokemon dataset.

### Brain scans 

#### DWI

Diffusion-weighted imaging (DWI) is a form of MR imaging based upon measuring the random Brownian motion of water molecules within a voxel of tissue. In general simplified terms, highly cellular tissues or those with cellular swelling exhibit lower diffusion coefficients. Diffusion is particularly useful in tumor characterization and cerebral ischemia. 

### PWI


Perfusion weighted imaging is a term used to denote a variety of MRI techniques able to give insights into the perfusion of tissues by blood. 


HINTS:
- Create the basis of the plots using `make_subplots`
- Then add the plots 1 by 1 using `fig.add_trace`
- Use the `go.Contour` function - which takes a matrix as `z` value 
```python
    fig = make_subplots(rows=1, cols=2)
    fig.add_trace( go.Contour(z=matrix) )
```

In [126]:
from plotly.subplots import make_subplots

pwi=pd.read_csv("../data/brain/pwi.csv",header=None)
dwi=pd.read_csv("../data/brain/dwi.csv", header=None)


In [127]:
#ANS
fig = make_subplots(rows=1, cols=2)

fig.add_trace(
    go.Contour(
        z=dwi.values,
        colorscale='Viridis' # Electric
    ),
    row=1, col=1
)


fig.add_trace(
    go.Contour(
        z=pwi.values,
        colorscale='Viridis', # Electric
        coloraxis=None,
        showscale=False
    ),
    row=1, col=2
)

fig.show()

# Bonus exercises

Make a surface plot of the `dwi` data using `Surface`

In [128]:
#ANS
go.Figure(data=[go.Surface(z=dwi.values)])

## `if time_left > 0:`
Go to the webpage below and follow one of the tutorials, that you find the most relevant:

https://plot.ly/python/

In [129]:
#CONFIG
from IPython.core.display import HTML
# Hide code tagged with #ANS
HTML('''<script>
function code_hide() {
    var cells = IPython.notebook.get_cells()
    cells.forEach(function(x){ if(x.get_text().includes("#ANS")){
        if (x.get_text().includes("#CONFIG")){

        } else{
            x.input.hide()
            x.output_area.clear_output()
        }

        
    }
    })
}
function code_hide2() {
    var cells = IPython.notebook.get_cells();
    cells.forEach(function(x){
    if( x.cell_type != "markdown"){
        x.input.show()      
    }
    
        });
} 
$( document ).ready(code_hide);
$( document ).ready(code_hide2);
</script>
<form action="javascript:code_hide()"><input type="submit" value="Hide answers"></form>
<form action="javascript:code_hide2()"><input type="submit" value="Show answers"></form>''')