# MMaking Box Plots better with Jitter

## Getting ready


In addition to `plotly`, `numpy` and `pandas`, make sure the `scipy` Python library avaiable in your Python environment
You can install it using the command:

```
pip install scipy 
```

For this recipe we will create two data sets

1. Import the Python modules `numpy`, `pandas`. Import the [`norm`](https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.norm.html) object from `scipy.stats`. This object will allow us to generate random samples from a normal distribution. This will help us to create data sets to be used in this recipe.

In [56]:
import numpy as np
import pandas as pd
from scipy.stats import norm, t

2. Create two data sets to be used in this recipe

In [80]:
n = 200
sample1 = norm(loc=0).rvs(n)
sample2 = t(df=3).rvs(n)

In [81]:
data1 = pd.DataFrame({'Normal': sample1})

In [59]:
samples =  np.concatenate( (sample1, sample2))
labels = ['Normal']*n + ['t-Student']*n 
data2 = pd.DataFrame({'Data': samples, 'Label':labels})

## How to do it

1. Import the `plotly.express` module as `px`

In [60]:
import plotly.express as px

2. Make a simple scatter plot to illustrate the points in the `data1` data set using the function `histogram`

In [61]:
df = data1

In [62]:
import plotly.graph_objects as go

In [63]:
fig = go.Figure()
fig.add_trace(go.Box(x=df["Normal"], 
                     notched=True, 
                     marker_color='purple', 
                     boxpoints="all",
                     name='Normal',
                     ))
fig.update_layout(title='Box Plot Sample from a Normal Distribution', 
                  height = 500, width = 800,)

In [64]:
fig = go.Figure()
fig.add_trace(go.Box(x=df["Normal"], 
                     jitter=0.5,
                     notched=True, 
                     marker_color='coral', 
                     boxpoints="all",
                     name='Normal',
                     ))
fig.update_layout(title='Box Plot Sample from a Normal Distribution', 
                  height = 500, width = 800,)

In [65]:
fig = go.Figure()
fig.add_trace(go.Box(x=df["Normal"], 
                     jitter=0.0,
                     notched=True, 
                     marker_color='teal', 
                     boxpoints="all",
                     name='Normal',
                     ))
fig.update_layout(title='Box Plot Sample from a Normal Distribution', 
                  height = 500, width = 800,)

In [67]:
fig = go.Figure()
fig.add_trace(go.Box(x=df["Normal"], 
                     pointpos=0.0,
                     jitter=0.5,
                     notched=True, 
                     marker_color='green', 
                     boxpoints="all",
                     name='Normal',
                     ))
fig.update_layout(title='Box Plot Sample from a Normal Distribution', 
                  height = 500, width = 800,)

In [68]:
df = data2

In [69]:
fig = go.Figure()

for l in df['Label'].unique():
    subdata = df[df.Label==l]
    
    fig.add_trace(go.Box(x=subdata["Data"], 
                        notched=True, 
                        boxpoints="all",
                        name=l
                        ))
fig.update_layout(title='Box Plot Sample from a Normal Distribution', 
                  height = 500, width = 800,)

In [70]:
fig = go.Figure()

colors = ['teal', 'purple']
for l, cl in zip(df['Label'].unique(), colors):
    subdata = df[df.Label==l]
    
    fig.add_trace(go.Box(x=subdata["Data"], 
                        notched=True, 
                        boxpoints="all",
                        name=l,
                        marker_color=cl, 
                        ))
fig.update_layout(title='Box Plot Sample from a Normal Distribution', 
                  height = 500, width = 800,)

In [71]:
fig = go.Figure()

colors = ['teal', 'purple']
for l, cl in zip(df['Label'].unique(), colors):
    subdata = df[df.Label==l]
    
    fig.add_trace(go.Box(x=subdata["Data"], 
                        pointpos=0.0,
                        jitter=0.5,
                        notched=True, 
                        boxpoints="all",
                        name=l,
                        marker_color=cl, 
                        ))
fig.update_layout(title='Box Plot Sample from a Normal Distribution', 
                  height = 500, width = 800,)

In [73]:
fig = go.Figure()

colors = ['teal', 'purple']
for l, cl in zip(df['Label'].unique(), colors):
    subdata = df[df.Label==l]
    
    fig.add_trace(go.Box(x=subdata["Data"], 
                        jitter=0.0,
                        notched=True, 
                        boxpoints="all",
                        name=l,
                        marker_color=cl, 
                        ))
fig.update_layout(title='Box Plot Sample from a Normal Distribution', 
                  height = 500, width = 800,)

## There is more

In [134]:
n = 500
sample = norm(loc=0).rvs(n)
data1 = pd.DataFrame({'Sample': sample})

In [113]:
df = data1

In [136]:
fig = go.Figure()
fig.add_trace(go.Box(x=df["Sample"], 
                    #  boxmean=True,
                    #  pointpos=0.0,
                    #  jitter=0.5,
                     notched=True, 
                     marker_color='green', 
                     boxpoints="outliers",
                     name='Normal',
                     ))
fig.update_layout(title='Box Plot', 
                  height = 500, width = 800,)

In [142]:
fig = go.Figure()
fig.add_trace(go.Box(x=df["Sample"], 
                     boxmean=True,
                     notched=True, 
                     marker_color='green', 
                     boxpoints="outliers",
                     name='Normal',
                     ))
fig.update_layout(title='Box Plot', 
                  height = 500, width = 800,)

In [143]:
fig = go.Figure()
fig.add_trace(go.Box(y=df["Sample"], 
                     lowerfence=[-2.],
                     upperfence=[2.],
                     boxmean=True,
                     notchwidth=0.5,
                     notched=True, 
                    #  marker_color='green', 
                    #  boxpoints="outliers",
                     name='Normal',
                     ))
fig.update_layout(title='Box Plot', 
                  height = 700, width = 500,)