# Making Histograms

## Getting ready


In addition to `plotly`, `numpy` and `pandas`, make sure the `scipy` Python library avaiable in your Python environment
You can install it using the command:

```
pip install scipy 
```

For this recipe we will create two data sets

1. Import the Python modules `numpy`, `pandas`. Import the [`norm`](https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.norm.html) object from `scipy.stats`. This object will allow us to generate random samples from a normal distribution. This will help us to create data sets to be used in this recipe.

In [1]:
import numpy as np
import pandas as pd
from scipy.stats import norm

2. Create two data sets to be used in this recipe

In [2]:
n = 400
sample1 = norm().rvs(n)
sample2 = norm(loc=3, scale=0.5).rvs(n)

In [3]:
data1 = pd.DataFrame({'Normal': sample1})

In [4]:
samples =  np.concatenate( (sample1, sample2))
labels = ['Sample 1']*n + ['Sample 2']*n 
data2 = pd.DataFrame({'Data': samples, 'Label':labels})

## How to do it

1. Import the `plotly.express` module as `px`

In [5]:
import plotly.express as px

2. Make a simple scatter plot to illustrate the points in the `data1` data set using the function `histogram`

In [6]:
# df = px.data.tips()
# fig = px.histogram(df, x="total_bill")
# fig.show()

In [7]:
df = data1
fig = px.histogram(df, x="Normal")
fig.show()

2. Add a title to your chart by passing a string as the input `title` into the function `histogram`
3. And customise the size of the figure by using the inputs `height` and `width`. Both have to be integers and correspond to the size of the figure in pixels.

In [8]:
# fig = px.histogram(df, x="total_bill", 
#                    height = 500, width = 800,
#                    title='Total Bill Distribution')
# fig.show()

In [9]:
fig = px.histogram(df, x="Normal", 
                   height = 500, width = 800,
                   title='Sample from a Normal Distribution')
fig.show()

In [10]:
fig = px.histogram(df, x="Normal", 
                   histnorm='probability density',
                   height = 500, width = 800,
                   title='Sample from a Normal Distribution')
fig.show()

In [11]:
fig = px.histogram(df, x="Normal", 
                   nbins=25,
                   histnorm='probability density',
                   height = 500, width = 800,
                   title='Sample from a Normal Distribution')
fig.show()

4. Customise the color of the bars using the input `color_discrete_sequence` as follows. Note that we have to pass a list of strings, where each string corresponds to a color.  In this case, we pass the color `teal`

In [12]:
fig = px.histogram(df, x="Normal", 
                   color_discrete_sequence=['teal'],
                   nbins=25,
                   histnorm='probability density',
                   height = 500, width = 800,
                   title='Sample from a Normal Distribution')
fig.show()

In [13]:
fig = px.histogram(df, x="Normal", 
                   opacity=0.75,
                   color_discrete_sequence=['teal'],
                   nbins=25,
                   histnorm='probability density',
                   height = 500, width = 800,
                   title='Sample from a Normal Distribution')
fig.show()

In [14]:
fig = px.histogram(df, x="Normal", 
                   cumulative=True,
                   opacity=0.75,
                   color_discrete_sequence=['teal'],
                   nbins=25,
                   histnorm='probability density',
                   height = 500, width = 800,
                   title='Sample from a Normal Distribution')
fig.show()

In [15]:
df = data2

In [16]:
fig = px.histogram(df, x='Data',
                   color='Label', 
                   opacity=0.75,
                #    color_discrete_sequence=['teal'],
                   nbins=25,
                   histnorm='probability density',
                   height = 500, width = 800,
                   title='Sample from a Normal Distribution')
fig.show()

In [17]:
fig = px.histogram(df, x='Data',
                   color='Label', 
                   opacity=0.75,
                   barmode="overlay",
                   color_discrete_sequence=['teal', 'pink'],
                   nbins=25,
                   histnorm='probability density',
                   height = 500, width = 800,
                   title='Sample from a Normal Distribution')
fig.show()

In [18]:
fig = px.histogram(df, x='Data',
                   color='Label', 
                   barmode="overlay",
                   opacity=0.5,
                   color_discrete_sequence=px.colors.qualitative.Prism,
                   nbins=25,
                   histnorm='probability density',
                   height = 500, width = 800,
                   title='Sample from a Normal Distribution')
fig.show()