# 4 Best (Often Better) Alternatives To Histograms
## Get More Insightful Distributions
<img src='images/analysis.jpg'></img>
<figcaption style="text-align: center;">
    <strong>
        Photo by 
        <a href='https://www.pexels.com/@marketingtuig?utm_content=attributionCopyText&utm_medium=referral&utm_source=pexels'>Timur Saglambilek</a>
        on 
        <a href='https://www.pexels.com/photo/analytics-text-185576/?utm_content=attributionCopyText&utm_medium=referral&utm_source=pexels'>Pexels</a>
    </strong>
</figcaption>

### Why Histograms May Not Be the Best Option

### Refresher On Discrete And Continuous Data

Before we move to the alternatives, I wanted to give some information on data types for those are not aware.

There are two types of numeric data:
- Discrete data - any data that is recorded by counting such as age, test scores, sometimes individual components of time like year, weekday or month number, etc.
- Continuous data - any data that is recorded by measuring such as height, weight, distance, etc. Time itself is also considered as continuous data. One defining aspect of continuous data is that the same data can be represented in different units of measurement. For example, distance can be measured in miles, kilometers, meters, centimeters, millimeters and the list **continues**. No matter how small, a smaller unit of measurement can be found for continuous data. 

> Note on money and prices, statisticians debate over whether money is continuous or discrete, so I won't get into it too much. However, it is important to note that banking industry and tax systems regard money as continuous data. 

### Setup

In [1]:
import numpy as np
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt

sns.set()

from empiricaldist import Pmf
from empiricaldist import Cdf

### Probability Mass Function - PMF Plots

The first alternative for histograms is plotting the results of a Probability Mass Function. 

Probability Mass Function is a function that takes a distribution (any sequence) of values and returns the frequency of each unique value. Consider this small distribution: 

In [2]:
x = [4, 6, 5, 6, 4, 3, 2]

To compute the PMF of this distribution, we will use `Pmf` function from the `empiricaldist` library (written by Allen B. Downey, author of well-know books such as _ThinkStats_ and _ThinkBayes_:

In [3]:
# import the function
from empiricaldist import Pmf  # pip install empiricaldist

# Compute PMF
pmf_dist = Pmf.from_seq(x, normalize=False)
pmf_dist

Unnamed: 0,probs
2,1
3,1
4,2
5,1
6,2


The result is a `Pmf` object (`pandas` series under the hood) with the unique values of the passed distribution. The unique values of the distribution are given as an ordered index and their frequencies (counts) under `probs`. 

Now, if we set `normalize` to `True`, `probs` will contain the probability of each value occurring if we choose a random number from the distribution `x`:

In [4]:
pmf_dist_norm = Pmf.from_seq(x, normalize=True)
pmf_dist_norm

Unnamed: 0,probs
2,0.142857
3,0.142857
4,0.285714
5,0.142857
6,0.285714


To get the probability of any value, we can use brackets operator:

In [5]:
pmf_dist_norm[4]

0.2857142857142857

This was a trivial example to give you an idea of Probability Mass Function. Next, I will load the `diamonds` and `tips` datasets from `seaborn` and we will work on more examples and get to plotting the results:

### Cumulative Distribution Function (CDF) Plot

### Probability Density Function (PDF) Plot

### Swarm plot

### KDE Plot