In [None]:
import pandas as pd

pd.options.plotting.backend = "plotly"

import plotly.io as pio

pio.templates.default = "plotly_dark+presentation"

# Categorical data

In [None]:
categorical = pd.Series(["c", "a", "b", "a", "b", "b"], name="cat")
categorical

The `value_counts` method is used to count the number of occurrences of each unique value in a Series. It returns a Series with the counts of unique values. 

Note that it automatically orders the counts in descending order. 



In [None]:
categorical.value_counts()

Plotting works by calling the `plot` method on the Series. We then specify the type of plot. In this case, we want a histogram.

In [None]:
categorical.plot.hist()

For categorical data, the flipped version is typically easier to read. We can flip the axes by setting the `orientation` parameter to `h` for "horizontal". The default is "v" for "vertical".

In [None]:
categorical.plot.hist(orientation="h")

# Continuous data

In [None]:
continuous = pd.Series([1.57, 0.09, 1, 2.9, 1.25, 1, 0.35, 2.3, 2.15], name="cont")
continuous

Again, the `value_counts` method has the same behaviour as above. Note, however, that it
is not very helpful to summarize the data. In this case, only one value occurs twice,
all others occur only once. Hence, the Series with the value counts has only one element
less than the original Series, which hardly counts as a summary.

In [None]:
continuous.value_counts()

For these data, plotly defaults to a bin number of 3, resulting in bins $[0-1)$, $[1-2)$, $[2-3)$. For other data, the default number of bins will be larger. However, the bins will always be equally spaced.

In [None]:
continuous.plot.hist().write_image("screencast/public/hist_cont.png")

# Discrete, ordered data

In [None]:
discrete = pd.Series([1, 1, 2, 5, 1, 5], name="dis")
discrete

The `value_counts` method has the same behaviour as above.

In [None]:
discrete.value_counts()

The plots now look like for continous data — there is no gap between the bars and the x-axis is continuous. Whether that makes sense for your application is up to you to decide.

In [None]:
discrete.plot.hist()

If you only want to see all unique outcomes, a simple way to get that into the visualisation is to increase the number of bins:

In [None]:
discrete.plot.hist(nbins=5)

We can achieve a similar result to the categorical data by making a bar chart out value
counts. Note that this keeps empty space for 3 and 4.

In [None]:
discrete.value_counts().plot.bar()

Again, we can swap the axes as this is the most readable way for such data.

In [None]:
discrete.value_counts().plot.bar(orientation="h")