<h1>Counting values with <code>adviz</code></h1>

## Counting values

> Count the unique values in a collection of items

* Count in absolute and proportion (percentage) terms
* Show the cumulative count and percent
* Determine the number of top items to show
* Show all remaining items lumped into one item named `Others:`
* Change the name of the items, as well as the caption and various other styling options


## Installation and usage

#### Installation:

```bash
python3 -m pip install adviz
```

#### Usage:

```python
import adviz
adv.value_counts_plus([item_1, item_2, ... item_n], ...)

```

## Usage

Generate a random list of 10,000 colors

In [None]:
#| echo: true
#| code-fold: true
import random
import numpy as np
import matplotlib as mpl

colors = list(mpl.colors.cnames.keys())
colors = random.choices(colors, weights=[0.9, 0.04, 0.05, 0.09]*37, k=10_000)
colors += [np.nan for i in range(240)]
colors[:20]

['mediumaquamarine',
 'linen',
 'mediumslateblue',
 'papayawhip',
 'palegoldenrod',
 'lightgreen',
 'rebeccapurple',
 'lightgreen',
 'darkred',
 'blanchedalmond',
 'cyan',
 'mediumslateblue',
 'mediumseagreen',
 'darkslategray',
 'dodgerblue',
 'linen',
 'lightsteelblue',
 'dodgerblue',
 'red',
 'lightcoral']

In [None]:
from functools import partial
import adviz
adviz.value_counts_plus = partial(adviz.value_counts_plus, size=14)

## View default output

In [None]:
#| echo: true
import adviz
adviz.value_counts_plus(colors)

Unnamed: 0,data,count,cum. count,%,cum. %
1,sienna,265,265,2.6%,2.6%
2,linen,251,516,2.5%,5.0%
3,darkgray,246,762,2.4%,7.4%
4,darkslategray,244,1006,2.4%,9.8%
5,,240,1246,2.3%,12.2%
6,fuchsia,238,1484,2.3%,14.5%
7,plum,238,1722,2.3%,16.8%
8,salmon,235,1957,2.3%,19.1%
9,mediumslateblue,235,2192,2.3%,21.4%
10,coral,233,2425,2.3%,23.7%


## Change the number of top values: `show_top=n`

In [None]:
#| echo: true
adviz.value_counts_plus(colors, show_top=5)

Unnamed: 0,data,count,cum. count,%,cum. %
1,sienna,265,265,2.6%,2.6%
2,linen,251,516,2.5%,5.0%
3,darkgray,246,762,2.4%,7.4%
4,darkslategray,244,1006,2.4%,9.8%
5,,240,1246,2.3%,12.2%
6,Others:,8994,10240,87.8%,100.0%


## Remove styling: `style=False`

In [None]:
#| echo: true
#| body-width: 200px
adviz.value_counts_plus(colors, show_top=5, style=False)

Unnamed: 0,data,count,cum_count,perc,cum_perc
0,sienna,265,265,0.025879,0.025879
1,linen,251,516,0.024512,0.050391
2,darkgray,246,762,0.024023,0.074414
3,darkslategray,244,1006,0.023828,0.098242
4,,240,1246,0.023438,0.12168
5,Others:,8994,10240,0.87832,1.0


## Styled vs non-styled tables
* Indexing starts with 1 to be more accessible to non-tech audience
* Indexing remains 0-based in non-styled tables for further processing
* Column headers are displayed in a more readable way

In [None]:
#| layout-ncol: 2
display(adviz.value_counts_plus(colors, show_top=5))
display(adviz.value_counts_plus(colors, show_top=5, style=False))

Unnamed: 0,data,count,cum. count,%,cum. %
1,sienna,265,265,2.6%,2.6%
2,linen,251,516,2.5%,5.0%
3,darkgray,246,762,2.4%,7.4%
4,darkslategray,244,1006,2.4%,9.8%
5,,240,1246,2.3%,12.2%
6,Others:,8994,10240,87.8%,100.0%


Unnamed: 0,data,count,cum_count,perc,cum_perc
0,sienna,265,265,0.025879,0.025879
1,linen,251,516,0.024512,0.050391
2,darkgray,246,762,0.024023,0.074414
3,darkslategray,244,1006,0.023828,0.098242
4,,240,1246,0.023438,0.12168
5,Others:,8994,10240,0.87832,1.0


## Change the size of the table: `size`

In [None]:
#| echo: true
adviz.value_counts_plus(colors, size=5)

Unnamed: 0,data,count,cum. count,%,cum. %
1,sienna,265,265,2.6%,2.6%
2,linen,251,516,2.5%,5.0%
3,darkgray,246,762,2.4%,7.4%
4,darkslategray,244,1006,2.4%,9.8%
5,,240,1246,2.3%,12.2%
6,fuchsia,238,1484,2.3%,14.5%
7,plum,238,1722,2.3%,16.8%
8,salmon,235,1957,2.3%,19.1%
9,mediumslateblue,235,2192,2.3%,21.4%
10,coral,233,2425,2.3%,23.7%


## Change the size of the table: `size`

In [None]:
#| echo: true
adviz.value_counts_plus(colors, size=20)

Unnamed: 0,data,count,cum. count,%,cum. %
1,sienna,265,265,2.6%,2.6%
2,linen,251,516,2.5%,5.0%
3,darkgray,246,762,2.4%,7.4%
4,darkslategray,244,1006,2.4%,9.8%
5,,240,1246,2.3%,12.2%
6,fuchsia,238,1484,2.3%,14.5%
7,plum,238,1722,2.3%,16.8%
8,salmon,235,1957,2.3%,19.1%
9,mediumslateblue,235,2192,2.3%,21.4%
10,coral,233,2425,2.3%,23.7%


## Sort `Others:`
* Are other values significant?
* What would the data look like if all others were in their sorted order?

In [None]:
#| echo: true
#| incremental: false
adviz.value_counts_plus(colors, sort_others=True)

Unnamed: 0,data,count,cum. count,%,cum. %
1,Others:,7815,7815,76.3%,76.3%
2,sienna,265,8080,2.6%,78.9%
3,linen,251,8331,2.5%,81.4%
4,darkgray,246,8577,2.4%,83.8%
5,darkslategray,244,8821,2.4%,86.1%
6,,240,9061,2.3%,88.5%
7,fuchsia,238,9299,2.3%,90.8%
8,plum,238,9537,2.3%,93.1%
9,salmon,235,9772,2.3%,95.4%
10,mediumslateblue,235,10007,2.3%,97.7%


## Change the theme of the table

In [None]:
#| echo: true
adviz.value_counts_plus(colors, background_gradient='magma')

Unnamed: 0,data,count,cum. count,%,cum. %
1,sienna,265,265,2.6%,2.6%
2,linen,251,516,2.5%,5.0%
3,darkgray,246,762,2.4%,7.4%
4,darkslategray,244,1006,2.4%,9.8%
5,,240,1246,2.3%,12.2%
6,fuchsia,238,1484,2.3%,14.5%
7,plum,238,1722,2.3%,16.8%
8,salmon,235,1957,2.3%,19.1%
9,mediumslateblue,235,2192,2.3%,21.4%
10,coral,233,2425,2.3%,23.7%


## Get the reverse of the theme by adding `_r`

In [None]:
#| echo: true
adviz.value_counts_plus(colors, background_gradient='magma_r')

Unnamed: 0,data,count,cum. count,%,cum. %
1,sienna,265,265,2.6%,2.6%
2,linen,251,516,2.5%,5.0%
3,darkgray,246,762,2.4%,7.4%
4,darkslategray,244,1006,2.4%,9.8%
5,,240,1246,2.3%,12.2%
6,fuchsia,238,1484,2.3%,14.5%
7,plum,238,1722,2.3%,16.8%
8,salmon,235,1957,2.3%,19.1%
9,mediumslateblue,235,2192,2.3%,21.4%
10,coral,233,2425,2.3%,23.7%


## Remove missing values: `dropna`

In [None]:
#| echo: true
adviz.value_counts_plus(colors, dropna=True)

Unnamed: 0,data,count,cum. count,%,cum. %
1,sienna,265,265,2.6%,2.6%
2,linen,251,516,2.5%,5.2%
3,darkgray,246,762,2.5%,7.6%
4,darkslategray,244,1006,2.4%,10.1%
5,fuchsia,238,1244,2.4%,12.4%
6,plum,238,1482,2.4%,14.8%
7,mediumslateblue,235,1717,2.4%,17.2%
8,salmon,235,1952,2.4%,19.5%
9,coral,233,2185,2.3%,21.8%
10,slategray,232,2417,2.3%,24.2%


## Use different symbols for `thousands` and `decimal`

In [None]:
#| echo: true
#| column: margin
adviz.value_counts_plus(colors, thousands='.', decimal=',', background_gradient='summer')

Unnamed: 0,data,count,cum. count,%,cum. %
1,sienna,265.0,265.0,"2,6%","2,6%"
2,linen,251.0,516.0,"2,5%","5,0%"
3,darkgray,246.0,762.0,"2,4%","7,4%"
4,darkslategray,244.0,1.006,"2,4%","9,8%"
5,,240.0,1.246,"2,3%","12,2%"
6,fuchsia,238.0,1.484,"2,3%","14,5%"
7,plum,238.0,1.722,"2,3%","16,8%"
8,salmon,235.0,1.957,"2,3%","19,1%"
9,mediumslateblue,235.0,2.192,"2,3%","21,4%"
10,coral,233.0,2.425,"2,3%","23,7%"


## Rename the data column

In [None]:
#| echo: true
adviz.value_counts_plus(colors, name='colors', background_gradient='cool_r')

Unnamed: 0,colors,count,cum. count,%,cum. %
1,sienna,265,265,2.6%,2.6%
2,linen,251,516,2.5%,5.0%
3,darkgray,246,762,2.4%,7.4%
4,darkslategray,244,1006,2.4%,9.8%
5,,240,1246,2.3%,12.2%
6,fuchsia,238,1484,2.3%,14.5%
7,plum,238,1722,2.3%,16.8%
8,salmon,235,1957,2.3%,19.1%
9,mediumslateblue,235,2192,2.3%,21.4%
10,coral,233,2425,2.3%,23.7%


## Convert to raw HTML `to_html`

In [None]:
#| echo: true

html_table = adviz.value_counts_plus(
    colors,
    background_gradient='winter_r',
    thousands='.',
    decimal=',',
    name='Colors').to_html()

print(html_table[:600])

<style type="text/css">
#T_aad9b_row0_col1, #T_aad9b_row0_col3 {
  background-color: #00fe80;
  color: #000000;
}
#T_aad9b_row0_col2, #T_aad9b_row0_col4, #T_aad9b_row1_col1, #T_aad9b_row1_col3, #T_aad9b_row2_col1, #T_aad9b_row2_col3, #T_aad9b_row3_col1, #T_aad9b_row3_col3, #T_aad9b_row4_col1, #T_aad9b_row4_col3, #T_aad9b_row5_col1, #T_aad9b_row5_col3, #T_aad9b_row6_col1, #T_aad9b_row6_col3, #T_aad9b_row7_col1, #T_aad9b_row7_col3, #T_aad9b_row8_col1, #T_aad9b_row8_col3, #T_aad9b_row9_col1, #T_aad9b_row9_col3 {
  background-color: #00ff80;
  color: #000000;
}
#T_aad9b_row1_col2, #T_aad9b_row1_co


## Get started now:
<br><br><br><br>
<h3><code>python3 -m pip install adviz</code></h3><br>

Explore more advertools [data visualizations](https://eliasdabbas.github.io/adviz)