<h1>Counting values with <code>adviz</code></h1>

## Counting values

> Count the unique values in a collection of items

* Count in absolute and proportion (percentage) terms
* Show the cumulative count and percent
* Determine the number of top items to show
* Show all remaining items lumped into one item named `Others:`
* Change the name of the items, as well as the caption and various other styling options


## Installation and usage

#### Installation:

```bash
python3 -m pip install adviz
```

#### Usage:

```python
import adviz
adv.value_counts_plus([item_1, item_2, ... item_n], ...)

```

## Usage

Generate a random list of 10,000 colors

In [None]:
#| echo: true
#| code-fold: true
import random
import numpy as np
import matplotlib as mpl

colors = list(mpl.colors.cnames.keys())
colors = random.choices(colors, weights=[0.9, 0.04, 0.05, 0.09]*37, k=10_000)
colors += [np.nan for i in range(240)]
colors[:20]

['aliceblue',
 'palegoldenrod',
 'paleturquoise',
 'goldenrod',
 'antiquewhite',
 'linen',
 'mediumblue',
 'lightseagreen',
 'steelblue',
 'grey',
 'deeppink',
 'chartreuse',
 'grey',
 'slategray',
 'olivedrab',
 'papayawhip',
 'mediumseagreen',
 'paleturquoise',
 'crimson',
 'slategray']

In [None]:
from functools import partial
import adviz
adviz.value_counts_plus = partial(adviz.value_counts_plus, size=14)

## View default output

In [None]:
#| echo: true
import adviz
adviz.value_counts_plus(colors)

Unnamed: 0,data,count,cum. count,%,cum. %
1,olivedrab,268,268,2.6%,2.6%
2,salmon,262,530,2.6%,5.2%
3,red,252,782,2.5%,7.6%
4,coral,251,1033,2.5%,10.1%
5,darkmagenta,245,1278,2.4%,12.5%
6,burlywood,244,1522,2.4%,14.9%
7,indigo,243,1765,2.4%,17.2%
8,,240,2005,2.3%,19.6%
9,dodgerblue,238,2243,2.3%,21.9%
10,plum,234,2477,2.3%,24.2%


## Change the number of top values: `show_top=n`

In [None]:
#| echo: true
adviz.value_counts_plus(colors, show_top=5)

Unnamed: 0,data,count,cum. count,%,cum. %
1,olivedrab,268,268,2.6%,2.6%
2,salmon,262,530,2.6%,5.2%
3,red,252,782,2.5%,7.6%
4,coral,251,1033,2.5%,10.1%
5,darkmagenta,245,1278,2.4%,12.5%
6,Others:,8962,10240,87.5%,100.0%


## Remove styling: `style=False`

In [None]:
#| echo: true
#| body-width: 200px
adviz.value_counts_plus(colors, show_top=5, style=False)

Unnamed: 0,data,count,cum_count,perc,cum_perc
0,olivedrab,268,268,0.026172,0.026172
1,salmon,262,530,0.025586,0.051758
2,red,252,782,0.024609,0.076367
3,coral,251,1033,0.024512,0.100879
4,darkmagenta,245,1278,0.023926,0.124805
5,Others:,8962,10240,0.875195,1.0


## Styled vs non-styled tables
* Indexing starts with 1 to be more accessible to non-tech audience
* Indexing remains 0-based in non-styled tables for further processing
* Column headers are displayed in a more readable way

In [None]:
#| layout-ncol: 2
display(adviz.value_counts_plus(colors, show_top=5))
display(adviz.value_counts_plus(colors, show_top=5, style=False))

Unnamed: 0,data,count,cum. count,%,cum. %
1,lightgreen,272,272,2.7%,2.7%
2,linen,261,533,2.5%,5.2%
3,darkred,249,782,2.4%,7.6%
4,plum,246,1028,2.4%,10.0%
5,lightcoral,245,1273,2.4%,12.4%
6,Others:,8967,10240,87.6%,100.0%


Unnamed: 0,data,count,cum_count,perc,cum_perc
0,lightgreen,272,272,0.026562,0.026562
1,linen,261,533,0.025488,0.052051
2,darkred,249,782,0.024316,0.076367
3,plum,246,1028,0.024023,0.100391
4,lightcoral,245,1273,0.023926,0.124316
5,Others:,8967,10240,0.875684,1.0


## Change the size of the table: `size`

In [None]:
#| echo: true
adviz.value_counts_plus(colors, size=5)

Unnamed: 0,data,count,cum. count,%,cum. %
1,olivedrab,268,268,2.6%,2.6%
2,salmon,262,530,2.6%,5.2%
3,red,252,782,2.5%,7.6%
4,coral,251,1033,2.5%,10.1%
5,darkmagenta,245,1278,2.4%,12.5%
6,burlywood,244,1522,2.4%,14.9%
7,indigo,243,1765,2.4%,17.2%
8,,240,2005,2.3%,19.6%
9,dodgerblue,238,2243,2.3%,21.9%
10,plum,234,2477,2.3%,24.2%


## Change the size of the table: `size`

In [None]:
#| echo: true
adviz.value_counts_plus(colors, size=20)

Unnamed: 0,data,count,cum. count,%,cum. %
1,olivedrab,268,268,2.6%,2.6%
2,salmon,262,530,2.6%,5.2%
3,red,252,782,2.5%,7.6%
4,coral,251,1033,2.5%,10.1%
5,darkmagenta,245,1278,2.4%,12.5%
6,burlywood,244,1522,2.4%,14.9%
7,indigo,243,1765,2.4%,17.2%
8,,240,2005,2.3%,19.6%
9,dodgerblue,238,2243,2.3%,21.9%
10,plum,234,2477,2.3%,24.2%


## Sort `Others:`
* Are other values significant?
* What would the data look like if all others were in their sorted order?

In [None]:
#| echo: true
#| incremental: false
adviz.value_counts_plus(colors, sort_others=True)

Unnamed: 0,data,count,cum. count,%,cum. %
1,Others:,7763,7763,75.8%,75.8%
2,olivedrab,268,8031,2.6%,78.4%
3,salmon,262,8293,2.6%,81.0%
4,red,252,8545,2.5%,83.4%
5,coral,251,8796,2.5%,85.9%
6,darkmagenta,245,9041,2.4%,88.3%
7,burlywood,244,9285,2.4%,90.7%
8,indigo,243,9528,2.4%,93.0%
9,,240,9768,2.3%,95.4%
10,dodgerblue,238,10006,2.3%,97.7%


## Change the theme of the table

In [None]:
#| echo: true
adviz.value_counts_plus(colors, background_gradient='magma')

Unnamed: 0,data,count,cum. count,%,cum. %
1,olivedrab,268,268,2.6%,2.6%
2,salmon,262,530,2.6%,5.2%
3,red,252,782,2.5%,7.6%
4,coral,251,1033,2.5%,10.1%
5,darkmagenta,245,1278,2.4%,12.5%
6,burlywood,244,1522,2.4%,14.9%
7,indigo,243,1765,2.4%,17.2%
8,,240,2005,2.3%,19.6%
9,dodgerblue,238,2243,2.3%,21.9%
10,plum,234,2477,2.3%,24.2%


## Get the reverse of the theme by adding `_r`

In [None]:
#| echo: true
adviz.value_counts_plus(colors, background_gradient='magma_r')

Unnamed: 0,data,count,cum. count,%,cum. %
1,olivedrab,268,268,2.6%,2.6%
2,salmon,262,530,2.6%,5.2%
3,red,252,782,2.5%,7.6%
4,coral,251,1033,2.5%,10.1%
5,darkmagenta,245,1278,2.4%,12.5%
6,burlywood,244,1522,2.4%,14.9%
7,indigo,243,1765,2.4%,17.2%
8,,240,2005,2.3%,19.6%
9,dodgerblue,238,2243,2.3%,21.9%
10,plum,234,2477,2.3%,24.2%


## Remove missing values: `dropna`

In [None]:
#| echo: true
adviz.value_counts_plus(colors, dropna=True)

Unnamed: 0,data,count,cum. count,%,cum. %
1,olivedrab,268,268,2.7%,2.7%
2,salmon,262,530,2.6%,5.3%
3,red,252,782,2.5%,7.8%
4,coral,251,1033,2.5%,10.3%
5,darkmagenta,245,1278,2.5%,12.8%
6,burlywood,244,1522,2.4%,15.2%
7,indigo,243,1765,2.4%,17.6%
8,dodgerblue,238,2003,2.4%,20.0%
9,plum,234,2237,2.3%,22.4%
10,blanchedalmond,234,2471,2.3%,24.7%


## Use different symbols for `thousands` and `decimal`

In [None]:
#| echo: true
#| column: margin
adviz.value_counts_plus(colors, thousands='.', decimal=',', background_gradient='summer')

Unnamed: 0,data,count,cum. count,%,cum. %
1,olivedrab,268.0,268.0,"2,6%","2,6%"
2,salmon,262.0,530.0,"2,6%","5,2%"
3,red,252.0,782.0,"2,5%","7,6%"
4,coral,251.0,1.033,"2,5%","10,1%"
5,darkmagenta,245.0,1.278,"2,4%","12,5%"
6,burlywood,244.0,1.522,"2,4%","14,9%"
7,indigo,243.0,1.765,"2,4%","17,2%"
8,,240.0,2.005,"2,3%","19,6%"
9,dodgerblue,238.0,2.243,"2,3%","21,9%"
10,plum,234.0,2.477,"2,3%","24,2%"


## Rename the data column

In [None]:
#| echo: true
adviz.value_counts_plus(colors, name='colors', background_gradient='cool_r')

Unnamed: 0,colors,count,cum. count,%,cum. %
1,olivedrab,268,268,2.6%,2.6%
2,salmon,262,530,2.6%,5.2%
3,red,252,782,2.5%,7.6%
4,coral,251,1033,2.5%,10.1%
5,darkmagenta,245,1278,2.4%,12.5%
6,burlywood,244,1522,2.4%,14.9%
7,indigo,243,1765,2.4%,17.2%
8,,240,2005,2.3%,19.6%
9,dodgerblue,238,2243,2.3%,21.9%
10,plum,234,2477,2.3%,24.2%


## Convert to raw HTML `to_html`

In [None]:
#| echo: true

html_table = adviz.value_counts_plus(
    colors,
    background_gradient='winter_r',
    thousands='.',
    decimal=',',
    name='Colors').to_html()

print(html_table[:600])

<style type="text/css">
#T_966c6_row0_col1, #T_966c6_row0_col3 {
  background-color: #00fe80;
  color: #000000;
}
#T_966c6_row0_col2, #T_966c6_row0_col4, #T_966c6_row1_col1, #T_966c6_row1_col3, #T_966c6_row2_col1, #T_966c6_row2_col3, #T_966c6_row3_col1, #T_966c6_row3_col3, #T_966c6_row4_col1, #T_966c6_row4_col3, #T_966c6_row5_col1, #T_966c6_row5_col3, #T_966c6_row6_col1, #T_966c6_row6_col3, #T_966c6_row7_col1, #T_966c6_row7_col3, #T_966c6_row8_col1, #T_966c6_row8_col3, #T_966c6_row9_col1, #T_966c6_row9_col3 {
  background-color: #00ff80;
  color: #000000;
}
#T_966c6_row1_col2, #T_966c6_row1_co


## Get started now:
<br><br><br><br>
<h3><code>python3 -m pip install adviz</code></h3><br>

Explore more advertools [data visualizations](https://eliasdabbas.github.io/adviz)