<a href="https://colab.research.google.com/github/gabrielnd312/bar_chart_race/blob/main/Bar_Chart_Race.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

#**Exploring the Utilities of the "Bar Chart Race" in Data Visualization**

<p align=left>
<img src="https://github.com/gabrielnd312/bar_chart_race/blob/main/bar_chart_race.gif?raw=true" width="100%"></p>

---
Data visualization plays a crucial role in analyzing and effectively communicating complex information.

Among the various techniques available, the **Bar Chart Race** has emerged as a powerful tool for portraying the evolution of data over time.

This technique combines the simplicity of a **bar chart with animation**, allowing data scientists to dynamically represent the **changing positions of different categories**. In this article, we will explore the uses of Bar Chart Race and how it can aid in understanding **patterns**, **trends**, and **insights** across diverse datasets.

<p align=left>
<img src="https://sharkcoder.com/files/article/webp/mpl-bar-chart-race-main.webp" width="80%"></p>


To use this type of visualization we need a dataset where:

* Each line represents a period in time.

* Each column contains values ​​for a category

* The index contains a time element (optional)


To create this type of chart, we will first install and import the necessary library

In [1]:
#Installing the library
!pip install bar_chart_race -q

[?25l     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/156.8 kB[0m [31m?[0m eta [36m-:--:--[0m[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m156.8/156.8 kB[0m [31m11.0 MB/s[0m eta [36m0:00:00[0m
[?25h

For the demonstration we will use the example available in the library itself, with information about the **urban population**.



<p align=left>
<img src="https://images.unsplash.com/photo-1496120005468-ab3ddc9991dd?ixlib=rb-4.0.3&ixid=M3wxMjA3fDB8MHxwaG90by1wYWdlfHx8fGVufDB8fHx8fA%3D%3D&auto=format&fit=crop&w=2071&q=80" width="70%"></p>


For that we use:
your_dataframe = *bcr.load_dataset('urban_pop')*


In [2]:
#Importing the package
import bar_chart_race as bcr
import pandas as pd

df = bcr.load_dataset('urban_pop')
df.head()



Unnamed: 0_level_0,United States,India,China,Ethiopia,Poland,Malaysia,Peru,Venezuela,Iraq,Saudi Arabia,...,Iran,Turkey,Germany,Pakistan,Nigeria,Mexico,Russia,Japan,Indonesia,Brazil
year,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
1976-01-01,160611122,138219074,162497601,3194879,19215135,4802814,9834645,10382196,7530612,4673719,...,15742290,16914583,56885943,18365687,13177979,38883279,90810675,85642808,26596457,67790415
1977-01-01,162256678,143699557,165293316,3300643,19625330,5038232,10197056,10778965,7901451,5041250,...,16521005,17474315,56801863,19175026,13868037,40371908,92479950,86538157,28001978,70478354
1978-01-01,164005080,149379782,171153535,3406129,20007316,5284698,10567468,11185526,8253727,5441027,...,17336704,18048189,56796181,20032626,14598364,41879098,94157479,87391419,29468690,73245834
1979-01-01,165847531,155285824,180399661,3522584,20341874,5539880,10945723,11600322,8598800,5885282,...,18227951,18640066,56865826,20941767,15355488,43407693,95659612,88197927,30996679,76091693
1980-01-01,167551171,161444128,189947471,3658252,20663601,5801267,11331194,12022351,8945814,6382806,...,19206467,19252680,57028530,21906732,16131172,44952217,96960865,88958689,32591870,79015954


In [3]:
#Loading the dataset
bcr.bar_chart_race(df=df, filename=None)

  ax.set_yticklabels(self.df_values.columns)
  ax.set_xticklabels([max_val] * len(ax.get_xticks()))


# **And this is how the chart works.**

This is an animation where the loaded data is presented preferably in **chronological order.**

By default we can see that when a parameter is exceeded by another (in our case, when a country exceeds another) it also exceeds in the graph.

However, this type of chart allows a series of **modifications and customizations**, including fixing the bars.


Below you can identify some possible modifications, ranging from **chart orientation**, **number of elements** considered in the animation to **bar size**, **font**, **title** and much more.

You can see the need for these modifications already in our chart above.


<p align=left>
<img src="https://images.unsplash.com/photo-1561250091-aadeb0248b04?ixlib=rb-4.0.3&ixid=M3wxMjA3fDB8MHxwaG90by1wYWdlfHx8fGVufDB8fHx8fA%3D%3D&auto=format&fit=crop&w=1176&q=80" width="80%"></p>

With **a lot of information**, the graphics overlap quickly, many numbers and names are exposed, in addition to very thin bars and a numerical system that is not so visually beautiful.

Therefore, we are going to make some changes to our chart to make it more **visually pleasing**.

To facilitate **understanding and visualization**, I will put its meaning next to each of these parameters using "#".

Below are the customizations and the **result** of the modified graph:


In [11]:
# Customizations in the chart
bcr.bar_chart_race(
    df=df, # select your data
    filename=None, # naming the file
    orientation='h', # chart presentation orientation
    sort='desc', # organization
    n_bars=10, # number of bars
    fixed_order=False, # fix parameter order
    fixed_max=True, # max
    steps_per_period=10, # steps per period
    interpolate_period=False, # interpolate the period
    label_bars=True, # label the bars
    bar_size=.95, # bar size
    period_label={'x': .99, 'y': .25, 'ha': 'right', 'va': 'center'}, # period labels
    period_fmt='%B %d, %Y', # period format
    perpendicular_bar_func='median', # perpendicular bar function
    period_length=500,# period size
    figsize=(5, 3), # size
    dpi=144, #image quality
    cmap='dark12', # color scheme
    title='Urban Population', # title
    title_size='', # title size
    bar_label_size=7, # bar label size
    tick_label_size=7, # label tick size
    shared_fontdict={'family' : 'Helvetica', 'color' : '.1'}, # font
    scale='linear', # data scale
    writer=None, # write on the plot
    fig=None, # set the figure
    bar_kwargs={'alpha': .7}, # kwargs
    filter_column_colors=False)  # filter column colors

  ax.set_yticklabels(self.df_values.columns)
  ax.set_xticklabels([max_val] * len(ax.get_xticks()))


In this second graph, we already have a **much clearer and more pleasant view** of the information provided.

It's easy to identify changes in the amount of information appearing, in the format in which the date is displayed, in the graph's progress and a **gray bar** has also been added that shows where the median is at each moment.

This cleaning and organization will make it easier to understand what is being analyzed and will make the process of **passing on information** much more agile.

In addition to viewing the chart, you can also save it in **HTML** for use in other tools.

In [6]:
# salvando como html
bcr_html = bcr.bar_chart_race(df=df, filename='bar_chart_race.html')

  ax.set_yticklabels(self.df_values.columns)
  ax.set_xticklabels([max_val] * len(ax.get_xticks()))


This type of chart is not the most common one we see, especially when we talk about the corporate environment.



<p align=left>
<img src="https://images.unsplash.com/photo-1506452819137-0422416856b8?ixlib=rb-4.0.3&ixid=M3wxMjA3fDB8MHxwaG90by1wYWdlfHx8fGVufDB8fHx8fA%3D%3D&auto=format&fit=crop&w=773&q=80" width="80%"></p>


But depending on the database, it becomes an interesting presentation option, especially of events in their **chronological order**.