# <font color=blue> Data Visualization

<font color=blue> **The objective of this project is twofold: to delve into the study of visualization libraries while advancing further in the exploration of Jupyter Notebook and Python. The project encompasses a major dataset and offers a range of exercises involving diverse datasets.** <font>

Several practices will be employed in this study:

1- Below certain commands, there will be a summary of their meanings.

2- All text will be written in English.

3- The data has been extracted from exercises on the Alura platform.

4- Each dataset will have a summary of its meaning.

# About

We are entrusted with a dataset encompassing a comprehensive sales report from 2016 to 2019 in our prominent online department store. Our mission is to craft compelling visualizations for various sectors within the store, effectively conveying valuable insights derived from this report.

We have received a dataset that displays the sales report from the years 2016 to 2019, and our goal is to build visualizations that can convey relevant information to other departments within the company.

To achieve this, we need to understand our database and the type of information it provides. Therefore, let's examine what each of the columns in the primary dataset signifies:

 - date_ordered (data_pedido): the date on which the customer placed the product order.
 - shipping_method (modo_envio): the shipping method specified by the customer.
 - customer_name (nome_cliente): the name of the customer.
 - customer_segment (segmento_cliente): the segment to which the customer belongs, either B2B or B2C.
 - city (cidade): the destination city of the order.
 - state (estado): the destination state of the order.
 - region (regiao): the region of the destination state of the order.
 - department (departamento): the department of the ordered product.
 - product_type (tipo_produto): the category of the ordered product.
 - sales (vendas): the sales value of the product.
 - quantity (quantidade): the quantity ordered for the product.
 - discount (desconto): the discount provided on the purchase.
 - profit (lucro): the profit/loss value for the company, considering expenses related to the product.

# To do List

1- The sales team needs to know the top 10 largest customers by total sales.

2- Visualization that allows associating: the total sales revenue and the total profit earned from the types of products sold.

3- Provide the distribution of orders by region in Brazil.

4- Find a pattern of product demands per month throughout the year.

5- Relate the shipping method of the products to the type of customer, based on the sales value obtained.

In [1]:
import pandas as pd
import numpy as np

primary_data = pd.read_csv('Dados/relatorio_vendas.csv')

primary_data['data_pedido'] = pd.to_datetime(primary_data['data_pedido'])

print("Primary Data size is: {}".format(primary_data.size))
print("Primary Data type is: {}".format(type(primary_data)))

Primary Data size is: 107280
Primary Data type is: <class 'pandas.core.frame.DataFrame'>


In [2]:
primary_data.head()

Unnamed: 0,data_pedido,modo_envio,nome_cliente,segmento_cliente,cidade,estado,regiao,departamento,tipo_produto,vendas,quantidade,lucro
0,2018-11-09,Econômica,Thiago Silveira,B2C,Ribeirão Preto,São Paulo,Sudeste,Materiais de construção,encanamentos,890.66,2,142.51
1,2018-11-09,Econômica,Thiago Silveira,B2C,Ribeirão Preto,São Paulo,Sudeste,Materiais de construção,ferramentas,2488.6,3,746.58
2,2018-06-13,Econômica,Giovanna Lima,B2B,Rio de Janeiro,Rio de Janeiro,Sudeste,Jardinagem e paisagismo,sementes,49.71,2,23.36
3,2017-10-12,Entrega padrão,Ana Júlia da Cruz,B2C,Foz do Iguaçu,Paraná,Sul,Materiais de construção,materiais de revestimento,3255.76,5,-1302.31
4,2017-10-12,Entrega padrão,Ana Júlia da Cruz,B2C,Foz do Iguaçu,Paraná,Sul,Jardinagem e paisagismo,vasos,76.05,2,8.56


# 1- Top 10 Customers

In [3]:
top_10_customers = primary_data.groupby(['nome_cliente'])['vendas'].sum().nlargest(10).copy()
top_10_customers = top_10_customers.reset_index()
top_10_customers.columns = ['Client','Sales']
top_10_customers['Rank'] = top_10_customers.index + 1
top_10_customers.set_index('Rank', inplace=True)
top_10_customers

Unnamed: 0_level_0,Client,Sales
Rank,Unnamed: 1_level_1,Unnamed: 2_level_1
1,Maria Luiza Almeida,64777.54
2,Ana Julia Pinto,51398.95
3,Ryan Farias,48178.94
4,Heitor da Mata,47610.85
5,Maria Clara Gonçalves,46946.42
6,Raquel Freitas,44826.26
7,Davi Ramos,43769.21
8,Amanda Melo,42354.27
9,Alexia Ribeiro,41238.84
10,Calebe Ribeiro,41056.7


In [4]:
# Creating object styler

s = top_10_customers.style
s.format({'Sales': 'R$ {:,.2f}'})
s

Unnamed: 0_level_0,Client,Sales
Rank,Unnamed: 1_level_1,Unnamed: 2_level_1
1,Maria Luiza Almeida,"R$ 64,777.54"
2,Ana Julia Pinto,"R$ 51,398.95"
3,Ryan Farias,"R$ 48,178.94"
4,Heitor da Mata,"R$ 47,610.85"
5,Maria Clara Gonçalves,"R$ 46,946.42"
6,Raquel Freitas,"R$ 44,826.26"
7,Davi Ramos,"R$ 43,769.21"
8,Amanda Melo,"R$ 42,354.27"
9,Alexia Ribeiro,"R$ 41,238.84"
10,Calebe Ribeiro,"R$ 41,056.70"


# 2- Total sales / Total revenue

In [5]:
# Creating dataframe

sales_revenue = primary_data.groupby(['tipo_produto'])[['vendas','lucro']].sum()
sales_revenue.columns = ['Sales','Revenue']
sales_revenue.index.name = 'Product Type'

# Creating styler

product_styler = sales_revenue.style
product_styler.format('R$ {:,.2f}')\
              .highlight_max(color='lightgreen')\
              .highlight_min(color='#F16165')

Unnamed: 0_level_0,Sales,Revenue
Product Type,Unnamed: 1_level_1,Unnamed: 2_level_1
decoração de jardim,"R$ 82,680.87","R$ 19,880.86"
encanamentos,"R$ 373,224.39","R$ -11,243.39"
equipamentos de limpeza,"R$ 542,304.55","R$ 17,448.30"
ferramentas,"R$ 995,159.43","R$ 82,042.91"
ferramentas automotivas,"R$ 502,109.33","R$ 126,660.54"
ferramentas de jardinagem,"R$ 648,880.47","R$ 106,408.80"
fertilizantes,"R$ 53,144.55","R$ 22,509.86"
iluminação,"R$ 275,229.82","R$ 40,531.09"
materiais de paisagismo,"R$ 150,552.80","R$ -3,823.67"
materiais de revestimento,"R$ 629,656.41","R$ -57,737.23"


In [64]:
# Creating gradient

product_styler = sales_revenue.style
product_styler.format('R$ {:,.2f}').background_gradient(cmap='Greens')

Unnamed: 0_level_0,Sales,Revenue
Product Type,Unnamed: 1_level_1,Unnamed: 2_level_1
decoração de jardim,"R$ 82,680.87","R$ 19,880.86"
encanamentos,"R$ 373,224.39","R$ -11,243.39"
equipamentos de limpeza,"R$ 542,304.55","R$ 17,448.30"
ferramentas,"R$ 995,159.43","R$ 82,042.91"
ferramentas automotivas,"R$ 502,109.33","R$ 126,660.54"
ferramentas de jardinagem,"R$ 648,880.47","R$ 106,408.80"
fertilizantes,"R$ 53,144.55","R$ 22,509.86"
iluminação,"R$ 275,229.82","R$ 40,531.09"
materiais de paisagismo,"R$ 150,552.80","R$ -3,823.67"
materiais de revestimento,"R$ 629,656.41","R$ -57,737.23"


In [63]:
# Creating dictionary

header = {
    'selector': 'th',
    'props': 'font-weight: bold; font-family: Arial; text-align: center; text-transform: capitalize;'
}

product_styler.format('R$ {:,.2f}').background_gradient(cmap='Greens')
product_styler.set_table_styles([header], overwrite = False)

Unnamed: 0_level_0,Sales,Revenue
Product Type,Unnamed: 1_level_1,Unnamed: 2_level_1
decoração de jardim,"R$ 82,680.87","R$ 19,880.86"
encanamentos,"R$ 373,224.39","R$ -11,243.39"
equipamentos de limpeza,"R$ 542,304.55","R$ 17,448.30"
ferramentas,"R$ 995,159.43","R$ 82,042.91"
ferramentas automotivas,"R$ 502,109.33","R$ 126,660.54"
ferramentas de jardinagem,"R$ 648,880.47","R$ 106,408.80"
fertilizantes,"R$ 53,144.55","R$ 22,509.86"
iluminação,"R$ 275,229.82","R$ 40,531.09"
materiais de paisagismo,"R$ 150,552.80","R$ -3,823.67"
materiais de revestimento,"R$ 629,656.41","R$ -57,737.23"


# 3- Orders by region in Brazil

In [8]:
# Create Dataframe

order_by_region = pd.DataFrame(primary_data['regiao'].value_counts())
order_by_region.columns = ['Number of Orders']
order_by_region.index.name = 'Region'
percent = order_by_region['Number of Orders'].to_numpy()
percent = 100*percent / percent.sum()
order_by_region['Percent'] = percent.round(2)

order_by_region

Unnamed: 0_level_0,Number of Orders,Percent
Region,Unnamed: 1_level_1,Unnamed: 2_level_1
Sudeste,4470,50.0
Nordeste,2075,23.21
Centro-Oeste,983,11.0
Norte,779,8.71
Sul,633,7.08


In [52]:
# Styling the dataframe

styler_region = order_by_region.style

header = {
    'selector': 'th',
    'props': 'font-weight: bold; font-family: Arial; text-align: right;background-color: white',
    
}


cells = {
    'selector':'td',
    'props':'background-color: white;'
}

styler_region.set_table_styles([header,cells])

styler_region.format({'Percent':'{:.2f}%'})\
             .bar(subset = 'Percent', vmin=0, vmax=100.0, color='#9CD33B')

Unnamed: 0_level_0,Number of Orders,Percent
Region,Unnamed: 1_level_1,Unnamed: 2_level_1
Sudeste,4470,50.00%
Nordeste,2075,23.21%
Centro-Oeste,983,11.00%
Norte,779,8.71%
Sul,633,7.08%


# 4- Pattern of product demands

In [14]:
# Creating dataframe to find a temporal order

months = primary_data.copy()
months = months.sort_values('data_pedido')
months['months'] = months['data_pedido'].dt.strftime('%Y - %b')
months = months.reset_index(drop=True)
months

Unnamed: 0,data_pedido,modo_envio,nome_cliente,segmento_cliente,cidade,estado,regiao,departamento,tipo_produto,vendas,quantidade,lucro,months
0,2016-01-04,Entrega padrão,Ana Júlia Monteiro,B2C,Belo Horizonte,Minas Gerais,Sudeste,Jardinagem e paisagismo,pesticidas,55.92,2,18.87,2016 - Jan
1,2016-01-05,Entrega padrão,Maria Cecília Jesus,B2B,Botucatu,São Paulo,Sudeste,Jardinagem e paisagismo,ferramentas de jardinagem,12.04,2,-18.66,2016 - Jan
2,2016-01-05,Entrega padrão,Maria Cecília Jesus,B2B,Botucatu,São Paulo,Sudeste,Jardinagem e paisagismo,vasos,927.30,3,-220.23,2016 - Jan
3,2016-01-05,Entrega padrão,Maria Cecília Jesus,B2B,Botucatu,São Paulo,Sudeste,Jardinagem e paisagismo,sementes,40.07,3,14.52,2016 - Jan
4,2016-01-06,Entrega padrão,Marcelo Rezende,B2C,Brasília,Distrito Federal,Centro-Oeste,Jardinagem e paisagismo,decoração de jardim,66.42,3,16.61,2016 - Jan
...,...,...,...,...,...,...,...,...,...,...,...,...,...
8935,2019-12-31,Entrega padrão,Lara Pinto,B2C,São Paulo,São Paulo,Sudeste,Jardinagem e paisagismo,ferramentas de jardinagem,179.44,3,67.29,2019 - Dec
8936,2019-12-31,Entrega padrão,Lara Pinto,B2C,São Paulo,São Paulo,Sudeste,Automotivo,pneus,309.16,7,9.27,2019 - Dec
8937,2019-12-31,Entrega padrão,Lara Pinto,B2C,São Paulo,São Paulo,Sudeste,Materiais de construção,encanamentos,1098.66,4,41.20,2019 - Dec
8938,2019-12-31,Entrega padrão,Cauê Martins,B2B,Osasco,São Paulo,Sudeste,Jardinagem e paisagismo,ferramentas de jardinagem,47.27,2,15.36,2019 - Dec


In [15]:
# Creating pivot table

month_sales = months.pivot_table(index= 'departamento', columns= 'months', values= 'quantidade', aggfunc= 'sum', sort= False)
month_sales

months,2016 - Apr,2016 - Aug,2016 - Dec,2016 - Feb,2016 - Jan,2016 - Jul,2016 - Jun,2016 - Mar,2016 - May,2016 - Nov,...,2019 - Dec,2019 - Feb,2019 - Jan,2019 - Jul,2019 - Jun,2019 - Mar,2019 - May,2019 - Nov,2019 - Oct,2019 - Sep
departamento,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
Jardinagem e paisagismo,323,339,525,87,161,348,273,277,270,716,...,989,205,361,423,490,458,542,946,617,929
Automotivo,107,95,154,20,45,91,83,46,81,236,...,252,63,109,168,170,167,150,331,205,280
Materiais de construção,69,101,269,23,60,104,114,102,91,213,...,395,67,83,162,151,107,172,312,213,282


In [50]:
# Creating styler

styler_month = month_sales.style
styler_month.set_sticky(axis= 'index')

columns = {
    'selector': '.col_heading',
    'props': 'font-weight: normal; font-family: Arial;'
}

tables = {
    'selector': 'td, th',
    'props': 'text-align: left;'
}

index = {
    'selector': '.index_name',
    'props': 'font-weight: bold; font-family: Arial; text-align: right;'
}


styler_month.set_table_styles([columns, tables, index], overwrite=False)

months,2016 - Apr,2016 - Aug,2016 - Dec,2016 - Feb,2016 - Jan,2016 - Jul,2016 - Jun,2016 - Mar,2016 - May,2016 - Nov,2016 - Oct,2016 - Sep,2017 - Apr,2017 - Aug,2017 - Dec,2017 - Feb,2017 - Jan,2017 - Jul,2017 - Jun,2017 - Mar,2017 - May,2017 - Nov,2017 - Oct,2017 - Sep,2018 - Apr,2018 - Aug,2018 - Dec,2018 - Feb,2018 - Jan,2018 - Jul,2018 - Jun,2018 - Mar,2018 - May,2018 - Nov,2018 - Oct,2018 - Sep,2019 - Apr,2019 - Aug,2019 - Dec,2019 - Feb,2019 - Jan,2019 - Jul,2019 - Jun,2019 - Mar,2019 - May,2019 - Nov,2019 - Oct,2019 - Sep
departamento,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1,Unnamed: 22_level_1,Unnamed: 23_level_1,Unnamed: 24_level_1,Unnamed: 25_level_1,Unnamed: 26_level_1,Unnamed: 27_level_1,Unnamed: 28_level_1,Unnamed: 29_level_1,Unnamed: 30_level_1,Unnamed: 31_level_1,Unnamed: 32_level_1,Unnamed: 33_level_1,Unnamed: 34_level_1,Unnamed: 35_level_1,Unnamed: 36_level_1,Unnamed: 37_level_1,Unnamed: 38_level_1,Unnamed: 39_level_1,Unnamed: 40_level_1,Unnamed: 41_level_1,Unnamed: 42_level_1,Unnamed: 43_level_1,Unnamed: 44_level_1,Unnamed: 45_level_1,Unnamed: 46_level_1,Unnamed: 47_level_1,Unnamed: 48_level_1
Jardinagem e paisagismo,323,339,525,87,161,348,273,277,270,716,276,549,271,322,628,131,97,268,321,278,269,746,260,663,338,380,770,177,190,413,473,260,463,727,402,757,415,561,989,205,361,423,490,458,542,946,617,929
Automotivo,107,95,154,20,45,91,83,46,81,236,85,154,111,117,235,44,55,88,59,74,94,179,100,180,72,123,189,49,59,117,118,112,124,253,103,181,87,136,252,63,109,168,170,167,150,331,205,280
Materiais de construção,69,101,269,23,60,104,114,102,91,213,94,182,85,106,258,38,93,128,120,100,76,259,129,194,128,113,294,67,91,132,117,127,171,308,171,227,131,132,395,67,83,162,151,107,172,312,213,282


# 5- Shipping method based on customer

In [55]:
# Creating pivot table

shipping_customer = primary_data.pivot_table(index='segmento_cliente', columns='modo_envio', values='vendas', aggfunc='sum')
shipping_customer['Total'] = shipping_customer.sum(axis=1)
shipping_customer.loc['Total'] = shipping_customer.sum()
shipping_customer

modo_envio,24 horas,Econômica,Entrega padrão,Envio rápido,Total
segmento_cliente,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
B2B,217466.05,718462.52,1884740.78,602049.56,3422718.91
B2C,182531.8,729176.13,2248286.96,498295.55,3658290.44
Total,399997.85,1447638.65,4133027.74,1100345.11,7081009.35


In [60]:
# Creating styler object

sales_customer = shipping_customer.style.format('{:,.2f}')
sales_customer

# Changing table visualization

table = {
    'selector':'td, th:not(.index_name)',
    'props': 'font-weight: normal; font-family: Arial; text-align: center; background-color: white'
}

index = {
    'selector': '.index_name',
    'props': 'font-weight: normal; font-align: right; font-style: italic; color: #696969'
}

sales_customer.set_table_styles([table,index])

modo_envio,24 horas,Econômica,Entrega padrão,Envio rápido,Total
segmento_cliente,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
B2B,217466.05,718462.52,1884740.78,602049.56,3422718.91
B2C,182531.8,729176.13,2248286.96,498295.55,3658290.44
Total,399997.85,1447638.65,4133027.74,1100345.11,7081009.35


In [62]:
# Creating lines on table

sales_customer.set_table_styles({
    'Total': [{
        'selector': 'th',
        'props': 'border-top: 1px solid #181818'
    },
    {
        'selector': 'td',
        'props': 'border-top: 1px solid #181818'
    }],
    
    'B2B': [{
        'selector': 'th',
        'props': 'border-top: 1px solid #181818'
    },
    {
        'selector': 'td',
        'props': 'border-top: 1px solid #181818'
    }]
}, overwrite=False, axis=1)

modo_envio,24 horas,Econômica,Entrega padrão,Envio rápido,Total
segmento_cliente,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
B2B,217466.05,718462.52,1884740.78,602049.56,3422718.91
B2C,182531.8,729176.13,2248286.96,498295.55,3658290.44
Total,399997.85,1447638.65,4133027.74,1100345.11,7081009.35


In [72]:
# Changing elements on columns

sales_customer.set_table_styles({
    'Total': [{
        'selector': '.true',
        'props': 'background-color: #D8D8D8'
    }]
}, overwrite=False, axis=0)

colors_columns = pd.DataFrame(['false','true','false'], index = shipping_customer['Total'].index,
                             columns=['Total'])

sales_customer.set_td_classes(colors_columns)

# Changing elements on rows

sales_customer.set_table_styles({
    'Total': [{
        'selector': '.true',
        'props': 'background-color: #D8D8D8;'
    }]
}, overwrite=False, axis=1)

colors_rows =  pd.DataFrame([['false', 'false', 'true ', 'false', 'false']],
                            columns=shipping_customer.columns,
                            index=['Total'])

sales_customer.set_td_classes(colors_rows)

modo_envio,24 horas,Econômica,Entrega padrão,Envio rápido,Total
segmento_cliente,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
B2B,217466.05,718462.52,1884740.78,602049.56,3422718.91
B2C,182531.8,729176.13,2248286.96,498295.55,3658290.44
Total,399997.85,1447638.65,4133027.74,1100345.11,7081009.35
