## Case - Deep Dive Analysis

Background:
There are multiple instances where the Value Sales have decreased for different manufacturers in the last month compared to the month before that and it is important for the business to understand the exact area or level from where the drop is coming from

Objective:
To create a generic Python function which can do a deep dive analysis and give out exact focus area which are behind the cause of drop of Value Sales for a particular Manufacturer

Steps to follow:
1. Check if there is a drop in the Value Sales in the target month compared to last month
2. If Step 1 shows an increase then print "There is no drop in the sales for a <particular manufacturer> in the <specified time period>" 
3. If Step 1 shows a drop, then proceed by picking different deep dive options: Product level, Geographical level
4. To do a deep dive on levels - zone,region,brand,subbrand
5. calculate the growth rate of sales in the target month and contribution to overall value sales
6. Take a product of the above two metrics to get a decisive metric which can then be sorted accordingly to get the actual focus area
7. Expected final input to the function:
    
    func(Manufacturer = "X", target_period = "May2019", reference_period = "Apr2019")
    
8. Expected final output of the function:
    
    
    Manufacturer	level	focus_area	growth_rate	contribution	product
    X	            Zone		               -20%	         50%	  -0.10
    X	            Zone		               -30%	         10%	  -0.03
    X	            Zone				
    X	            Zone				
    X	            Subbrand				
    X	            Subbrand				
    X	            Region		             -15%	         30%	  -0.05
    X	            Region				
    X	            Brand		              -25%	          5%	  -0.01
    X					

    
Additional Information:
1. There is a Geographical Level Hierarchy - Zone, Region
2. There is a Product Level Hierarchy- Manufacturer, Brand, Subbrand, Item
3. Pack Type and Pack Size are present under Item level
4. Formula for growth_rate = ( Value Sales (focus_area) (target_period) - Value Sales (focus_area) (reference_period) ) * 100 / Value Sales (focus_area) (reference_period)
5. Formula for contribution = Value Sales (focus_area) (target_period) * 100 / Value Sales (X) (target_period)
6. Formula for product = growth_rate * contribution
7. Focus Area can be any individual value of Geographical Level Hierarchy and Product Level Hierarchy. For example, North Zone, Urban, Brand 1, Subbrand 1
    
    
## Data Dictionary
        Label Name	   Label Description
            Zone	     Geographical Zone
            Region	     Geographical Region
      Manufacturer	     Name of the company
            Brand	     Name of the brand
          Subbrand	     Name of the subbrand
            Item	     Name of the item
           PackSize	     Pack size of items
           Packtype	     Type of pack of items
            month	     month of sale
  Value Offtake(000 Rs)    Value Sales in Rs. Thousands



#### Import the libraries

In [1]:
import pandas as pd
import warnings
warnings.filterwarnings("ignore")

#### Load the file from stage location

In [2]:
location = 'D:/Xzane/Priyesh/Courses - Training materials/Companies/LT/case study/Case Study - Deep Dive Analysis.xlsx'

In [3]:
df_file = pd.read_excel(location,sheet_name = 'input_data')

In [4]:
df_file.head(50)

Unnamed: 0,Zone,Region,Manufacturer,Brand,Subbrand,Item,PackSize,Packtype,month,Value Offtake(000 Rs)
0,East,Rural,GLAXOSMITHKLINE,HORLICKS,HORLICKS LITE,HORLICKS LITE 500 GMS REFILL PACK,500 GMS,REFILL PACK,2019-01-01,4883.05
1,East,Rural,GLAXOSMITHKLINE,HORLICKS,WOMENS HORLICKS,WOMENS HORLICKS 500 GMS REFILL PACK,500 GMS,REFILL PACK,2019-01-01,4460.013
2,East,Rural,GLAXOSMITHKLINE,BOOST,BOOST SPORTS,BOOST SPORTS 500 GMS REFILL PACK,500 GMS,REFILL PACK,2019-01-01,3230.254
3,East,Rural,GLAXOSMITHKLINE,HORLICKS,HORLICKS HEALTH AND NUTRITION,HORLICKS HEALTH AND NUTRITION 500 GMS REFILL PACK,500 GMS,REFILL PACK,2019-01-01,4548.961
4,East,Rural,GLAXOSMITHKLINE,BOOST,BOOST HEALTH AND ENERGY,BOOST HEALTH AND ENERGYÂ 500 GMS REFILL PACK,500 GMS,REFILL PACK,2019-01-01,2993.13
5,East,Rural,GLAXOSMITHKLINE,HORLICKS,WOMENS HORLICKS,WOMENS HORLICKS 750 GMS REFILL PACK,750 GMS,REFILL PACK,2019-01-01,2312.412
6,East,Rural,GLAXOSMITHKLINE,HORLICKS,HORLICKS HEALTH AND NUTRITION,HORLICKS HEALTH AND NUTRITION 750 GMS REFILL PACK,750 GMS,REFILL PACK,2019-01-01,1603.273
7,East,Rural,GLAXOSMITHKLINE,HORLICKS,HORLICKS HEALTH AND NUTRITION,HORLICKS HEALTH AND NUTRITION 1000 GMS REFILL ...,1000 GMS,REFILL PACK,2019-01-01,995.398
8,East,Rural,GLAXOSMITHKLINE,HORLICKS,WOMENS HORLICKS,WOMENS HORLICKS 1000 GMS REFILL PACK,1000 GMS,REFILL PACK,2019-01-01,632.588
9,East,Rural,MONDELEZ,BOURNVITA,BOURNVITA PROHEALTH,BOURNVITA PROHEALTH 500 GMS REFILL PACK,500 GMS,REFILL PACK,2019-01-01,800.874


In [5]:
df_file.isnull().sum()

Zone                     0
Region                   0
Manufacturer             0
Brand                    0
Subbrand                 0
Item                     0
PackSize                 0
Packtype                 0
month                    0
Value Offtake(000 Rs)    0
dtype: int64

In [6]:
df_file.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 1280 entries, 0 to 1279
Data columns (total 10 columns):
 #   Column                 Non-Null Count  Dtype         
---  ------                 --------------  -----         
 0   Zone                   1280 non-null   object        
 1   Region                 1280 non-null   object        
 2   Manufacturer           1280 non-null   object        
 3   Brand                  1280 non-null   object        
 4   Subbrand               1280 non-null   object        
 5   Item                   1280 non-null   object        
 6   PackSize               1280 non-null   object        
 7   Packtype               1280 non-null   object        
 8   month                  1280 non-null   datetime64[ns]
 9   Value Offtake(000 Rs)  1280 non-null   float64       
dtypes: datetime64[ns](1), float64(1), object(8)
memory usage: 100.1+ KB


In [7]:
df_file.describe()

Unnamed: 0,Value Offtake(000 Rs)
count,1280.0
mean,957.548663
std,1822.845836
min,0.0
25%,37.425
50%,233.1515
75%,929.90925
max,17548.553


###### Except Value OffTake(000 Rs) others are categorial features. Also highly deviated observed from std value

In [8]:
df = df_file.copy()

In [9]:
#df_file_test['Geo_lvl']= df_file_test['Zone'].str.cat(df_file_test['Region'], sep=",")
#df_file_test['Prod_lvl']= df_file_test['Brand']+','+df_file_test['Subbrand']+','+df_file_test['Item']

In [10]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 1280 entries, 0 to 1279
Data columns (total 10 columns):
 #   Column                 Non-Null Count  Dtype         
---  ------                 --------------  -----         
 0   Zone                   1280 non-null   object        
 1   Region                 1280 non-null   object        
 2   Manufacturer           1280 non-null   object        
 3   Brand                  1280 non-null   object        
 4   Subbrand               1280 non-null   object        
 5   Item                   1280 non-null   object        
 6   PackSize               1280 non-null   object        
 7   Packtype               1280 non-null   object        
 8   month                  1280 non-null   datetime64[ns]
 9   Value Offtake(000 Rs)  1280 non-null   float64       
dtypes: datetime64[ns](1), float64(1), object(8)
memory usage: 100.1+ KB


"""
import sqlit`e3
db_path = "C:/Users/priye/PythonProjects/database/LntCaseStudy.db"
con = sqlite3.connect(db_path)
cur = con.cursor()
df_file_test.to_sql('LNT', con, if_exists='replace', index=False)
con.commit()
con.close()
"""

In [11]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 1280 entries, 0 to 1279
Data columns (total 10 columns):
 #   Column                 Non-Null Count  Dtype         
---  ------                 --------------  -----         
 0   Zone                   1280 non-null   object        
 1   Region                 1280 non-null   object        
 2   Manufacturer           1280 non-null   object        
 3   Brand                  1280 non-null   object        
 4   Subbrand               1280 non-null   object        
 5   Item                   1280 non-null   object        
 6   PackSize               1280 non-null   object        
 7   Packtype               1280 non-null   object        
 8   month                  1280 non-null   datetime64[ns]
 9   Value Offtake(000 Rs)  1280 non-null   float64       
dtypes: datetime64[ns](1), float64(1), object(8)
memory usage: 100.1+ KB


In [12]:
cols=['PackSize','Packtype']
#col=['month_new']
df.drop(cols,axis=1,inplace=True)

In [13]:
df.rename(columns={"Value Offtake(000 Rs)":"Value Sales"},inplace=True)
df.head()

Unnamed: 0,Zone,Region,Manufacturer,Brand,Subbrand,Item,month,Value Sales
0,East,Rural,GLAXOSMITHKLINE,HORLICKS,HORLICKS LITE,HORLICKS LITE 500 GMS REFILL PACK,2019-01-01,4883.05
1,East,Rural,GLAXOSMITHKLINE,HORLICKS,WOMENS HORLICKS,WOMENS HORLICKS 500 GMS REFILL PACK,2019-01-01,4460.013
2,East,Rural,GLAXOSMITHKLINE,BOOST,BOOST SPORTS,BOOST SPORTS 500 GMS REFILL PACK,2019-01-01,3230.254
3,East,Rural,GLAXOSMITHKLINE,HORLICKS,HORLICKS HEALTH AND NUTRITION,HORLICKS HEALTH AND NUTRITION 500 GMS REFILL PACK,2019-01-01,4548.961
4,East,Rural,GLAXOSMITHKLINE,BOOST,BOOST HEALTH AND ENERGY,BOOST HEALTH AND ENERGYÂ 500 GMS REFILL PACK,2019-01-01,2993.13


In [14]:
level_df = pd.DataFrame([('Zone', 'East'),
                       ('Zone', 'North'),
                       ('Zone', 'South'),
                       ('Zone', 'West'),
                       ('Region', 'Rural'),
                       ('Region', 'Urban'),
                       ('Brand', 'AMUL'),
                       ('Brand', 'BOOST'),
                       ('Brand', 'BOURNVITA'),
                       ('Brand', 'COMPLAN'),
                       ('Brand', 'HORLICKS'),
                       ('Brand', 'MILO'),
                       ('Subbrand', 'AMUL PRO WHEY'),
                       ('Subbrand', 'BOOST HEALTH AND ENERGY'),
                       ('Subbrand', 'BOOST SPORTS'),
                       ('Subbrand', 'BOURNVITA PROHEALTH'),
                       ('Subbrand', 'COMPLAN ROYALE CHOCOLATE'),
                       ('Subbrand', 'HORLICKS HEALTH AND NUTRITION'),
                       ('Subbrand', 'HORLICKS LITE'),
                       ('Subbrand', 'MILO ACTIVE GO'),
                       ('Subbrand', 'WOMENS HORLICKS')],
                      columns = ['Level', 'focus_area'])
level_df.set_index('focus_area')

Unnamed: 0_level_0,Level
focus_area,Unnamed: 1_level_1
East,Zone
North,Zone
South,Zone
West,Zone
Rural,Region
Urban,Region
AMUL,Brand
BOOST,Brand
BOURNVITA,Brand
COMPLAN,Brand


In [15]:
# For Overall
df_grp_man = df.groupby(['Manufacturer','month'], sort=True, as_index=False)['Value Sales'].sum()
df_grp_man['diff_frm_prev_month'] = df_grp_man.groupby('Manufacturer')['Value Sales'].diff()



In [16]:
df_grp_man.head(10)

Unnamed: 0,Manufacturer,month,Value Sales,diff_frm_prev_month
0,AMUL,2019-01-01,1525.574,
1,AMUL,2019-02-01,1807.457,281.883
2,AMUL,2019-03-01,1755.112,-52.345
3,AMUL,2019-04-01,1955.994,200.882
4,AMUL,2019-05-01,1410.972,-545.022
5,GLAXOSMITHKLINE,2019-01-01,201960.379,
6,GLAXOSMITHKLINE,2019-02-01,197328.072,-4632.307
7,GLAXOSMITHKLINE,2019-03-01,176568.68,-20759.392
8,GLAXOSMITHKLINE,2019-04-01,158163.671,-18405.009
9,GLAXOSMITHKLINE,2019-05-01,176433.256,18269.585


In [17]:
# For zone 
df_grp_Z = df.groupby(['Manufacturer','month','Zone'], sort=True, as_index=False)['Value Sales'].sum()
df_grp_Z['diff_frm_prev_month'] = df_grp_Z.groupby(['Manufacturer','Zone'])['Value Sales'].diff()

In [18]:
df_grp_Z.head()

Unnamed: 0,Manufacturer,month,Zone,Value Sales,diff_frm_prev_month
0,AMUL,2019-01-01,East,80.591,
1,AMUL,2019-01-01,North,236.7,
2,AMUL,2019-01-01,South,849.921,
3,AMUL,2019-01-01,West,358.362,
4,AMUL,2019-02-01,East,139.413,58.822


In [19]:
# For Zone_Region 
df_grp_ZR = df.groupby(['Manufacturer','month','Zone','Region'], sort=True, as_index=False)['Value Sales'].sum()
df_grp_ZR['diff_frm_prev_month'] = df_grp_ZR.groupby(['Manufacturer','Zone','Region'])['Value Sales'].diff()

In [20]:
df_grp_ZR.head()

Unnamed: 0,Manufacturer,month,Zone,Region,Value Sales,diff_frm_prev_month
0,AMUL,2019-01-01,East,Rural,14.449,
1,AMUL,2019-01-01,East,Urban,66.142,
2,AMUL,2019-01-01,North,Rural,0.0,
3,AMUL,2019-01-01,North,Urban,236.7,
4,AMUL,2019-01-01,South,Rural,91.383,


In [21]:
# For Zone_Region_Brand 
df_grp_ZRB = df.groupby(['Manufacturer','month','Zone','Region','Brand'], sort=True, as_index=False)['Value Sales'].sum()
df_grp_ZRB['diff_frm_prev_month'] = df_grp_ZRB.groupby(['Manufacturer','Zone','Region','Brand'])['Value Sales'].diff()

In [22]:
df_grp_ZRB.head()

Unnamed: 0,Manufacturer,month,Zone,Region,Brand,Value Sales,diff_frm_prev_month
0,AMUL,2019-01-01,East,Rural,AMUL,14.449,
1,AMUL,2019-01-01,East,Urban,AMUL,66.142,
2,AMUL,2019-01-01,North,Rural,AMUL,0.0,
3,AMUL,2019-01-01,North,Urban,AMUL,236.7,
4,AMUL,2019-01-01,South,Rural,AMUL,91.383,


In [23]:
# For Zone_Region_Brand_Subbrand 
df_grp_ZRBS = df.groupby(['Manufacturer','month','Zone','Region','Brand','Subbrand'], sort=True, as_index=False)['Value Sales'].sum()
df_grp_ZRBS['diff_frm_prev_month'] = df_grp_ZRBS.groupby(['Manufacturer','Zone','Region','Brand','Subbrand'])['Value Sales'].diff()

In [66]:
def manufacturer(manu,date1,date2):
    if date2 <= date1:
        print("Invalid Date selection. Try again!!")
    else:
        neg_vals_Z = df_grp_Z[(df_grp_Z['diff_frm_prev_month']<0)&(df_grp_Z['Manufacturer']==manu) & (df_grp_Z['month']==date2)]
        neg_vals_Z['prev_Value Sales'] = neg_vals_Z['Value Sales'] - neg_vals_Z['diff_frm_prev_month']
        neg_vals_Z['growth rate'] = neg_vals_Z['diff_frm_prev_month']/neg_vals_Z['prev_Value Sales'] #).astype(str) + '%'
        neg_vals_Z['contribution'] = neg_vals_Z['Value Sales']/ df_grp_man['Value Sales'][(df_grp_man['month']==date2)&(df_grp_man['Manufacturer']==manu)].values[0] #).astype(str) + '%'
        neg_vals_Z['product']=neg_vals_Z['growth rate'] * neg_vals_Z['contribution']
        net_df_grp_Z = neg_vals_Z[['Manufacturer','Zone','growth rate','contribution','product']]
        net_df_grp_Z=net_df_grp_Z.groupby(['Manufacturer','Zone'],sort=True, as_index=False).sum()
        net_df_grp_Z.rename(columns={"Zone":"focus_area"},inplace=True)
        #print(net_df_grp_Z)
        
        neg_vals_R = df_grp_ZR[(df_grp_ZR['diff_frm_prev_month']<0)&(df_grp_ZR['Manufacturer']==manu) & (df_grp_ZR['month']==date2)]
        neg_vals_R['prev_Value Sales'] = neg_vals_R['Value Sales'] - neg_vals_R['diff_frm_prev_month']
        neg_vals_R['growth rate'] = neg_vals_R['diff_frm_prev_month']/neg_vals_R['prev_Value Sales'] #).astype(str) + '%'
        neg_vals_R['contribution'] = neg_vals_R['Value Sales']/ df_grp_man['Value Sales'][(df_grp_man['month']==date2)&(df_grp_man['Manufacturer']==manu)].values[0] #).astype(str) + '%'
        neg_vals_R['product']=neg_vals_R['growth rate'] * neg_vals_R['contribution']
        net_df_grp_ZR = neg_vals_R[['Manufacturer','Region','growth rate','contribution','product']]
        net_df_grp_ZR=net_df_grp_ZR.groupby(['Manufacturer','Region'],sort=True, as_index=False).sum()
        net_df_grp_ZR.rename(columns={"Region":"focus_area"},inplace=True)
        #print(net_df_grp_ZR)
        
        neg_vals_B = df_grp_ZRB[(df_grp_ZRB['diff_frm_prev_month']<0)&(df_grp_ZRB['Manufacturer']==manu) & (df_grp_ZRB['month']==date2)]
        neg_vals_B['prev_Value Sales'] = neg_vals_B['Value Sales'] - neg_vals_B['diff_frm_prev_month']
        neg_vals_B['growth rate'] = neg_vals_B['diff_frm_prev_month']/neg_vals_B['prev_Value Sales'] #).astype(str) + '%'
        neg_vals_B['contribution'] = neg_vals_B['Value Sales']/ df_grp_man['Value Sales'][(df_grp_man['month']==date2)&(df_grp_man['Manufacturer']==manu)].values[0] #).astype(str) + '%'
        neg_vals_B['product']=neg_vals_B['growth rate'] * neg_vals_B['contribution']
        net_df_grp_ZRB = neg_vals_B[['Manufacturer','Brand','growth rate','contribution','product']]
        net_df_grp_ZRB=net_df_grp_ZRB.groupby(['Manufacturer','Brand'],sort=True, as_index=False).sum()
        net_df_grp_ZRB.rename(columns={"Brand":"focus_area"},inplace=True)
        #print(net_df_grp_ZRB)
        
        
        neg_vals = df_grp_ZRBS[(df_grp_ZRBS['diff_frm_prev_month']<0)&(df_grp_ZRBS['Manufacturer']==manu) & (df_grp_ZRBS['month']==date2)]
        neg_vals['prev_Value Sales'] = neg_vals['Value Sales'] - neg_vals['diff_frm_prev_month']
        neg_vals['growth rate'] = neg_vals['diff_frm_prev_month']/neg_vals['prev_Value Sales'] #).astype(str) + '%'
        neg_vals['contribution'] = neg_vals['Value Sales']/ df_grp_man['Value Sales'][(df_grp_man['month']==date2)&(df_grp_man['Manufacturer']==manu)].values[0] #).astype(str) + '%'
        neg_vals['product']=neg_vals['growth rate'] * neg_vals['contribution']
        net_df_grp_ZRBS = neg_vals[['Manufacturer','Subbrand','growth rate','contribution','product']]
        net_df_grp_ZRBS=net_df_grp_ZRBS.groupby(['Manufacturer','Subbrand'],sort=True, as_index=False).sum()
        net_df_grp_ZRBS.rename(columns={"Subbrand":"focus_area"},inplace=True)
        #print(f"\n\n{net_df_grp_ZRBS}")
        
        dfs = [net_df_grp_Z,net_df_grp_ZR,net_df_grp_ZRB,net_df_grp_ZRBS]
        df_con= pd.concat(dfs,axis=0,ignore_index=True, sort=False)
        df_con.set_index('focus_area',inplace=True)
        df_con =  pd.merge(df_con, level_df, how='left', on=['focus_area'])
        df_con.reset_index(drop=True,inplace= True)
        final_df = df_con[['Manufacturer','Level','focus_area','growth rate','contribution','product']]
        return(final_df)
            

manufacturer('KRAFT FOODS','2019-03-01','2019-04-01')
       

Unnamed: 0,Manufacturer,Level,focus_area,growth rate,contribution,product
0,KRAFT FOODS,Zone,East,-0.259213,0.181949,-0.047164
1,KRAFT FOODS,Zone,South,-0.008678,0.125672,-0.001091
2,KRAFT FOODS,Zone,West,-0.073576,0.350887,-0.025817
3,KRAFT FOODS,Region,Rural,-1.103542,0.006589,-0.004229
4,KRAFT FOODS,Region,Urban,-0.383361,0.893365,-0.089176
5,KRAFT FOODS,Brand,COMPLAN,-1.486903,0.899954,-0.093405
6,KRAFT FOODS,Subbrand,COMPLAN ROYALE CHOCOLATE,-1.486903,0.899954,-0.093405


In [33]:
# 4 to 1 mapping
# last step to map 

import numpy as np
level_map={}
cols=['Zone','Region','Brand','Subbrand']
for i in range(len(cols)):
    x = [x for x in df.columns if x in cols[i]]
    #print(x)
    col_val =np.unique(df[x].values) 
    #print(col_val)
    for i in range(len(col_val)):
        level_map[col_val[i]]=x
   
    
        
    #print('\n')

#print(pd.DataFrame(level_map))
#print(level_map)

In [34]:
df_list= [df_grp_Z,df_grp_ZR,df_grp_ZRB,df_grp_ZRBS]

In [40]:
for i in range(len(df_list)):
    tempdf = df_list[i]
    tempdf['level'] = 0 #= df[x].map(level_map)
    df_new=tempdf
#print(df_new)

In [36]:
df_grp_ZRB['Level']=df['Region'].map(level_map)

In [37]:
df_grp_ZRB.head()

Unnamed: 0,Manufacturer,month,Zone,Region,Brand,Value Sales,diff_frm_prev_month,level,Level
0,AMUL,2019-01-01,East,Rural,AMUL,14.449,,0,[Region]
1,AMUL,2019-01-01,East,Urban,AMUL,66.142,,0,[Region]
2,AMUL,2019-01-01,North,Rural,AMUL,0.0,,0,[Region]
3,AMUL,2019-01-01,North,Urban,AMUL,236.7,,0,[Region]
4,AMUL,2019-01-01,South,Rural,AMUL,91.383,,0,[Region]


In [63]:
cols=['Zone','Region','Brand','Subbrand']
for i in range(len(cols)):
    x = [x for x in df.columns if x in cols[i]]
    #print(x)
    #df_grp_ZRB['Level']=df['Region'].map(level_map)
    for i in range(len(df_list)):
        tempdf=0
        tempdf = df_list[i]
        tempdf['level'] = df[x]
        print(tempdf)

   Manufacturer      month   Zone  Value Sales  diff_frm_prev_month  level
0          AMUL 2019-01-01   East       80.591                  NaN   East
1          AMUL 2019-01-01  North      236.700                  NaN   East
2          AMUL 2019-01-01  South      849.921                  NaN   East
3          AMUL 2019-01-01   West      358.362                  NaN   East
4          AMUL 2019-02-01   East      139.413               58.822   East
..          ...        ...    ...          ...                  ...    ...
95       NESTLE 2019-04-01   West     2658.452             -248.780  North
96       NESTLE 2019-05-01   East     3197.894              656.629  North
97       NESTLE 2019-05-01  North     3314.420              -59.308  North
98       NESTLE 2019-05-01  South     1485.547               65.020  North
99       NESTLE 2019-05-01   West     2738.004               79.552  North

[100 rows x 6 columns]
    Manufacturer      month   Zone Region  Value Sales  diff_frm_prev_month 

[240 rows x 9 columns]
    Manufacturer      month   Zone Region Brand        Subbrand  Value Sales  \
0           AMUL 2019-01-01   East  Rural  AMUL   AMUL PRO WHEY       14.449   
1           AMUL 2019-01-01   East  Urban  AMUL   AMUL PRO WHEY       66.142   
2           AMUL 2019-01-01  North  Rural  AMUL   AMUL PRO WHEY        0.000   
3           AMUL 2019-01-01  North  Urban  AMUL   AMUL PRO WHEY      236.700   
4           AMUL 2019-01-01  South  Rural  AMUL   AMUL PRO WHEY       91.383   
..           ...        ...    ...    ...   ...             ...          ...   
355       NESTLE 2019-05-01  North  Urban  MILO  MILO ACTIVE GO     2520.914   
356       NESTLE 2019-05-01  South  Rural  MILO  MILO ACTIVE GO      266.968   
357       NESTLE 2019-05-01  South  Urban  MILO  MILO ACTIVE GO     1218.579   
358       NESTLE 2019-05-01   West  Rural  MILO  MILO ACTIVE GO      503.869   
359       NESTLE 2019-05-01   West  Urban  MILO  MILO ACTIVE GO     2234.135   

     diff_frm_pr

In [68]:
from pandas_profiling import ProfileReport as pr
profile = pr(df=df,title = "Profiling report of case study",html={'style':{'full_width':True}})
profile.to_widgets()

Summarize dataset: 100%|██████████| 22/22 [00:10<00:00,  2.19it/s, Completed]                     
Generate report structure: 100%|██████████| 1/1 [00:07<00:00,  7.05s/it]
                                                             

VBox(children=(Tab(children=(Tab(children=(GridBox(children=(VBox(children=(GridspecLayout(children=(HTML(valu…