Project Name: Discount Analysis

Description:

This project aims to analyze sales and gross margin (GM) data for various OPCOs (Operating Companies). The methodology involves several steps:

1. Calculate Overall Sales and GM: The first step is to determine the overall sales and GM for each OPCO. The results will be summarized in the "opco_summary" sheet of the provided Excel file.

2. Identify Decrease in GM: Next, we analyze the companies experiencing a decrease in the overall GM. Specifically, we focus on the following customer types: direct, indirect other, and channel partners. We highlight in the table only those companies whose GM is below the target. The relevant information can be found in the "opco_below" tab.

3. Analyze Customer Orders: Based on the problematic customer types identified in the previous step, we prepare a table that lists the orders from customers responsible for the decrease in the target GM. This data is available in the "companies_below" tab.

4. Weighted Contribution Analysis: To determine the impact of different customer types, we calculate their respective weights in contributing to the overall results. This ensures that larger project amounts have a greater influence on the final profitability.

5. Evaluation of Product Category Projects: We assess the achievement of the GM target by focusing on projects in the Product category during the months of April and May. The analysis provides an overall perspective on the GM performance.

6. Key OPCOs and Target Industries: We identify key OPCOs that made significant contributions to sales and GM decrease. Moreover, we examine the industries of customers involved, which are considered target industries for our business.

The Python script provided in this repository facilitates the execution of the methodology. It reads the data from the provided Excel file, performs the required calculations, and generates the relevant outputs. Make sure to customize the script with the appropriate file path, column names, and target GM value based on your specific data.

Dependencies:
- Python 3.x
- pandas library

Feel free to explore the code, modify it as needed, and utilize the generated insights to make informed decisions about sales and GM optimization.

In [None]:
import pandas as pd
import numpy as np
import sqlite3
import importlib

# keeping company information in additional file
import data_file

In [None]:
importlib.reload(data_file)

In [None]:
# Orders and Discount initial data
discount_data = pd.read_excel('data_files/discount_source.xlsx')
discount_data['sold_to_customer'] = discount_data['sold_to_customer'].astype(str)
discount_data['sold_to_customer'] = discount_data['sold_to_customer'].str.strip()

# Intial target Data
target_gm = pd.read_excel('data_files/target_gm.xlsx')

# Customer Data
conn = sqlite3.connect('data_files/customer_data.db')
query = "SELECT * FROM customers"  # Replace 'tablename' with your table name
df_customers = pd.read_sql_query(query, conn)
conn.close()

df_customers['tier'] = df_customers['tier'].fillna(df_customers['indirect_direct'])
df_customers['tier_new'] = df_customers['tier_new'].fillna(df_customers['indirect_direct'])

In [None]:
order_data = discount_data.copy()
wdf = df_customers.copy()

In [None]:
len(order_data)

In [None]:
order_columns = list(order_data.columns) + ['customer_name', 'indirect_direct', 'tier', 'tier_new', 'type']

In [None]:
# agents df

list_of_sales_person_n = data_file.sales_person_n
exlpanrter=data_file.exlpanrter

exl_list = list(wdf[wdf[exlpanrter].notna()]['sold_to_customer'].unique())
agents_df = discount_data[(discount_data['sales_person_n'].isin(list_of_sales_person_n)) | (discount_data['sold_to_customer'].isin(exl_list))]
wdf['sold_to_customer'] = wdf['sold_to_customer'].astype(str)
agents_df = agents_df.merge(wdf[['sold_to_customer', 'customer_name']], left_on='sold_to_customer', right_on='sold_to_customer', how='left')
agents_df['indirect_direct'] = 'Indirect'
agents_df['channel'] = 'Channel Partner'
agents_df['type'] = 'Agent'
agents_df['tier'] = 'Channel Partner'
agents_df['tier_new'] = 'Channel Partner'

In [None]:
# other df
other_df = discount_data[(~discount_data['sales_person_n'].isin(list_of_sales_person_n)) & (~discount_data['sold_to_customer'].isin(exl_list))]
wdf['sold_to_customer'] = wdf['sold_to_customer'].astype(str)
other_df = other_df.merge(wdf[['sold_to_customer', 'customer_name', 'indirect_direct', 'channel', 'type', 'tier', 'tier_new']], left_on='sold_to_customer', right_on='sold_to_customer', how='left')
final_df = pd.concat([other_df, agents_df])

final = final_df[final_df['sold_to_customer'].isin(wdf['sold_to_customer'])]
other = final_df[~final_df['sold_to_customer'].isin(wdf['sold_to_customer'])]

# checking data integrity
print(len(final) + len(other))

In [None]:
# preparation of summary information to get the list of OPCO which are below target

df = final.copy()
pivot_table = pd.pivot_table(df, values=['sales_eur', 'gp_eur'], index='company_code_n', aggfunc='sum')
pivot_table.reset_index(inplace=True)
pivot_table['result_gm'] = pivot_table['gp_eur'] / pivot_table['sales_eur']

merged_df = pd.merge(target_gm, pivot_table, left_on='OPCO', right_on='company_code_n')
merged_df = merged_df[['OPCO', 'sales_eur', 'gp_eur', 'target_gm', 'result_gm']]

overview_opco_df = merged_df.copy()
overview_opco_df['dif'] = overview_opco_df['result_gm'] - overview_opco_df['target_gm']

filtered_df = merged_df[merged_df['target_gm'] > merged_df['result_gm']]

opco_below_target = list(filtered_df['OPCO'].unique())

In [None]:
# it is necessary to understand sum of all orders for each OPCO

grouped_opco_sum_sales = final.groupby(['company_code_n']).agg({
    'sales_eur': 'sum',
    'gp_eur': 'sum'
})

grouped_opco_sum_sales.reset_index(inplace=True)
grouped_opco_sum_sales = grouped_opco_sum_sales.rename(columns={'sales_eur': 'sum_sales', 'gp_eur' : 'sum gp'})

In [None]:
# and then to add to the processed table
final_with_targe_gm = final.merge(target_gm, left_on='company_code_n', right_on='OPCO')

final_with_targe_gm = final_with_targe_gm.merge(grouped_opco_sum_sales, how='left')
final_with_targe_gm = final_with_targe_gm[final_with_targe_gm['company_code_n'].isin(opco_below_target)]

grouped_data = final_with_targe_gm.groupby(['company_code_n', 'tier_new', 'target_gm', 'sum_sales', 'sum gp']).agg({
    'sales_eur': 'sum',
    'gp_eur': 'sum'
})

grouped_data['result gm'] = grouped_data['gp_eur'] / grouped_data['sales_eur']

final_data = grouped_data.reset_index()

In [None]:
final_data['sales weight'] = final_data['sales_eur'] / final_data['sum_sales'] * 100
final_data['sales weight'] = final_data['sales weight'].round(0)
final_data['gp weight'] = final_data['gp_eur'] / final_data['sum gp'] * 100
final_data['gp weight'] = final_data['gp weight'].round(0)

final_opco_data = final_data[['company_code_n', 'tier_new', 'target_gm', 'result gm', 'sales_eur', 'sales weight', 'gp_eur', 'gp weight']]

next_filtered_df = final_data[final_data['target_gm'] > final_data['result gm']]

# delete rows with 0 weight or > 100% as irrelevant for analysis
filtered_final_opco_data = next_filtered_df[(next_filtered_df['sales weight'] > 0) & (next_filtered_df['gp weight'] > 0) & (next_filtered_df['gp weight'] <= 100)]


filtered_final_opco_data['dif'] = filtered_final_opco_data['result gm'] - filtered_final_opco_data['target_gm']

In [None]:
# now it is time to dive deeper with companies 

grouped_company_sum_sales = final.groupby(['company_code_n',  'customer_name']).agg({
    'sales_eur': 'sum',
    'gp_eur': 'sum'
})

grouped_company_sum_sales.reset_index(inplace=True)
grouped_company_sum_sales = grouped_company_sum_sales.rename(columns={'sales_eur': 'sum_sales', 'gp_eur' : 'sum gp'})

next_filtered_df = next_filtered_df[['company_code_n', 'tier_new', 'target_gm']]

companies_merged_df = final.merge(next_filtered_df, on=['company_code_n', 'tier_new'], how='inner')


companies_merged_df = companies_merged_df.merge(grouped_company_sum_sales, on=['company_code_n', 'customer_name'], how='left')

companies_grouped_data = companies_merged_df.groupby(['company_code_n',  'customer_name', 'ec_eu_customer_n', 'sales_person_n', 'sstp_approval_no', 'tier_new','target_gm', 'sales_order_so','sum_sales', 'sum gp']).agg({
    'sales_eur': 'sum',
    'gp_eur': 'sum'
})

companies_grouped_data.reset_index(inplace=True)

companies_grouped_data['sales weight'] = companies_grouped_data['sales_eur'] / companies_grouped_data['sum_sales'] * 100
companies_grouped_data['sales weight'] = companies_grouped_data['sales weight'].round(0)
companies_grouped_data['gp weight'] = companies_grouped_data['gp_eur'] / companies_grouped_data['sum gp'] * 100
companies_grouped_data['gp weight'] = companies_grouped_data['gp weight'].round(0)


companies_grouped_data['result gm'] = companies_grouped_data['gp_eur'] / companies_grouped_data['sales_eur']

companies_grouped_data.reset_index(inplace=True)

companies_grouped_data = companies_grouped_data[companies_grouped_data['target_gm'] > companies_grouped_data['result gm']]

# delete rows with 0 weight or > 100% as irrelevant for analysis
companies_grouped_data = companies_grouped_data[(companies_grouped_data['sales weight'] > 0) & (companies_grouped_data['gp weight'] > 0) & (companies_grouped_data['gp weight'] <= 100)]

companies_grouped_data['difference'] = companies_grouped_data['target_gm'] - companies_grouped_data['result gm']
companies_grouped_data = companies_grouped_data.sort_values(by=['sales_eur','sales weight', 'difference'], ascending=[False, False,False])

companies_grouped_data = companies_grouped_data.loc[:, 'company_code_n':]

companies_grouped_data = companies_grouped_data[['company_code_n', 'customer_name', 'ec_eu_customer_n', 'sales_person_n',
       'sstp_approval_no', 'tier_new', 'target_gm', 'sales_order_so', 'sales_eur', 'gp_eur', 'sales weight',
       'gp weight', 'result gm', 'difference']]

In [None]:
writer = pd.ExcelWriter('data_files/result_table.xlsx')
# Save each DataFrames to a separate sheet in the same file
overview_opco_df.to_excel(writer, sheet_name='opco_summary', index=False)
filtered_final_opco_data.to_excel(writer, sheet_name='opco_below', index=False)
companies_grouped_data.to_excel(writer, sheet_name='results', index=False)
# Save the file
writer.close()