# E-commerce Analysis Case

* * *

## Analysis Process：

### 1. Identify Overall Operational Indicators
### 2. Find underperforming products from the price range and optimize the product structure
### 3. Determine poorly performing products from the discount range and optimize the product structure


* * *

## Introduction ##

&nbsp;&nbsp;&nbsp;&nbsp;This project's dataset comes from VIPS, a well-known Chinese website dedicated to special offers by selling branded discount products online. A special sale generally refers to selling a specified pair of goods at a discounted price during a specified period, usually in a mall or specialty store. This model has already existed offline for a long (e.g., shopping mall promotions, street sales), but in other countries, there are also discounts on unsold goods in large stores, such as Outlets. The deals are generally inventory clearance, but some businesses specialize in producing goods for sale.

&nbsp;&nbsp;&nbsp;&nbsp;This special sale industry is a real industrial chain, but because of the rapid distribution channels, geographical location and other reasons, most of them are clustered in first-tier cities. For people in less developed areas, this industry is still very strange to them. Therefore, a group of people have become the brand's porters through social media platform like WeChat and other channels to quickly distribute big brand inventory, to achieve rapid low-cost inventory clearance, and to speed up the turnover of funds back to the purpose.

&nbsp;&nbsp;&nbsp;&nbsp;In terms of supply, branded tailgates are the most common source of discount retail goods, as they have natural clearance needs, but in fact, as long as the cost is low enough, new product launches, custom underwriting, and private brands can all be a sustainable source of discount retail. In the early days of its establishment, VIPS source of goods is mainly tail products, but with the continuous development of VIPS in the field of e-commerce, the proportion of new products and exclusive supplies has been increasing, as early as in Q2 2016 analysis, VIPS seasonal new products and platform special offer products already accounted for 37%!


## Part 1. Evaluate and Optimize

#### In this part we would evaluate the results of each promotion and optimize the product mix as appropriate in order to make products sell better.

### Step 1. Read Each Dataset

In [17]:
import pandas as pd
import pandas as pd
import numpy as np

import warnings
warnings.filterwarnings('ignore')



In [69]:
import sqlalchemy


engine = sqlalchemy.create_engine('mysql+pymysql://frogdata05:Frogdata!1321@localhost:3306/froghd')

# Read data
# Commodity Information Sheet
sql_cmd = "select * from sales_info1"

# Execute sql to queries to access data
dt1 = pd.read_sql(sql=sql_cmd, con=engine)

dt1.rename(columns={"商品名": "sale_name",
                    "售卖价":"sale_price",
                    "吊牌价":"tag_price",
                    "折扣率":"discount",
                    "库存量":"inventory",
                    "货值":"inventory_value",
                    "成本价":"cost_price",
                    "利润率":"profit_rate",
                    "skus":"SKU"},
          inplace=True)

dt1.head()

Unnamed: 0,sale_name,sale_price,tag_price,discount,inventory,inventory_value,cost_price,profit_rate,SKU
0,A001,15,70,0.214286,501,35070,14,0.066667,2
1,A002,236,610,0.386885,423,258030,75,0.682203,1
2,A003,473,1253,0.377494,415,519995,394,0.167019,1
3,A004,320,835,0.383234,624,521040,279,0.128125,2
4,A005,15,82,0.182927,179,14678,27,-0.8,1


In [3]:
dt1.to_excel('new.xlsx')

In [19]:
# Read data
# Commodity Popularity Sheet
sql_cmd = "select * from sales_info2"

# Execute sql to queries to access data
dt2 = pd.read_sql(sql=sql_cmd, con=engine)


dt2.head()

Unnamed: 0,sale_name,uvs,collections,carts
0,A001,10926,48,372
1,A002,13124,84,193
2,A003,25657,45,173
3,A004,20833,5,273
4,A005,19371,71,356


In [70]:
# Read data
# User Sales Detail Sheet
sql_cmd = "select * from sales_info3"

# Execute sql to queries to access data
dt3 = pd.read_sql(sql=sql_cmd, con=engine)

dt3.rename(columns={"is_tui":"refund_or_not",
                    "tui_cons":"refundNums",
                    "tui_price":"refundPrice"},
          inplace=True)

#Switch "yes" or "no" of "refund_or_not" variable into “1”(yes) and "0" (no)
dt3["refund_or_not"]=dt3["refund_or_not"].map({"是":1,"否":0})
dt3.head()

Unnamed: 0,user_id,buy_date,sale_name,buy_cons,buy_price,cost_price,refund_or_not,refundNums,refundPrice
0,1,20191111,F001,1,920.0,920.0,1,1,920.0
1,2,20191111,B007,2,548.0,1096.0,0,0,0.0
2,2,20191111,E007,1,930.0,930.0,1,1,930.0
3,3,20191111,A004,2,320.0,640.0,1,2,640.0
4,3,20191111,H007,2,750.0,1500.0,0,0,0.0


### Step 2. Merge "Commodity Information Sheet" and "Commodity Popularity Sheet"

In [71]:
# As a result, we get basic product information as well as some popularity information, 
# such as the number of added charts, the number of favorites and the number of uvs (unique visit) 
dt_product = dt1.merge(dt2,how="left",on="sale_name")
dt_product.head()

Unnamed: 0,sale_name,sale_price,tag_price,discount,inventory,inventory_value,cost_price,profit_rate,SKU,uvs,collections,carts
0,A001,15,70,0.214286,501,35070,14,0.066667,2,10926,48,372
1,A002,236,610,0.386885,423,258030,75,0.682203,1,13124,84,193
2,A003,473,1253,0.377494,415,519995,394,0.167019,1,25657,45,173
3,A004,320,835,0.383234,624,521040,279,0.128125,2,20833,5,273
4,A005,15,82,0.182927,179,14678,27,-0.8,1,19371,71,356


### Step 3. Merge Step 2 Sheet with "User Sales Detail Sheet"

In [26]:
# Summarize the selling situation of each product and rename columns

product_sales = dt3.groupby("sale_name").agg({"buy_cons":"sum",
                                                 "cost_price":"sum",
                                                 "refundNums":"sum",
                                                 "refundPrice":"sum",
                                                 "buy_price":"mean",
                                                 "user_id":pd.Series.nunique}).reset_index()
product_sales.rename(columns={"buy_cons":"Num_sales",
                              "cost_price":"Amount_sales",
                              "refund_or_not":"Num_refund",
                              "refundPrice":"Amount_refund",
                              "buy_price":"Unit_price",
                              "user_id":"Num_users"},inplace=True)
product_sales.head()

Unnamed: 0,sale_name,Num_sales,Amount_sales,refundNums,Amount_refund,Unit_price,Num_users
0,A001,185,2775.0,59,885.0,15.0,116
1,A002,146,34456.0,31,7316.0,236.0,87
2,A003,144,68112.0,31,14663.0,473.0,94
3,A004,172,55040.0,56,17920.0,320.0,111
4,A005,122,1830.0,32,480.0,15.0,81


In [27]:
# Merge Product Information
dt_product_sales = dt_product.merge(product_sales,how="left",on="sale_name")
dt_product_sales.head()

Unnamed: 0,sale_name,sale_price,tag_price,discout,inventory,inventory_value,cost_price,profit_rate,SKU,uvs,collections,carts,Num_sales,Amount_sales,refundNums,Amount_refund,Unit_price,Num_users
0,A001,15,70,0.214286,501,35070,14,0.066667,2,10926,48,372,185,2775.0,59,885.0,15.0,116
1,A002,236,610,0.386885,423,258030,75,0.682203,1,13124,84,193,146,34456.0,31,7316.0,236.0,87
2,A003,473,1253,0.377494,415,519995,394,0.167019,1,25657,45,173,144,68112.0,31,14663.0,473.0,94
3,A004,320,835,0.383234,624,521040,279,0.128125,2,20833,5,273,172,55040.0,56,17920.0,320.0,111
4,A005,15,82,0.182927,179,14678,27,-0.8,1,19371,71,356,122,1830.0,32,480.0,15.0,81


### Step 4. Overall Operations Evaluation

In the overall operations section, the main focus is on sales, sell-through, UV, conversion rate, and other indicators as auxiliary indicators. The sales volume is used to compare with the expected target, and the sell-through ratio is used to see the flow of goods.

- **GMV**: Gross Merchandise Volume, which means the total volume of transactions (within a certain period). Mostly used in the e-commerce industry, it usually includes the number of unpaid orders that have been placed.
- **Real Sale Volume**: GMV - Refusal Refund Amount
- **Sales Volume**: Cumulative sales volume (including refusal refund)
- **Per Customer Transaction**: GMV / the Number of Customers, positively related to Gross Profit Margin
- **UV**: number of unique visits to the product's page
- **Conversion Rate**: The Number of Customers / UV
- **Discount Rate**: GMV / Total Amount of Tag (tag price * sales volume)
- **Stock Value**: Tag Price * Inventory Amount
- **Sales Ratio**: GMV / Stock Value
- **Collections**: The number of users who have collected a product
- **Additional Purchases**: The number of users who have added products to charts
- **SKU**: SKU count in promotional activities (generally refers to the item number)
- **SPU**: SPU count in promotional activities (generally refers to the style number)
- **Rejected Volume**: The total number of rejected and returned goods
- **Rejected Amount**: The total amount of rejections and returns

In [29]:
#1、GMV: the total volume of transactions, including return amount
gmv = dt_product_sales["Amount_sales"].sum()
gmv

3747167.0

In [30]:
#2、Real Sale Volume: GMV - Refusal Refund Amount
return_sales = dt_product_sales["Amount_refund"].sum()
return_money = gmv - return_sales
return_money

2607587.0

In [31]:
#3、Sales Volume: Cumulative sales volume (including refusal refund)
all_sales = dt_product_sales["Num_sales"].sum()
all_sales

12017

In [32]:
#4、Per Customer Transaction: GMV / the Number of Customers, positively related to Gross Profit Margin
# dt3.user_id.unique().count()

custom_price = gmv / dt_product_sales["Num_users"].sum()
custom_price

493.56783456269756

In [33]:
# 5、UV: number of unique visits to the product's page
uv_cons = dt_product_sales["uvs"].sum()
uv_cons

1176103

In [34]:
# 6、Conversion Rate: The Number of Customers / UV
uv_rate = dt_product_sales["Num_users"].sum() / dt_product_sales["uvs"].sum()
uv_rate

0.006455216932530569

In [35]:
# 7、Discount Rate: GMV / Total Amount of Tag (tag price * sales volume)
tags_sales = np.sum(dt_product_sales["tag_price"] * dt_product_sales["Num_sales"])
discount_rate= gmv / tags_sales 
discount_rate

0.4179229541452886

In [37]:
# 8、Stock Value: Tag Price * Inventory Amount
goods_value = dt_product_sales["inventory_value"].sum()
goods_value

18916395

In [38]:
# 9、Sales Ratio: GMV / Stock Value
sales_rate = gmv / goods_value
sales_rate

0.19809096817866195

In [39]:
# 10、Collections: The number of users who have collected a product
coll_cons = dt_product_sales["collections"].sum()
coll_cons

6224

In [40]:
# 11、Additional Purchases: The number of users who have added products to charts
add_shop_cons = dt_product_sales["carts"].sum()
add_shop_cons

18690

In [41]:
# 12、SKU: SKU count in promotional activities (generally refers to the item number)
sku_cons = dt_product_sales["SKU"].sum()
sku_cons

125

In [42]:
# 13、SPU: SPU count in promotional activities (generally refers to the style number)
spu_cons = len(dt_product_sales["sale_name"].unique())
spu_cons

80

In [43]:
# 14、Rejected Volume: The total number of rejected and returned goods
reject_cons = dt_product_sales["refundNums"].sum()
reject_cons

3643

In [45]:
# 15、Rejected Amount: The total amount of rejections and returns
reject_money = dt_product_sales["Amount_refund"].sum()
reject_money

1139580.0

In [46]:
# Summary


sales_state_dangqi = pd.DataFrame(
    {"GMV":[gmv,],"Real Sale Volume":[return_money,],"Sales Volume":[all_sales,],"Per Customer Transaction":[custom_price,],
     "UV":[uv_cons,],"Conversion Rate":[uv_rate,],"Discount Rate":[discount_rate,],"Stock Value":[goods_value,],
     "Sales Ratio":[sales_rate,],"Collections":[coll_cons,],"Additional Purchases":[add_shop_cons,],"SKU":[sku_cons,],
     "SPU":[spu_cons,],"Rejected Volume":[reject_cons,],"Rejected Amount":[reject_money,],}, 
    ) #index=["2020 Double11 Shopping Festival",]

# Here are statistics of 2019 Double11 shopping Festival as follows, which have been calculated already. 
sales_state_tongqi = pd.DataFrame(
    {"GMV":[2261093,],"Real Sale Volume":[1464936.517,],"Sales Volume":[7654,],"Per Customer Transaction":[609.34567,],
     "UV":[904694,],"Conversion Rate":[0.0053366,],"Discount Rate":[0.46,],"Stock Value":[12610930,],
     "Sales Ratio":[0.1161,],"Collections":[4263,],"Additional Purchases":[15838,],"SKU":[82,],
     "SPU":[67,],"Rejected Volume":[2000,],"Rejected Amount":[651188.57,],}, 
    ) #index=["2019 Double11 Shopping Festival",]

#sales_state = pd.concat([sales_state_dangqi, sales_state_tangqi])
sales_state_dangqi_s = pd.DataFrame(sales_state_dangqi.stack()).reset_index().iloc[:,[1,2]]
sales_state_dangqi_s.columns = ["Indicators","2020 double11"]
sales_state_tongqi_s = pd.DataFrame(sales_state_tongqi.stack()).reset_index().iloc[:,[1,2]]
sales_state_tongqi_s.columns = ["Indicators","2019 double11"]
sales_state = pd.merge(sales_state_dangqi_s, sales_state_tongqi_s,on="Indicators")
sales_state["Year-on-Year Ratio"] = (sales_state["2020 double11"] - sales_state["2019 double11"]) / sales_state["2019 double11"]
sales_state

Unnamed: 0,Indicators,2020 double11,2019 double11,Year-on-Year Ratio
0,GMV,3747167.0,2261093.0,0.657237
1,Real Sale Volume,2607587.0,1464937.0,0.78
2,Sales Volume,12017.0,7654.0,0.570029
3,Per Customer Transaction,493.5678,609.3457,-0.190004
4,UV,1176103.0,904694.0,0.300001
5,Conversion Rate,0.006455217,0.0053366,0.209612
6,Discount Rate,0.417923,0.46,-0.091472
7,Stock Value,18916400.0,12610930.0,0.5
8,Sales Ratio,0.198091,0.1161,0.70621
9,Collections,6224.0,4263.0,0.460005


## Part 2. Identify and Optimize From the Price Range

#### What we need to do is to in-depth explore the data of different intervals to optimize the later promotion structure. First of all, we need to find the sales source data in this range of in this promotion. The source data requires the display of specific model number, sales, sales and other information. The second step is to calculate the conversion rate and discount rate of each item.

### Indicators:
- Sales Amount
- Sales Volume
- Per Customer Transaction
- Numbers of Customers
- UV
- Conversion Rate
- Inventory Volume
- Inventory Value
- Sales Ratio

In [51]:
# Divide price range
# Set the segmentation range
listBins = [0,200, 400, 100000]

# Set labels for each range
listLabels = ['1_200','200_400','400 or more']

# Use pd.cut for data discretization slicing, with consistent group labels and group numbers
"""
pandas.cut(x,bins,right=True,labels=None,retbins=False,precision=3,include_lowest=False)

"""
dt_product_sales['price range'] = pd.cut(dt_product_sales['sale_price'], bins=listBins, labels=listLabels, include_lowest=True)
dt_product_sales.head()

Unnamed: 0,sale_name,sale_price,tag_price,discout,inventory,inventory_value,cost_price,profit_rate,SKU,uvs,collections,carts,Num_sales,Amount_sales,refundNums,Amount_refund,Unit_price,Num_users,价格分组,price range
0,A001,15,70,0.214286,501,35070,14,0.066667,2,10926,48,372,185,2775.0,59,885.0,15.0,116,1_200,1_200
1,A002,236,610,0.386885,423,258030,75,0.682203,1,13124,84,193,146,34456.0,31,7316.0,236.0,87,200_400,200_400
2,A003,473,1253,0.377494,415,519995,394,0.167019,1,25657,45,173,144,68112.0,31,14663.0,473.0,94,400 or more,400 or more
3,A004,320,835,0.383234,624,521040,279,0.128125,2,20833,5,273,172,55040.0,56,17920.0,320.0,111,200_400,200_400
4,A005,15,82,0.182927,179,14678,27,-0.8,1,19371,71,356,122,1830.0,32,480.0,15.0,81,1_200,1_200


In [54]:
dt_product_sales_info = dt_product_sales.groupby("price range").agg({
                                        "inventory_value":"sum",
                                        "Amount_sales":"sum",
                                        "Num_sales":"sum",
                                        "uvs":"sum",
                                        "Num_users":"sum",
                                        "collections":"sum",
                                        "carts":"sum"
                                        }).reset_index()
dt_product_sales_info.head()

Unnamed: 0,price range,inventory_value,Amount_sales,Num_sales,uvs,Num_users,collections,carts
0,1_200,1573146,339896.0,3615,369561,2280,1733,5324
1,200_400,8585973,1417702.0,4978,465547,3151,2608,8302
2,400 or more,8757276,1989569.0,3424,340995,2161,1883,5064


In [56]:
# Calculate indicators
dt_product_sales_info["Proportion_values"]=dt_product_sales_info["inventory_value"]/dt_product_sales_info["inventory_value"].sum()
dt_product_sales_info["Proportion_sales"]=dt_product_sales_info["Amount_sales"]/dt_product_sales_info["Amount_sales"].sum()
dt_product_sales_info["Per Customer Transaction"]=dt_product_sales_info["Amount_sales"]/dt_product_sales_info["Num_users"]
dt_product_sales_info["Conversion Rate"]=dt_product_sales_info["Num_users"]/dt_product_sales_info["uvs"]

dt_product_sales_info.head()

Unnamed: 0,price range,inventory_value,Amount_sales,Num_sales,uvs,Num_users,collections,carts,Proportion_values,Proportion_sales,Per Customer Transaction,Conversion Rate
0,1_200,1573146,339896.0,3615,369561,2280,1733,5324,0.083163,0.090707,149.077193,0.006169
1,200_400,8585973,1417702.0,4978,465547,3151,2608,8302,0.453891,0.37834,449.921295,0.006768
2,400 or more,8757276,1989569.0,3424,340995,2161,1883,5064,0.462946,0.530953,920.670523,0.006337


In [57]:
# Take out data beyond price range 400
product_400 = dt_product_sales[dt_product_sales["price range"]=='400 or more']
product_400.head()

Unnamed: 0,sale_name,sale_price,tag_price,discout,inventory,inventory_value,cost_price,profit_rate,SKU,uvs,collections,carts,Num_sales,Amount_sales,refundNums,Amount_refund,Unit_price,Num_users,价格分组,price range
2,A003,473,1253,0.377494,415,519995,394,0.167019,1,25657,45,173,144,68112.0,31,14663.0,473.0,94,400 or more,400 or more
5,A006,428,1493,0.286674,264,394152,233,0.455607,1,5805,134,161,143,61204.0,46,19688.0,428.0,90,400 or more,400 or more
10,B001,426,1121,0.380018,479,536959,311,0.269953,1,20448,6,242,158,67308.0,43,18318.0,426.0,101,400 or more,400 or more
13,B004,491,1394,0.352224,396,552024,353,0.281059,2,14535,120,211,160,78560.0,47,23077.0,491.0,102,400 or more,400 or more
15,B006,484,1467,0.329925,296,434232,398,0.177686,2,3733,115,285,141,68244.0,48,23232.0,484.0,91,400 or more,400 or more


In [58]:
# Calculate indicators for this price range
# Conversion Rate: The Number of Customers / UV
product_400['Conversion Rate'] = product_400["Num_users"]/product_400["uvs"]
# Stock Value: Tag Price * Inventory Amount
product_400["stock value"] = product_400["tag_price"]*product_400["inventory"]
product_400.head()

Unnamed: 0,sale_name,sale_price,tag_price,discout,inventory,inventory_value,cost_price,profit_rate,SKU,uvs,...,Num_sales,Amount_sales,refundNums,Amount_refund,Unit_price,Num_users,价格分组,price range,Conversion Rate,stock value
2,A003,473,1253,0.377494,415,519995,394,0.167019,1,25657,...,144,68112.0,31,14663.0,473.0,94,400 or more,400 or more,0.003664,519995
5,A006,428,1493,0.286674,264,394152,233,0.455607,1,5805,...,143,61204.0,46,19688.0,428.0,90,400 or more,400 or more,0.015504,394152
10,B001,426,1121,0.380018,479,536959,311,0.269953,1,20448,...,158,67308.0,43,18318.0,426.0,101,400 or more,400 or more,0.004939,536959
13,B004,491,1394,0.352224,396,552024,353,0.281059,2,14535,...,160,78560.0,47,23077.0,491.0,102,400 or more,400 or more,0.007018,552024
15,B006,484,1467,0.329925,296,434232,398,0.177686,2,3733,...,141,68244.0,48,23232.0,484.0,91,400 or more,400 or more,0.024377,434232


In [60]:
# Sales Ratio: GMV / Stock Value
product_400["Sales Ratio"] = product_400["Amount_sales"]/product_400["stock value"]
product_400.head()

Unnamed: 0,sale_name,sale_price,tag_price,discout,inventory,inventory_value,cost_price,profit_rate,SKU,uvs,...,Amount_sales,refundNums,Amount_refund,Unit_price,Num_users,价格分组,price range,Conversion Rate,stock value,Sales Ratio
2,A003,473,1253,0.377494,415,519995,394,0.167019,1,25657,...,68112.0,31,14663.0,473.0,94,400 or more,400 or more,0.003664,519995,0.130986
5,A006,428,1493,0.286674,264,394152,233,0.455607,1,5805,...,61204.0,46,19688.0,428.0,90,400 or more,400 or more,0.015504,394152,0.15528
10,B001,426,1121,0.380018,479,536959,311,0.269953,1,20448,...,67308.0,43,18318.0,426.0,101,400 or more,400 or more,0.004939,536959,0.12535
13,B004,491,1394,0.352224,396,552024,353,0.281059,2,14535,...,78560.0,47,23077.0,491.0,102,400 or more,400 or more,0.007018,552024,0.142313
15,B006,484,1467,0.329925,296,434232,398,0.177686,2,3733,...,68244.0,48,23232.0,484.0,91,400 or more,400 or more,0.024377,434232,0.15716


In [62]:
product_400[["sale_name","Amount_sales","Num_sales","Num_users","uvs",'Conversion Rate',"inventory","stock value","Sales Ratio"]]

Unnamed: 0,sale_name,Amount_sales,Num_sales,Num_users,uvs,Conversion Rate,inventory,stock value,Sales Ratio
2,A003,68112.0,144,94,25657,0.003664,415,519995,0.130986
5,A006,61204.0,143,90,5805,0.015504,264,394152,0.15528
10,B001,67308.0,158,101,20448,0.004939,479,536959,0.12535
13,B004,78560.0,160,102,14535,0.007018,396,552024,0.142313
15,B006,68244.0,141,91,3733,0.024377,296,434232,0.15716
16,B007,110148.0,201,122,29492,0.004137,325,487175,0.226095
17,B008,65280.0,136,82,18574,0.004415,339,482058,0.135419
22,C003,70950.0,150,92,17244,0.005335,242,319682,0.221939
26,C007,70122.0,174,104,20754,0.005011,258,289476,0.242238
29,C010,87750.0,117,75,5044,0.014869,229,280754,0.312551


### Optimal Suggestion:

#### - Commodities with a conversion rate greater than 0.7%: temporarily reserved for the next promotion

#### - Commodities with a conversion rate less than 0.7%, but with a sale ratio of 36% or more: Served for next promotion

#### - Commodities with a conversion rate less than 0.7% and a sale ration less than 36%: Clearance sales


In [63]:
# Select qualified commodities
# 1、Commodities with a conversion rate greater than 0.7%: temporarily reserved for the next promotion
stay_stocks571 = product_400[product_400["Conversion Rate"]>0.007]
stay_stocks571

Unnamed: 0,sale_name,sale_price,tag_price,discout,inventory,inventory_value,cost_price,profit_rate,SKU,uvs,...,Amount_sales,refundNums,Amount_refund,Unit_price,Num_users,价格分组,price range,Conversion Rate,stock value,Sales Ratio
5,A006,428,1493,0.286674,264,394152,233,0.455607,1,5805,...,61204.0,46,19688.0,428.0,90,400 or more,400 or more,0.015504,394152,0.15528
13,B004,491,1394,0.352224,396,552024,353,0.281059,2,14535,...,78560.0,47,23077.0,491.0,102,400 or more,400 or more,0.007018,552024,0.142313
15,B006,484,1467,0.329925,296,434232,398,0.177686,2,3733,...,68244.0,48,23232.0,484.0,91,400 or more,400 or more,0.024377,434232,0.15716
29,C010,750,1226,0.611746,229,280754,128,0.829333,1,5044,...,87750.0,43,32250.0,750.0,75,400 or more,400 or more,0.014869,280754,0.312551
46,E007,930,1578,0.589354,409,645402,356,0.617204,1,7264,...,143220.0,47,43710.0,930.0,96,400 or more,400 or more,0.013216,645402,0.221908
50,F001,920,1438,0.639777,217,312046,237,0.742391,1,4630,...,106720.0,40,36800.0,920.0,80,400 or more,400 or more,0.017279,312046,0.342001
51,F002,454,1383,0.328272,211,291813,326,0.281938,2,7366,...,70370.0,57,25878.0,454.0,94,400 or more,400 or more,0.012761,291813,0.241148
60,G001,463,1266,0.365719,142,179772,268,0.421166,2,13011,...,65746.0,46,21298.0,463.0,95,400 or more,400 or more,0.007302,179772,0.365719
62,G003,800,1275,0.627451,328,418200,264,0.67,2,11500,...,111200.0,46,36800.0,800.0,92,400 or more,400 or more,0.008,418200,0.265901
73,H004,1000,1466,0.682128,174,255084,347,0.653,1,10986,...,163000.0,56,56000.0,1000.0,98,400 or more,400 or more,0.00892,255084,0.639005


In [64]:
# Select qualified commodities
# 2、Commodities with a conversion rate less than 0.7%, but with a sale ratio of 36% or more: Served for next promotion
stay_stocks573 = product_400[(product_400["Sales Ratio"]>=0.36)&(product_400["Conversion Rate"]<0.007)]
stay_stocks573

Unnamed: 0,sale_name,sale_price,tag_price,discout,inventory,inventory_value,cost_price,profit_rate,SKU,uvs,...,Amount_sales,refundNums,Amount_refund,Unit_price,Num_users,价格分组,price range,Conversion Rate,stock value,Sales Ratio


In [65]:
# Select qualified commodities
# 3、Commodities with a conversion rate less than 0.7% and a sale ration less than 36%: Clearance sales
stay_stocks574 = product_400[(product_400["Sales Ratio"]<0.36)&(product_400["Conversion Rate"]<0.007)]
stay_stocks574

Unnamed: 0,sale_name,sale_price,tag_price,discout,inventory,inventory_value,cost_price,profit_rate,SKU,uvs,...,Amount_sales,refundNums,Amount_refund,Unit_price,Num_users,价格分组,price range,Conversion Rate,stock value,Sales Ratio
2,A003,473,1253,0.377494,415,519995,394,0.167019,1,25657,...,68112.0,31,14663.0,473.0,94,400 or more,400 or more,0.003664,519995,0.130986
10,B001,426,1121,0.380018,479,536959,311,0.269953,1,20448,...,67308.0,43,18318.0,426.0,101,400 or more,400 or more,0.004939,536959,0.12535
16,B007,548,1499,0.365577,325,487175,420,0.233577,2,29492,...,110148.0,66,36168.0,548.0,122,400 or more,400 or more,0.004137,487175,0.226095
17,B008,480,1422,0.337553,339,482058,302,0.370833,2,18574,...,65280.0,39,18720.0,480.0,82,400 or more,400 or more,0.004415,482058,0.135419
22,C003,473,1321,0.358062,242,319682,254,0.463002,2,17244,...,70950.0,44,20812.0,473.0,92,400 or more,400 or more,0.005335,319682,0.221939
26,C007,403,1122,0.35918,258,289476,167,0.585608,2,20754,...,70122.0,47,18941.0,403.0,104,400 or more,400 or more,0.005011,289476,0.242238
36,D007,431,1279,0.336982,387,494973,356,0.174014,1,20943,...,76718.0,60,25860.0,431.0,106,400 or more,400 or more,0.005061,494973,0.154994
42,E003,486,1349,0.360267,354,477546,220,0.547325,2,19094,...,59292.0,43,20898.0,486.0,77,400 or more,400 or more,0.004033,477546,0.12416
48,E009,401,1004,0.399402,224,224896,268,0.331671,2,25477,...,62155.0,50,20050.0,401.0,99,400 or more,400 or more,0.003886,224896,0.276372
56,F007,488,1351,0.361214,235,317485,402,0.17623,2,25320,...,73200.0,34,16592.0,488.0,95,400 or more,400 or more,0.003752,317485,0.230562


## Part 3. Identify and Optimize From the Discount Range


#### Likewise, we choose the 0.35-0.4 discount range for further exploration. dt_product_discount_info table, we can get a sale ratio of 16.90%, conversion rate of 0.53%, and discount rate of 37% for the 0.35-0.4 discount range, which should be compared when optimizing the product structure.

In [66]:
dt_product_sales.head()

Unnamed: 0,sale_name,sale_price,tag_price,discout,inventory,inventory_value,cost_price,profit_rate,SKU,uvs,collections,carts,Num_sales,Amount_sales,refundNums,Amount_refund,Unit_price,Num_users,价格分组,price range
0,A001,15,70,0.214286,501,35070,14,0.066667,2,10926,48,372,185,2775.0,59,885.0,15.0,116,1_200,1_200
1,A002,236,610,0.386885,423,258030,75,0.682203,1,13124,84,193,146,34456.0,31,7316.0,236.0,87,200_400,200_400
2,A003,473,1253,0.377494,415,519995,394,0.167019,1,25657,45,173,144,68112.0,31,14663.0,473.0,94,400 or more,400 or more
3,A004,320,835,0.383234,624,521040,279,0.128125,2,20833,5,273,172,55040.0,56,17920.0,320.0,111,200_400,200_400
4,A005,15,82,0.182927,179,14678,27,-0.8,1,19371,71,356,122,1830.0,32,480.0,15.0,81,1_200,1_200


In [81]:
# Divide price range
# Set the segmentation range
listBins = [0.15, 0.2, 0.25, 0.3, 0.35, 0.4, 0.45, 0.5, 0.55, 0.6, 0.65, 0.7, 1]

# Set labels for each range
listLabels = ['0.15_0.2','0.2_0.25','0.25_0.3','0.3_0.35','0.35_0.4','0.4_0.45','0.45_0.5','0.5_0.55','0.55_0.6','0.6_0.65','0.65_0.7','0.7_1']

## Use pd.cut for data discretization slicing, with consistent group labels and group numbers
"""
pandas.cut(x,bins,right=True,labels=None,retbins=False,precision=3,include_lowest=False)

"""
dt_product_sales['discount range'] = pd.cut(dt_product['discount'], bins=listBins, labels=listLabels, include_lowest=True)
dt_product_sales.head()

Unnamed: 0,sale_name,sale_price,tag_price,discout,inventory,inventory_value,cost_price,profit_rate,SKU,uvs,...,Num_sales,Amount_sales,refundNums,Amount_refund,Unit_price,Num_users,价格分组,price range,discount rage,discount range
0,A001,15,70,0.214286,501,35070,14,0.066667,2,10926,...,185,2775.0,59,885.0,15.0,116,1_200,1_200,0.2_0.25,0.2_0.25
1,A002,236,610,0.386885,423,258030,75,0.682203,1,13124,...,146,34456.0,31,7316.0,236.0,87,200_400,200_400,0.35_0.4,0.35_0.4
2,A003,473,1253,0.377494,415,519995,394,0.167019,1,25657,...,144,68112.0,31,14663.0,473.0,94,400 or more,400 or more,0.35_0.4,0.35_0.4
3,A004,320,835,0.383234,624,521040,279,0.128125,2,20833,...,172,55040.0,56,17920.0,320.0,111,200_400,200_400,0.35_0.4,0.35_0.4
4,A005,15,82,0.182927,179,14678,27,-0.8,1,19371,...,122,1830.0,32,480.0,15.0,81,1_200,1_200,0.15_0.2,0.15_0.2


In [76]:
dt_product_discount_info = dt_product_sales.groupby("discount range").agg({
                                        "inventory_value":"sum",
                                        "Amount_sales":"sum",
                                        "Num_sales":"sum",
                                        "uvs":"sum",
                                        "Num_users":"sum",
                                        "collections":"sum",
                                        "carts":"sum"
                                        }).reset_index()
dt_product_discount_info

Unnamed: 0,discount range,inventory_value,Amount_sales,Num_sales,uvs,Num_users,collections,carts
0,0.15_0.2,14678,1830.0,122,19371,81,71,356
1,0.2_0.25,597376,106944.0,1052,67808,634,520,1505
2,0.25_0.3,546516,79924.0,725,66471,462,538,971
3,0.3_0.35,2553886,382794.0,1065,87609,660,536,1530
4,0.35_0.4,8105784,1369758.0,3696,443317,2341,2046,5884
5,0.4_0.45,2098352,453179.0,1988,184205,1258,845,3428
6,0.45_0.5,1869262,311158.0,1452,138194,934,683,1996
7,0.5_0.55,112395,38024.0,196,26088,124,25,84
8,0.55_0.6,645402,143220.0,154,7264,96,78,388
9,0.6_0.65,1785946,590706.0,1144,98210,735,630,1795


In [77]:
# Calculate indicators
dt_product_discount_info["Proportion_values"]=dt_product_discount_info["inventory_value"]/dt_product_discount_info["inventory_value"].sum()
dt_product_discount_info["Proportion_sales"]=dt_product_discount_info["Amount_sales"]/dt_product_discount_info["Amount_sales"].sum()
dt_product_discount_info["Per Customer Transaction"]=dt_product_discount_info["Amount_sales"]/dt_product_discount_info["Num_users"]
dt_product_discount_info["Conversion Rate"]=dt_product_discount_info["Num_users"]/dt_product_discount_info["uvs"]

dt_product_discount_info

Unnamed: 0,discount range,inventory_value,Amount_sales,Num_sales,uvs,Num_users,collections,carts,Proportion_values,Proportion_sales,Per Customer Transaction,Conversion Rate
0,0.15_0.2,14678,1830.0,122,19371,81,71,356,0.000776,0.000488,22.592593,0.004182
1,0.2_0.25,597376,106944.0,1052,67808,634,520,1505,0.03158,0.02854,168.681388,0.00935
2,0.25_0.3,546516,79924.0,725,66471,462,538,971,0.028891,0.021329,172.995671,0.00695
3,0.3_0.35,2553886,382794.0,1065,87609,660,536,1530,0.135009,0.102156,579.990909,0.007533
4,0.35_0.4,8105784,1369758.0,3696,443317,2341,2046,5884,0.428506,0.365545,585.116617,0.005281
5,0.4_0.45,2098352,453179.0,1988,184205,1258,845,3428,0.110928,0.120939,360.237679,0.006829
6,0.45_0.5,1869262,311158.0,1452,138194,934,683,1996,0.098817,0.083038,333.14561,0.006759
7,0.5_0.55,112395,38024.0,196,26088,124,25,84,0.005942,0.010147,306.645161,0.004753
8,0.55_0.6,645402,143220.0,154,7264,96,78,388,0.034119,0.038221,1491.875,0.013216
9,0.6_0.65,1785946,590706.0,1144,98210,735,630,1795,0.094413,0.157641,803.681633,0.007484


In [78]:
# Take out data within rage 0.35-0.4
product_354 = dt_product_sales[dt_product_sales["discount range"]=='0.35_0.4']
product_354.head()

Unnamed: 0,sale_name,sale_price,tag_price,discout,inventory,inventory_value,cost_price,profit_rate,SKU,uvs,...,Num_sales,Amount_sales,refundNums,Amount_refund,Unit_price,Num_users,价格分组,price range,discount rage,discount range
1,A002,236,610,0.386885,423,258030,75,0.682203,1,13124,...,146,34456.0,31,7316.0,236.0,87,200_400,200_400,0.35_0.4,0.35_0.4
2,A003,473,1253,0.377494,415,519995,394,0.167019,1,25657,...,144,68112.0,31,14663.0,473.0,94,400 or more,400 or more,0.35_0.4,0.35_0.4
3,A004,320,835,0.383234,624,521040,279,0.128125,2,20833,...,172,55040.0,56,17920.0,320.0,111,200_400,200_400,0.35_0.4,0.35_0.4
10,B001,426,1121,0.380018,479,536959,311,0.269953,1,20448,...,158,67308.0,43,18318.0,426.0,101,400 or more,400 or more,0.35_0.4,0.35_0.4
12,B003,288,746,0.386059,439,327494,109,0.621528,1,23170,...,151,43488.0,44,12672.0,288.0,89,200_400,200_400,0.35_0.4,0.35_0.4


In [79]:
# Calculate indicators for this price range
# Conversion Rate: The Number of Customers / UV

product_354['Conversion Rate'] = product_354["Num_users"]/product_354["uvs"]

# Stock Value: Tag Price * Inventory Amount
product_354["stock value"] = product_354["tag_price"]*product_354["inventory"]
product_354.head()

Unnamed: 0,sale_name,sale_price,tag_price,discout,inventory,inventory_value,cost_price,profit_rate,SKU,uvs,...,refundNums,Amount_refund,Unit_price,Num_users,价格分组,price range,discount rage,discount range,Conversion Rate,stock value
1,A002,236,610,0.386885,423,258030,75,0.682203,1,13124,...,31,7316.0,236.0,87,200_400,200_400,0.35_0.4,0.35_0.4,0.006629,258030
2,A003,473,1253,0.377494,415,519995,394,0.167019,1,25657,...,31,14663.0,473.0,94,400 or more,400 or more,0.35_0.4,0.35_0.4,0.003664,519995
3,A004,320,835,0.383234,624,521040,279,0.128125,2,20833,...,56,17920.0,320.0,111,200_400,200_400,0.35_0.4,0.35_0.4,0.005328,521040
10,B001,426,1121,0.380018,479,536959,311,0.269953,1,20448,...,43,18318.0,426.0,101,400 or more,400 or more,0.35_0.4,0.35_0.4,0.004939,536959
12,B003,288,746,0.386059,439,327494,109,0.621528,1,23170,...,44,12672.0,288.0,89,200_400,200_400,0.35_0.4,0.35_0.4,0.003841,327494


In [80]:
# Sales Ratio: GMV / Stock Value
product_354["Sales Ratio"] = product_354["Amount_sales"]/product_354["stock value"]
product_354.head()

Unnamed: 0,sale_name,sale_price,tag_price,discout,inventory,inventory_value,cost_price,profit_rate,SKU,uvs,...,Amount_refund,Unit_price,Num_users,价格分组,price range,discount rage,discount range,Conversion Rate,stock value,Sales Ratio
1,A002,236,610,0.386885,423,258030,75,0.682203,1,13124,...,7316.0,236.0,87,200_400,200_400,0.35_0.4,0.35_0.4,0.006629,258030,0.133535
2,A003,473,1253,0.377494,415,519995,394,0.167019,1,25657,...,14663.0,473.0,94,400 or more,400 or more,0.35_0.4,0.35_0.4,0.003664,519995,0.130986
3,A004,320,835,0.383234,624,521040,279,0.128125,2,20833,...,17920.0,320.0,111,200_400,200_400,0.35_0.4,0.35_0.4,0.005328,521040,0.105635
10,B001,426,1121,0.380018,479,536959,311,0.269953,1,20448,...,18318.0,426.0,101,400 or more,400 or more,0.35_0.4,0.35_0.4,0.004939,536959,0.12535
12,B003,288,746,0.386059,439,327494,109,0.621528,1,23170,...,12672.0,288.0,89,200_400,200_400,0.35_0.4,0.35_0.4,0.003841,327494,0.13279


In [84]:
product_354[["sale_name","Amount_sales","Num_sales","Unit_price","Num_users","uvs","inventory","stock value","discout","Sales Ratio",'Conversion Rate']]

Unnamed: 0,sale_name,Amount_sales,Num_sales,Unit_price,Num_users,uvs,inventory,stock value,discout,Sales Ratio,Conversion Rate
1,A002,34456.0,146,236.0,87,13124,423,258030,0.386885,0.133535,0.006629
2,A003,68112.0,144,473.0,94,25657,415,519995,0.377494,0.130986,0.003664
3,A004,55040.0,172,320.0,111,20833,624,521040,0.383234,0.105635,0.005328
10,B001,67308.0,158,426.0,101,20448,479,536959,0.380018,0.12535,0.004939
12,B003,43488.0,151,288.0,89,23170,439,327494,0.386059,0.13279,0.003841
13,B004,78560.0,160,491.0,102,14535,396,552024,0.352224,0.142313,0.007018
16,B007,110148.0,201,548.0,122,29492,325,487175,0.365577,0.226095,0.004137
19,B010,46800.0,120,390.0,82,7934,188,186496,0.393145,0.250944,0.010335
21,C002,29748.0,148,201.0,97,7835,287,158711,0.363472,0.187435,0.01238
22,C003,70950.0,150,473.0,92,17244,242,319682,0.358062,0.221939,0.005335


### Optimal Suggestions:

#### The part with the discount rate **greater than 37%** should be reserved for products with a sale ratio greater than 36.5% and a conversion rate greater than 0.7%, and the rest will be processed for clearance;

#### The part with the discount rate **less than 37%** should be reserved for the part with the sale ratio greater than 36.5% and the conversion rate greater than 0.7%, and the rest will be cleared.







In [85]:
# Select qualified commodities
# 1、Reservation：The part with a discount rate greater than 37% is reserved for products with a sale-to-sale ratio greater than 36.5% and a conversion rate greater than 0.7%
stay_stocks1 = product_354[(product_354["discout"]>0.37)&(product_354["Sales Ratio"]>0.365)&(product_354["Conversion Rate"]>0.007)]
stay_stocks1

Unnamed: 0,sale_name,sale_price,tag_price,discout,inventory,inventory_value,cost_price,profit_rate,SKU,uvs,...,Amount_refund,Unit_price,Num_users,价格分组,price range,discount rage,discount range,Conversion Rate,stock value,Sales Ratio
64,G005,221,588,0.37585,147,86436,106,0.520362,2,4334,...,9503.0,221.0,92,200_400,200_400,0.35_0.4,0.35_0.4,0.021228,86436,0.365623


In [86]:
# 2、Clearance processing products that do not meet the conditions: the part with a discount rate greater than 37% is looking for products with a sale-to-sale ratio less than 36.5% or a conversion rate less than 0.7%
stay_stocks2 = product_354[(product_354["discout"]>=0.37)&((product_354["Sales Ratio"]<=0.365)|(product_354["Conversion Rate"]<=0.007))] #
stay_stocks2

Unnamed: 0,sale_name,sale_price,tag_price,discout,inventory,inventory_value,cost_price,profit_rate,SKU,uvs,...,Amount_refund,Unit_price,Num_users,价格分组,price range,discount rage,discount range,Conversion Rate,stock value,Sales Ratio
1,A002,236,610,0.386885,423,258030,75,0.682203,1,13124,...,7316.0,236.0,87,200_400,200_400,0.35_0.4,0.35_0.4,0.006629,258030,0.133535
2,A003,473,1253,0.377494,415,519995,394,0.167019,1,25657,...,14663.0,473.0,94,400 or more,400 or more,0.35_0.4,0.35_0.4,0.003664,519995,0.130986
3,A004,320,835,0.383234,624,521040,279,0.128125,2,20833,...,17920.0,320.0,111,200_400,200_400,0.35_0.4,0.35_0.4,0.005328,521040,0.105635
10,B001,426,1121,0.380018,479,536959,311,0.269953,1,20448,...,18318.0,426.0,101,400 or more,400 or more,0.35_0.4,0.35_0.4,0.004939,536959,0.12535
12,B003,288,746,0.386059,439,327494,109,0.621528,1,23170,...,12672.0,288.0,89,200_400,200_400,0.35_0.4,0.35_0.4,0.003841,327494,0.13279
19,B010,390,992,0.393145,188,186496,265,0.320513,2,7934,...,16770.0,390.0,82,200_400,200_400,0.35_0.4,0.35_0.4,0.010335,186496,0.250944
37,D008,340,916,0.371179,287,262892,204,0.4,2,6236,...,12240.0,340.0,84,200_400,200_400,0.35_0.4,0.35_0.4,0.01347,262892,0.16037
48,E009,401,1004,0.399402,224,224896,268,0.331671,2,25477,...,20050.0,401.0,99,400 or more,400 or more,0.35_0.4,0.35_0.4,0.003886,224896,0.276372
63,G004,392,1040,0.376923,479,498160,234,0.403061,2,15356,...,19600.0,392.0,90,200_400,200_400,0.35_0.4,0.35_0.4,0.005861,498160,0.115674
70,H001,297,755,0.393377,338,255190,166,0.441077,1,6856,...,12474.0,297.0,92,200_400,200_400,0.35_0.4,0.35_0.4,0.013419,255190,0.173412


In [88]:
# Select Qualified Commodities：
# 3、Reservation：In the part where the discount rate is less than 37%, the part with the sales-to-sale ratio greater than 36.5% and the conversion rate greater than 0.7% is retained
stay_stocks3 = product_354[(product_354["discout"]<=0.37)&(product_354["Conversion Rate"]>0.007)&(product_354["Sales Ratio"]>0.365)] 
stay_stocks3

Unnamed: 0,sale_name,sale_price,tag_price,discout,inventory,inventory_value,cost_price,profit_rate,SKU,uvs,...,Amount_refund,Unit_price,Num_users,价格分组,price range,discount rage,discount range,Conversion Rate,stock value,Sales Ratio
60,G001,463,1266,0.365719,142,179772,268,0.421166,2,13011,...,21298.0,463.0,95,400 or more,400 or more,0.35_0.4,0.35_0.4,0.007302,179772,0.365719


In [90]:
# 4、Clearance processing products that do not meet this condition: look for the part where the discount rate is less than 37% and the sales-to-sell ratio is less than 36.5% or the conversion rate is less than 0.7%
stay_stocks4 = product_354[((product_354["discout"]<0.37) & ((product_354["Sales Ratio"]<0.365)|(product_354["Conversion Rate"]<0.007)))]
stay_stocks4

Unnamed: 0,sale_name,sale_price,tag_price,discout,inventory,inventory_value,cost_price,profit_rate,SKU,uvs,...,Amount_refund,Unit_price,Num_users,价格分组,price range,discount rage,discount range,Conversion Rate,stock value,Sales Ratio
13,B004,491,1394,0.352224,396,552024,353,0.281059,2,14535,...,23077.0,491.0,102,400 or more,400 or more,0.35_0.4,0.35_0.4,0.007018,552024,0.142313
16,B007,548,1499,0.365577,325,487175,420,0.233577,2,29492,...,36168.0,548.0,122,400 or more,400 or more,0.35_0.4,0.35_0.4,0.004137,487175,0.226095
21,C002,201,553,0.363472,287,158711,105,0.477612,1,7835,...,8643.0,201.0,97,200_400,200_400,0.35_0.4,0.35_0.4,0.01238,158711,0.187435
22,C003,473,1321,0.358062,242,319682,254,0.463002,2,17244,...,20812.0,473.0,92,400 or more,400 or more,0.35_0.4,0.35_0.4,0.005335,319682,0.221939
24,C005,270,765,0.352941,178,136170,115,0.574074,2,12610,...,9450.0,270.0,92,200_400,200_400,0.35_0.4,0.35_0.4,0.007296,136170,0.291474
26,C007,403,1122,0.35918,258,289476,167,0.585608,2,20754,...,18941.0,403.0,104,400 or more,400 or more,0.35_0.4,0.35_0.4,0.005011,289476,0.242238
30,D001,346,951,0.363828,239,227289,269,0.222543,1,24418,...,12802.0,346.0,90,200_400,200_400,0.35_0.4,0.35_0.4,0.003686,227289,0.200942
32,D003,193,533,0.362101,417,222261,165,0.145078,1,27367,...,9457.0,193.0,91,1_200,1_200,0.35_0.4,0.35_0.4,0.003325,222261,0.131121
41,E002,389,1080,0.360185,629,679320,244,0.372751,2,24150,...,14004.0,389.0,94,200_400,200_400,0.35_0.4,0.35_0.4,0.003892,679320,0.088185
42,E003,486,1349,0.360267,354,477546,220,0.547325,2,19094,...,20898.0,486.0,77,400 or more,400 or more,0.35_0.4,0.35_0.4,0.004033,477546,0.12416
