<h1>Table of Contents<span class="tocSkip"></span></h1>
<div class="toc"><ul class="toc-item"><li><span><a href="#Setup" data-toc-modified-id="Setup-1"><span class="toc-item-num">1&nbsp;&nbsp;</span>Setup</a></span></li><li><span><a href="#Cost-Plus-Pricing" data-toc-modified-id="Cost-Plus-Pricing-2"><span class="toc-item-num">2&nbsp;&nbsp;</span>Cost-Plus Pricing</a></span></li><li><span><a href="#Competitive-Pricing" data-toc-modified-id="Competitive-Pricing-3"><span class="toc-item-num">3&nbsp;&nbsp;</span>Competitive Pricing</a></span></li></ul></div>

## Setup

In [1]:
import pandas as pd
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)

import numpy as np

import plotly.express as px
import plotly.graph_objects as go

import datetime

import os
if not os.path.exists("images"):
    os.mkdir("images")

In [2]:
df = pd.read_csv("Products.csv")
print(df.shape)
print(df.columns)

(168, 26)
Index(['Id', 'Price-Max', 'Price-Min', 'Availability', 'Condition',
       'Price-Currency', 'Date-Seen', 'On-Sale', 'Merchant', 'Shipping',
       'Price-Source-URLs', 'ASINS', 'Brand', 'Category-Labels', 'Date-Added',
       'Date-Updated', 'EAN', 'Image URLs', 'Keys', 'Manufacturer',
       'Manufacturer-Id', 'Name', 'Primary-Category', 'Review-Source-URLs',
       'UPC', 'Weight'],
      dtype='object')


In [3]:
# Remove erroneously priced product

df =  df[df["Price-Min"] > 1]

In [4]:
# Convert Date Seen string to datetime

df["Date-Seen"] = df["Date-Seen"].apply(lambda x: x.split(","))
df["Date-Seen"] = df["Date-Seen"].apply(lambda x: [datetime.datetime.strptime(y,"%Y-%m-%dT%H:%M:%SZ") for y in x])

## Cost-Plus Pricing

This graph shows the range of minimum and maximum pricing for each TV model. Some of the ranges are quite large, indicating that the markup of other retailers varies largely as well, in which case cost-plus pricing for those models may be advantageous. However, some models have very small ranges, in which case cost-plus pricing may not be the most effective strategy - a markup well beyond that of other retailers may deter consumers.

In [6]:
fig = go.Figure()
fig.add_trace(go.Box(x=df.Name, y=df["Price-Min"], name="Price-Min"))
fig.add_trace(go.Box(x=df.Name, y=df["Price-Max"], name="Price-Max"))
fig.update_layout(
    yaxis_title='Price',
    boxmode='group'
)
fig.show()

In [7]:
# fig.write_image("images/cost_plus_pricing.png")

## Competitive Pricing

Using the median values, high-end models can be grouped as those \\$2,000 and above, mid-range models \\$1,000 - \\$1,999, and low-end models $999 and below.

In [8]:
# Median values by product
df.groupby("Name").median().sort_values("Price-Min", ascending=False)

Unnamed: 0_level_0,Price-Max,Price-Min,On-Sale,EAN,UPC
Name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
"Sony - 55 class - oled - a1e series - 2160p - smart - 4k uhd tv with hdr""",3099.35,3050.0,False,,27242910000.0
"Samsung - 55 class - led - q8f series - 2160p - smart - 4k uhd tv with hdr""",2197.99,2097.99,True,,887000000000.0
"Samsung - 55 class - led - curved - q7c series - 2160p - smart - 4k uhd tv with hdr""",2197.99,1997.99,True,,887000000000.0
X900f-series 55-class hdr uhd smart led tv,1398.995,1398.995,False,,27242910000.0
"Sony - 55 class - led - x800e series - 2160p - smart - 4k uhd tv with hdr""",1098.5,998.0,False,,27242900000.0
"Samsung 55 class 4k (2160p) smart led tv (un55ku7000)""",954.99,954.99,False,,887000000000.0
"Samsung - 55 class - led - curved - mu6490 series - 2160p - smart - 4k ultra hd tv with hdr""",799.99,797.99,False,,887000000000.0
"Lg - 55 class - led - uj7700 series - 2160p - smart - 4k uhd tv with hdr""",1299.0,796.99,True,,719000000000.0
Sony xbr55x700d 55-inch 4k ultra hd smart led tv (2016 model),789.99,789.99,False,,27242900000.0
"Hisense - 55 class - led - h9 series - 2160p - smart - 4k uhd tv with hdr""",699.99,699.99,False,,888000000000.0


In [11]:
exploded_date_seen = df.explode("Date-Seen").sort_values("Date-Seen")

fig1 = px.line(exploded_date_seen[exploded_date_seen["Price-Min"] > 2000], x="Date-Seen", y="Price-Min", color="Name")
fig1.update_layout(showlegend=False)#, title="High-End Price by Date Seen")
fig1.show()

fig2 = px.line(exploded_date_seen[(exploded_date_seen["Price-Min"] > 1000) & (exploded_date_seen["Price-Min"] < 1999)],
              x="Date-Seen", y="Price-Min", color="Name")
fig2.update_layout(showlegend=False)#, title="Mid-Range Price by Date Seen")
fig2.show()

fig3 = px.line(exploded_date_seen[exploded_date_seen["Price-Min"] < 999], x="Date-Seen", y="Price-Min", color="Name")
fig3.update_layout(showlegend=False)#, title="Low-End Price by Date Seen")
fig3.show()

In [12]:
# fig1.write_image("images/high_end_time_series.png")
# fig2.write_image("images/mid_range_time_series.png")
# fig3.write_image("images/low_end_time_series.png")

The graphs above show price fluctuations over time for each product. Higher end models fluctuate by more than $2000, which make a case for Premium competitive pricing - if the eCommerce platform can provide incentives that justify the added price, these higher end models may sell well. These same high-end models may be loss leaders as well - if they are sold below market rate, the eCommerce platform may be able to pair other high-end products with them that have a greater markup, as customers shopping for high-end products likely have the income to make additional purchases. <br><br> Mid-range and low-end product prices fluctuate by \\$800 and \\$600 respectively, making them the best candidates for price matching, or possibly loss leading. Price matching is likely the safest route, allowing the eCommerce platform to make a relatively small but less risky profit, while loss leading may result in overall losses if the eCommerce platform cannot persuade consumers to buy additional products to offset the loss.