<h1>Pricing Analysis</h1>
<br>
<p>This project demonstrates several methods that could be used in determining price for a given range of products.</p>
<p>The data used is a list of metals, with made-up figures for price and sales over an arbitrary timeframe.</p>
<p>While we use gold, silver etc, this could be a stand-in for any range of similar products, like washing machines or mattresses. The assumption is that they are different quality versions of similar products.</p>

In [70]:
# import libraries
import plotly.express as px
import plotly.graph_objects as go
import pandas as pd
from sklearn import datasets, linear_model
from sklearn.metrics import mean_squared_error, r2_score
import numpy as np

<h1>Constructing the Data</h1>

In [71]:
# tuple of product catalogue with price and sales
stock = {'gold':{'price':100, 'sales':10, 'supply_price': 50},
         'silver':{'price':70,'sales':50, 'supply_price': 30},
         'steel':{'price':50,'sales':80, 'supply_price':25},
         'tamahagane':{'price':45,'sales':90, 'supply_price':20},
         'bronze':{'price':10, 'sales':100, 'supply_price':4},
         'copper':{'price':8,'sales':120, 'supply_price':1}}
stock_df = pd.DataFrame(stock)
stock_df

Unnamed: 0,gold,silver,steel,tamahagane,bronze,copper
price,100,70,50,45,10,8
sales,10,50,80,90,100,120
supply_price,50,30,25,20,4,1


In [72]:
#create arrays of prices and sales
price = np.array([stock[item]['price'] for item in stock])
sales = np.array([stock[item]['sales'] for item in stock])
supply_prices = np.array([stock[item]['supply_price'] for item in stock])

<h1>Demand Modelling</h1>
<p>In this section we use Linear Regression to try and establish the relationship between price and sales for these products.</p>

<h2>Plotting the Actual Price/Sales Distribution</h2>

In [73]:
fig1 = go.Figure(data=[go.Scatter(x=sales, y=price, text=[item for item in stock], name='Actual')])

fig1.update_layout(
    title='Sales by Price and Quantity',
    xaxis_title='Sale Quantity',
    yaxis_title='Sale Price'
    
)
fig1.show()

<h1>Fit and Test Regression Models</h2>

<h2>Linear Regression</h2>

In [74]:
regr = linear_model.LinearRegression()
regr.fit(price.reshape(-1,1), sales)

In [75]:
intercept = regr.intercept_
coef = regr.coef_
sales_pred = regr.predict(price.reshape(-1,1))
r2 = r2_score(sales, sales_pred)
mse = mean_squared_error(sales, sales_pred)
print(f'coef: {coef}, intercept: {intercept}, r-squared: {r2}, mse: {mse}')

coef: [-1.0759781], intercept: 125.75030044064627, r-squared: 0.9322829612463872, mse: 87.46784172341654


<h2>Logarithmic Model</h2>

In [76]:
log_prices = np.log(price)
log_model = linear_model.LinearRegression()
log_model.fit(log_prices.reshape(-1,1), sales)
intercept = log_model.intercept_
coef = log_model.coef_
log_sales_pred = log_model.predict(log_prices.reshape(-1,1))
r2 = r2_score(sales, log_sales_pred)
mse = mean_squared_error(sales, log_sales_pred)
print(f'coef: {coef}, intercept: {intercept}, r-squared: {r2}, mse: {mse}')

coef: [-32.33624779], intercept: 187.93099082106096, r-squared: 0.7416428751382155, mse: 333.71128627980505


<h2>Exponential Model</h2>

In [77]:
log_sales = np.log(sales)
exp_model = linear_model.LinearRegression()
exp_model.fit(price.reshape(-1,1), log_sales)
intercept = exp_model.intercept_
coef = exp_model.coef_
exp_sales_pred = exp_model.predict(price.reshape(-1,1))
r2 = r2_score(sales, exp_sales_pred)
mse = mean_squared_error(sales, exp_sales_pred)
print(f'coef: {coef}, intercept: {intercept}, r-squared: {r2}, mse: {mse}')

coef: [-0.02323808], intercept: 5.177580561031846, r-squared: -3.8539186375683867, mse: 6269.6449068591655


In [78]:
fig1.add_trace(go.Scatter(x=sales_pred, y=price, name='Linear Prediction'))
fig1.add_trace(go.Scatter(x=log_sales_pred, y=price, name='Logarithmic Prediction'))
# fig1.add_trace(go.Scatter(x=exp_sales_pred, y=price, name='Exponential Prediction')) omitted due to poor fit

<p>Given the small dataset we can only really extract rough indicators, but of the linear models tested the standard linear model has the lowest Mean Squared Error and a quite decent R-squared score.</p>
<p>While there are outliers, we should assume a linear relationship between price and sales.</p>

In [79]:
#plot entire linear model
all_prices = np.array([i for i in range(1,101)])
linear_sales_pred = regr.predict(all_prices.reshape(-1,1))
log_sales_pred = log_model.predict(all_prices.reshape(-1,1))
fig2 = go.Figure(data=[
    go.Scatter(x=sales,y=price, name='Actual'),
    go.Scatter(x=linear_sales_pred, y=all_prices, name='Linear Model')
])
fig2.update_layout(
    title='Sales Price Model',
    xaxis_title='Predicted Sale Quantity',
    yaxis_title='For Sale Price'
    
)

<p>The plot above graphs the predicted sales using the linear model for all (integer) price points. Thus if we are to introduce a new product, we can predict for example, we could sell 60 units at a price of £61, or 101 units at a price of £23.</P>
<p>This obviously does not take account of the profit margins involved, which is for later discussion.</P>