# A Theoretical and Practical Review of Elasticity and Pricing Strategy

Pricing is a complex topic, and every business in the world has to deal with it. Whether you are a nine-year-old with a lemonade stand or a transnational corporation, price is the mechanism that you use to operate your business. You provide a good or service. A customer pays the price for that good or service.

Over the last twenty-five years or so, I have had the privilege to work on several pricing related projects. Each one was unique, and each required considerable thought and, ultimately, a specific solution. That said, some general concepts are helpful when you tackle pricing problems. That's my goal with this article. Not to provide a one size fits all pricing solution that works everywhere, but a general guide of data science techniques that are useful for pricing related problems.

Note that in this exercise, I am using a python Jupyter notebook inside Watson Studio from IBM. For my visualizations, I am using Plotly.  Also, please note that all data used in this notebook is 100% fake.  I manufactured it for this demonstration.  Although it is completely fake, it does accurately represent several projects on which I have worked in the past.

For more detail, refer to my article,
<a href="https://jshadgriffin.medium.com/a-theoretical-and-practical-review-of-elasticity-and-pricing-strategy-a42b1cc09dfe" target="_blank" rel="noopener noreferrer">A Theoretical and Practical Review of Elasticity and Pricing Strategy</a> and <a href="https://youtu.be/45CFpetqR30" target="_blank" rel="noopener noreferrer">video</a>.
This notebook runs on Python.

## Table of Contents

1. [Review of Microeconomics](#1.0)<br>
    1.1 [Import Libraries](#1.1)<br>
    1.2 [Create Demand Curve and Demand Schedule](#1.2)<br>
    1.3 [Price Elasticity of Demand and Price Elasticity of Revenue](#1.3)<br>
2. [Estimate Elasticity with a Real World Example](#2.0)<br>
    2.1 [Import data, transform data and data exploration](#2.1)<br>
    2.2 [Estimate Price Elasticity of Demand](#2.2)<br>
    2.3 [Creating a Demand Curve](#2.3)<br>
    2.4 [Estimate Price Elasticity of Revenue](#2.4)<br>
3. [Identify Different Price Elasticities of Demand and Price Discriminate](#3.0)<br>
    3.1 [Read, Explore and Transform Data](#3.1)<br>
    3.2 [Build a Demand Curve for All Stores](#3.2)<br>
    3.3 [Collect and Exploit the Residuals](#3.3)<br>
    3.4 [Cluster Stores](#3.4)<br>
    3.5 [Estimate Revenue Elasticity for Store Clusters](#3.5)<br>
    3.6 [Build Demand Curve for Each Store Cluster](#3.6)<br>
    
    
    
    
    
    


## 1.0 Review of Microeconomics <a id="1.0"></a>

If you took economics in school, you probably remember drawing a bunch of supply and demand curves. Hopefully, you'll remember the elegance of equilibriums and optimal spaces where things seemed to fit "just right." (If you are not a dork like me and remember econ 101 as a series of formulas you had to memorize, that's ok too :)).

You'll probably also remember something called elasticity. Elasticity is an integral part of price theory but also has tons of other applications. In this notebook, we walk through the basic concepts associated with pricing and price elasticity.




### 1.1 Import Libraries <a id="1.1"></a>

In [None]:
import sys
import types
import statsmodels.api as sm
import pandas as pd
import numpy as np
from botocore.client import Config
import ibm_boto3


def __iter__(self): return 0

!pip install plotly --upgrade
!pip install chart_studio  --upgrade
import chart_studio.plotly as py
import plotly.graph_objs as go
import plotly as plotly



### 1.2 Create Demand Curve and Demand Schedule <a id="1.2"></a>


In [None]:
trace = go.Scatter(
    x = [1,2,3,4,5,6,6.5,7,8,9,10],
    y = [100,90,80,70,60,50,45,40,30,20,10],
    mode = 'lines'
)



layout = go.Layout(
    title='Demand Curve',
    xaxis=dict(
        title='Price',
        titlefont=dict(
            family='Courier New, monospace',
            size=18,
            color='#7f7f7f'
        )
    ),
    yaxis=dict(
        title='Quantity',
        titlefont=dict(
            family='Courier New, monospace',
            size=18,
            color='#7f7f7f'
        )
    )
)
    
data=[trace]  
fig = go.Figure(data=data, layout=layout)

plotly.offline.iplot(fig, filename='shapes-lines')

A demand curve reflects the relationship between price and quantity. All companies face demand curves. All businesses have customers, and those customers will buy a certain number of items based on the price you charge. If you lower the price, you will sell more. If you raise the price, you will sell less.  This simple fact is called the law of demand.

Note that our demand curve has quantity on the vertical axis and price on the horizontal access. This is common in most graduate-level economics textbooks. Undergraduate level textbooks are usually the opposite (price is on the vertical axis). I know that doesn't make any sense, but we'll save a discussion for why this is the case for another day.

Let's look specifically at the demand curve we just drew. Based on the relationship between price and quantity, if the firm charges one dollar, it will sell 100 units. At 10 dollars, it will sell ten units, and for 6 dollars, it will sell 50 units.

Of course, there are a bunch of other options. You can easily see them in a demand schedule. A demand schedule is just the data underlying a demand curve.



In [None]:
d = {'price': [1,2,3,4,5,6,6.5,7,8,9,10], 'quantity': [100,90,80,70,60,50,45,40,30,20,10]}
df_demand_schedule= pd.DataFrame(data=d)

df_demand_schedule

###  1.3 Price Elasticity of Demand and Price Elasticity of Revenue <a id="1.3"></a>

Price elasticity of demand is the percentage change in quantity divided by the percentage change in price. It is a unitless number that details the price sensitivity at a particular point on the linear demand curve. Note that the elasticity of demand will vary as you move up and down a traditional demand curve. For example, on the curve above, the price elasticity of demand when the price is 2 dollars differs from the price elasticity of demand when the price is 8 dollars. I will show this graphically in just a second.

Note that there is also something called a constant elasticity demand curve where elasticity doesn't change from point to point. Put that aside for the moment. We'll get back to constant elasticity demand curves in part two of our exercise.

Here is the equation for price elasticity of demand.

(%change in quantity)/(%change in price) or ((Q2-Q1)/Q1)/(P2-P1)/P1)

Note that the equation above will always be a negative number. If you increase (decrease) price, the quantity will always decrease (increase). Sometimes you'll see the equation like this. -1*(((Q2-Q1)/Q1)/(P2-P1)/P1))

The -1 at the front of the equation makes it positive. In this exercise, we will just let it be a negative number.

In advanced economics studies, the price elasticity of demand is usually expressed as a derivative (calculus).

(dQ/dP) * (P1/Q1)

Let's add elasticity to our demand schedule, then look at it visually.

In [None]:
df_demand_schedule['Price_Elasticity_of_Demand']=(df_demand_schedule.quantity.pct_change() / df_demand_schedule.price.pct_change())
df_demand_schedule

In [None]:
x1=df_demand_schedule['price']
y1=df_demand_schedule['quantity']
z1=df_demand_schedule['Price_Elasticity_of_Demand']

trace1 = go.Scatter(
    x=x1,
    y=y1,
    name='Demand Curve'
)
trace2 = go.Scatter(
    x=x1,
    y=z1,
    name='Elasticity',
    yaxis='y2'
)
data = [trace1, trace2]
layout = go.Layout(
    title='Price Elasticity of Demand and the Demand Schedule',
    yaxis=dict(
        title='Quantity'
    ),
    yaxis2=dict(
        title='Price Elasticity of Demand',
        titlefont=dict(
            color='rgb(148, 103, 189)'
        ),
        tickfont=dict(
            color='rgb(148, 103, 189)'
        ),
        overlaying='y',
        side='right'
    ),
        xaxis=dict(
        title='Price',
        titlefont=dict(
            family='Courier New, monospace',
            size=18,
            color='#7f7f7f'
        )
))
fig = go.Figure(data=data, layout=layout)
plotly.offline.iplot(fig, filename='shapes-lines')

Note that when the price elasticity of demand is greater than -1, a 1% increase in price will lower quantity by less than 1%. When the price elasticity of demand is less than -1, a 1% increase in price will lead to a greater than 1% decrease in quantity. When it is equal to -1, it is unitary elastic. This means a 1% increase in price will lead to a 1% decrease in quantity.

Another critically important metric is revenue. Revenue is (price*quantity).

Typically, you'll want to choose the price that maximizes revenue or profit.

Price elasticity of revenue is another critical metric. Similar to the price elasticity of demand, the price elasticity of revenue is the percentage change in revenue divided by the percentage change in price.

Calculate the price elasticity of revenue with the following formula.

(%change in revenue)/(%change in price)

or

((R2-R1)/R1)/(P2-P1)/P1)

In advanced economics studies, the price elasticity of revenue is almost always expressed as a derivative (calculus).

(dR/dP) * (P1/R1)

Now that we have defined revenue and the price elasticity of revenue let's add them to our demand schedule.

In [None]:
df_demand_schedule['Revenue']=df_demand_schedule['price']*df_demand_schedule['quantity']
df_demand_schedule['Price_Elasticity_of_Revenue']=(df_demand_schedule.Revenue.pct_change() / df_demand_schedule.price.pct_change())
df_demand_schedule

Now, Lets plot revenue with demand

In [None]:
x1=df_demand_schedule['price']
y1=df_demand_schedule['quantity']
z1=df_demand_schedule['Revenue']

trace1 = go.Scatter(
    x=x1,
    y=y1,
    name='Demand Curve'
)
trace2 = go.Scatter(
    x=x1,
    y=z1,
    name='Revenue',
    yaxis='y2'
)
data = [trace1, trace2]
layout = go.Layout(
    title='Revenue, Quantity and Price',
    yaxis=dict(
        title='Quantity'
    ),
    yaxis2=dict(
        title='Revenue',
        titlefont=dict(
            color='rgb(148, 103, 189)'
        ),
        tickfont=dict(
            color='rgb(148, 103, 189)'
        ),
        overlaying='y',
        side='right'
    ),
        xaxis=dict(
        title='Price',
        titlefont=dict(
            family='Courier New, monospace',
            size=18,
            color='#7f7f7f'
        )
))
fig = go.Figure(data=data, layout=layout)
plotly.offline.iplot(fig, filename='shapes-lines')

Now, let's plot price elasticity of revenue next to Demand.


In [None]:
x1=df_demand_schedule['price']
y1=df_demand_schedule['quantity']
z1=df_demand_schedule['Price_Elasticity_of_Revenue']

trace1 = go.Scatter(
    x=x1,
    y=y1,
    name='Demand Curve'
)
trace2 = go.Scatter(
    x=x1,
    y=z1,
    name='Price Elasticity of Revenue',
    yaxis='y2'
)
data = [trace1, trace2]
layout = go.Layout(
    title='Price Elasticity of Revenue, Quantity and Price',
    yaxis=dict(
        title='Quantity'
    ),
    yaxis2=dict(
        title='Price Elasticity of Revenue',
        titlefont=dict(
            color='rgb(148, 103, 189)'
        ),
        tickfont=dict(
            color='rgb(148, 103, 189)'
        ),
        overlaying='y',
        side='right'
    ),
        xaxis=dict(
        title='Price',
        titlefont=dict(
            family='Courier New, monospace',
            size=18,
            color='#7f7f7f'
        )
))
fig = go.Figure(data=data, layout=layout)
plotly.offline.iplot(fig, filename='shapes-lines')

The price elasticity of revenue is similar but also slightly different from the Price elasticity of demand. 
Price elasticity of revenue is centered on zero, which makes it easier to interpret. If revenue elasticity is greater than zero, a price increase will increase revenue. If it is less than zero, increasing price will lower revenue.

Based on the charts above, you can see that a price between 5 and 6 dollars will maximize revenue. To maximize profit, you'd need to incorporate cost information.




Another concept that is critical to understand is price discrimination. Let's say that two 22-year-old men walk into a convenience store off an old highway in rural West Texas to buy a bottle of water. This part of Texas is a desert. It is in the middle of July. The sun is hot, and rain is scarce. One of the men walking into the store drove to the store in a car with air conditioning. The second man walked three miles to get to the store. The man who walked to the store is likely to be thirstier than the man who drove, right? I could even say that the man who walked's demand curve for the bottle of water is higher than the man who drove his car. Even though the store owner knows this, he really can't charge the thirstier man more the water, even though the thirstier man would likely pay more. In other words, the store owner cannot price discriminate between the two customers.

Price discrimination is charging different prices to different customers based on their demand for your product. The more a firm can price discriminate, the higher their profits will be. There are books written on why this is so, but for now, understanding this basic fact is good enough. Be careful, though; sometimes, price discrimination is illegal. There are many situations where it is lawful, however. The airlines are masters of price discrimination. Historically, they separate consumers who book travel early versus those who book travel late. The earlier you book a flight, typically the cheaper the fare. Movie theaters that offer a senior discount also use price discrimination. They offer a lower price because seniors have a different demand curve than younger moviegoers.

One last economics concept to understand related to pricing is inflation. Inflation is the annual upward drift in pricing that occurs across an economy over time. If you've lived life, you have probably noticed that most things get a little more expensive each year. Another way to think of inflation is the change in the cost of living. In the United States, inflation has been pretty low over the last twenty years. Typically, the US inflation rate is estimated to be about 1% to 3% annually. Understanding inflation is essential because when we set prices in the real world, we must consider inflation. Usually, this means that price changes account for changes in inflation. For example, if you have a market that is prime for a price increase, you'll likely increase prices by inflation plus some percentage. If the inflation rate is 1.5%, you may decide to raise prices by inflation plus 5%. This means the actual change in prices is 6.5%. Likewise, if you want to decrease the price, you can hold it constant. If inflation is 1.5% and the price of your product remains unchanged year over year, essentially, this amounts to a 1.5% decrease in price. Think of it like this.
If everything else in the economy increases in price by 1.5% and your product's price did not change, in relative or real terms, the price of your product is lower than it was the previous year.

It's all relative, right? I remember when I was a kid, and all my friends were growing taller when I wasn't. I wasn't shrinking, but it sure felt like it.

Now that we understand the basic concepts let's move to a real-world example.

# 2.0 Estimate Elasticity with a Real World Example <a id="2.0"></a>

In this use case, we will examine price elasticity for a regional retailer with 219 locations. The type of retailer isn't essential. It could be a convenience store, a restaurant, or a coffee shop. Whatever the business of the retailer, their data should look the same.

It is important to note that this data is 100% fake and it belongs to me. I created this data from scratch . Although it is synthetic, it is very consistent with real data I have worked with in the past.

The success of a retailer depends on several factors. One is management and management decisions. Pricing, for example, is 100% controllable by management. Environmental factors surrounding the store are also critical and typically not very controllable. For example, a retailer will naturally do better if the people living around the store are wealthy than if they are poor.

Our data set in this exercise is a combination of controllable and non-controllable factors. The controllable factor is the price. We also have numerous non-controllable environmental factors.

Our goal in this exercise is to understand better the relationship between price, quantity, and revenue. We achieve this goal by estimating the price elasticity of demand and the price elasticity of revenue.

### 2.1 Import data, transform data and data exploration <a id="2.1"></a>

The first step is to read in the data from Github.

In [None]:
!rm RETAIL_DATA.csv
!wget https://raw.githubusercontent.com/shadgriffin/Pricing_Tutorial/master/RETAIL_DATA.csv


In [None]:
pd_datax = pd.read_csv("RETAIL_DATA.csv")

df_retail = pd_datax
df_retail.head()

The first few rows of the data set should appear above.

Here is a definition of each field.

STORE_ID - is a unique id specific to each retail outlet

PERCENTAGE_OF_RENTERS is the percentage of households surrounding the store that rent their housing.

PERCENTAGE_OF_CHILDREN is the percentage of households surrounding the store that have children.

AVERAGE_INCOME is the average annual income of the households surrounding the store.

AVERAGE_AGE_IN_YEARS is the average age of the head of household in the vicinity of the retail outlet.

AVERAGE_LENGTH_OF_RESIDENCE is an average of the time individuals surrounding the retail outlet have lived at their current address.

PERCENT_SPEAKING_SPANISH is the percentage of households surrounding the store that speak Spanish

PRICE is the average price across multiple items sold at the retail outlet.

QUANTITY is the number of items sold by the retail outlet in the last year.

REVENUE is the total revenue for the store in the last year.

This exercise aims to calculate the price elasticity of demand and the price elasticity of revenue. (Please look at the first section in this series to understand these terms.) This isn't that hard. To estimate an elasticity, you can use a standard ordinary least squares regression and natural log (base e) transformed variables.
The first step is to take the natural log of each variable. The cell below does this. Note the new variables created at the end of the DF.

In [None]:
df_retail['LN_PRICE'] = np.log((df_retail.PRICE))
df_retail['LN_REVENUE'] = np.log((df_retail.REVENUE))
df_retail['LN_QUANTITY'] = np.log((df_retail.QUANTITY))
df_retail['LN_INCOME'] = np.log((df_retail.AVERAGE_INCOME))
df_retail['LN_AVERAGE_AGE_IN_YEARS'] = np.log((df_retail.AVERAGE_AGE_IN_YEARS))
df_retail['LN_AVERAGE_LENGTH_OF_RESIDENCE'] = np.log((df_retail.AVERAGE_LENGTH_OF_RESIDENCE))
df_retail['LN_PERCENT_SPEAKING_SPANISH'] = np.log((df_retail.PERCENT_SPEAKING_SPANISH))
df_retail['LN_PERCENT_HAVING_CHILDREN'] = np.log((df_retail.PERCENT_HAVING_CHILDREN))
df_retail['LN_PERCENTAGE_OF RENTERS'] = np.log((df_retail.PERCENTAGE_OF_RENTERS))




### 2.2 Estimate Price Elasticity of Demand <a id="2.2"></a>

Now, let's build an ordinary least squares regression using the log-transformed variables.

Define your independent and dependent variables with the following code cell.

Note that we only include statistically significant variables.

In [None]:
independentx = df_retail[['LN_PRICE','LN_PERCENTAGE_OF RENTERS','LN_PERCENT_HAVING_CHILDREN','LN_INCOME',
                          'LN_PERCENT_SPEAKING_SPANISH']]
independent = sm.add_constant(independentx, prepend=False)
dependent=df_retail['LN_QUANTITY']

The next few cells run the OLS regression.

In [None]:
mod = sm.OLS(dependent, independent)

In [None]:
res = mod.fit()

In [None]:
print(res.summary())

So, the price elasticity of demand is -.64. This comes from the ANOVA table above and is the estimated coefficient of LN_PRICE regressed on LN_QUANTITY.

A 1% increase in price will lower the quantity sold by .64%. 

The other coefficients can be interpreted similarly. 

A 1% increase in people's average income around a store will increase the quantity sold by .55%.

### 2.3 Creating a Demand Curve <a id="2.3"></a>

Now, let's make a demand curve. To do this, we will need to build a demand schedule using the predicted quantity at different prices based on our OLS regression model.

First, let's summarize the variables other than price that are significant in the model. We will take the average and then evaluate the relationship between price and quantity when the other variables are at their mean.

In [None]:
df_retail['chachacha']=1

doodad=['PERCENTAGE_OF_RENTERS', 'PERCENT_HAVING_CHILDREN','AVERAGE_INCOME','PERCENT_SPEAKING_SPANISH']
wookie = df_retail.groupby(['chachacha'])[doodad].mean()

wookie.reset_index(level=0, inplace=True)
wookie.head()





Next, we will create an array of prices and then convert the list to a pandas data frame.

In [None]:
#create an array of prices
price = [1.50,1.75,2.0,2.25,2.50,2.75,3.0,3.25,3.50,3.75,4.0,4.25,4.50,4.75,5.0,5.25,5.50,5.75,6.0,6.25,6.50,6.75,
         7.0,7.25,7.5,7.75,8.0,8.25,8.50,8.75,9.0,9.25,9.50,9.75,10.0,10.25,10.50,10.75,11.0,11.25,11.50,11.75,12.0]
df_price=pd.DataFrame(price)
df_price.columns = ['PRICE']
df_price['chachacha']=1


Then, merge the average value for the non-price variables to the prices we constructed above and calculate the log.

In [None]:
#join the array of prices to the average values of the other independent variables.
df_price =df_price.merge(wookie, on=['chachacha'], how='inner')

#Create Log Transformed Variables
df_price['LN_PRICE'] = np.log((df_price.PRICE))
df_price['LN_INCOME'] = np.log((df_price.AVERAGE_INCOME))
df_price['LN_PERCENT_SPEAKING_SPANISH'] = np.log((df_price.PERCENT_SPEAKING_SPANISH))
df_price['LN_PERCENT_HAVING_CHILDREN'] = np.log((df_price.PERCENT_HAVING_CHILDREN))
df_price['LN_PERCENTAGE_OF_RENTERS'] = np.log((df_price.PERCENTAGE_OF_RENTERS))
df_price['const']=1

We can now use the input variables we manufactured in the last few lines of code and our model to predict quantity at each price point. The result will be a demand schedule.

In [None]:
#create our scoring data set.
scoring= df_price[['LN_PRICE','LN_PERCENTAGE_OF_RENTERS','LN_PERCENT_HAVING_CHILDREN','LN_INCOME',
                          'LN_PERCENT_SPEAKING_SPANISH','const']]
#score the scoring data set
ln_q_hat=pd.DataFrame(res.predict(scoring))
#name the columns correctly
ln_q_hat.columns = ['LN_Q_HAT']
#combine ln of price and ln of predicted q into a new data frame
df_ce_demand = pd.concat([scoring['LN_PRICE'], ln_q_hat], axis=1)


#exponentiate the ln variables to get predicted quantity and price
df_ce_demand['Q_HAT']=np.exp(df_ce_demand['LN_Q_HAT'])
df_ce_demand['PRICE']=np.exp(df_ce_demand['LN_PRICE'])

#eliminate the ln variables and make the demand schedule.
df_ce_demand = df_ce_demand[['Q_HAT','PRICE']]

df_ce_demand.head()

Bingo! Now we have a demand schedule and we can plot it as a demand curve.



In [None]:

trace = go.Scatter(
    x = df_ce_demand['PRICE'],
    y = df_ce_demand['Q_HAT'],
    mode = 'lines'
)



layout = go.Layout(
    title='Demand Curve',
    xaxis=dict(
        title='Price',
        titlefont=dict(
            family='Courier New, monospace',
            size=18,
            color='#7f7f7f'
        )
    ),
    yaxis=dict(
        title='Quantity',
        titlefont=dict(
            family='Courier New, monospace',
            size=18,
            color='#7f7f7f'
        )
    )
)
    
data=[trace]  
fig = go.Figure(data=data, layout=layout)

#plot_url = py.plot(fig, filename='styling-names')
plotly.offline.iplot(fig, filename='shapes-lines')


Wait a minute! That's not linear! Nope, it isn't. This is a variation of the demand curves we developed in part 1 of our exercise. It is what we call a constant elasticity demand curve. The elasticity is the same at all points.

Remember, in our earlier discussion, that elasticity was different at each price point of the demand curve. With this demand curve, elasticity is the same at all the price points. It still shows the relationship between price and quantity.

For example, for 6 dollars, our firm can sell about 75,000 units at each store.

If you wanted a linear demand curve, you could get there. You would regress price on quantity (instead of ln of price on ln of quantity). Of course, if you built your model this way, the coefficient wouldn't be an elasticity.

### 2.4 Estimate Price Elasticity of Revenue <a id="2.4"></a>

Estimating the price elasticity of revenue follows a similar process. The difference is we will use the natural log of revenue as a dependent variable instead of the natural log of quantity.

In [None]:
independentx = df_retail[['LN_PRICE','LN_PERCENTAGE_OF RENTERS','LN_PERCENT_HAVING_CHILDREN','LN_INCOME',
                          'LN_PERCENT_SPEAKING_SPANISH']]
independent = sm.add_constant(independentx, prepend=False)
dependent=df_retail['LN_REVENUE']

In [None]:
mod = sm.OLS(dependent, independent)

In [None]:
res = mod.fit()

In [None]:
print(res.summary())

The regression results suggest that the price elasticity of revenue is .358. This means that a 1% increase in price will lead to a .358% increase in revenue. In other words, this firm would make more revenue if it increased prices.

There are few important caveats that I should probably mention.

One, this is a point estimate. That is a fancy way of saying that you shouldn't get too crazy. If you increase prices by 1%, you probably will increase revenue by .35%. However, if you raise prices by 100%, you probably would not realize a 35% increase in revenue. Baked into the model is an established historical relationship between your customers and your prices. If you do something very different than the historical norm, don't expect the model to be predictive.

Two, it is essential to understand this elasticity is an average across all stores. The price elasticity of revenue is .358, on average. There are 291 stores in the data set. Some probably have an elasticity greater than .35. Others probably have an elasticity that is less than .35. In other words, if you increase prices by 3% across the board, on average, you will realize a 1.05% increase in revenue. This is an average. Some stores will recognize more than 1.05%, and others will discover less than 1.05%. A 3% increase in prices may even cause some stores to lose revenue.

What if you could tailor the price increase for each store?

That is, increase prices by an average of 3%, but give some stores a higher bump in prices than others. You could even decrease prices in some stores if it makes sense. Tailoring each store's price increase based on their specific market will lead to an even more significant revenue increase. For example, you can raise prices by 3% on average and get an increase in revenue greater than 1.05%.

There are many ways to accomplish this goal. In the third part of this exercise, we will examine a relatively simple and straightforward way to make a market-based pricing decision for each of our 291 stores.


# 3.0 Identify Different Price Elasticities of Demand and Price Discriminate <a id="3.0"></a>

Previously, we learned that our organization's price elasticity of revenue is .35. A 1% increase in price will generate a .35% increase in revenue. If we increase prices across all stores by 3%, we will increase revenue by 1.05%.

Can we do better than this? Yes, by price discriminating. To price discriminate, we must identify stores that have higher and lower revenue elasticity of demand. The higher the revenue elasticity of demand, the higher the ability to generate revenue from raising prices. The lower the revenue elasticity of demand, the higher our ability to increase revenue from lowering prices.

So how do you know which stores can sustain a higher price increase?






In our approach, we engage in the following steps to segment our stores by their price sensitivity.
This approach involves the following steps (I will explain why each step is necessary below).
1. Build a demand curve that includes all stores.
2. Use the residual from the demand curve equation and each store's price to segment the stores into groups.
3. Estimate the price elasticity of revenue for each group.
4. Set prices based on the price elasticity of revenue.

### 3.1 Read, Explore and Transform Data <a id="3.1"></a>

In [None]:
df_retail = pd_datax
df_retail.head()

Again, here is a definition of each field.

STORE_ID - is a unique id specific to each retail outlet

PERCENTAGE_OF_RENTERS is the percentage of households surrounding the store that rent their housing.

PERCENTAGE_OF_CHILDREN is the percentage of households surrounding the store that have children.

AVERAGE_INCOME is the average annual income of the households surrounding the store.

AVERAGE_AGE_IN_YEARS is the average age of the head of household in the vicinity of the retail outlet.

AVERAGE_LENGTH_OF_RESIDENCE is an average of the time individuals surrounding the retail outlet have lived at their current address.

PERCENT_SPEAKING_SPANISH is the percentage of households surrounding the store that speak Spanish

PRICE is the average price across multiple items sold at the retail outlet.

QUANTITY is the number of items sold by the retail outlet in the last year.

REVENUE is the total revenue for the store in the last year.

Now, create new fields by taking the natural log of each variable.

In [None]:
df_retail['LN_PRICE'] = np.log((df_retail.PRICE))
df_retail['LN_REVENUE'] = np.log((df_retail.REVENUE))
df_retail['LN_QUANTITY'] = np.log((df_retail.QUANTITY))
df_retail['LN_INCOME'] = np.log((df_retail.AVERAGE_INCOME))
df_retail['LN_AVERAGE_AGE_IN_YEARS'] = np.log((df_retail.AVERAGE_AGE_IN_YEARS))
df_retail['LN_AVERAGE_LENGTH_OF_RESIDENCE'] = np.log((df_retail.AVERAGE_LENGTH_OF_RESIDENCE))
df_retail['LN_PERCENT_SPEAKING_SPANISH'] = np.log((df_retail.PERCENT_SPEAKING_SPANISH))
df_retail['LN_PERCENT_HAVING_CHILDREN'] = np.log((df_retail.PERCENT_HAVING_CHILDREN))
df_retail['LN_PERCENTAGE_OF RENTERS'] = np.log((df_retail.PERCENTAGE_OF_RENTERS))

df_retail.head()


### 3.2 Build a Demand Curve for All Stores <a id="3.2"></a>

In [None]:
df_retaily=df_retail

Define the regressors and the dependent variable.

In [None]:
independentx = df_retaily[['LN_PERCENTAGE_OF RENTERS','LN_PERCENT_HAVING_CHILDREN','LN_INCOME',
                          'LN_PERCENT_SPEAKING_SPANISH','LN_PRICE']]
independent = sm.add_constant(independentx, prepend=False)
dependent=df_retaily['LN_QUANTITY']

Build the model.

In [None]:
mod = sm.OLS(dependent, independent)

Examine the results.

In [None]:
results = mod.fit()
print(results.summary())

As before, the price elasticity of demand for all stores is -.64.  I 1% increase in price will lead to a .64% decrease in quantity.



### 3.3 Collect and Exploit the Residuals <a id="3.3"></a>

Next, we will use the residual from the demand equation to segment our stores into groups.  The first step is to save the residuals as a variable in our dataframe.

In [None]:
df_q_pred = pd.DataFrame(results.predict(independent))

df_q_pred.columns = ['P_LN_QUANTITY']
df_q_pred['P_QUANTITY']=np.exp(df_q_pred['P_LN_QUANTITY'])
df_retailx = pd.concat([df_retaily, df_q_pred], axis=1)
df_retailx['R_QUANTITY']=df_retailx['QUANTITY']-df_retailx['P_QUANTITY']

df_forview=df_retailx[['STORE_ID','QUANTITY','P_QUANTITY','R_QUANTITY']]


df_forview.head(2)

Now, let's take a minute and discuss what the residual from this linear equation means. The residual, as you recall, is the difference between the actual quantity and the predicted quantity.

Let's look at the two rows above. In the first row, the actual quantity is about 95,000, and the amount predicted is about 75,000 for a difference of about 20,000.
Given the price and market factors of STORE_ID 177473, we would expect the quantity sold to be about 75,000. In reality, this store is doing much better than that. It is selling almost 20,000 more units than we would expect. In other words, it is over-performing.

In the second row, the actual quantity sold is about 95,000, and the expected quantity sold (given the price and market factors) is about 95,000. In other words, STORE_ID 177467 is an average performer.

So, what do the previous statements tell us about the current prices at each store? It is hard to say at this point. It could be that the first store is over-performing because it is priced optimally. Or, it could be telling us that this store is on a completely different demand curve, and there is an opportunity to raise prices.
One thing is for sure. These stores are different.

Next, let's take a look at how each store is priced relative to each other. That is, what stores are priced higher than average and what stores are priced lower than average.

In the next step, we will calculate the PRICE_DELTA. The difference between the price of each store and the average price of all stores.


In [None]:
df=df_retailx.copy()
df['AVERAGE_PRICE']=df['PRICE'].mean()


df['PRICE_DELTA']=df['PRICE']-df['AVERAGE_PRICE']


df_forview=df[['STORE_ID','QUANTITY','P_QUANTITY','R_QUANTITY', 'PRICE', 'PRICE_DELTA']]


df_forview.head(2)



For the store in the first row, we can now see that the store is overperforming, and its price is 2.4 cents higher than average. For the store in the second row, it is priced 1.4 cents higher than average.

We still don’t have enough information to know whether these stores deserve a price increase or decrease. Let’s see if we can use these new metrics to cluster our stores into groups that likely face a similar demand curve.

### 3.4 Cluster Stores <a id="3.4"></a>

Import K-means libraries.

In [None]:

from sklearn.cluster import KMeans
from sklearn.preprocessing import StandardScaler

Define the features for the K-Means procedure.

In [None]:
features = df[['PRICE_DELTA','R_QUANTITY']]

Standardize the features.

In [None]:
scaler = StandardScaler()
scaled_features = scaler.fit_transform(features)

Create the clusters.  Note that I came up with 5 clusters after a little trial and error.

In [None]:
kmeans = KMeans(
init="random",
n_clusters=5,
n_init=10,
max_iter=300,
random_state=42)

kmeans.fit(scaled_features)

Examine statistics of clusters and kmeans procedure.

In [None]:
kmeans.inertia_

In [None]:
kmeans.cluster_centers_

In [None]:
kmeans.n_iter_

Append new clusters to the original dataframe.

In [None]:
zz=kmeans.labels_
df['CLUSTER'] = pd.Series(zz, index=df.index)

Find the average price delta and quantity residual for each segment.

In [None]:
df_output = df.groupby(["CLUSTER"])[["PRICE_DELTA", "R_QUANTITY"]].mean()
df_output= df_output.sort_values(["PRICE_DELTA"])
df_output

This is very insightful information.  
    
    Cluster 3 represents stores that are priced very low and are under-performing.  
    Cluster 4 represents stores that are priced a little low and are way over-performing.
    Cluster 0 represents stores that are priced a little high and over-performing.
    Cluster 1 represents stores that are priced a little high and are severely under-performing.
    Cluster 2 represents stores that are priced very high and under-performing.

Let's examine the clusters graphically to get even more clarity.

In [None]:
df.CLUSTER = df.CLUSTER.astype(str)

In [None]:
import plotly.express as px

fig = px.scatter(df, y="PRICE_DELTA", x="R_QUANTITY", color="CLUSTER", size_max=45)

fig.update_layout(legend=dict(
    orientation="h",
    yanchor="bottom",
    y=1.02,
    xanchor="right",
    x=1
))

fig.show()

### 3.5 Estimate Revenue Elasticity for Store Clusters <a id="3.5"></a>

We will now incorporate these segments into the revenue elasticity regression model to estimate the revenue elasticity of demand for each segment.

To accomplish this, we will need to create dummy variables that represent each cluster.

In [None]:
df['CLUSTER'] = pd.Series(zz, index=df.index)
xx=pd.get_dummies(df['CLUSTER'],prefix='seg')
df = pd.concat([df, xx], axis=1)
df.head()

Note that we now have the segments expressed as dummy variables at the end of the dataframe.

Next, create interactive dummy variables.  Segment 0 is the base value.

In [None]:
df['LN_P_SEG_1']=(df['LN_PRICE']*df['seg_1'])
df['LN_P_SEG_2']=(df['LN_PRICE']*df['seg_2'])
df['LN_P_SEG_3']=(df['LN_PRICE']*df['seg_3'])
df['LN_P_SEG_4']=(df['LN_PRICE']*df['seg_4'])


Include the interactive dummy variables into the revenue model.

In [None]:
independentx = df[['LN_PERCENTAGE_OF RENTERS','LN_PERCENT_HAVING_CHILDREN','LN_INCOME',
                          'LN_PERCENT_SPEAKING_SPANISH','LN_PRICE','LN_P_SEG_1','LN_P_SEG_2','LN_P_SEG_3','LN_P_SEG_4']]
independent = sm.add_constant(independentx, prepend=False)
dependent=df['LN_REVENUE']

mod = sm.OLS(dependent, independent)

results = mod.fit()

print(results.summary())

Export the Coefficients.

In [None]:
coeff = results.params

results_df = pd.DataFrame({#"pvals":pvals,
                               "coeff":coeff,
                               #"conf_lower":conf_lower,
                               #"conf_higher":conf_higher
                                })

#Reordering...
results_df = results_df[["coeff"]]

results_df.reset_index(level=0, inplace=True)

xx=results_df.loc[(results_df['index']=='LN_PRICE') | (results_df['index']=='LN_P_SEG_1')| (results_df['index']=='LN_P_SEG_2')| (results_df['index']=='LN_P_SEG_3')
                 | (results_df['index']=='LN_P_SEG_4')]
xx=xx.copy()
xx['index'] = xx['index'].astype('string')
xx = xx.rename(columns={'index': 'THINGY'})

xx['CLUSTER']=np.where((xx.THINGY==('LN_P_SEG_1')),1,
                      np.where((xx.THINGY==('LN_PRICE')),0,
                              np.where((xx.THINGY==('LN_P_SEG_2')),2,
                                      np.where((xx.THINGY==('LN_P_SEG_3')),3,
                                              np.where((xx.THINGY==('LN_P_SEG_4')),4,99)))))

xx


Transform the coefficient so that it represents the elasticity for each segment.

In [None]:
xx['wookie']=1
zz=xx[xx['THINGY']=='LN_PRICE']

zz = zz.rename(columns={'coeff': 'BASE_P'})
zz=zz[['wookie','BASE_P']]

xx =xx.merge(zz, on=['wookie'], how='inner')

xx['BASE_P']=np.where((xx.THINGY==('LN_PRICE')),0,xx['BASE_P'])

xx['REV_ELASTICITY']=xx['coeff']+xx['BASE_P']
qq=xx[['CLUSTER','REV_ELASTICITY']]
qq

Merge Revenue Elasticity to Other segment metrics.

In [None]:
df_output.reset_index(level=0, inplace=True)
df_output=df_output[['CLUSTER','PRICE_DELTA','R_QUANTITY']]
df_output =df_output.merge(qq, on=['CLUSTER'], how='inner')
df_output=df_output[['CLUSTER','PRICE_DELTA','R_QUANTITY','REV_ELASTICITY']]
df_output= df_output.sort_values(["REV_ELASTICITY"])
df_output


Now, let's take another look at our Clusters.

Cluster 4 includes stores with the highest revenue elasticity. This makes sense, given that it is priced less than average and is performing well.
Cluster 0 has the second-highest revenue elasticity. It is priced higher than average and also appears to face a strong demand.

The other three clusters also have a consequential revenue elasticity.

There should be one thing that jumps right out as you review these elasticity numbers. Can you guess?

Remember the overall price elasticity of revenue for all stores? It was .36. All of the segments above have a price elasticity of revenue greater than .36. Some should be lower, and some should be higher, right?

Yeah, they should. While the five individual segments can have a higher elasticity than the five combined, it isn't likely.

This anomaly allows me to make an essential point about this data and all data observed outside of a laboratory. A friend of mine once said that using a sophisticated model on data from a corporate data warehouse is like using a laser beam to slice bologna. Nothing could be more accurate. We live in an imperfect world filled with random noise and random error. You have to remember this as you "do" your data science. Most of the time, your models' output is directional and subject to the imperfect world that we inhabit. That doesn't mean that the work we do as data scientists are not useful; it is. Just remember that the goal is to guide the business towards better revenue, lower costs, and higher profit. Perfection is a goal, not a destination. We are subject to an imperfect world, but data science can make it better.

Given this, it makes me sometimes wonder about the emphasis on prediction accuracy that I've seen in the last few years. I mean, with the Kaggle competitions and the highly complex algorithms, are we really improving things? Does improving the accuracy from .1% to .01% matter? I mean, at the end of the day, aren't still just left with a sliced piece of bologna? Not always, but in many situations, I think you are.

So, now what do we do? That isn't easy to say. Moving from your Jupyter notebook to the real-world is never a straightforward task, especially given the limitations of our imperfect world. Having said that, given all the information presented so far, we can take action.

Here are a few things to keep in mind. One, history is essential. If the historical rate increase for a market is 1%, I wouldn't recommend implementing a 20% increase in prices. I would want to keep it consistent with what consumers are used to seeing. Two, listen to others in the organization, especially those in customer-facing roles. When it comes to antidote versus data, I am 100% in the data camp. Having said that, I don't think you can ignore those who are intimately connected to the business. For example, if a store manager is convinced his store cannot handle a price increase and your model says it is prime for one, I wouldn't go as big as I would if the store manager was indifferent or advocating strongly for a price increase.

Here is an example of how you can use our analysis to set price changes. We will assume that this firm increased prices for the last few years by the inflation rate across the board.

For segment 4, price is below average, quantity is way above average, and the revenue elasticity is 0.72. I would increase prices by inflation + 4%. This should yield a 2.88% real increase in revenue for these stores.

For segment 0, price is above average, quantity is above average, and the revenue elasticity is .59, I would increase prices by inflation + 2%. This would yield a 1.2 percent real increase in revenue.

For segment 3, price is way below average, quantity is below average, and revenue elasticity is .48, I would increase prices by inflation + 1%. This should yield a .48% increase in real revenue.

For segment 2, price is way above average, quantity is below average, and revenue elasticity is .45, I would increase prices by inflation + .5%. This should yield a .225% increase in real revenue.

For segment 1, price is above average, quantity is below average, and revenue elasticity is .39, I would increase prices by inflation. This should yield a 0% increase in real revenue.

Again, there is no perfect answer here, but the recommendations above are solid and consistent with our analysis.


### 3.6 Build Demand Curve for Each Store Cluster <a id="3.6"></a>

For the heck of it, let's look at the demand curves for each of the segments.

In [None]:
independentx = df[['LN_PERCENTAGE_OF RENTERS','LN_PERCENT_HAVING_CHILDREN','LN_INCOME',
                          'LN_PERCENT_SPEAKING_SPANISH','LN_PRICE','LN_P_SEG_1','LN_P_SEG_2','LN_P_SEG_3','LN_P_SEG_4']]
independent = sm.add_constant(independentx, prepend=False)
dependent=df['LN_QUANTITY']

mod = sm.OLS(dependent, independent)

results = mod.fit()

print(results.summary())

In [None]:
#Export coefficients
coeff = results.params

results_df = pd.DataFrame({#"pvals":pvals,
                               "coeff":coeff,
                               #"conf_lower":conf_lower,
                               #"conf_higher":conf_higher
                                })

#Reordering...
results_df = results_df[["coeff"]]
results_df.reset_index(level=0, inplace=True)
#format and rename columns
xx=results_df.copy()
xx['index'] = xx['index'].astype('string')

xx = xx.rename(columns={'index': 'THINGY'})

#transpose the results
results_flipped = xx.transpose()
df2=results_flipped
header=df2.iloc[0]
df2=df2[1:]
df2.columns=header
results_flipped=df2

#create stand alone pricing coefficients.
results_flipped['LN_P_SEG_1X']=results_flipped['LN_PRICE']+results_flipped['LN_P_SEG_1']
results_flipped['LN_P_SEG_2X']=results_flipped['LN_PRICE']+results_flipped['LN_P_SEG_2']
results_flipped['LN_P_SEG_3X']=results_flipped['LN_PRICE']+results_flipped['LN_P_SEG_3']
results_flipped['LN_P_SEG_4X']=results_flipped['LN_PRICE']+results_flipped['LN_P_SEG_4']
results_flipped['LN_P_SEG_0X']=results_flipped['LN_PRICE']
#pair down the dataframe
results_flipped=results_flipped[['LN_PERCENTAGE_OF RENTERS','LN_PERCENT_HAVING_CHILDREN','LN_INCOME','LN_PERCENT_SPEAKING_SPANISH',
                                 'LN_P_SEG_1X','LN_P_SEG_2X','LN_P_SEG_3X','LN_P_SEG_4X','LN_P_SEG_0X','const']].copy()

#create a flag to join 
results_flipped['chachacha']=1
#rename columns
results_flipped = results_flipped.rename(columns={'LN_PERCENTAGE_OF RENTERS': 'B_LN_PERCENTAGE_OF_RENTERS','LN_PERCENT_HAVING_CHILDREN':'B_LN_PERCENT_HAVING_CHILDREN',
                                                 'LN_INCOME':'B_LN_INCOME','LN_PERCENT_SPEAKING_SPANISH':'B_LN_PERCENT_SPEAKING_SPANISH',
                                                 'LN_P_SEG_1X':'B_LN_P_SEG_1', 'LN_P_SEG_2X':'B_LN_P_SEG_2','LN_P_SEG_3X':'B_LN_P_SEG_3',
                                                 'LN_P_SEG_4X':'B_LN_P_SEG_4','LN_P_SEG_0X':'B_LN_P_SEG_0','const':'const'})


#create a list to get averages for the non-price independent variables
dingle=['PERCENTAGE_OF_RENTERS', 'PERCENT_HAVING_CHILDREN','AVERAGE_INCOME','PERCENT_SPEAKING_SPANISH','QUANTITY']
#get averages of non-price independent variables
df['chachacha']=1
wookie = df.groupby(['chachacha'])[dingle].mean()

wookie.reset_index(level=0, inplace=True)

#merge the independent variables and coefficients into one dataframe

wookie =wookie.merge(results_flipped, on=['chachacha'], how='inner')
#create an array of prices to score
price = [4.00,4.10,4.20,4.30,4.40,4.50,4.60,4.70,4.80,4.90,5.00,5.10,5.20,5.30,5.40,5.50,5.60,5.70,5.80,5.90,
        6.00,6.10,6.20,6.30,6.40,6.50,6.60,6.70,6.80,6.90,7.00,7.10,7.20,7.30,7.40,7.50,7.60,7.70,7.80,7.90,
        8.00,8.10,8.20,8.30,8.40,8.50,8.60,8.70,8.80,8.90,9.00,9.10,9.20,9.30,9.40,9.50,9.60,9.70,9.80,9.90]
df_price=pd.DataFrame(price)
df_price.columns = ['PRICE']
df_price['chachacha']=1

#merge prices, coefficients and non-price independent variables
df_price =df_price.merge(wookie, on=['chachacha'], how='inner')

#Create Log Transformed Variables
df_price['LN_PRICE'] = np.log((df_price.PRICE))
df_price['LN_INCOME'] = np.log((df_price.AVERAGE_INCOME))
df_price['LN_PERCENT_SPEAKING_SPANISH'] = np.log((df_price.PERCENT_SPEAKING_SPANISH))
df_price['LN_PERCENT_HAVING_CHILDREN'] = np.log((df_price.PERCENT_HAVING_CHILDREN))
df_price['LN_PERCENTAGE_OF_RENTERS'] = np.log((df_price.PERCENTAGE_OF_RENTERS))

#score the quantity variables

df_price['P_LN_Q_SEG_0']=(df_price['LN_PRICE']*df_price['B_LN_P_SEG_0']+\
                          df_price['B_LN_PERCENTAGE_OF_RENTERS']*df_price['LN_PERCENTAGE_OF_RENTERS']+\
df_price['B_LN_PERCENT_HAVING_CHILDREN']*df_price['LN_PERCENT_HAVING_CHILDREN']+df_price['B_LN_INCOME']*df_price['LN_INCOME']+\
df_price['B_LN_PERCENT_SPEAKING_SPANISH']*df_price['LN_PERCENT_SPEAKING_SPANISH']+df_price['const']).astype(float)

df_price['P_LN_Q_SEG_1']=(df_price['LN_PRICE']*df_price['B_LN_P_SEG_1']+df_price['B_LN_PERCENTAGE_OF_RENTERS']*df_price['LN_PERCENTAGE_OF_RENTERS']+\
df_price['B_LN_PERCENT_HAVING_CHILDREN']*df_price['LN_PERCENT_HAVING_CHILDREN']+df_price['B_LN_INCOME']*df_price['LN_INCOME']+\
df_price['B_LN_PERCENT_SPEAKING_SPANISH']*df_price['LN_PERCENT_SPEAKING_SPANISH']+df_price['const']).astype(float)

df_price['P_LN_Q_SEG_2']=(df_price['LN_PRICE']*df_price['B_LN_P_SEG_2']+df_price['B_LN_PERCENTAGE_OF_RENTERS']*df_price['LN_PERCENTAGE_OF_RENTERS']+\
df_price['B_LN_PERCENT_HAVING_CHILDREN']*df_price['LN_PERCENT_HAVING_CHILDREN']+df_price['B_LN_INCOME']*df_price['LN_INCOME']+\
df_price['B_LN_PERCENT_SPEAKING_SPANISH']*df_price['LN_PERCENT_SPEAKING_SPANISH']+df_price['const']).astype(float)

df_price['P_LN_Q_SEG_3']=(df_price['LN_PRICE']*df_price['B_LN_P_SEG_3']+df_price['B_LN_PERCENTAGE_OF_RENTERS']*df_price['LN_PERCENTAGE_OF_RENTERS']+\
df_price['B_LN_PERCENT_HAVING_CHILDREN']*df_price['LN_PERCENT_HAVING_CHILDREN']+df_price['B_LN_INCOME']*df_price['LN_INCOME']+\
df_price['B_LN_PERCENT_SPEAKING_SPANISH']*df_price['LN_PERCENT_SPEAKING_SPANISH']+df_price['const']).astype(float)

df_price['P_LN_Q_SEG_4']=(df_price['LN_PRICE']*df_price['B_LN_P_SEG_4']+df_price['B_LN_PERCENTAGE_OF_RENTERS']*df_price['LN_PERCENTAGE_OF_RENTERS']+\
df_price['B_LN_PERCENT_HAVING_CHILDREN']*df_price['LN_PERCENT_HAVING_CHILDREN']+df_price['B_LN_INCOME']*df_price['LN_INCOME']+\
df_price['B_LN_PERCENT_SPEAKING_SPANISH']*df_price['LN_PERCENT_SPEAKING_SPANISH']+df_price['const']).astype(float)

df_price['P_Q_SEG_0']=np.exp(df_price['P_LN_Q_SEG_0'])
df_price['P_Q_SEG_1']=np.exp(df_price['P_LN_Q_SEG_1'])
df_price['P_Q_SEG_2']=np.exp(df_price['P_LN_Q_SEG_2'])
df_price['P_Q_SEG_3']=np.exp(df_price['P_LN_Q_SEG_3'])
df_price['P_Q_SEG_4']=np.exp(df_price['P_LN_Q_SEG_4'])

#plot in two dimensions

x1=df_price['PRICE']
y1=df_price['P_Q_SEG_0']
z1=df_price['P_Q_SEG_1']
a1=df_price['P_Q_SEG_2']
b1=df_price['P_Q_SEG_3']
c1=df_price['P_Q_SEG_4']

trace1 = go.Scatter(
    x=x1,
    y=y1,
    name='Segment 0'
)
trace2 = go.Scatter(
    x=x1,
    y=z1,
    name='Segment 1',
    yaxis='y1'
)
trace3 = go.Scatter(
    x=x1,
    y=a1,
    name='Segment 2',
    yaxis='y1'
)
trace4 = go.Scatter(
    x=x1,
    y=b1,
    name='Segment 3',
    yaxis='y1'
)
trace5 = go.Scatter(
    x=x1,
    y=c1,
    name='Segment 4',
    yaxis='y1'
)
data = [trace1, trace2,trace3,trace4,trace5]
layout = go.Layout(
    title='Demand Curves by Segment',
    yaxis=dict(
        title='Quantity'
    ),
    yaxis2=dict(
        title='Price Elasticity of Revenue',
        titlefont=dict(
            color='rgb(148, 103, 189)'
        ),
        tickfont=dict(
            color='rgb(148, 103, 189)'
        ),
        overlaying='y',
        side='right'
    ),
        xaxis=dict(
        title='Price',
        titlefont=dict(
            family='Courier New, monospace',
            size=18,
            color='#7f7f7f'
        )
))
fig = go.Figure(data=data, layout=layout)
plotly.offline.iplot(fig, filename='shapes-lines')

I hope this was helpful.  Please reach out if you have any questions.

### Author

Shad Griffin is a Certified Thought Leader and a Data Scientist at IBM.

Copyright © 2021. This notebook and its source code are released under the terms of the MIT License.