Naïve Bayes is a fast classification method that can predict the results given some training samples.

The blockchain is an effective technology for managing supply chains, as it allows manufacturers, distributors, retailers, and customers to share a transaction network. With blockchain, when a new transaction occurs, a block is created in the network that is shared by all parties involved. This block contains information about the transaction, such as the purchase of a product X under a contract agreement.

In the real world of production and sales, distributors must anticipate demand. One way to achieve this is by using the Naive Bayesian classification method.

Let's say you're a distributor and you have access to the product_demand.csv dataset, which contains information on product demand and the number of blocks from the previous day. Keep in mind that a small number of blocks (transactions) may correspond to a high demand for the product.

In [4]:
import pandas as pd
df = pd.read_csv("product_demand.csv")
df.head()

Unnamed: 0,day,number of blocks,demand
0,0,some,low
1,1,few,low
2,2,some,high
3,3,some,high
4,4,many,high


* **Number of Blocks** accounts for the number of the blocks of purchasing products in the previous day. This number has been classified into three categories, "few", "some" and "many".
* **Demand** shows whether the product is in low demand or high demand by the customers.

###  
Use the dataset `product_demand.csv` to calculate the prior probabilitie and print the results.

(1) $p(\text{demand}=\text{"high"})$

In [7]:
data = sum(df['demand']=='high')
length = len(df['demand'])
probability = data/length
print(f'The probability is {probability}')

The probability is 0.5


(2) $p(\text{demand}=\text{"low"})$

In [8]:
data_1 = sum(df['demand']=='low')
length_1 = len(df['demand'])
probability_1 = data_1/length_1
print(f'The probability is {probability_1}')

The probability is 0.5


###

Calculate the conditional probabilities by Bayes Theorem and print the results.

(1) $p(\text{blocks}=\text{"few"}\mid\text{demand}=\text{"high"})$

In [15]:
data_3 = sum(df['number of blocks'] == 'few')
length_3 = len(df['number of blocks'])
probability_2 = data_3/length_3
data_4 = sum((df['number of blocks'] == 'few') &(df['demand']=='high'))
probability_3 = data_4/data_3
probability_4 = (probability_3*probability_2)/probability
print(f'The probability is {probability_4}')

The probability is 0.13


(2) $p(\text{blocks}=\text{"some"}\mid\text{demand}=\text{"high"})$

In [13]:

data_4 = sum(df['number of blocks'] == 'some')
probability_5 = data_4/length_3
data_5 = sum((df['number of blocks'] == 'some') & (df['demand']=='high'))
probability_6 = data_5/data_4
probability_7 = (probability_6*probability_5)/probability
print(f'The probability is {probability_7}')

The probability is 0.33999999999999997


(3) $p(\text{blocks}=\text{"some"}\mid\text{demand}=\text{"low"})$

In [34]:

p_8 = sum((df["number of blocks"] == "some") & (df["demand"] == "high"))/length_3
p_9 = sum(df["number of blocks"] == "some")/length_3
p = (probability_8/probability_9)*probability_9/probability
print(f'The probability is {p}')

The probability is 0.35


(4) $p(\text{blocks}=\text{"many"}\mid\text{demand}=\text{"low"})$

In [28]:

probability_10 = sum((df["number of blocks"] == "many") & (df["demand"] == "low"))/length_3
probability_11 = sum(df["number of blocks"] == "many")/length_3
probability_12 = (p_many_and_low/p_many)*p_many/probability_1
print(f'The probability is {probability_12}')

The probability is 0.16


### 

Under the Naive Bayesian model and your estimates of the above probabilities, what is the most likely demand for the product if there are "many" blocks for the previous day?

In [29]:

p = sum((df["number of blocks"] == "many") & (df["demand"] == "high"))/len(df)
p_1 = sum(df["number of blocks"] == "many")/len(df)
p_2 = (p/p_1)*p_1/probability
Bayesian_low = probability_1*probability_12/p_1
Bayesian_high = probability*p_2/p_1
print( f'The demand for low is {Bayesian_low}')
print( f'The demand for high is {Bayesian_high}')

The demand for low is 0.23188405797101452
The demand for high is 0.7681159420289856


In [35]:
print(f' After calculating the probability for "low" which is 23.18 percent and the probability for "high" which is 76.81, if there are "many" blocks the previous day then the demand for the product is likely to be "high"' )

 After calculating the probability for "low" which is 23.18 percent and the probability for "high" is 76.81, if there are "many" blocks the previous day then the demand for the product is likely to be "high"
