# Prediction
User Story: User should be able to provide a Customer ID and Date, and program should be able to
predict quantity.



# Introduction:

In the dynamic world of retail and e-commerce, understanding customer behavior and predicting
future purchasing patterns is a crucial aspect of business success. 
The "Personalized Quantity Prediction in Retail" task addresses the need to harness
data-driven insights to enhance customer experience, optimize inventory management, 
and boost overall sales.

# Problem Statement:

The primary objective of this task is to enable a program to predict the quantity of products 
that a specific customer is likely to purchase on a given date. To achieve this,
the program takes two key inputs from the user: Customer ID and Date. With these inputs, 
the program employs machine learning and predictive modeling techniques to estimate
the expected quantity of products that the specified customer will buy on the provided date.

# Import Libraries

In [2]:
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
from sklearn.metrics import mean_squared_error

# Import Data 

In [3]:
data = pd.read_excel("Online Retail.xlsx")
data

Unnamed: 0,InvoiceNo,StockCode,Description,Quantity,InvoiceDate,UnitPrice,CustomerID,Country
0,536365,85123A,WHITE HANGING HEART T-LIGHT HOLDER,6,2010-12-01 08:26:00,2.55,17850.0,United Kingdom
1,536365,71053,WHITE METAL LANTERN,6,2010-12-01 08:26:00,3.39,17850.0,United Kingdom
2,536365,84406B,CREAM CUPID HEARTS COAT HANGER,8,2010-12-01 08:26:00,2.75,17850.0,United Kingdom
3,536365,84029G,KNITTED UNION FLAG HOT WATER BOTTLE,6,2010-12-01 08:26:00,3.39,17850.0,United Kingdom
4,536365,84029E,RED WOOLLY HOTTIE WHITE HEART.,6,2010-12-01 08:26:00,3.39,17850.0,United Kingdom
...,...,...,...,...,...,...,...,...
541904,581587,22613,PACK OF 20 SPACEBOY NAPKINS,12,2011-12-09 12:50:00,0.85,12680.0,France
541905,581587,22899,CHILDREN'S APRON DOLLY GIRL,6,2011-12-09 12:50:00,2.10,12680.0,France
541906,581587,23254,CHILDRENS CUTLERY DOLLY GIRL,4,2011-12-09 12:50:00,4.15,12680.0,France
541907,581587,23255,CHILDRENS CUTLERY CIRCUS PARADE,4,2011-12-09 12:50:00,4.15,12680.0,France


In [4]:
data.head()

Unnamed: 0,InvoiceNo,StockCode,Description,Quantity,InvoiceDate,UnitPrice,CustomerID,Country
0,536365,85123A,WHITE HANGING HEART T-LIGHT HOLDER,6,2010-12-01 08:26:00,2.55,17850.0,United Kingdom
1,536365,71053,WHITE METAL LANTERN,6,2010-12-01 08:26:00,3.39,17850.0,United Kingdom
2,536365,84406B,CREAM CUPID HEARTS COAT HANGER,8,2010-12-01 08:26:00,2.75,17850.0,United Kingdom
3,536365,84029G,KNITTED UNION FLAG HOT WATER BOTTLE,6,2010-12-01 08:26:00,3.39,17850.0,United Kingdom
4,536365,84029E,RED WOOLLY HOTTIE WHITE HEART.,6,2010-12-01 08:26:00,3.39,17850.0,United Kingdom


# null values 

In [5]:
data.isnull().sum()

InvoiceNo           0
StockCode           0
Description      1454
Quantity            0
InvoiceDate         0
UnitPrice           0
CustomerID     135080
Country             0
dtype: int64

# drop null values 

In [6]:
data.dropna(subset=['CustomerID'], inplace=True)


In [7]:
data.isnull().sum()

InvoiceNo      0
StockCode      0
Description    0
Quantity       0
InvoiceDate    0
UnitPrice      0
CustomerID     0
Country        0
dtype: int64

In [8]:
data.dtypes

InvoiceNo              object
StockCode              object
Description            object
Quantity                int64
InvoiceDate    datetime64[ns]
UnitPrice             float64
CustomerID            float64
Country                object
dtype: object

# using time series for prediction  

In [9]:
data['Year'] = data['InvoiceDate'].dt.year
data['Month'] = data['InvoiceDate'].dt.month
data['Day'] = data['InvoiceDate'].dt.day

# values count according to 2010 and 2011 yeaar

In [10]:
data['Year'].value_counts()

2011    379979
2010     26850
Name: Year, dtype: int64

# Target variables 

In [11]:
# Define your features and target variable
X = data[['CustomerID', 'Year', 'Month', 'Day']]  # Features
y = data['Quantity']  # Target variable

# split x and y and create a model 

In [12]:
# Split the data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Initialize  
model =LinearRegression ()
model.fit(X_train, y_train)


# create a user input 

In [14]:
# User input for CustomerID, Year, and Month
customer_id = int(input("Enter CustomerID: "))
year = int(input("Enter Year: "))
month = int(input("Enter Month (1-12): "))


Enter CustomerID:  17850
Enter Year:  2010
Enter Month (1-12):  12


# prediction of  quantity 

In [15]:
# Create a DataFrame with the input data
input_data = pd.DataFrame({'CustomerID': [customer_id], 'Year': [year], 'Month': [month], 'Day': [1]})


# Use the trained model to make predictions
predicted_quantity = model.predict(input_data)


print(f"Predicted Quantity: {predicted_quantity[0]}")


Predicted Quantity: 9.685981324020986


# Predicted Quantity: 9.685981324020986

# Challenges:

Accurate prediction depends on the availability and quality of historical transaction data.
Handling seasonality and external factors that impact buying behavior.
Ensuring data privacy and compliance with relevant regulations, such as GDPR,
when dealing with customer-specific information.

# conclusion 
 the "Personalized Quantity Prediction in Retail" task represents a valuable and practical
application of data-driven decision-making in the dynamic world of retail and e-commerce. 
This task harnesses the power of historical transaction data, machine learning, and 


personalized insights to address several critical objectives:
Enhanced Customer Experience: By predicting the quantity of products a specific customer is
likely to purchase on a given date, retailers can provide personalized recommendations and offers, 
ultimately improving the shopping experience and increasing customer loyalty.

Optimized Inventory Management: Accurate predictions enable retailers to optimize their inventory
levels, reducing the risk of stockouts and overstocking. This, in turn, leads to cost savings and
improved operational efficiency.

Targeted Marketing and Promotions: Personalized predictions empower retailers to design targeted
marketing campaigns and promotions, increasing the effectiveness of their marketing efforts and 
driving higher sales.

Data-Driven Decision-Making: Retailers can make informed decisions based on predictive insights,
allowing them to plan production, logistics, and supply chain operations more effectively.

However, it's important to acknowledge that the success of this task depends on various factors, including the availability and quality of historical data, the ability to handle seasonality and external factors, and compliance with data privacy regulations.

In summary, the "Personalized Quantity Prediction in Retail" task demonstrates how the synergy
between data analysis, machine learning, and domain knowledge can lead to improved customer 
satisfaction, operational efficiency, and business profitability in the ever-evolving retail
landscape. It underscores the importance of leveraging data-driven solutions to stay competitive and meet the evolving demands of modern consumers.





