# Introduction
Anny seriously loves Japanese food so in the beginning of 2021, he decides to embark upon a risky venture and opens up a cute little restaurant that sells his 3 favourite foods: **sushi, curry and ramen.**

Anny’s Diner is in need of your assistance to help the restaurant stay afloat - the restaurant has captured some very basic data from their few months of operation but have no idea how to use their data to help them run the business.

## Problem_statement
Anny wants to use the data to answer a few simple questions about his customers, especially about their visiting patterns, how much money they’ve spent and also which menu items are their favourite. Having this deeper connection with his customers will help him deliver a better and more personalised experience for his loyal customers.

He plans on using these insights to help him decide whether he should expand the existing customer loyalty program - additionally he needs help to generate some basic datasets so his team can easily inspect the data without needing to use SQL.

Danny has provided you with a sample of his overall customer data due to privacy issues - but he hopes that these examples are enough for you to write fully functioning pandas code  to help him answer his questions!

Anny has shared with you 3 key datasets for this case study:

- sales
- menu
- members

### 1. Bring in the necessary libraries for your work. Import the tools and resources needed to accomplish your tasks.
### 2.Import the necessary data for analysis. Bring in the information that you need to examine and draw insights from.
### 3.Explore the details of all datasets by checking their information.
### 4.Make sure that each type of information (like numbers or dates) is stored in the correct way. This helps ensure that the data is accurate and ready for analysis, making your work more reliable and meaningful

In [207]:
import numpy as np,pandas as pd
import seaborn as sns
menu=pd.read_csv(r'C:\Users\Sam\Downloads\menu.csv')
members=pd.read_csv(r"C:\Users\Sam\Downloads\members.csv")
sales=pd.read_csv(r"C:\Users\Sam\Downloads\sales.csv")

In [208]:
members

Unnamed: 0,customer_id,join_date
0,A,2021-01-07
1,B,2021-01-09


In [209]:
menu

Unnamed: 0,product_id,product_name,price
0,1,sushi,10
1,2,curry,15
2,3,ramen,12


In [210]:
sales

Unnamed: 0,customer_id,order_date,product_id
0,A,2021-01-01,1
1,A,2021-01-01,2
2,A,2021-01-07,2
3,A,2021-01-10,3
4,A,2021-01-11,3
5,A,2021-01-11,3
6,B,2021-01-01,2
7,B,2021-01-02,2
8,B,2021-01-04,1
9,B,2021-01-11,1


In [211]:
members.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 2 entries, 0 to 1
Data columns (total 2 columns):
 #   Column       Non-Null Count  Dtype 
---  ------       --------------  ----- 
 0   customer_id  2 non-null      object
 1   join_date    2 non-null      object
dtypes: object(2)
memory usage: 164.0+ bytes


In [212]:
menu.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 3 entries, 0 to 2
Data columns (total 3 columns):
 #   Column        Non-Null Count  Dtype 
---  ------        --------------  ----- 
 0   product_id    3 non-null      int64 
 1   product_name  3 non-null      object
 2   price         3 non-null      int64 
dtypes: int64(2), object(1)
memory usage: 204.0+ bytes


In [213]:
sales.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 15 entries, 0 to 14
Data columns (total 3 columns):
 #   Column       Non-Null Count  Dtype 
---  ------       --------------  ----- 
 0   customer_id  15 non-null     object
 1   order_date   15 non-null     object
 2   product_id   15 non-null     int64 
dtypes: int64(1), object(2)
memory usage: 492.0+ bytes


In [214]:
sales['order_date']=sales['order_date'].astype('datetime64[ns]')
members['join_date']=members['join_date'].astype('datetime64[ns]')

### 1. What is the total amount each customer spent at the restaurant?
### 2.How many days has each customer visited the restaurant?
### 3.What was the first item from the menu purchased by each customer?
### 4.What is the most purchased item on the menu and how many times was it purchased by all customers?
### 5.Which item was the most popular for each customer?
### 6.Which item was purchased first by the customer after they became a member?
### 7.Which item was purchased just before the customer became a member?
### 8.What is the total items and amount spent for each member before they became a member?
### 9.If each  $1 spent equates to 10 points and sushi has a 2x points multiplier - how many points would each customer have?
### 10.In the first week after a customer joins the program (including their join date) they earn 2x points on all items, not just sushi - how many points do customer A and B have at the end of January?


In [215]:
tab=pd.merge(sales,menu)

In [216]:
tab

Unnamed: 0,customer_id,order_date,product_id,product_name,price
0,A,2021-01-01,1,sushi,10
1,B,2021-01-04,1,sushi,10
2,B,2021-01-11,1,sushi,10
3,A,2021-01-01,2,curry,15
4,A,2021-01-07,2,curry,15
5,B,2021-01-01,2,curry,15
6,B,2021-01-02,2,curry,15
7,A,2021-01-10,3,ramen,12
8,A,2021-01-11,3,ramen,12
9,A,2021-01-11,3,ramen,12


In [217]:
z=pd.merge(sales,menu,on='product_id',how='inner')

In [218]:
z

Unnamed: 0,customer_id,order_date,product_id,product_name,price
0,A,2021-01-01,1,sushi,10
1,B,2021-01-04,1,sushi,10
2,B,2021-01-11,1,sushi,10
3,A,2021-01-01,2,curry,15
4,A,2021-01-07,2,curry,15
5,B,2021-01-01,2,curry,15
6,B,2021-01-02,2,curry,15
7,A,2021-01-10,3,ramen,12
8,A,2021-01-11,3,ramen,12
9,A,2021-01-11,3,ramen,12


In [219]:
q1=z.groupby('customer_id')['price'].sum()

In [220]:
q1

customer_id
A    76
B    74
C    36
Name: price, dtype: int64

### 2.How many days has each customer visited the restaurant?¶

In [221]:
sales.groupby('customer_id')['order_date'].nunique()#this mention days

customer_id
A    4
B    6
C    2
Name: order_date, dtype: int64

In [222]:
sales.groupby('customer_id')['order_date'].value_counts()

customer_id  order_date
A            2021-01-01    2
             2021-01-11    2
             2021-01-07    1
             2021-01-10    1
B            2021-01-01    1
             2021-01-02    1
             2021-01-04    1
             2021-01-11    1
             2021-01-16    1
             2021-02-01    1
C            2021-01-01    2
             2021-01-07    1
Name: count, dtype: int64

### 3.What was the first item from the menu purchased by each customer?

In [223]:

tab['purchase_rank'] = tab.groupby('customer_id')['order_date'].rank()


In [224]:
tab

Unnamed: 0,customer_id,order_date,product_id,product_name,price,purchase_rank
0,A,2021-01-01,1,sushi,10,1.5
1,B,2021-01-04,1,sushi,10,3.0
2,B,2021-01-11,1,sushi,10,4.0
3,A,2021-01-01,2,curry,15,1.5
4,A,2021-01-07,2,curry,15,3.0
5,B,2021-01-01,2,curry,15,1.0
6,B,2021-01-02,2,curry,15,2.0
7,A,2021-01-10,3,ramen,12,4.0
8,A,2021-01-11,3,ramen,12,5.5
9,A,2021-01-11,3,ramen,12,5.5


In [225]:
first_purchase = tab[tab['purchase_rank'] == 1.5]
first_purchase1 = tab[tab['purchase_rank'] == 1]

In [226]:
first_purchase

Unnamed: 0,customer_id,order_date,product_id,product_name,price,purchase_rank
0,A,2021-01-01,1,sushi,10,1.5
3,A,2021-01-01,2,curry,15,1.5
12,C,2021-01-01,3,ramen,12,1.5
13,C,2021-01-01,3,ramen,12,1.5


In [227]:
first_purchase1

Unnamed: 0,customer_id,order_date,product_id,product_name,price,purchase_rank
5,B,2021-01-01,2,curry,15,1.0


In [228]:
fp = pd.merge(first_purchase, first_purchase1, on='customer_id', how='outer')

In [229]:
fp

Unnamed: 0,customer_id,order_date_x,product_id_x,product_name_x,price_x,purchase_rank_x,order_date_y,product_id_y,product_name_y,price_y,purchase_rank_y
0,A,2021-01-01,1.0,sushi,10.0,1.5,NaT,,,,
1,A,2021-01-01,2.0,curry,15.0,1.5,NaT,,,,
2,C,2021-01-01,3.0,ramen,12.0,1.5,NaT,,,,
3,C,2021-01-01,3.0,ramen,12.0,1.5,NaT,,,,
4,B,NaT,,,,,2021-01-01,2.0,curry,15.0,1.0


### 4.What is the most purchased item on the menu and how many times was it purchased by all customers?

In [230]:
pur_product = tab.groupby('product_name')['product_id'].count().reset_index()
pur_product.columns = ['product_name', 'purchase_count']


In [231]:
most_purchased_item = pur_product.loc[pur_product['purchase_count'].idxmax()]
print('the most purchased item on the cout of its purchased by all customers is :',most_purchased_item)

the most purchased item on the cout of its purchased by all customers is : product_name      ramen
purchase_count        8
Name: 1, dtype: object


### 5.Which item was the most popular for each customer?

In [232]:
customer_product_counts = tab.groupby(['customer_id', 'product_name']).size().reset_index(name='purchase_count')


In [233]:
most_popular_for_each_customer = customer_product_counts.loc[customer_product_counts.groupby('customer_id')['purchase_count'].idxmax()]

In [234]:
most_popular_for_each_customer#the most popular for each customer are as below

Unnamed: 0,customer_id,product_name,purchase_count
1,A,ramen,3
3,B,curry,2
6,C,ramen,3


### 6.Which item was purchased first by the customer after they became a member?¶

In [235]:
memsal= pd.merge(members,sales , on='customer_id')

In [236]:
memsal

Unnamed: 0,customer_id,join_date,order_date,product_id
0,A,2021-01-07,2021-01-01,1
1,A,2021-01-07,2021-01-01,2
2,A,2021-01-07,2021-01-07,2
3,A,2021-01-07,2021-01-10,3
4,A,2021-01-07,2021-01-11,3
5,A,2021-01-07,2021-01-11,3
6,B,2021-01-09,2021-01-01,2
7,B,2021-01-09,2021-01-02,2
8,B,2021-01-09,2021-01-04,1
9,B,2021-01-09,2021-01-11,1


In [239]:
allmerge=pd.merge(memsal,menu, on='product_id',how='outer')

In [241]:
allmerge

Unnamed: 0,customer_id,join_date,order_date,product_id,product_name,price
0,A,2021-01-07,2021-01-01,1,sushi,10
1,B,2021-01-09,2021-01-04,1,sushi,10
2,B,2021-01-09,2021-01-11,1,sushi,10
3,A,2021-01-07,2021-01-01,2,curry,15
4,A,2021-01-07,2021-01-07,2,curry,15
5,B,2021-01-09,2021-01-01,2,curry,15
6,B,2021-01-09,2021-01-02,2,curry,15
7,A,2021-01-07,2021-01-10,3,ramen,12
8,A,2021-01-07,2021-01-11,3,ramen,12
9,A,2021-01-07,2021-01-11,3,ramen,12


In [242]:
filtered_data1 = allmerge[allmerge['order_date'] >= allmerge['join_date']]
first_item_after_join = filtered_data.sort_values('order_date').groupby('customer_id').first()

In [243]:
filtered_data1

Unnamed: 0,customer_id,join_date,order_date,product_id,product_name,price
2,B,2021-01-09,2021-01-11,1,sushi,10
4,A,2021-01-07,2021-01-07,2,curry,15
7,A,2021-01-07,2021-01-10,3,ramen,12
8,A,2021-01-07,2021-01-11,3,ramen,12
9,A,2021-01-07,2021-01-11,3,ramen,12
10,B,2021-01-09,2021-01-16,3,ramen,12
11,B,2021-01-09,2021-02-01,3,ramen,12


In [244]:
first_item_after_join

Unnamed: 0_level_0,join_date,order_date,product_id,product_name,price,points
customer_id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
A,2021-01-07,2021-01-01,1,sushi,10,
B,2021-01-09,2021-01-01,2,curry,15,


### 7.Which item was purchased just before the customer became a member?¶

In [245]:
filtered_data = allmerge[allmerge['order_date'] < allmerge['join_date']]

last_item_before_join = filtered_data.sort_values(['customer_id', 'order_date']).groupby('customer_id').last()


In [246]:
last_item_before_join

Unnamed: 0_level_0,join_date,order_date,product_id,product_name,price
customer_id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
A,2021-01-07,2021-01-01,2,curry,15
B,2021-01-09,2021-01-04,1,sushi,10


### 8.What is the total items and amount spent for each member before they became a member?

In [247]:
total_items_and_amount = filtered_data.groupby('customer_id').agg(
    total_items=pd.NamedAgg(column='product_id', aggfunc='count'),
    total_amount=pd.NamedAgg(column='price', aggfunc='sum')
).reset_index()


In [248]:
total_items_and_amount

Unnamed: 0,customer_id,total_items,total_amount
0,A,2,25
1,B,3,40


### 9.If each $1 spent equates to 10 points and sushi has a 2x points multiplier - how many points would each customer have?

In [249]:
def calculate_points(row):
    multiplier = 2 if row['product_name'] == 'sushi' else 1
    return row['price'] * 10 * multiplier

filtered_data1['points'] = filtered_data1.apply(calculate_points, axis=1)

total_points = filtered_data1.groupby('customer_id')['points'].sum().reset_index()

total_points

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  filtered_data1['points'] = filtered_data1.apply(calculate_points, axis=1)


Unnamed: 0,customer_id,points
0,A,510
1,B,440


### 10.In the first week after a customer joins the program (including their join date) they earn 2x points on all items, not just sushi - how many points do customer A and B have at the end of January?

In [257]:
allmerge['days_since_join'] = (allmerge['order_date'] - allmerge['join_date']).dt.days

def calculate_points(row):
    multiplier = 2 if row['days_since_join'] <= 7 else 1
    return row['price'] * 10 * multiplier


allmerge['points'] = allmerge.apply(calculate_points, axis=1)

january_data = allmerge[(allmerge['order_date'].dt.month == 1) & (allmerge['order_date'].dt.year == 2021)]


total_points_january = january_data.groupby('customer_id')['points'].sum().reset_index()

total_points_january

Unnamed: 0,customer_id,points
0,A,1520
1,B,1240
