# Problem 1

We need to predict which of the Magic Keys given in “problem 1.csv” will buy milk and/or meat in the first 15 days of March-2019. We have to put Y in the purchase column if the Magic Keys will purchase and N if the Magic Keys will not make a purchase. 

## Necessary Libraries

In [24]:
import pandas as pd

from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import classification_report

## Data Preprocessing

In [15]:
# Load the data
boxes_df = pd.read_csv('boxes.csv')
purchase_df = pd.read_csv('purchase.csv')
problem_df = pd.read_csv('problem 1.csv')

In [16]:
# Convert the purchase date to datetime format
purchase_df['PURCHASE_DATE'] = pd.to_datetime(purchase_df['PURCHASE_DATE'],format='%d/%m/%Y')

# Filter purchases made in the first 15 days 2019
march_purchases = purchase_df[(purchase_df['PURCHASE_DATE'].dt.day <= 15)]

# Merge the purchase data with the boxes data to get details about the purchased boxes
merged_df = pd.merge(march_purchases, boxes_df, on='BOX_ID')

# Group by MAGIC_KEY and check if they purchased milk or meat
purchase_summary = merged_df.groupby('MAGIC_KEY').agg({'MILK': 'sum', 'MEAT': 'sum'})

# Predict which Magic Keys will make a purchase
# If a Magic Key purchased milk or meat, they will make a purchase
purchase_summary['PURCHASE'] = purchase_summary.apply(lambda row: 'Y' if row['MILK'] > 0 or row['MEAT'] > 0 else 'N', axis=1)

In [30]:
# Merge the problem data with our purchase summary to get the final prediction
submission_df = pd.merge(problem_df, purchase_summary[['PURCHASE']], on='MAGIC_KEY', how='left')

# Fill missing values with 'N' and ensure the column is of object dtype to avoid dtype incompatibility
submission_df['PURCHASE'] = submission_df['PURCHASE'].fillna('N').astype(object)

# Save the submission file
submission_df.to_csv('submission1.csv', index=False)

In [32]:
submission_df

Unnamed: 0,MAGIC_KEY,PURCHASE
0,28D5BB06356,N
1,293BEAB4E98,Y
2,2962EE8065C,N
3,2957BE29EA9,Y
4,28E351A0745,N
...,...,...
58684,28FB7C09776,Y
58685,28E0E3B69BF,Y
58686,28D343103A7,Y
58687,290B1D6D5CB,Y
