# Problem 1

We need to predict which of the Magic Keys given in “problem 1.csv” will buy milk and/or meat in the first 15 days of March-2019. We have to put Y in the purchase column if the Magic Keys will purchase and N if the Magic Keys will not make a purchase. 

## Necessary Libraries

In [1]:
import pandas as pd
import numpy as np

## Data Preprocessing

In [2]:
# Load the data
boxes_df = pd.read_csv('boxes.csv')
purchase_df = pd.read_csv('purchase.csv')
problem_df = pd.read_csv('problem 1.csv')

In [3]:
# Convert the purchase date to datetime format
purchase_df['PURCHASE_DATE'] = pd.to_datetime(purchase_df['PURCHASE_DATE'],format='%d/%m/%Y')

# Filter purchases made in the first 15 days of March 2019
march_purchases = purchase_df[(purchase_df['PURCHASE_DATE'].dt.month == 3) & (purchase_df['PURCHASE_DATE'].dt.day <= 15)]

# Merge the purchase data with the boxes data to get details about the purchased boxes
merged_df = pd.merge(march_purchases, boxes_df, on='BOX_ID')

# Group by MAGIC_KEY and check if they purchased milk or meat
purchase_summary = merged_df.groupby('MAGIC_KEY').agg({'MILK': 'sum', 'MEAT': 'sum'})

# Predict which Magic Keys will make a purchase
# If a Magic Key purchased milk or meat, they will make a purchase
purchase_summary['PURCHASE'] = purchase_summary.apply(lambda row: 'Y' if row['MILK'] > 0 or row['MEAT'] > 0 else 'N', axis=1)

In [4]:
# Merge the problem data with our purchase summary to get the final prediction
submission_df = pd.merge(problem_df, purchase_summary[['PURCHASE']], on='MAGIC_KEY', how='left')

# Fill missing values with 'N' and ensure the column is of object dtype to avoid dtype incompatibility
submission_df['PURCHASE'] = submission_df['PURCHASE'].fillna('N').astype(object)

# Save the submission file
submission_df.to_csv('submission.csv', index=False)