# Instacart Market Basket Analysis

This Jupyter Notebook contains the analysis of the Instacart market basket data. It includes data loading, cleaning, exploration, and visualization steps.

## Importing the required libraries

In this section, we will import the necessary libraries for data analysis and visualization.

In [None]:
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import numpy as np

## Loading the datasets

We will load the datasets from the `data` directory.

In [None]:
orders = pd.read_csv('../data/instacart_orders.csv', sep=';')
products = pd.read_csv('../data/products.csv', sep=';')
departments = pd.read_csv('../data/departments.csv', sep=';')
aisles = pd.read_csv('../data/aisles.csv', sep=';')
order_products = pd.read_csv('../data/order_products.csv', sep=';')

## Data Exploration

In this section, we will explore the datasets to understand their structure and content.

In [None]:
orders.info()
products.info()
departments.info()
aisles.info()
order_products.info()

## Data Cleaning

We will clean the datasets by handling missing values and duplicates.

In [None]:
# Example of handling missing values
products['product_name'] = products['product_name'].fillna('unknown')
orders = orders.drop_duplicates().reset_index(drop=True)
order_products = order_products.drop_duplicates().reset_index(drop=True)

## Data Visualization

In this section, we will visualize the data to gain insights into customer behavior.

In [None]:
plt.figure(figsize=(10, 6))
sns.countplot(data=orders, x='order_dow')
plt.title('Order Distribution by Day of Week')
plt.xlabel('Day of Week')
plt.ylabel('Number of Orders')
plt.show()

## Conclusion

This notebook provides a comprehensive analysis of the Instacart market basket data, including data loading, cleaning, exploration, and visualization.