# E-commerce Shopping Behavior Analysis

**Goal:** Practice Pandas data manipulation and Seaborn/Matplotlib visualization to understand customer habits.

### 1. Imports

Import necessary libraries for data manipulation (Pandas, NumPy), visualization (Matplotlib, Seaborn), and file path handling (Pathlib).

In [6]:
import pandas as pd
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
from pathlib import Path

### 2. Plotting Configuration

Set global styles for consistency across all visualizations. We'll use Seaborn's `whitegrid` style and define a default figure size using Matplotlib's `rcParams` (rc).

In [5]:
# Set global visual preferences for plots
sns.set_style('whitegrid')
# Use plt.rc to configure runtime settings - set default figure size
plt.rc('figure', figsize=(10, 6))

### 3. Define Project Paths

Establish robust, relative file paths using `pathlib.Path` to ensure the notebook can correctly locate the data directory, regardless of where the project is located on the filesystem.

In [8]:
# --- Define Paths Robustly ---
notebook_dir = Path.cwd()
project_root = notebook_dir.parent
data_dir = project_root / 'data'
# We don't need csv_file_path here anymore, we'll find it dynamically

print(f"Project Root: {project_root}")
print(f"Data Directory: {data_dir}")

Project Root: /home/prof-adept/Shared/Alpha/Programming/Python/Data Science Projects/ecommerce-analysis
Data Directory: /home/prof-adept/Shared/Alpha/Programming/Python/Data Science Projects/ecommerce-analysis/data


### 4. Load Data Dynamically

The following cell loads the primary dataset. It uses `pathlib.glob` to automatically find the correct `.csv` file within the `/data/` directory, making the notebook robust to potential filename changes in future dataset versions. It also includes error handling for cases where the CSV file is missing or multiple CSVs are present.

In [9]:
csv_file_path = None
df = None # Initialize df to None

try:
    # Use glob to find all files ending in .csv (case-insensitive) in the data directory
    csv_files = list(data_dir.glob('*.csv')) + list(data_dir.glob('*.CSV')) 
    
    if not csv_files:
        print(f"❌ Error: No CSV file found in the directory: {data_dir}")
        print("Please ensure you have downloaded the dataset CSV and placed it there.")
    elif len(csv_files) > 1:
        print(f"⚠️ Warning: Multiple CSV files found in {data_dir}. Using the first one found:")
        print([f.name for f in csv_files])
        csv_file_path = csv_files[0] # Default to the first one
    else:
        # Exactly one CSV file found
        csv_file_path = csv_files[0]
        print(f"Found CSV file: {csv_file_path.name}")

    # Proceed to load if a path was determined
    if csv_file_path:
        print(f"Loading data from: {csv_file_path}")
        df = pd.read_csv(csv_file_path)
        print("✅ Data loaded successfully!")
        display(df.head())

except Exception as e:
    print(f"❌ An unexpected error occurred while finding or loading the data: {e}")

# Optional: Check if df was loaded successfully before proceeding
if df is None:
     print("\nStopping analysis because data failed to load.")
     # You might want to stop execution here in a real script, 
     # or just be aware that df is None in the notebook.

Found CSV file: shopping_behavior_updated.csv
Loading data from: /home/prof-adept/Shared/Alpha/Programming/Python/Data Science Projects/ecommerce-analysis/data/shopping_behavior_updated.csv
✅ Data loaded successfully!


Unnamed: 0,Customer ID,Age,Gender,Item Purchased,Category,Purchase Amount (USD),Location,Size,Color,Season,Review Rating,Subscription Status,Shipping Type,Discount Applied,Promo Code Used,Previous Purchases,Payment Method,Frequency of Purchases
0,1,55,Male,Blouse,Clothing,53,Kentucky,L,Gray,Winter,3.1,Yes,Express,Yes,Yes,14,Venmo,Fortnightly
1,2,19,Male,Sweater,Clothing,64,Maine,L,Maroon,Winter,3.1,Yes,Express,Yes,Yes,2,Cash,Fortnightly
2,3,50,Male,Jeans,Clothing,73,Massachusetts,S,Maroon,Spring,3.1,Yes,Free Shipping,Yes,Yes,23,Credit Card,Weekly
3,4,21,Male,Sandals,Footwear,90,Rhode Island,M,Maroon,Spring,3.5,Yes,Next Day Air,Yes,Yes,49,PayPal,Weekly
4,5,45,Male,Blouse,Clothing,49,Oregon,M,Turquoise,Spring,2.7,Yes,Free Shipping,Yes,Yes,31,PayPal,Annually
