# Applied Data Lab

# Project 01: Customer Shopping Trends Dataset

![Person Shopping gif](https://cdn.dribbble.com/users/1948198/screenshots/4377223/dribble.gif)

## About Dataset

The Customer Shopping Preferences Dataset provides insights into consumer behavior and preferences, enabling businesses to tailor their strategies and improve customer satisfaction. With 3900 records, this dataset covers various customer attributes crucial for data analysis and machine learning practice.

| **Attribute**              | **Description**                                                      |
|---------------------------|----------------------------------------------------------------------|
| Customer ID                | Unique identifier for each customer                                  |
| Age                       | Age of the customer                                                  |
| Gender                    | Gender of the customer (Male/Female)                                 |
| Item Purchased            | The item purchased by the customer                                   |
| Category                  | Category of the purchased item                                       |
| Purchase Amount (USD)     | The purchase amount in USD                                           |
| Location                  | Location where the purchase was made                                |
| Size                      | Size of the purchased item                                           |
| Color                     | Color of the purchased item                                          |
| Season                    | Season during which the purchase was made                            |
| Review Rating             | Rating given by the customer for the purchased item                  |
| Subscription Status       | Indicates if the customer has a subscription (Yes/No)                |
| Shipping Type             | Type of shipping chosen by the customer                              |
| Discount Applied          | Indicates if a discount was applied to the purchase (Yes/No)         |
| Promo Code Used           | Indicates if a promo code was used for the purchase (Yes/No)         |
| Previous Purchases        | The total count of transactions concluded by the customer at the store, excluding the ongoing transaction |
| Payment Method            | Customer's most preferred payment method                             |
| Frequency of Purchases    | Frequency at which the customer makes purchases (e.g., Weekly, Fortnightly, Monthly) |


## Project Objective:

Analyze the Customer Shopping Preferences Dataset to gain valuable insights into customer behavior, preferences, and purchasing patterns. While the following objectives provide a structured approach, feel free to explore the dataset and derive your insights.

Feel free to follow the following objectives to perform structured analyses or explore the dataset and generate valuable insights as per your preferences. The dataset offers enough opportunities for data exploration and analysis.

**UseFul Guides**
- [Pandas Group By Link1](https://realpython.com/pandas-groupby/)
- [Pandas Group By Link2](https://www.geeksforgeeks.org/python-pandas-dataframe-groupby/)

### Objectives:

1. **Customer Profiling**:
   - *Objective*: Create customer profiles by grouping them based on a specific attribute, e.g., age or gender.
   - *Steps*:
     - Use pandas to group customers by the chosen attribute.
     - Calculate summary statistics like mean and median for various attributes within each group.

2. **Purchase History Analysis**:
   - *Objective*: Aggregate data by "Item Purchased" or "Category" to uncover trends and popular products.
   - *Steps*:
     - Utilize pandas to group data by "Item Purchased" or "Category."
     - Calculate common statistics, such as identifying the most frequently purchased item or the average purchase amount.

3. **Preferred Payment Method Exploration**:
   - *Objective*: Explore the distribution of preferred payment methods among customers.
   - *Steps*:
     - Use pandas' `value_counts()` function to count occurrences of each payment method.
     - Calculate the percentage of customers using different payment methods.

4. **Rating Analysis**:
   - *Objective*: Analyze the numerical review ratings provided by customers to gain insights into overall customer satisfaction and identify trends in review ratings.
  - *Steps*:
    - Calculate summary statistics for the review ratings, such as the mean, median, and distribution.
    - Examine how review ratings vary based on customer attributes (e.g., age, gender, subscription status).
    - Assess if there are any seasonal trends in review ratings.
    - Explore whether there are differences in review ratings based on the preferred payment method or purchase frequency.


5. **Subscription Status Analysis**:
   - *Objective*: Investigate the distribution of subscription status (Yes/No) among customers and its impact on purchase behavior.
   - *Steps*:
     - Group the data by "Subscription Status."
     - Calculate statistics on purchase behavior, such as the average purchase amount or frequency, for each group.

6. **Location-Based Analysis**:
   - *Objective*: Analyze customer demographics and purchasing behavior by location.
   - *Steps*:
     - Group the data by "Location."
     - Calculate statistics, like average purchase amount or preferred payment methods, for different regions.

7. **Discount and Promo Code Usage**:
   - *Objective*: Examine the prevalence of discount and promo code usage among customers.
   - *Steps*:
     - Use the `value_counts()` function to count occurrences of "Discount Applied" and "Promo Code Used."
     - Calculate the percentage of customers using discounts or promo codes.

8. **Customer Purchase Patterns**:
   - *Objective*: Explore patterns in customer purchase frequencies and analyze factors influencing purchase behavior.
   - *Steps*:
     - Group data by "Frequency of Purchases" (e.g., Weekly, Fortnightly).
     - Calculate statistics on purchase behavior, such as average purchase amount, within each group.

9. **Customer Age Groups**:
   - *Objective*: Categorize customers into age groups (e.g., 18-25, 26-35, etc.) and assess their purchasing habits.
   - *Steps*:
     - Use pandas to categorize customers by age groups.
     - Analyze the distribution of customers within each age group and their purchasing patterns.

10. **Item Category Analysis**:
    - *Objective*: Group data by "Category" and analyze statistics for each category, e.g., average purchase amount.
    - *Steps*:
      - Group data by "Category" using pandas.
      - Calculate statistics, such as average purchase amount, for each product category.

## Project Start From Here

## Setting Up the Address
In this cell, a path variable is set with the value of the current directory where the notebook is open. This is done to easily upload the dataset file from this location.

In [2]:
import pandas as pd

In [None]:
# Run this cell
import os
PATH = os.getcwd() + '/'
PATH

**ONLY FOR GOOGLE COLAB USERS**

For those who are using **Google Colab**, uncomment and run the cell below.

**Note**: You have to repalce value of variable `YOUR_PATH_TO_DATASET_DIRECTORY` with the path where your dataset is placed in the Google Drive folder.



In [None]:
# from google.colab import drive
# drive.mount('/content/drive/')
# YOUR_PATH_TO_DATASET_DIRECTORY = "work/Applied_Data_Lab/phase_2"
# PATH = "/content/drive/MyDrive/"+YOUR_PATH_TO_DATASET_DIRECTORY+"/"
# PATH

Importing the `shopping_trends.csv` file into the `data` variable.
