# 🛒 Product Brand Sales Analysis Tool
This notebook allows you to enter a product keyword (e.g., 'milk', 'eggs') and performs the following:
- Identifies top-selling brands in each store
- Saves results to a CSV file
- Displays the cheapest matching product
- Compares the cheapest options between Woolworths and Coles

In [1]:
# Importing required libraries and loading datasets
import pandas as pd

# Load Woolworths and Coles transaction data
woolies_df = pd.read_csv('customer_transactions_woolies.csv')
coles_df = pd.read_csv('customer_transactions_coles.csv')

## Tag and Combine Store Data
We add a `Store` column to distinguish between Woolworths and Coles, and combine them into a single dataset for analysis.

In [2]:
# Tag each dataset with the store name and combine them
woolies_df['Store'] = 'Woolworths'
coles_df['Store'] = 'Coles'

# Combine both datasets into one
combined_df = pd.concat([woolies_df, coles_df], ignore_index=True)

## Brand Extraction
For simplicity, we assume the first word in the `ItemName` is the brand. This may not always be perfect, but works for common formats like 'Coles Free Range Eggs'.

In [3]:
# Extract the brand name (first word of the item name)
def extract_brand(item_name):
    return item_name.split()[0] if isinstance(item_name, str) else None

# Apply brand extraction to each row
combined_df['Brand'] = combined_df['ItemName'].apply(extract_brand)

## Top Brand Calculation Logic
This function filters the data based on the input keyword, aggregates quantities per brand and store, and returns the top-selling brand per store.

In [4]:
# Function to find top-selling brands based on a search keyword
def top_brands_by_keyword(df, keyword):
    # Filter products based on keyword
    filtered_df = df[df['ItemName'].str.contains(keyword, case=False, na=False)]
    
    # Group by Store and Brand, sum quantities
    grouped = filtered_df.groupby(['Store', 'Brand'])['Quantity'].sum().reset_index()
    
    # Get the top-selling brand in each store
    top_brands = grouped.loc[grouped.groupby('Store')['Quantity'].idxmax()]
    return top_brands

## User Input and Brand Analysis
Here, the user inputs a keyword and the tool outputs the top-selling brand in each store. It also saves the result to a CSV file.

In [5]:
# Ask user to input a keyword
keyword = input("Enter a keyword to search for (e.g., 'milk'): ")

# Call the analysis function
top_brands = top_brands_by_keyword(combined_df, keyword)

# Display results
print("\nTop Selling Brand(s) by Store for keyword '{}':\n".format(keyword))
print(top_brands)

# Save the results to a CSV file
top_brands.to_csv('top_brands_by_store.csv', index=False)
print("\nSaved results to 'top_brands_by_store.csv'")


Top Selling Brand(s) by Store for keyword 'Chips':

         Store    Brand  Quantity
6        Coles    Coles       425
76  Woolworths  Smith's       852

Saved results to 'top_brands_by_store.csv'


In [6]:
# Find the cheapest product among the filtered results
filtered_df = combined_df[combined_df['ItemName'].str.contains(keyword, case=False, na=False)]

# Find the product with the lowest unit price
cheapest_product = filtered_df.loc[filtered_df['UnitPrice'].idxmin()]

print("\nCheapest product matching the keyword '{}':\n".format(keyword))
print(cheapest_product[['Store', 'ItemName', 'Brand', 'UnitPrice']])


Cheapest product matching the keyword 'Chips':

Store                                              Woolworths
ItemName     Nong Shim Shrimp Meat Chip Shrimp Meat Chips 75g
Brand                                                    Nong
UnitPrice                                                1.45
Name: 27172, dtype: object


In [7]:
# Compare cheapest product across Woolworths and Coles
# Filter data again based on keyword
filtered_df = combined_df[combined_df['ItemName'].str.contains(keyword, case=False, na=False)]

# Get the cheapest product per store
cheapest_per_store = filtered_df.loc[filtered_df.groupby('Store')['UnitPrice'].idxmin()]

# Display the cheapest product from each store
print("\nCheapest product(s) matching the keyword '{}':\n".format(keyword))
print(cheapest_per_store[['Store', 'ItemName', 'Brand', 'UnitPrice']])

# Find which store has the absolute cheapest product
min_price_row = cheapest_per_store.loc[cheapest_per_store['UnitPrice'].idxmin()]
print("\n The cheapest product is in {} with '{}' at $ {:.2f}".format(
    min_price_row['Store'],
    min_price_row['ItemName'],
    min_price_row['UnitPrice']
))


Cheapest product(s) matching the keyword 'Chips':

             Store                                          ItemName   Brand  \
582861       Coles   Smiths Grainwaves Chips Sour Cream Chives | 40g  Smiths   
27172   Woolworths  Nong Shim Shrimp Meat Chip Shrimp Meat Chips 75g    Nong   

        UnitPrice  
582861       1.50  
27172        1.45  

 The cheapest product is in Woolworths with 'Nong Shim Shrimp Meat Chip Shrimp Meat Chips 75g' at $ 1.45


## Absolute Cheapest Product (Across All Stores)
This section identifies the **single cheapest product** from both stores based on the entered keyword.
It filters all matching products and selects the one with the lowest unit price overall.

In [8]:
# Find the cheapest product among the filtered results
filtered_df = combined_df[combined_df['ItemName'].str.contains(keyword, case=False, na=False)]
cheapest_product = filtered_df.loc[filtered_df['UnitPrice'].idxmin()]

print("\nCheapest product matching the keyword '{}':\n".format(keyword))
print(cheapest_product[['Store', 'ItemName', 'Brand', 'UnitPrice']])


Cheapest product matching the keyword 'Chips':

Store                                              Woolworths
ItemName     Nong Shim Shrimp Meat Chip Shrimp Meat Chips 75g
Brand                                                    Nong
UnitPrice                                                1.45
Name: 27172, dtype: object


**Conclusion**: This is the product with the lowest unit price **regardless of store**. It represents the best option in terms of cost alone among all matched results.

## Cheapest Product Per Store & Comparison
This section finds the cheapest product **in each store individually** and compares them.
It helps determine which store offers a better price for the searched keyword.

In [9]:
# Compare cheapest product across Woolworths and Coles
filtered_df = combined_df[combined_df['ItemName'].str.contains(keyword, case=False, na=False)]
cheapest_per_store = filtered_df.loc[filtered_df.groupby('Store')['UnitPrice'].idxmin()]

# Display cheapest product from each store
print("\nCheapest product(s) matching the keyword '{}':\n".format(keyword))
print(cheapest_per_store[['Store', 'ItemName', 'Brand', 'UnitPrice']])


Cheapest product(s) matching the keyword 'Chips':

             Store                                          ItemName   Brand  \
582861       Coles   Smiths Grainwaves Chips Sour Cream Chives | 40g  Smiths   
27172   Woolworths  Nong Shim Shrimp Meat Chip Shrimp Meat Chips 75g    Nong   

        UnitPrice  
582861       1.50  
27172        1.45  


In [10]:
# Identify which store offers the absolute cheapest product
min_price_row = cheapest_per_store.loc[cheapest_per_store['UnitPrice'].idxmin()]
print("\n The cheapest product is in {} with '{}' at $ {:.2f}".format(
    min_price_row['Store'],
    min_price_row['ItemName'],
    min_price_row['UnitPrice']
))


 The cheapest product is in Woolworths with 'Nong Shim Shrimp Meat Chip Shrimp Meat Chips 75g' at $ 1.45


**Conclusion**: Among both stores, the cheapest product was offered by **{min_price_row['Store']}** for the searched keyword.

This lets users make informed purchasing decisions based on store-wise pricing.