<div style="background-color:#e68867; padding:10px; border:2px solid black;">
    <h1><b>Quality | Ethics | Transparency</b></h1>
</div>

<div style="background-color:#c7cbe9; padding:10px;border:2px solid black;">
<h2>🌿Cannabis Market Snapshot: Québec’s SQDC Landscape (July 2025)</h2>
    
🔹 **Author** : MagaliTrueAnalytics  
🔹 **Date** : 2025-07-19 
🔹 **Sources** : (https://github.com/MagaliTrueAnalytics/Portfolio/blob/main/Projet6/Data/)  
🔹 **Objective** : This analysis aims to provide a comprehensive snapshot of Québec’s cannabis retail ecosystem as of July 2025.

By combining manually collected data and webscraped datasets, the project delivers an in-depth portrait of:

- 🧬 **Product Offering**: Distribution of cannabis products across key categories (e.g. dried flower, hashish, pre-rolls), strain types (Indica, Sativa, Hybrid), and dominance profiles (THC, CBD, Balanced)
- 💰 **Pricing Trends**: Average prices, price ranges, and comparative analysis across strains, producers, and categories
- 🏪 **Retail Landscape**: Mapping of SQDC store expansion and catalog diversity across provinces
- 📈 **Benchmarking Tool**: An interactive framework designed to support producers and stakeholders in comparing their brand’s retail presence, pricing strategy, and catalog footprint against competitors

The Power BI dashboard acts as both an analytical tool and strategic resource for anyone seeking to better understand Québec’s regulated cannabis market—from data enthusiasts to industry professionals.
</div>

<div style="background-color:#c7cbe9; padding:10px;border:2px solid black;">
    
<h2>📈 Dataset Presentation</h2>
</div>

## 📊 Dataset Presentation

This analysis relies on a combination of **manually collected** and **webscraped datasets**, focusing on the SQDC cannabis offering in Québec as of **July 2025**.

### 📝 1. SQDC Website (Manual Collection)
Between **July 13 and 17, 2025**, product-level data was manually collected from the official SQDC website. The focus was on four key product categories representing over 80% of the catalog:
- *Dried flower*
- *Ground cannabis*
- *Pre-rolls*
- *Hashish*

Collected information includes:
- Producer and brand names
- Product pricing
- Strain type (Indica, Sativa, Hybrid)
- Dominance profile (THC/CBD/Balanced)
- Category breakdown

A separate Excel file also consolidates high-level stats such as:
- Total number of products across all categories (including beverages, edibles, extracts)
- Distribution by strain, dominance, and category

### 🕸️ 2. Weedcrawler.ca (Web Scraped Data)

On **July 17–18, 2025**, authorized web scraping was performed on [quebec.weedcrawler.ca](https://quebec.weedcrawler.ca) using Python scripts executed in **Visual Studio Code**.

Extracted data includes:
- SQDC store names and their opening dates
- Brands and products listed per store

Three custom Python scripts used during this process are available in this repository:
- `weedcrawler.py`: Extracts SQDC store names and their opening dates
- `SQDC_StoreID.py`: Retrieves unique store IDs to streamline store-specific URL navigation
- `Produits_succursales.py`: Collects detailed product information per store

All data was exported and saved as CSV files for seamless integration into the analysis workflow.

### 🌍 3. Location Enhancement
The store dataset was enriched with **latitude and longitude** information:
- Geo-coordinates for each city were generated using a GenAI tool
- For cities with multiple stores (e.g. Montréal, Québec City, Laval), precise geo-coordinates were manually retrieved via Google Maps to avoid limitations in scraping and reduce API complexity

### 🆔 4. Data Preprocessing
Before importing into Power BI:
- Unique IDs were generated for producers and brands using randomized Excel formulas
- Data cleaning and structuring were carried out in **Power Query**, including transformations, table relations, and schema refinement

<div style="background-color:#c7cbe9; padding:10px;border:2px solid black;">
<h2>🧹ETL Process: Data Extraction, Transformation & Loading</h2>
</div>

The Power BI model ingests five data tables loaded into Power Query from the following sources:

- **Produits_Marques_Prix_SQDC.xlsx** (manually collected from the official SQDC website), which includes three sheets:
  - `Producteurs_SQDC`
  - `Marques_Producteurs`
  - `Produits_Prix`
- **Succursales_map.csv** and **Succursales_Produits.csv** (web scraped using Python)

#### 🔧 Data Cleaning & Transformation

Extensive data cleansing operations were applied to ensure consistency and enable smooth semantic modeling. Key challenges emerged due to subtle text differences between manually collected and scraped datasets (e.g. apostrophes and special characters).

Steps performed across all tables include:
- Trimming and cleaning text fields
- Converting text to uppercase
- Replacing inconsistent values
- Updating data types (e.g. price fields converted to fixed decimal numbers)

#### 🧩 Table Preparation for Modeling

- **Succursales_Produits**
  - Split the `Marque-Nom` column using the `-` delimiter
  - Created a new `Marque_Nom_Clean` column using M code

- **Marques_Producteurs**
  - Added `Marque_Nom_Clean` column for matching purposes

- **Produits_Dominance**
  - Created by duplicating `Produits_Prix` and retaining only relevant fields: `Categorie`, `Marque`, `Dominance`, and `Souche`

- **Succursales_Produits** (continued)
  - Left outer joined with `Marques_Producteurs` on `Marque_Nom_Clean` to add `Marque_ID` and `Producteur_ID`
  - Left outer joined with `Produits_Dominance` using three keys (`Souche`, `Marque_ID`, `Souche`) to add `Dominance`

#### 🔐 Anonymization Strategy

To protect brand and producer identities:
- Added `Producteur_Alias` column in `Producteurs_SQDC` and `Marques_Producteurs` via M code
- Added `Marque_Alias` column in `Marques_Producteurs`

#### 🧮 Filtering for Active Producers

To focus the analysis on currently active producers:

- Created the `Producteurs_ID_Actifs` table by duplicating `Succursales_Produits`, extracting `Producteur_ID`, and removing duplicates.  
- This table was **not** loaded into the Power BI model but used as a filter via inner joins with:
  - `Marques_Producteurs`
  - `Producteurs_SQDC`
  - `Produits_Prix`

Only producers referenced in the **Weedcrawler** dataset were retained for final analysis. 

🔗[Link for raw data and python script](https://github.com/MagaliTrueAnalytics/Portfolio/tree/main/Projet6/Data)

<div style="background-color:#c7cbe9; padding:10px;border:2px solid black;">
<h2>📊Analyze & Vizualisation</h2>
</div>

The report was designed using dynamic visualizations supported by custom DAX measures to derive insights from structured data. Three key measures were created:

- `PrixMoyen`: Calculates the average product price
- `SuccursalesCumulees`: Tracks the cumulative expansion of SQDC stores over time
- `TotalReferences`: Counts the total number of product references listed in stores

#### 🧭 Page 1 — *Overview_Offer*

This page provides a high-level view of SQDC’s product coverage and store distribution:

- 🗺️ **Map**: SQDC Store Network and Reference Count by Location
- 📈 **Area Chart**: Cumulative Store Extension Over Time
- 🍩 **Donut Chart**: Share of SQDC Product Offering by Province
- 📊 **Bar Chart**: Producer Catalog Presence at SQDC, color-coded by province
- 🌳 **Tree Map**: Product Breakdown by Category and Strain
- 🍩 **Donut Chart**: Product Distribution by Category
- 🍩 **Donut Chart**: Product Distribution by Dominance Type

#### 💰 Page 2 — *Overview_Price*

This section focuses on pricing insights across product categories and strain profiles:

- 📦 **Box Plot**: Price Distribution by Product Category
- 📊 **Stacked Column Chart**: Product Spread Across Price Ranges by Category
- 📈 **Multi-Column Chart**: Price Trends by Strain, Product Category, and Dominance Profile
- 📊 **Stacked Bar Chart**: Average Price by Category and Strain Type

#### 🧪 Page 3 — *Benchmarking*

This dashboard enables focused benchmarking analysis with filterable producer views:

- 🎛️ Filters: by Product Category and by Producteur_Alias
- 🧾 KPI Cards:
  - Total Unique Products Referenced in SQDC Stores
  - Retail Presence Across Store Locations
  - Total Product References in All SQDC Stores
  - Average Price per Gram
- 📊 **Bar Chart**: Producer Catalog Presence by Province
- 📋 **Table View**: Displays detailed pricing breakdown—Product Category, Strain, Minimum Price, Average Price, Maximum Price—based on producer selection

🔗[link for Reports](https://github.com/MagaliTrueAnalytics/Portfolio/tree/main/Projet6/Report)


<div style="background-color:#c7cbe9; padding:10px;border:2px solid black;">
<h2>📣Data Insights Analysis</h2>
</div>

#### 🏪 SQDC Retail Expansion

- As of July 2025, **107 SQDC stores** are operating across Québec—up from **11 in 2018**.
- Store openings per year show rapid growth between 2018 and 2021, followed by deceleration:
  - 2018 (11), 2019 (21), 2020 (23), 2021 (25), 2022 (11), 2023 (6), 2024 (4), 2025 (6).
- The Montréal Saint-Hubert store was permanently closed, per Google business records (removed from the data).

#### 🍁 Market Representation by Producer

- **Québec-based producers dominate** SQDC’s in-store product listings, representing **74.45%**, followed by **Ontario at 17.56%**.
- The top five producers—**all from Québec**—account for a significant share of product references:
  - #1 producer: 5,084 references
  - #2 producer: 2,654 references
  - #3–#5 range from ~2,100 to ~1,900 references
- Ontario’s leading producer ranks sixth with 1,861 references.
- This reflects strong prioritization of **local producers** in the SQDC retail environment.

#### 🌿 Product Types & Strain Dominance

- Product breakdown in SQDC stores:
  - **Dried Flower**: 54.28% (19,517 references)
  - **Pre-rolls**: 29% (10,643)
  - **Hashish**: 13.5% (4,963)
  - **Ground Cannabis**: 3.23% (1,184)
- Dominance profile by reference count:
  - **THC-dominant**: 76.48%
  - **Balanced**: 15.62%
  - **CBD-dominant**: 7.9%

#### 💵 Price Analysis by Category & Strain

- **Hashish** and **Pre-rolls** show wide price variability:
  - Hashish range: 9.77 to 29.90CAD/g (avg. 16.27CAD/g)
  - Pre-rolls range: 4.57 to 23.50CAD/g (avg. 10.31CAD/g)
- **Dried Flower** pricing is more stable:
  - Avg. 7.69CAD/g, median 7.32CAD/g, max 14.29CAD/g
- **Ground Cannabis** has the lowest average price:
  - Avg. 5.08CAD/g, 75% of products below 4.79CAD/g

- Strain-based pricing confirms:
  - *Derived strains* are most expensive across all categories except dried flower
    - Ground: 12.83CAD/g
    - Hashish: 16.06CAD/g
    - Pre-rolls: 16.14CAD/g
  - No derived strains are present in dried flower

- Within the same product category, strain types **Indica**, **Sativa**, and **Hybrid** show minimal price variance:
  - Pre-rolls: ~8.68–8.96CAD/g
  - Dried Flower: ~7.31–7.51CAD/g
  - Ground: ~4.50–4.58CAD/g

#### 📊 Price Distribution Patterns

- Major price ranges by category:
  - 4.71–9.42CAD/g: Most products across categories (esp. dried flower, pre-rolls)
  - 3.14–4.71CAD/g: Primarily ground cannabis
  - 12.56–15.70CAD/g: Bulk of hashish products
- Outliers:
  - 2 hashish products priced above 29.80CAD/g—both from the same brand

#### 🔎 Dominance vs. Pricing

- Excluding derived strains, **CBD-dominant products** are generally less expensive than THC or Balanced:
  - Exception: Hybrid pre-rolls, where THC averages slightly lower than CBD
- **Balanced products** consistently show higher average prices:
  - Dried Flower: 7.49CAD/g (vs. THC 7.45CAD/g, CBD 7.05CAD/g)
  - Ground Cannabis: 8.80CAD/g (vs. THC 8.73CAD/g, CBD 8.69CAD/g)
  - Hashish: 17.67CAD/g (vs. THC 15.82CAD/g, CBD 12.36CAD/g)

#### 📌 Benchmarking Report

No aggregated insights provided for the *Benchmarking* page due to its individualized use case. This section enables producers to apply filters (`Category`, `Producteur_Alias`) to access tailored KPIs and pricing tables specific to their catalog and retail footprint.

<div style="background-color:#c7cbe9; padding:10px;border:2px solid black;">
<h2>🙏 Acknowledgements</h2>
</div>

A heartfelt thank you to **Alexandre Voyer**, creator of the [Weedcrawler](https://weedcrawler.ca/) platform, for generously granting permission to access and scrape data from his site. Weedcrawler is a brilliantly designed, user-friendly tool that provides real-time visibility into SQDC stock levels—an essential resource for cannabis producers seeking to stand out in a competitive market.

This project would not have been possible without Alexandre’s openness and support. His platform played a pivotal role in enabling meaningful analysis and insights throughout this report.

<div style="background-color:#c7cbe9; padding:10px;border:2px solid black;">
<h2>🛡️ Intellectual Property & Usage Notice</h2>
</div>

This notebook and its contents are the result of independent data collection, transformation, and visualization efforts as part of the **Cannabis Market Snapshot: Québec’s SQDC Landscape (July 2025)** project. The materials, analyses, and visualizations presented here are for educational and informational purposes only.

Please respect the terms of use of the original data sources, including [Weedcrawler.ca](https://quebec.weedcrawler.ca), and do not reuse the scraped data commercially without appropriate authorization. Any benchmarking insights are intended to support fair market transparency and strategic exploration within the cannabis industry.

The author retains rights over the code, structure, and curated visualizations. If you wish to reference or reuse elements of this project, please credit appropriately or reach out for collaboration.