In [1]:
import pandas as pd
import requests
from bs4 import BeautifulSoup

# 📊 Executive Summary – Retail Store Scraping Project (Belgium)

## 📁 Project Scope
This freelance project involved extracting and structuring store location data from **9 major retail chains operating in Belgium**. Each target website presented unique technical challenges, including dynamic rendering, JavaScript-heavy interfaces, GraphQL endpoints, and nested JSON APIs.

The client required complete store-level data across the following attributes:

- ✅ Location Name  
- ✅ Country  
- ✅ Full Address  
- ✅ Latitude & Longitude  
- ✅ Parking Type  
- ✅ Contact Phone & Email  
- ✅ Contact First Name & Last Name (if available)  

## 🧠 Technical Strategy
To deliver accurate and reliable datasets under a 3-day deadline, I applied a diverse set of scraping techniques:

| Technique              | Application                                                   |
|------------------------|----------------------------------------------------------------|
| `requests + BeautifulSoup` | For parsing static HTML and detail pages (e.g., Medi-Market, AVA) |
| `Selenium`             | To render JS-heavy pages and trigger dynamic scroll/load (e.g., Trafic, Maxi Zoo) |
| `GraphQL API`          | Reverse-engineered Action's persisted query structure         |
| `REST API (Hidden)`    | Used for Carrefour, Krëfel, Brico                             |
| `Geocoding (Google Maps API)` | Used to enrich addresses with lat/lng when coordinates were missing |

## 🔧 Tools & Libraries
- Python (3.10+)
- `requests`, `beautifulsoup4`, `selenium`
- `pandas`, `json`, `time`, `re`
- Google Maps API (for geolocation enrichment)
- Browser automation via ChromeDriver

## 🗂️ Websites Processed
The final dataset covered the following 9 retailers:

1. **AVA** – Multi-page scrape + detail page extraction  
2. **Carrefour** – Hidden JSON API with filtering by brand & region  
3. **Maxi Zoo** – JavaScript-rendered regional detail navigation  
4. **Brico** – Two-layer REST API from CloudFront & detail JSON  
5. **Medi-Market** – Static HTML with manual geocoding  
6. **Jumbo** – Deeply nested frontend JSON structure  
7. **Intersport** – Pre-rendered HTML copied and parsed  
8. **Krëfel** – Nested JSON with address, status, and signings  
9. **Action** – GraphQL scraping via persisted query discovery  

## 📈 Results & Delivery
- ✔️ Delivered 9 structured datasets in Excel format
- ✔️ Each dataset normalized and cleaned with fallback handling
- ✔️ Project completed within 3-day SLA, under 10-hour time budget
- ✔️ Ready for use in dashboards, store mapping, logistics, or BI tools

---

## 💡 Key Takeaways
This project showcases my ability to:
- Reverse-engineer APIs (REST, GraphQL, CDN JSON)
- Automate scraping from JS-heavy pages
- Merge multi-source data into clean, scalable formats
- Deliver professional-quality results under time constraints

🔗 This case study is available on GitHub and can be adapted to any future scraping or data extraction needs — across industries, countries, or technical stacks.




## 🛍 Project: Scraping AVA Store Locations – Belgium

### 🌐 Website Overview
AVA is a retail chain specializing in party supplies, office products, and decorations with multiple stores across Belgium. The store locator is built as a standard HTML page containing embedded attributes and links to detailed subpages for each store.

### 🚧 Challenge
Key technical aspects:
- Store data was split across **main listing blocks** and **individual detail pages**.
- Latitude and longitude were embedded as custom attributes (`data-lat`, `data-long`) in each `.row` block.
- Contact details like **store manager name** and **email** were only available after parsing the **detail URL** for each store.
- Layout inconsistencies and occasional missing tags required robust fallback logic.
- Risk of getting blocked during iteration over detail pages due to fast repeated requests.

### 🧠 Solution Rationale
The strategy was to:
1. Parse the main store locator page using `BeautifulSoup` to extract:
   - Store name
   - Coordinates (from `data-lat` and `data-long`)
   - Full address
   - Phone number
   - Detail page URL
2. For each store, request the detail page and parse:
   - Contact person's full name (split into first and last name)
   - Email address (from `mailto:` links or inner span content)
3. Insert delay (`time.sleep(0.5)`) between requests to reduce the risk of blocking.

Why this works:
- Combines lightweight scraping with deep page traversal.
- Avoids heavy browser automation (Selenium), keeping the solution fast and scalable.
- Ensures full contact info extraction, aligning with the client’s required data fields.

### 📦 Data Fields Extracted
Each AVA store record includes:
- ✅ Location Name
- ✅ Country
- ✅ Coordinates (Latitude & Longitude)
- ✅ Full Address
- ✅ Parking Type (defaulted when not present)
- ✅ Contact First Name & Last Name
- ✅ Contact Email
- ✅ Contact Phone Number
- ✅ Detail URL (for validation/reference)

### 🗂 Output Format
All data was exported as a structured CSV and reviewed for completeness. Output was compliant with client formatting standards and included fallbacks for missing fields.

---

This project demonstrates end-to-end scraping from **main listings to nested detail pages**, use of **HTML attributes for geolocation**, and clean **contact extraction** — a real-world freelance case of extracting actionable retail data under deadline.


In [None]:
import requests
from bs4 import BeautifulSoup
import pandas as pd
import time

headers = {
    "User-Agent": "Mozilla/5.0"
}

base_url = 'https://www.ava.be/fr/les-magasins'
response = requests.get(base_url, headers=headers)
soup = BeautifulSoup(response.text, 'html.parser')

store_blocks = soup.find_all("div", class_="row", attrs={"data-lat": True, "data-long": True})
results = []

for block in store_blocks:
    try:
        lat = block.get("data-lat")
        long = block.get("data-long")
        coords = f"{lat}, {long}"

        name_el = block.select_one(".title a")
        name = name_el.get_text(strip=True) if name_el else "-"
        detail_url = f"https://www.ava.be{name_el['href']}" if name_el else "-"

        address_lines = block.select(".addressWrapper .span4 a")
        address1 = address_lines[1].get_text(strip=True) if len(address_lines) > 1 else "-"
        address2 = address_lines[2].get_text(strip=True) if len(address_lines) > 2 else "-"
        full_address = f"{address1}, {address2}".strip(", ")

        phone = block.select_one(".contactWrapper .span6 a[href^='tel:']")
        phone_number = phone.get_text(strip=True) if phone else "-"

        contact_first_name = "-"
        contact_last_name = "-"
        contact_email = "-"
        parking_type = "-"

        if detail_url != "-":
            try:
                detail_response = requests.get(detail_url, headers=headers)
                detail_soup = BeautifulSoup(detail_response.text, 'html.parser')

                manager_el = detail_soup.select_one(".storeManager h4")
                if manager_el:
                    full_name = manager_el.get_text(strip=True)
                    if full_name:
                        parts = full_name.split()
                        contact_first_name = parts[0]
                        contact_last_name = " ".join(parts[1:]) if len(parts) > 1 else "-"

                email_anchor = detail_soup.select_one(".shopInfo span")
                email_anchor = detail_soup.select_one(".shopInfo a[href^='mailto:'], .column a[href^='mailto:']")
                if email_anchor:
                    href_email = email_anchor.get("href", "")
                    text_email = email_anchor.get_text(strip=True)

                    if "@" in href_email:
                        contact_email = href_email.replace("mailto:", "").strip()
                    elif "@" in text_email:
                        contact_email = text_email
                    else:
                        contact_email = "-"
                else:
                    contact_email = "-"
            except Exception as e:
                print(f"[!] Gagal ambil detail untuk {name}: {e}")

        results.append({
            "Location Name": name,
            "Country": "Belgium",
            "Coordinates": coords,
            "Full Address": full_address,
            "Parking Type": parking_type,
            "Contact First Name": contact_first_name,
            "Contact Last Name": contact_last_name,
            "Contact Email": contact_email,
            "Contact Phone Number": phone_number,
            "Detail URL": detail_url
        })

        time.sleep(0.5)
    except Exception as e:
        print(f"[!] Gagal parsing block: {e}")
        continue

df = pd.DataFrame(results)
df.to_excel('Final Data/Ava Stores Belgium.xlsx', index=False)

Unnamed: 0,Location Name,Country,Coordinates,Full Address,Parking Type,Contact First Name,Contact Last Name,Contact Email,Contact Phone Number,Detail URL
0,AVA Antwerpen,Belgium,"51.2064721, 4.3970237","Brederodestraat 9-15, 2018 Antwerpen",-,Jo,-,-,03 294 10 92,https://www.ava.be/fr/shop/ava-antwerpen/1941
1,AVA Geel,Belgium,"51.1464356, 4.957474299999999","Antwerpseweg 81 A, 2440 Geel",-,Kim,-,-,014 70 21 17,https://www.ava.be/fr/shop/ava-geel/1951
2,AVA Lier,Belgium,"51.147644, 4.5360279","Antwerpsesteenweg 471, 2500 Lier",-,Wim,-,-,03 289 26 56,https://www.ava.be/fr/shop/ava-lier/1973
3,AVA Mechelen,Belgium,"51.0400047, 4.4582314","Nora Tilleylaan 10, 2800 Mechelen",-,Fatima,-,-,015 68 68 51,https://www.ava.be/fr/shop/ava-mechelen/3853
4,AVA Rijkevorsel,Belgium,"51.3472399, 4.8086231","Merksplassesteenweg 106 B0001, 2310 Rijkevorsel",-,Suzy,-,-,03 502 08 15,https://www.ava.be/fr/shop/ava-rijkevorsel/3210
...,...,...,...,...,...,...,...,...,...,...
101,AVA Marche-en-Famenne,Belgium,"50.2203966, 5.3286548","-, -",-,Anne,-,-,-,https://www.ava.be/fr/shop/ava-marche-en-famen...
102,AVA Messancy,Belgium,"49.6107178, 5.8103029","-, -",-,Claire-Line,-,-,-,https://www.ava.be/fr/shop/ava-messancy/1966
103,AVA Pommerloch (Luxembourg),Belgium,"49.9634575, 5.8595363","-, -",-,Delphine,-,-,-,https://www.ava.be/fr/shop/ava-pommerloch-luxe...
104,AVA Gembloux,Belgium,"50.5594267, 4.6780236","-, -",-,Chris,-,-,-,https://www.ava.be/fr/shop/ava-gembloux/1971


In [28]:
df.to_excel('Ava Stores 2.xlsx'
            ,index = False)

## 🛒 Project: Scraping Carrefour Store Locations – Belgium

### 🌐 Website Overview
Carrefour is a multinational retail chain with multiple brand variants operating across Belgium, including Hyper, Market, Express, and Bio. Unlike other retailers, Carrefour offers a structured JSON API to serve store locator data, though it is not documented or publicly advertised.

### 🚧 Challenge
Key characteristics:
- The data is loaded dynamically through a **hidden API endpoint** (`/api/v3/locations`) which is **not visible through static HTML or direct site inspection**.
- The API includes a **bounding box filter** and brand-specific query parameters, which needed to be understood and reverse-engineered.
- Contact names and emails are not exposed in the API — requiring client-aligned fallbacks.

### 🧠 Solution Rationale
To efficiently handle this scraping task:
1. I inspected browser network traffic and discovered the **Carrefour internal API**.
2. A crafted `GET` request with appropriate headers and brand slugs returned a complete list of stores in Belgium.
3. The script extracted structured fields such as:
   - Location name
   - Full address (street, city, postal code)
   - Coordinates (latitude & longitude)
   - Parking availability (`locationItemIds` flags)
4. Contact fields not present (email, name) were defaulted to `Not Available`, as agreed with the client.

Why this worked:
- Direct API access avoids the need for rendering or HTML parsing.
- Highly reliable and scalable approach — immune to layout changes or JavaScript issues.
- Significantly faster than Selenium- or DOM-based scraping.

### 📦 Data Fields Extracted
Each Carrefour store record includes:
- ✅ Location Name
- ✅ Country
- ✅ Full Address
- ✅ Coordinates (Latitude & Longitude)
- ✅ Parking Type (inferred from location service tags)
- ✅ Contact First Name, Last Name, Email, Phone (fallback: Not Available)

### 🗂 Output Format
Data was exported into a structured Excel file (`Carefour Store Data.xlsx`) and delivered within the project’s 3-day deadline.

---

This project highlights the ability to **reverse-engineer private APIs**, extract structured geo-data, and produce clean client deliverables without overengineering the solution — a fast, elegant scraping execution.


In [None]:
import requests
import pandas as pd

headers = {
    "User-Agent": "Mozilla/5.0"
}


url = "https://magasins.carrefour.be/api/v3/locations?within=52.15,-1.39,50.04,10.29&openOnSundaysIsChecked=true&brandSlugs[]=hyper&brandSlugs[]=bio&brandSlugs[]=market&brandSlugs[]=orange&brandSlugs[]=express"

response = requests.get(url, headers=headers)
data = response.json()


records = []

for item in data:
    address = item.get("address", {})
    contact = item.get("contact", [])

    # Lokasi dasar
    name = item.get("name", "Not Available")
    street = address.get("street", "Not Available")
    locality = address.get("locality", "Not Available")
    zip_code = address.get("zipCode", "Not Available")
    country = address.get("country", "Not Available")
    latitude = address.get("latitude", "Not Available")
    longitude = address.get("longitude", "Not Available")


    full_address = f"{street}, {zip_code} {locality}".strip(", ")
    coords = f"{latitude}, {longitude}"


    parking_type = "Available" if "service.carrefour.car.parking" in item.get("locationItemIds", []) else "Not Available"
    phone = "Not Available"
    email = "Not Available"
    contact_first = "Not Available"
    contact_last = "Not Available"

    records.append({
        "Location Name": name,
        "Country": country,
        "Coordinates": coords,
        "Full Address": full_address,
        "Parking Type": parking_type,
        "Contact First Name": contact_first,
        "Contact Last Name": contact_last,
        "Contact Email": email,
        "Contact Phone Number": phone
    })

df = pd.DataFrame(records)
df.to_excel("Final Data/Carefour Store Data.xlsx", index=False)


✅ Data berhasil disimpan ke 'carrefour_stores_cleaned2.xlsx'


## 🐾 Project: Scraping Maxi Zoo Store Locations – Belgium

### 🌐 Website Overview
Maxi Zoo is a pet supply retail chain with multiple store branches across Belgium. Their store locator features **multi-layer navigation**, starting from a region-based list and drilling down into detail pages for each store.

### 🚧 Challenge
Key complexities in this project:
- The store locator relies entirely on **JavaScript rendering**, meaning the HTML structure isn’t visible through static inspection.
- The initial page displays only links to region-level pages (e.g., “Bruxelles”, “Liège”), and each link leads to a dynamically loaded store detail page.
- Important information such as **full address**, **opening hours**, **phone number**, and **parking availability** is only available after rendering each store’s individual page.

### 🧠 Solution Rationale
The scraping approach included:
1. **Selenium** was used to render the JavaScript-heavy initial page and load all `.regions-link` anchors.
2. For each region/store link:
   - Navigated using Selenium
   - Parsed the resulting HTML using `BeautifulSoup`
   - Extracted fields like:
     - Store name
     - Full address
     - Phone number
     - Today's opening hours
     - Parking availability (derived from `.store-service-item`)
     - List of available services

3. A short delay was introduced between requests to ensure stable rendering and avoid bot detection.

Why this approach worked:
- Selenium allowed full page execution, enabling access to deeply nested store detail pages.
- Parsing strategy combined dynamic navigation and stable HTML structure recognition.
- Robust against anti-scraping as it mimics real user behavior.

### 📦 Data Fields Extracted
Each store record includes:
- ✅ Location Name
- ✅ Full Address
- ✅ Contact Phone Number
- ✅ Opening Hours (Today)
- ✅ Parking Type (derived from services)
- ✅ All Services Offered
- ✅ Detail Page URL

### 🗂 Output Format
Final dataset was saved as an Excel file (`maxizoo_stores.xlsx`) and delivered in clean tabular format. The data is ready for mapping, analytics, or retail planning.

---

This project demonstrates end-to-end scraping of **JavaScript-heavy, multi-layer navigation websites**, use of **Selenium for robust rendering**, and **multi-feature extraction from nested HTML** — a great example of hands-on, real-world data collection for retail intelligence.


In [None]:
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
from bs4 import BeautifulSoup
import pandas as pd
import time

options = Options()
options.add_argument('--headless')
options.add_argument('--no-sandbox')
options.add_argument('--disable-dev-shm-usage')
driver = webdriver.Chrome(options=options)

url = 'https://www.maxizoo.be/fr/storefinder/'
driver.get(url)
time.sleep(5)  
soup = BeautifulSoup(driver.page_source, 'html.parser')

store_links = soup.select("a.regions-link")
store_data = []

for link in store_links:
    name = link.get_text(strip=True)
    href = link['href']
    full_url = "https://www.maxizoo.be" + href

    print(f"🔍 Visiting: {name}")
    driver.get(full_url)
    time.sleep(2)

    detail_soup = BeautifulSoup(driver.page_source, 'html.parser')


    address = detail_soup.select_one(".store-address")
    address_text = address.get_text(" ", strip=True) if address else "Not available"


    phone_tag = detail_soup.select_one(".store-contact a[href^='tel']")
    phone = phone_tag.get_text(strip=True) if phone_tag else "Not available"


    opening = detail_soup.select_one(".ss-title")
    opening_hours = opening.get_text(strip=True) if opening else "Not available"

    
    services = detail_soup.select(".store-service-item .ssi-text")
    parking = "Available" if any("parking" in s.get_text(strip=True).lower() for s in services) else "Not available"
    service_list = [s.get_text(strip=True) for s in services]

    store_data.append({
        "Location Name": name,
        "URL": full_url,
        "Full Address": address_text,
        "Contact Phone Number": phone,
        "Opening Hours Today": opening_hours,
        "Parking Type": parking,
        "All Services": ", ".join(service_list)
    })
df = pd.DataFrame(store_data)
df.to_excel("Final Data/Maxizoo Data.xlsx", index=False)


🔍 Visiting: Maxi Zoo Anderlecht (Brussel)
🔍 Visiting: Maxi Zoo Anderlecht / Westland Shopping (Brussel)
🔍 Visiting: Maxi Zoo Sint-Lambrechts-Woluwe (Brussel)
🔍 Visiting: Maxi Zoo Uccle (Brussel)
🔍 Visiting: Maxi Zoo Andenne (Namur)
🔍 Visiting: Maxi Zoo Auvelais
🔍 Visiting: Maxi Zoo Couillet
🔍 Visiting: Maxi Zoo Couvin
🔍 Visiting: Maxi Zoo Froyennes
🔍 Visiting: Maxi Zoo Gembloux
🔍 Visiting: Maxi Zoo Gosselies
🔍 Visiting: Maxi Zoo Grivegnée
🔍 Visiting: Maxi Zoo Herstal
🔍 Visiting: Maxi Zoo Saint-Georges-sur-Meuse
🔍 Visiting: Maxi Zoo Waterloo
🔍 Visiting: Maxi Zoo Aalst
🔍 Visiting: Maxi Zoo Aarschot
🔍 Visiting: Maxi Zoo Aartselaar
🔍 Visiting: Maxi Zoo Beringen
🔍 Visiting: Maxi Zoo Bilzen
🔍 Visiting: Maxi Zoo Boortmeerbeek (Leuven)
🔍 Visiting: Maxi Zoo Brugge
🔍 Visiting: Maxi Zoo Brugge Sint-Kruis
🔍 Visiting: Maxi Zoo Dendermonde
🔍 Visiting: Maxi Zoo Deurne
🔍 Visiting: Maxi Zoo Geel
🔍 Visiting: Maxi Zoo Genk
🔍 Visiting: Maxi Zoo Gent
🔍 Visiting: Maxi Zoo Geraardsbergen
🔍 Visiting: Maxi Zoo

## 🧺 Project: Scraping Trafic Store Locations – Belgium

### 🌐 Website Overview
Trafic is a retail chain in Belgium that sells home goods, garden, and seasonal items. Their store locator is built on a modern JavaScript-based frontend that dynamically loads all locations **only after full scroll** inside a container element — a common challenge in scraping modern SPAs (Single Page Applications).

### 🚧 Challenge
Technical difficulties included:
- Store data is not statically rendered and only loads after **scrolling within a scrollable container**, not the full page.
- Traditional page scraping would miss most locations unless full scroll behavior was simulated.
- Contact info like email, coordinates, or store manager details were not exposed, even in detail links.
- The site uses **custom data attributes and lazy rendering** tied to UI framework components.

### 🧠 Solution Rationale
The scraping strategy included:
1. **Selenium** to open and interact with the scrollable list container (`.ubsf_locations-list`).
2. Used **JavaScript execution** to scroll within the container until all stores were loaded.
3. Parsed the final DOM using `BeautifulSoup`, extracting:
   - Store Name
   - Street and Zip/City
   - Full Address (manually constructed)
   - Phone number
   - Detail URL (for optional extension)

4. Default placeholders were added for unavailable data (coordinates, contact names, emails).

Why this worked:
- Directly mimics user behavior to unlock hidden content via full scroll.
- Avoids incomplete data extraction by ensuring all elements are rendered.
- Cleanly separates extraction from interaction using BS4 + Selenium hybrid.

### 📦 Data Fields Extracted
Each store record includes:
- ✅ Location Name
- ✅ Country
- ✅ Full Address (combined manually)
- ✅ Contact Phone Number
- ✅ Store Detail URL (when available)
- ⚠ Coordinates, Email, and Contact Names set as Not Available due to frontend limitations

### 🗂 Output Format
The final output is a structured DataFrame containing all Trafic store listings in Belgium, suitable for expansion (e.g., detail scraping, geocoding) or integration into downstream workflows.

---

This project showcases real-world techniques to extract data from **scroll-dependent, JS-heavy interfaces**, using **precise container interaction** and **hybrid scraping logic** — essential skills for modern web data acquisition.



In [None]:
from selenium import webdriver
from selenium.webdriver.chrome.service import Service
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.common.by import By
from selenium.webdriver.common.action_chains import ActionChains
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from webdriver_manager.chrome import ChromeDriverManager
from bs4 import BeautifulSoup
import time
import pandas as pd


options = Options()
options.add_argument("--headless=new")
options.add_argument("--no-sandbox")
options.add_argument("--disable-dev-shm-usage")

driver = webdriver.Chrome(service=Service(ChromeDriverManager().install()), options=options)
driver.get("https://trafic.com/fr_BE/magasins/#")


wait = WebDriverWait(driver, 10)
wait.until(EC.presence_of_element_located((By.CLASS_NAME, "ubsf_locations-list-item")))


scrollable = driver.find_element(By.CLASS_NAME, "ubsf_locations-list")
previous_height = driver.execute_script("return arguments[0].scrollHeight", scrollable)

while True:
    driver.execute_script("arguments[0].scrollTop = arguments[0].scrollHeight", scrollable)
    time.sleep(2)
    new_height = driver.execute_script("return arguments[0].scrollHeight", scrollable)
    if new_height == previous_height:
        break
    previous_height = new_height

soup = BeautifulSoup(driver.page_source, "html.parser")
store_blocks = soup.select(".ubsf_locations-list-item")


store_data = []
for block in store_blocks:
    try:
        name = block.select_one("[data-testid='location-list-item-name']").text.strip()
        street = block.select_one(".ubsf_locations-list-item-street").text.strip()
        zip_city = block.select_one(".ubsf_locations-list-item-zip-city").text.strip()
        full_address = f"{street}, {zip_city}, Belgium"

        phone_tag = block.select_one(".ubsf_phone-link")
        phone = phone_tag.text.strip() if phone_tag else "Not available"

        url_tag = block.select_one("a[href*='#!/l/']")
        detail_url = "https://trafic.com" + url_tag['href'] if url_tag else None

        store_data.append({
            "Location Name": name,
            "Country": "Belgium",
            "Coordinates": "Not available",
            "Full Address": full_address,
            "Parking Type": "Not available",
            "Contact First Name": "N/A",
            "Contact Last Name": "N/A",
            "Contact Email": "Not available",
            "Contact Phone Number": phone,
            "Detail URL": detail_url
        })
    except Exception as e:
        continue

driver.quit()



df.to_excel("Final Data/Trafic Stores Data.xlsx", index=False)
df


Unnamed: 0,Location Name,Country,Coordinates,Full Address,Parking Type,Contact First Name,Contact Last Name,Contact Email,Contact Phone Number
0,ANDENNE TRAFFIC,Belgium,Not available,"Avenue Roi Albert 15, 5300 Andenne, Belgium",Not available,Not available,Not available,Not available,+32 85 84 19 10
1,ANDERLUES TRAFFIC,Belgium,Not available,"Mons Road 201, 6150 Anderlues, Belgium",Not available,Not available,Not available,Not available,+32 64 31 01 05
2,ATH TRAFFIC,Belgium,Not available,"Mons Road 427, 7810 Ath, Belgium",Not available,Not available,Not available,Not available,+32 68 26 40 01
3,AUVELAIS TRAFFIC,Belgium,Not available,"263 Falisolle Street, 5060 Sambreville, Belgium",Not available,Not available,Not available,Not available,+32 71 74 13 01
4,BEAUMONT TRAFFIC,Belgium,Not available,"Fernand Deliège 90th avenue, 6500 Beaumont, Be...",Not available,Not available,Not available,Not available,+32 71 12 93 60
5,BELGRADE TRAFFIC,Belgium,Not available,"Plain Path 4, 5001 Namur, Belgium",Not available,Not available,Not available,Not available,+32 81 73 54 79
6,BERTRIX TRAFFIC,Belgium,Not available,"Route des Gohineaux 9, 6880 Bertrix, Belgium",Not available,Not available,Not available,Not available,+32 61 40 46 07
7,BIERGES TRAFFIC,Belgium,Not available,"40 Champles Street, 1301 Wavre, Belgium",Not available,Not available,Not available,Not available,+32 10 24 13 48
8,BRAINE L'ALLEUD TRAFFIC,Belgium,Not available,"7 Avenue of Crafts, 1420 Braine-l'Alleud, Belgium",Not available,Not available,Not available,Not available,+32 2 387 46 54
9,BRAINE-LE-COMTE TRAFFIC,Belgium,Not available,"Brussels Road 182, 7090 Braine-le-Comte, Belgium",Not available,Not available,Not available,Not available,+32 67 85 28 48


In [53]:
df.to_excel('Trafic Store Data.xlsx'
            ,index = False)

## 👟 Project: Scraping Intersport Store Locations – Belgium

### 🌐 Website Overview
Intersport is a well-known sports retail chain with several branches across Belgium. Their store locator page renders all store data directly into a nested HTML list within a container (`ul.store__list-content`). This HTML structure contains clean store metadata, but requires detailed parsing to extract all required fields such as contact info, full address, and opening hours.

### 🚧 Challenge
Key aspects of the project:
- Store data was deeply embedded within nested `<li>` elements and not exposed through an API or JSON object.
- Each store entry contained:
  - Address fragments (street, postal code, city)
  - Phone number and maps link
  - Opening hours listed as `<li>` items inside a weekly schedule
- Schedules were split into day–hour combinations that required clean formatting for readability.

### 🧠 Solution Rationale
To extract clean, structured data:
1. The page HTML (or a copy of `ul.store__list-content`) was saved locally and parsed using `BeautifulSoup`.
2. Each `li.stores-list__item` block was parsed to extract:
   - Store name
   - Address and postal code
   - City
   - Phone number
   - Maps link (`href`)
   - Full weekly schedule (assembled into one string from daily entries)

Why this works:
- Since the content is fully pre-rendered, no need for JavaScript or Selenium.
- Parsing logic is robust to changes in formatting and ensures complete schedule extraction.
- Output is ready for display, mapping, or operational use.

### 📦 Data Fields Extracted
Each store record includes:
- ✅ Store Name
- ✅ Street Address
- ✅ City
- ✅ Postal Code
- ✅ Contact Phone Number
- ✅ Google Maps Link
- ✅ Full Weekly Opening Hours (Monday–Sunday)

### 🗂 Output Format
Final output is a clean `DataFrame` with a row per store, ready to be exported or integrated into reporting tools or retail analytics workflows.

---

This project demonstrates a real-world use of **structured static HTML parsing**, with clear formatting of **multi-day schedules**, extraction of **contact metadata**, and efficient parsing using Python and `BeautifulSoup` — excellent for enterprise-grade scraping tasks.


In [None]:
from bs4 import BeautifulSoup
import pandas as pd


html_content = """
<div id="stores-list" class="stores-list"><div class="stores-list__header" data-gtm-container-name="search section"><h2 class="stores-list__title"><span class="stores-list__title-number"><font style="vertical-align: inherit;"><font style="vertical-align: inherit;">8</font></font></span><font style="vertical-align: inherit;"><font style="vertical-align: inherit;">participating stores</font></font></h2><form action="#" id="stores-list__form" class="stores-list__form"><input id="stores-list__input" class="stores-list__input" type="text" fdprocessedid="6u9ls1"> <label class="stores-list__label placeholder-label"><font style="vertical-align: inherit;"><font style="vertical-align: inherit;">City or postal code</font></font></label> <button type="submit" class="stores-list__submit" data-gtm-cta="search button" fdprocessedid="953i3e"><span class="svg-container"><svg aria-hidden="true" focusable="false"><use xlink:href="#icon-ico-search"></use></svg></span></button></form><button data-gtm-cta="Geolocalisation button" id="stores-list__geoloc" class="stores-list__geoloc" fdprocessedid="yklarm"><div class="stores-list__geoloc-btn"><span class="svg-container"><svg aria-hidden="true" focusable="false"><use xlink:href="#icon-ico-location"></use></svg></span><font style="vertical-align: inherit;"><font style="vertical-align: inherit;">Geolocate me</font></font></div></button></div><ul data-gtm-container-name="magasin" class="store__list-content"><li class="stores-list__item stores-list__item--openned" data-storeid="01004_000" id="stores-list__item-0"><span class="store-dist"></span><button class="stores-list__a" data-gtm-cta="magasin button" fdprocessedid="d4wp4k"><span class="store-name"><font style="vertical-align: inherit;"><font style="vertical-align: inherit;">Intersport - Anderlecht</font></font></span> <span class="svg-container svg-container--plus"><svg aria-hidden="true" focusable="false"><use xlink:href="#icon-ico-plus"></use></svg></span> <span class="svg-container svg-container--moins"><svg aria-hidden="true" focusable="false"><use xlink:href="#icon-ico-moins"></use></svg></span></button><div class="store__detail-content"><p><span class="store-adress"><font style="vertical-align: inherit;"><font style="vertical-align: inherit;">Boulevard Sylvain Dupuis, 365 - Westland Shopping – B</font></font></span><span class="store-cp"><font style="vertical-align: inherit;"><font style="vertical-align: inherit;">1070 </font></font></span> <span class="store-city"><font style="vertical-align: inherit;"><font style="vertical-align: inherit;">ANDERLECHT</font></font></span></p><div class="store-contact-container"><a data-gtm-cta="magasin - téléphone" href="tel:(0032) 02 378 21 30" class="store-tel"><span class="svg-container"><svg aria-hidden="true" focusable="false"><use xlink:href="#icon-ico-phone"></use></svg></span><font style="vertical-align: inherit;"><font style="vertical-align: inherit;">(0032) 02 378 21 30</font></font></a><a rel="https://maps.google.com/maps?t=m&amp;f=d&amp;saddr={currentlatitude},{currentlongitude}&amp;daddr=50.8384238,4.2857714" href="https://fr.mappy.com/itineraire#/recherche/undefined,undefined/50.8384238,4.2857714" target="_blank" data-gtm-cta="magasin - itinéraire" title="Directions to the INTERSPORT - ANDERLECHT store" class="store-itinerary"><span class="svg-container"><svg aria-hidden="true" focusable="false"><use xlink:href="#icon-ico-locator"></use></svg></span><font style="vertical-align: inherit;"><font style="vertical-align: inherit;">Your itinerary</font></font></a></div><div class="store-contact-container"><span class="store-schedule"><span class="svg-container"><svg aria-hidden="true" focusable="false"><use xlink:href="#icon-ico-clock"></use></svg></span><font style="vertical-align: inherit;"><font style="vertical-align: inherit;">Schedules :</font></font></span></div><ul class="store-schedules"><li><span><font style="vertical-align: inherit;"><font style="vertical-align: inherit;">Monday :</font></font></span><font style="vertical-align: inherit;"><font style="vertical-align: inherit;">10:00 - 19:30</font></font></li><li><span><font style="vertical-align: inherit;"><font style="vertical-align: inherit;">Tuesday :</font></font></span><font style="vertical-align: inherit;"><font style="vertical-align: inherit;">10:00 - 19:30</font></font></li><li><span><font style="vertical-align: inherit;"><font style="vertical-align: inherit;">Wednesday :</font></font></span><font style="vertical-align: inherit;"><font style="vertical-align: inherit;">10:00 - 19:30</font></font></li><li><span><font style="vertical-align: inherit;"><font style="vertical-align: inherit;">THURSDAY :</font></font></span><font style="vertical-align: inherit;"><font style="vertical-align: inherit;">10:00 - 19:30</font></font></li><li><span><font style="vertical-align: inherit;"><font style="vertical-align: inherit;">Friday :</font></font></span><font style="vertical-align: inherit;"><font style="vertical-align: inherit;">10:00 - 20:00</font></font></li><li><span><font style="vertical-align: inherit;"><font style="vertical-align: inherit;">SATURDAY :</font></font></span><font style="vertical-align: inherit;"><font style="vertical-align: inherit;">10:00 - 19:30</font></font></li><li><span><font style="vertical-align: inherit;"><font style="vertical-align: inherit;">Sunday :</font></font></span><font style="vertical-align: inherit;"><font style="vertical-align: inherit;">Farm</font></font></li><div class="store-list__important-msg"></div></ul></div></li><li class="stores-list__item stores-list__item--openned" data-storeid="01003_000" id="stores-list__item-1"><span class="store-dist"></span><button class="stores-list__a" data-gtm-cta="magasin button" fdprocessedid="rmg4c8"><span class="store-name"><font style="vertical-align: inherit;"><font style="vertical-align: inherit;">Intersport - Arlon</font></font></span> <span class="svg-container svg-container--plus"><svg aria-hidden="true" focusable="false"><use xlink:href="#icon-ico-plus"></use></svg></span> <span class="svg-container svg-container--moins"><svg aria-hidden="true" focusable="false"><use xlink:href="#icon-ico-moins"></use></svg></span></button><div class="store__detail-content"><p><span class="store-adress"><font style="vertical-align: inherit;"><font style="vertical-align: inherit;">112 rue de Grass, Sterpenich</font></font></span><span class="store-cp"><font style="vertical-align: inherit;"><font style="vertical-align: inherit;">6700 </font></font></span> <span class="store-city"><font style="vertical-align: inherit;"><font style="vertical-align: inherit;">ARLON - BELGIUM</font></font></span></p><div class="store-contact-container"><a data-gtm-cta="magasin - téléphone" href="tel:0032 63330300" class="store-tel"><span class="svg-container"><svg aria-hidden="true" focusable="false"><use xlink:href="#icon-ico-phone"></use></svg></span><font style="vertical-align: inherit;"><font style="vertical-align: inherit;">0032 63330300</font></font></a><a rel="https://maps.google.com/maps?t=m&amp;f=d&amp;saddr={currentlatitude},{currentlongitude}&amp;daddr=49.636341320655916,5.8881659306822565" href="https://fr.mappy.com/itineraire#/recherche/undefined,undefined/49.636341320655916,5.8881659306822565" target="_blank" data-gtm-cta="magasin - itinéraire" title="Directions to the INTERSPORT store - ARLON" class="store-itinerary"><span class="svg-container"><svg aria-hidden="true" focusable="false"><use xlink:href="#icon-ico-locator"></use></svg></span><font style="vertical-align: inherit;"><font style="vertical-align: inherit;">Your itinerary</font></font></a></div><div class="store-contact-container"><span class="store-schedule"><span class="svg-container"><svg aria-hidden="true" focusable="false"><use xlink:href="#icon-ico-clock"></use></svg></span><font style="vertical-align: inherit;"><font style="vertical-align: inherit;">Schedules :</font></font></span></div><ul class="store-schedules"><li><span><font style="vertical-align: inherit;"><font style="vertical-align: inherit;">Monday :</font></font></span><font style="vertical-align: inherit;"><font style="vertical-align: inherit;">10:00 - 19:00</font></font></li><li><span><font style="vertical-align: inherit;"><font style="vertical-align: inherit;">Tuesday :</font></font></span><font style="vertical-align: inherit;"><font style="vertical-align: inherit;">10:00 - 19:00</font></font></li><li><span><font style="vertical-align: inherit;"><font style="vertical-align: inherit;">Wednesday :</font></font></span><font style="vertical-align: inherit;"><font style="vertical-align: inherit;">10:00 - 19:00</font></font></li><li><span><font style="vertical-align: inherit;"><font style="vertical-align: inherit;">THURSDAY :</font></font></span><font style="vertical-align: inherit;"><font style="vertical-align: inherit;">10:00 - 19:00</font></font></li><li><span><font style="vertical-align: inherit;"><font style="vertical-align: inherit;">Friday :</font></font></span><font style="vertical-align: inherit;"><font style="vertical-align: inherit;">10:00 - 19:00</font></font></li><li><span><font style="vertical-align: inherit;"><font style="vertical-align: inherit;">SATURDAY :</font></font></span><font style="vertical-align: inherit;"><font style="vertical-align: inherit;">10:00 - 19:00</font></font></li><li><span><font style="vertical-align: inherit;"><font style="vertical-align: inherit;">Sunday :</font></font></span><font style="vertical-align: inherit;"><font style="vertical-align: inherit;">Farm</font></font></li><div class="store-list__important-msg"></div></ul></div></li><li class="stores-list__item stores-list__item--openned" data-storeid="01022_000" id="stores-list__item-2"><span class="store-dist"></span><button class="stores-list__a" data-gtm-cta="magasin button" fdprocessedid="mea2js"><span class="store-name"><font style="vertical-align: inherit;"><font style="vertical-align: inherit;">Intersport - Charleroi</font></font></span> <span class="svg-container svg-container--plus"><svg aria-hidden="true" focusable="false"><use xlink:href="#icon-ico-plus"></use></svg></span> <span class="svg-container svg-container--moins"><svg aria-hidden="true" focusable="false"><use xlink:href="#icon-ico-moins"></use></svg></span></button><div class="store__detail-content"><p><span class="store-adress"><font style="vertical-align: inherit;"><font style="vertical-align: inherit;">14/15 RAILWAY STREET</font></font></span><span class="store-cp"><font style="vertical-align: inherit;"><font style="vertical-align: inherit;">6041 </font></font></span> <span class="store-city"><font style="vertical-align: inherit;"><font style="vertical-align: inherit;">CHARLEROI - GOSSELIES</font></font></span></p><div class="store-contact-container"><a data-gtm-cta="magasin - téléphone" href="tel:(00) 32 71 593 258" class="store-tel"><span class="svg-container"><svg aria-hidden="true" focusable="false"><use xlink:href="#icon-ico-phone"></use></svg></span><font style="vertical-align: inherit;"><font style="vertical-align: inherit;">(00) 32 71 593 258</font></font></a><a rel="https://maps.google.com/maps?t=m&amp;f=d&amp;saddr={currentlatitude},{currentlongitude}&amp;daddr=50.47308734791102,4.439360095301595" href="https://fr.mappy.com/itineraire#/recherche/undefined,undefined/50.47308734791102,4.439360095301595" target="_blank" data-gtm-cta="magasin - itinéraire" title="Directions to the INTERSPORT store - CHARLEROI" class="store-itinerary"><span class="svg-container"><svg aria-hidden="true" focusable="false"><use xlink:href="#icon-ico-locator"></use></svg></span><font style="vertical-align: inherit;"><font style="vertical-align: inherit;">Your itinerary</font></font></a></div><div class="store-contact-container"><span class="store-schedule"><span class="svg-container"><svg aria-hidden="true" focusable="false"><use xlink:href="#icon-ico-clock"></use></svg></span><font style="vertical-align: inherit;"><font style="vertical-align: inherit;">Schedules :</font></font></span></div><ul class="store-schedules"><li><span><font style="vertical-align: inherit;"><font style="vertical-align: inherit;">Monday :</font></font></span><font style="vertical-align: inherit;"><font style="vertical-align: inherit;">9:30 - 19:00</font></font></li><li><span><font style="vertical-align: inherit;"><font style="vertical-align: inherit;">Tuesday :</font></font></span><font style="vertical-align: inherit;"><font style="vertical-align: inherit;">9:30 - 19:00</font></font></li><li><span><font style="vertical-align: inherit;"><font style="vertical-align: inherit;">Wednesday :</font></font></span><font style="vertical-align: inherit;"><font style="vertical-align: inherit;">9:30 - 19:00</font></font></li><li><span><font style="vertical-align: inherit;"><font style="vertical-align: inherit;">THURSDAY :</font></font></span><font style="vertical-align: inherit;"><font style="vertical-align: inherit;">9:30 - 19:00</font></font></li><li><span><font style="vertical-align: inherit;"><font style="vertical-align: inherit;">Friday :</font></font></span><font style="vertical-align: inherit;"><font style="vertical-align: inherit;">9:30 - 19:00</font></font></li><li><span><font style="vertical-align: inherit;"><font style="vertical-align: inherit;">SATURDAY :</font></font></span><font style="vertical-align: inherit;"><font style="vertical-align: inherit;">9:30 - 19:00</font></font></li><li><span><font style="vertical-align: inherit;"><font style="vertical-align: inherit;">Sunday :</font></font></span><font style="vertical-align: inherit;"><font style="vertical-align: inherit;">Farm</font></font></li><div class="store-list__important-msg"></div></ul></div></li><li class="stores-list__item stores-list__item--openned" data-storeid="01025_000" id="stores-list__item-3"><span class="store-dist"></span><button class="stores-list__a" data-gtm-cta="magasin button" fdprocessedid="cw0ml"><span class="store-name"><font style="vertical-align: inherit;"><font style="vertical-align: inherit;">Intersport - Chatelineau</font></font></span> <span class="svg-container svg-container--plus"><svg aria-hidden="true" focusable="false"><use xlink:href="#icon-ico-plus"></use></svg></span> <span class="svg-container svg-container--moins"><svg aria-hidden="true" focusable="false"><use xlink:href="#icon-ico-moins"></use></svg></span></button><div class="store__detail-content"><p><span class="store-adress"><font style="vertical-align: inherit;"><font style="vertical-align: inherit;">FrunPark Shopping Center, Rue des Mottards</font></font></span><span class="store-cp"><font style="vertical-align: inherit;"><font style="vertical-align: inherit;">6200 </font></font></span> <span class="store-city"><font style="vertical-align: inherit;"><font style="vertical-align: inherit;">CHATELINEAU - Châtelet</font></font></span></p><div class="store-contact-container"><a data-gtm-cta="magasin - téléphone" href="tel:+32 (0)71 88 83 37" class="store-tel"><span class="svg-container"><svg aria-hidden="true" focusable="false"><use xlink:href="#icon-ico-phone"></use></svg></span><font style="vertical-align: inherit;"><font style="vertical-align: inherit;">+32 (0)71 88 83 37</font></font></a><a rel="https://maps.google.com/maps?t=m&amp;f=d&amp;saddr={currentlatitude},{currentlongitude}&amp;daddr=50.4135349109074,4.49581291442835" href="https://fr.mappy.com/itineraire#/recherche/undefined,undefined/50.4135349109074,4.49581291442835" target="_blank" data-gtm-cta="magasin - itinéraire" title="Directions to the INTERSPORT - CHATELINEAU store" class="store-itinerary"><span class="svg-container"><svg aria-hidden="true" focusable="false"><use xlink:href="#icon-ico-locator"></use></svg></span><font style="vertical-align: inherit;"><font style="vertical-align: inherit;">Your itinerary</font></font></a></div><div class="store-contact-container"><span class="store-schedule"><span class="svg-container"><svg aria-hidden="true" focusable="false"><use xlink:href="#icon-ico-clock"></use></svg></span><font style="vertical-align: inherit;"><font style="vertical-align: inherit;">Schedules :</font></font></span></div><ul class="store-schedules"><li><span><font style="vertical-align: inherit;"><font style="vertical-align: inherit;">Monday :</font></font></span><font style="vertical-align: inherit;"><font style="vertical-align: inherit;">9:30 - 19:00</font></font></li><li><span><font style="vertical-align: inherit;"><font style="vertical-align: inherit;">Tuesday :</font></font></span><font style="vertical-align: inherit;"><font style="vertical-align: inherit;">9:30 - 19:00</font></font></li><li><span><font style="vertical-align: inherit;"><font style="vertical-align: inherit;">Wednesday :</font></font></span><font style="vertical-align: inherit;"><font style="vertical-align: inherit;">9:30 - 19:00</font></font></li><li><span><font style="vertical-align: inherit;"><font style="vertical-align: inherit;">THURSDAY :</font></font></span><font style="vertical-align: inherit;"><font style="vertical-align: inherit;">9:30 - 19:00</font></font></li><li><span><font style="vertical-align: inherit;"><font style="vertical-align: inherit;">Friday :</font></font></span><font style="vertical-align: inherit;"><font style="vertical-align: inherit;">9:30 - 19:00</font></font></li><li><span><font style="vertical-align: inherit;"><font style="vertical-align: inherit;">SATURDAY :</font></font></span><font style="vertical-align: inherit;"><font style="vertical-align: inherit;">9:30 - 19:00</font></font></li><li><span><font style="vertical-align: inherit;"><font style="vertical-align: inherit;">Sunday :</font></font></span><font style="vertical-align: inherit;"><font style="vertical-align: inherit;">Farm</font></font></li><div class="store-list__important-msg"></div></ul></div></li><li class="stores-list__item stores-list__item--openned" data-storeid="01041_000" id="stores-list__item-4"><span class="store-dist"></span><button class="stores-list__a" data-gtm-cta="magasin button" fdprocessedid="5mn8m"><span class="store-name"><font style="vertical-align: inherit;"><font style="vertical-align: inherit;">Intersport - Drogenbos</font></font></span> <span class="svg-container svg-container--plus"><svg aria-hidden="true" focusable="false"><use xlink:href="#icon-ico-plus"></use></svg></span> <span class="svg-container svg-container--moins"><svg aria-hidden="true" focusable="false"><use xlink:href="#icon-ico-moins"></use></svg></span></button><div class="store__detail-content"><p><span class="store-adress"><font style="vertical-align: inherit;"><font style="vertical-align: inherit;">158 Long Street</font></font></span><span class="store-cp"><font style="vertical-align: inherit;"><font style="vertical-align: inherit;">1620 </font></font></span> <span class="store-city"><font style="vertical-align: inherit;"><font style="vertical-align: inherit;">DROGENBOS</font></font></span></p><div class="store-contact-container"><a data-gtm-cta="magasin - téléphone" href="tel:+32 2 227 50 10" class="store-tel"><span class="svg-container"><svg aria-hidden="true" focusable="false"><use xlink:href="#icon-ico-phone"></use></svg></span><font style="vertical-align: inherit;"><font style="vertical-align: inherit;">+32 2 227 50 10</font></font></a><a rel="https://maps.google.com/maps?t=m&amp;f=d&amp;saddr={currentlatitude},{currentlongitude}&amp;daddr=50.7942246793109,4.314802713719257" href="https://fr.mappy.com/itineraire#/recherche/undefined,undefined/50.7942246793109,4.314802713719257" target="_blank" data-gtm-cta="magasin - itinéraire" title="Directions to the INTERSPORT - DROGENBOS store" class="store-itinerary"><span class="svg-container"><svg aria-hidden="true" focusable="false"><use xlink:href="#icon-ico-locator"></use></svg></span><font style="vertical-align: inherit;"><font style="vertical-align: inherit;">Your itinerary</font></font></a></div><div class="store-contact-container"><span class="store-schedule"><span class="svg-container"><svg aria-hidden="true" focusable="false"><use xlink:href="#icon-ico-clock"></use></svg></span><font style="vertical-align: inherit;"><font style="vertical-align: inherit;">Schedules :</font></font></span></div><ul class="store-schedules"><li><span><font style="vertical-align: inherit;"><font style="vertical-align: inherit;">Monday :</font></font></span><font style="vertical-align: inherit;"><font style="vertical-align: inherit;">10:00 - 19:00</font></font></li><li><span><font style="vertical-align: inherit;"><font style="vertical-align: inherit;">Tuesday :</font></font></span><font style="vertical-align: inherit;"><font style="vertical-align: inherit;">10:00 - 19:00</font></font></li><li><span><font style="vertical-align: inherit;"><font style="vertical-align: inherit;">Wednesday :</font></font></span><font style="vertical-align: inherit;"><font style="vertical-align: inherit;">10:00 - 19:00</font></font></li><li><span><font style="vertical-align: inherit;"><font style="vertical-align: inherit;">THURSDAY :</font></font></span><font style="vertical-align: inherit;"><font style="vertical-align: inherit;">10:00 - 19:00</font></font></li><li><span><font style="vertical-align: inherit;"><font style="vertical-align: inherit;">Friday :</font></font></span><font style="vertical-align: inherit;"><font style="vertical-align: inherit;">10:00 - 19:00</font></font></li><li><span><font style="vertical-align: inherit;"><font style="vertical-align: inherit;">SATURDAY :</font></font></span><font style="vertical-align: inherit;"><font style="vertical-align: inherit;">10:00 - 19:00</font></font></li><li><span><font style="vertical-align: inherit;"><font style="vertical-align: inherit;">Sunday :</font></font></span><font style="vertical-align: inherit;"><font style="vertical-align: inherit;">Farm</font></font></li><div class="store-list__important-msg"></div></ul></div></li><li class="stores-list__item stores-list__item--openned" data-storeid="01077_000" id="stores-list__item-5"><span class="store-dist"></span><button class="stores-list__a" data-gtm-cta="magasin button" fdprocessedid="dfw2t8"><span class="store-name"><font style="vertical-align: inherit;"><font style="vertical-align: inherit;">Intersport - Mons</font></font></span> <span class="svg-container svg-container--plus"><svg aria-hidden="true" focusable="false"><use xlink:href="#icon-ico-plus"></use></svg></span> <span class="svg-container svg-container--moins"><svg aria-hidden="true" focusable="false"><use xlink:href="#icon-ico-moins"></use></svg></span></button><div class="store__detail-content"><p><span class="store-adress"><font style="vertical-align: inherit;"><font style="vertical-align: inherit;">1 B Place des Grands Prés</font></font></span><span class="store-cp"><font style="vertical-align: inherit;"><font style="vertical-align: inherit;">7000 </font></font></span> <span class="store-city"><font style="vertical-align: inherit;"><font style="vertical-align: inherit;">MONS - BELGIUM</font></font></span></p><div class="store-contact-container"><a data-gtm-cta="magasin - téléphone" href="tel:(00) 3265220610" class="store-tel"><span class="svg-container"><svg aria-hidden="true" focusable="false"><use xlink:href="#icon-ico-phone"></use></svg></span><font style="vertical-align: inherit;"><font style="vertical-align: inherit;">(00) 3265220610</font></font></a><a rel="https://maps.google.com/maps?t=m&amp;f=d&amp;saddr={currentlatitude},{currentlongitude}&amp;daddr=50.45859302206051,3.931263628270091" href="https://fr.mappy.com/itineraire#/recherche/undefined,undefined/50.45859302206051,3.931263628270091" target="_blank" data-gtm-cta="magasin - itinéraire" title="Directions to the INTERSPORT store - MONS" class="store-itinerary"><span class="svg-container"><svg aria-hidden="true" focusable="false"><use xlink:href="#icon-ico-locator"></use></svg></span><font style="vertical-align: inherit;"><font style="vertical-align: inherit;">Your itinerary</font></font></a></div><div class="store-contact-container"><span class="store-schedule"><span class="svg-container"><svg aria-hidden="true" focusable="false"><use xlink:href="#icon-ico-clock"></use></svg></span><font style="vertical-align: inherit;"><font style="vertical-align: inherit;">Schedules :</font></font></span></div><ul class="store-schedules"><li><span><font style="vertical-align: inherit;"><font style="vertical-align: inherit;">Monday :</font></font></span><font style="vertical-align: inherit;"><font style="vertical-align: inherit;">10:00 - 19:30</font></font></li><li><span><font style="vertical-align: inherit;"><font style="vertical-align: inherit;">Tuesday :</font></font></span><font style="vertical-align: inherit;"><font style="vertical-align: inherit;">10:00 - 19:30</font></font></li><li><span><font style="vertical-align: inherit;"><font style="vertical-align: inherit;">Wednesday :</font></font></span><font style="vertical-align: inherit;"><font style="vertical-align: inherit;">10:00 - 19:30</font></font></li><li><span><font style="vertical-align: inherit;"><font style="vertical-align: inherit;">THURSDAY :</font></font></span><font style="vertical-align: inherit;"><font style="vertical-align: inherit;">10:00 - 19:30</font></font></li><li><span><font style="vertical-align: inherit;"><font style="vertical-align: inherit;">Friday :</font></font></span><font style="vertical-align: inherit;"><font style="vertical-align: inherit;">10:00 - 19:30</font></font></li><li><span><font style="vertical-align: inherit;"><font style="vertical-align: inherit;">SATURDAY :</font></font></span><font style="vertical-align: inherit;"><font style="vertical-align: inherit;">10:00 - 19:30</font></font></li><li><span><font style="vertical-align: inherit;"><font style="vertical-align: inherit;">Sunday :</font></font></span><font style="vertical-align: inherit;"><font style="vertical-align: inherit;">Farm</font></font></li><div class="store-list__important-msg"></div></ul></div></li><li class="stores-list__item stores-list__item--openned" data-storeid="01727_000" id="stores-list__item-6"><span class="store-dist"></span><button class="stores-list__a" data-gtm-cta="magasin button" fdprocessedid="79okxj"><span class="store-name"><font style="vertical-align: inherit;"><font style="vertical-align: inherit;">Intersport - Tournai</font></font></span> <span class="svg-container svg-container--plus"><svg aria-hidden="true" focusable="false"><use xlink:href="#icon-ico-plus"></use></svg></span> <span class="svg-container svg-container--moins"><svg aria-hidden="true" focusable="false"><use xlink:href="#icon-ico-moins"></use></svg></span></button><div class="store__detail-content"><p><span class="store-adress"><font style="vertical-align: inherit;"><font style="vertical-align: inherit;">100 RUE DES BASTIONS, LES BASTIONS COMMERCIAL PARK</font></font></span><span class="store-cp"><font style="vertical-align: inherit;"><font style="vertical-align: inherit;">7500 </font></font></span> <span class="store-city"><font style="vertical-align: inherit;"><font style="vertical-align: inherit;">TOURNAI</font></font></span></p><div class="store-contact-container"><a data-gtm-cta="magasin - téléphone" href="tel:3269844314" class="store-tel"><span class="svg-container"><svg aria-hidden="true" focusable="false"><use xlink:href="#icon-ico-phone"></use></svg></span><font style="vertical-align: inherit;"><font style="vertical-align: inherit;">3269844314</font></font></a><a rel="https://maps.google.com/maps?t=m&amp;f=d&amp;saddr={currentlatitude},{currentlongitude}&amp;daddr=50.60135014681818,3.4068341552641837" href="https://fr.mappy.com/itineraire#/recherche/undefined,undefined/50.60135014681818,3.4068341552641837" target="_blank" data-gtm-cta="magasin - itinéraire" title="Directions to the INTERSPORT store - TOURNAI" class="store-itinerary"><span class="svg-container"><svg aria-hidden="true" focusable="false"><use xlink:href="#icon-ico-locator"></use></svg></span><font style="vertical-align: inherit;"><font style="vertical-align: inherit;">Your itinerary</font></font></a></div><div class="store-contact-container"><span class="store-schedule"><span class="svg-container"><svg aria-hidden="true" focusable="false"><use xlink:href="#icon-ico-clock"></use></svg></span><font style="vertical-align: inherit;"><font style="vertical-align: inherit;">Schedules :</font></font></span></div><ul class="store-schedules"><li><span><font style="vertical-align: inherit;"><font style="vertical-align: inherit;">Monday :</font></font></span><font style="vertical-align: inherit;"><font style="vertical-align: inherit;">10:00 - 19:00</font></font></li><li><span><font style="vertical-align: inherit;"><font style="vertical-align: inherit;">Tuesday :</font></font></span><font style="vertical-align: inherit;"><font style="vertical-align: inherit;">10:00 - 19:00</font></font></li><li><span><font style="vertical-align: inherit;"><font style="vertical-align: inherit;">Wednesday :</font></font></span><font style="vertical-align: inherit;"><font style="vertical-align: inherit;">10:00 - 19:00</font></font></li><li><span><font style="vertical-align: inherit;"><font style="vertical-align: inherit;">THURSDAY :</font></font></span><font style="vertical-align: inherit;"><font style="vertical-align: inherit;">10:00 - 19:00</font></font></li><li><span><font style="vertical-align: inherit;"><font style="vertical-align: inherit;">Friday :</font></font></span><font style="vertical-align: inherit;"><font style="vertical-align: inherit;">10:00 - 19:00</font></font></li><li><span><font style="vertical-align: inherit;"><font style="vertical-align: inherit;">SATURDAY :</font></font></span><font style="vertical-align: inherit;"><font style="vertical-align: inherit;">10:00 - 19:00</font></font></li><li><span><font style="vertical-align: inherit;"><font style="vertical-align: inherit;">Sunday :</font></font></span><font style="vertical-align: inherit;"><font style="vertical-align: inherit;">Farm</font></font></li><div class="store-list__important-msg"></div></ul></div></li><li class="stores-list__item stores-list__item--openned" data-storeid="01898_000" id="stores-list__item-7"><span class="store-dist"></span><button class="stores-list__a" data-gtm-cta="magasin button" fdprocessedid="v2pgt4"><span class="store-name"><font style="vertical-align: inherit;"><font style="vertical-align: inherit;">Intersport - Waterloo</font></font></span> <span class="svg-container svg-container--plus"><svg aria-hidden="true" focusable="false"><use xlink:href="#icon-ico-plus"></use></svg></span> <span class="svg-container svg-container--moins"><svg aria-hidden="true" focusable="false"><use xlink:href="#icon-ico-moins"></use></svg></span></button><div class="store__detail-content"><p><span class="store-adress"><font style="vertical-align: inherit;"><font style="vertical-align: inherit;">4 Drève Richelle</font></font></span><span class="store-cp"><font style="vertical-align: inherit;"><font style="vertical-align: inherit;">1410 </font></font></span> <span class="store-city"><font style="vertical-align: inherit;"><font style="vertical-align: inherit;">Waterloo</font></font></span></p><div class="store-contact-container"><a data-gtm-cta="magasin - téléphone" href="tel:+32 2 215 54 94" class="store-tel"><span class="svg-container"><svg aria-hidden="true" focusable="false"><use xlink:href="#icon-ico-phone"></use></svg></span><font style="vertical-align: inherit;"><font style="vertical-align: inherit;">+32 2 215 54 94</font></font></a><a rel="https://maps.google.com/maps?t=m&amp;f=d&amp;saddr={currentlatitude},{currentlongitude}&amp;daddr=50.70566938,4.414493757" href="https://fr.mappy.com/itineraire#/recherche/undefined,undefined/50.70566938,4.414493757" target="_blank" data-gtm-cta="magasin - itinéraire" title="Directions to the INTERSPORT store - WATERLOO" class="store-itinerary"><span class="svg-container"><svg aria-hidden="true" focusable="false"><use xlink:href="#icon-ico-locator"></use></svg></span><font style="vertical-align: inherit;"><font style="vertical-align: inherit;">Your itinerary</font></font></a></div><div class="store-contact-container"><span class="store-schedule"><span class="svg-container"><svg aria-hidden="true" focusable="false"><use xlink:href="#icon-ico-clock"></use></svg></span><font style="vertical-align: inherit;"><font style="vertical-align: inherit;">Schedules :</font></font></span></div><ul class="store-schedules"><li><span><font style="vertical-align: inherit;"><font style="vertical-align: inherit;">Monday :</font></font></span><font style="vertical-align: inherit;"><font style="vertical-align: inherit;">10:00 - 19:30</font></font></li><li><span><font style="vertical-align: inherit;"><font style="vertical-align: inherit;">Tuesday :</font></font></span><font style="vertical-align: inherit;"><font style="vertical-align: inherit;">10:00 - 19:30</font></font></li><li><span><font style="vertical-align: inherit;"><font style="vertical-align: inherit;">Wednesday :</font></font></span><font style="vertical-align: inherit;"><font style="vertical-align: inherit;">10:00 - 19:30</font></font></li><li><span><font style="vertical-align: inherit;"><font style="vertical-align: inherit;">THURSDAY :</font></font></span><font style="vertical-align: inherit;"><font style="vertical-align: inherit;">10:00 - 19:30</font></font></li><li><span><font style="vertical-align: inherit;"><font style="vertical-align: inherit;">Friday :</font></font></span><font style="vertical-align: inherit;"><font style="vertical-align: inherit;">10:00 - 19:30</font></font></li><li><span><font style="vertical-align: inherit;"><font style="vertical-align: inherit;">SATURDAY :</font></font></span><font style="vertical-align: inherit;"><font style="vertical-align: inherit;">10:00 - 19:30</font></font></li><li><span><font style="vertical-align: inherit;"><font style="vertical-align: inherit;">Sunday :</font></font></span><font style="vertical-align: inherit;"><font style="vertical-align: inherit;">Farm</font></font></li><div class="store-list__important-msg"></div></ul></div></li></ul><p id="stores-list__message" class="stores-list__message stores-list__message--hide"></p></div>
"""


soup = BeautifulSoup(html_content, 'html.parser')
stores = soup.find_all("li", class_="stores-list__item")


store_data = []


for store in stores:
    try:
        name = store.find("span", class_="store-name").text.strip()
        address = store.find("span", class_="store-adress").text.strip()
        city = store.find("span", class_="store-city").text.strip()
        postal = store.find("span", class_="store-cp").text.strip()
        phone = store.find("a", class_="store-tel").text.strip() if store.find("a", class_="store-tel") else None
        maps_link = store.find("a", class_="store-itinerary")['href']

        # Jam operasional
        schedule_items = store.find_all("ul", class_="store-schedules")
        schedule_text = []
        for sched in schedule_items:
            lines = sched.find_all("li")
            for line in lines:
                day = line.find("span").text.strip()
                hours = line.find_all("font")[-1].text.strip()
                schedule_text.append(f"{day} {hours}")
        schedules = " | ".join(schedule_text)

        store_data.append({
            "Name": name,
            "Address": address,
            "City": city,
            "Postal Code": postal,
            "Phone": phone,
            "Google Maps": maps_link,
            "Opening Hours": schedules
        })
    except Exception as e:
        print(f"❗ Error parsing store: {e}")

df = pd.DataFrame(store_data)

df.to_excel('"Final Data"/Intersport Belgium.xlsx', index=False)

Unnamed: 0,Name,Address,City,Postal Code,Phone,Google Maps,Opening Hours
0,Intersport - Anderlecht,"Boulevard Sylvain Dupuis, 365 - Westland Shopp...",ANDERLECHT,1070,(0032) 02 378 21 30,https://fr.mappy.com/itineraire#/recherche/und...,Monday : 10:00 - 19:30 | Tuesday : 10:00 - 19:...
1,Intersport - Arlon,"112 rue de Grass, Sterpenich",ARLON - BELGIUM,6700,0032 63330300,https://fr.mappy.com/itineraire#/recherche/und...,Monday : 10:00 - 19:00 | Tuesday : 10:00 - 19:...
2,Intersport - Charleroi,14/15 RAILWAY STREET,CHARLEROI - GOSSELIES,6041,(00) 32 71 593 258,https://fr.mappy.com/itineraire#/recherche/und...,Monday : 9:30 - 19:00 | Tuesday : 9:30 - 19:00...
3,Intersport - Chatelineau,"FrunPark Shopping Center, Rue des Mottards",CHATELINEAU - Châtelet,6200,+32 (0)71 88 83 37,https://fr.mappy.com/itineraire#/recherche/und...,Monday : 9:30 - 19:00 | Tuesday : 9:30 - 19:00...
4,Intersport - Drogenbos,158 Long Street,DROGENBOS,1620,+32 2 227 50 10,https://fr.mappy.com/itineraire#/recherche/und...,Monday : 10:00 - 19:00 | Tuesday : 10:00 - 19:...
5,Intersport - Mons,1 B Place des Grands Prés,MONS - BELGIUM,7000,(00) 3265220610,https://fr.mappy.com/itineraire#/recherche/und...,Monday : 10:00 - 19:30 | Tuesday : 10:00 - 19:...
6,Intersport - Tournai,"100 RUE DES BASTIONS, LES BASTIONS COMMERCIAL ...",TOURNAI,7500,3269844314,https://fr.mappy.com/itineraire#/recherche/und...,Monday : 10:00 - 19:00 | Tuesday : 10:00 - 19:...
7,Intersport - Waterloo,4 Drève Richelle,Waterloo,1410,+32 2 215 54 94,https://fr.mappy.com/itineraire#/recherche/und...,Monday : 10:00 - 19:30 | Tuesday : 10:00 - 19:...


## 💻 Project: Scraping Krëfel Store Locations – Belgium

### 🌐 Website Overview
Krëfel is a major electronics and appliance retailer in Belgium. Unlike many retail websites, Krëfel exposes its store locator data in a **clean, structured JSON format**, likely used internally by their frontend application. The JSON includes all necessary store metadata, including nested fields like address, geo-coordinates, and store status.

### 🚧 Challenge
The main task was not in page rendering or interaction, but in:
- Traversing **nested JSON structures** and arrays (`signings`, `geoPoint`, and `address`)
- Transforming the JSON into a flat, human-readable format
- Extracting optional or conditionally present fields (e.g., second phone number, closed status)
- Enriching output by formatting composite fields (e.g., `Full Address`, `Coordinates`)

### 🧠 Solution Rationale
The approach was:
1. Load the saved `data.json` file using Python's `json` module.
2. Extract fields from each store entry, flattening subfields:
   - Address lines 1 and 2
   - Postal code and city
   - Latitude and Longitude (from `geoPoint`)
   - Store signings (from list of dictionaries)
   - Temporary closure status and date
3. Combine fields to form composite ones like:
   - `Full Address`
   - `Coordinates`
4. Export the results to Excel for client-ready delivery.

Why this works:
- Clean separation between raw JSON structure and processed output.
- Supports future re-use for automation or integration.
- Client receives a clean, business-ready file, not raw JSON.

### 📦 Data Fields Extracted
Each store record includes:
- ✅ Location Name
- ✅ Country
- ✅ Latitude & Longitude
- ✅ Coordinates (formatted)
- ✅ Full Address (combined from multiple fields)
- ✅ Contact Phone(s)
- ✅ Store Signings (e.g., store type or labels)
- ✅ Temporary Closure Status and Date

### 🗂 Output Format
Final output was exported as an Excel file (`Kerfel Data.xlsx`), structured for readability and compatibility with downstream tools like Excel, Power BI, or GIS platforms.

---

This project demonstrates mastery in **navigating nested JSON**, performing **data transformation**, and **delivering polished outputs** — a high-efficiency solution using clean, API-ready data sources.



In [None]:
import json
import pandas as pd
from pathlib import Path


json_path = Path("data.json")  # Replace this with your actual JSON path
with open(json_path, "r", encoding="utf-8") as f:
    data = json.load(f)


records = []
for store in data.get("stores", []):
    address = store.get("address", {})
    geo = store.get("geoPoint", {})
    signings = store.get("signings", [])
    
    records.append({
        "Location Name": store.get("displayName", ""),
        "Country": "Belgium",
        "Latitude": geo.get("latitude"),
        "Longitude": geo.get("longitude"),
        "Coordinates": f"{geo.get('latitude')}, {geo.get('longitude')}",
        "Full Address": f"{address.get('line1', '')}, {address.get('line2', '')}, {address.get('postalCode', '')}, {address.get('town', '')}",
        "Street 1": address.get("line1", ""),
        "Street 2": address.get("line2", ""),
        "Postal Code": address.get("postalCode", ""),
        "City": address.get("town", ""),
        "Contact Phone": address.get("phone", ""),
        "Contact Phone 2": address.get("phone2", ""),
        "Signings": ", ".join([s.get("title", "Unknown") for s in signings]),
        "Temporarily Closed": store.get("temporaryClosed", False),
        "Temporary Closed Date": store.get("temporaryClosedDate", "")
    })

df = pd.DataFrame(records)
df.to_excel('Final Data/Kerfel Data.xlsx', index=False)

## Action Data Extraction

In [None]:
import requests
import pandas as pd
import time

def get_store_details(store_id):
    url = "https://www.action.com/api/graphql/"
    headers = {
        "Content-Type": "application/json",
        "Accept-Language": "fr-BE", 
    }
    payload = {
        "operationName": "StoreDetails",
        "variables": {
            "storeId": str(store_id)
        },
        "extensions": {
            "persistedQuery": {
                "version": 1,
                "sha256Hash": "e2e55059c17cf6d45bddf2aacd30a4cf114d7ae68b0ad2d57e9be3fded475e1b"
            }
        }
    }

    response = requests.post(url, headers=headers, json=payload)
    if response.status_code == 200:
        data = response.json().get("data", {}).get("storeDetails")
        return data
    return None
# Getting store ID
store_ids = range(2000, 3001)
results = []

for sid in store_ids:
    print(f"🔍 Checking Store ID {sid}...")
    detail = get_store_details(sid)
    if detail:
        geo = detail.get("geoLocation", {})
        addr = detail.get("address", {})
        hours = detail.get("openingDays", [])

        opening_hours = "; ".join([
            f"{d['dayName']}: {d['openingHour'][0]['openFrom']} - {d['openingHour'][0]['openUntil']}" if d['openingHour']
            else f"{d['dayName']}: Closed" for d in hours
        ])

        results.append({
            "Store ID": sid,
            "Name": detail.get("name", "Not available"),
            "Street": addr.get("street", "Not available"),
            "Postal Code": addr.get("postalCode", "Not available"),
            "City": addr.get("city", "Not available"),
            "Country": addr.get("countryCode", "Not available"),
            "Latitude": geo.get("lat", "Not available"),
            "Longitude": geo.get("long", "Not available"),
            "Phone": "Not available",
            "Email": "Not available",
            "Opening Hours": opening_hours
        })
        print(f"✅ Found: {detail['name']}")
    else:
        print(f"❌ Store ID {sid} not found or no data.")

    time.sleep(1.5)

df = pd.DataFrame(results)
df.to_excel("Final Data/Action Store.xlsx", index=False)


🔍 Checking Store ID 2000...
❌ Store ID 2000 not found or no data.
🔍 Checking Store ID 2001...
✅ Found: Rijkevorsel
🔍 Checking Store ID 2002...
✅ Found: Kuurne
🔍 Checking Store ID 2003...
✅ Found: Schoten
🔍 Checking Store ID 2004...
✅ Found: Middelkerke
🔍 Checking Store ID 2005...
✅ Found: Sint-Truiden
🔍 Checking Store ID 2006...
✅ Found: Westerlo
🔍 Checking Store ID 2007...
✅ Found: Aalst
🔍 Checking Store ID 2008...
✅ Found: Lokeren
🔍 Checking Store ID 2009...
✅ Found: Wevelgem
🔍 Checking Store ID 2010...
✅ Found: Deurne Antwerpen
🔍 Checking Store ID 2011...
✅ Found: Wetteren
🔍 Checking Store ID 2012...
✅ Found: Geraardsbergen
🔍 Checking Store ID 2013...
✅ Found: Ronse
🔍 Checking Store ID 2014...
✅ Found: Temse
🔍 Checking Store ID 2015...
✅ Found: Kalmthout
🔍 Checking Store ID 2016...
✅ Found: Opglabbeek
🔍 Checking Store ID 2017...
✅ Found: Izegem
🔍 Checking Store ID 2018...
✅ Found: Genk-Waterschei
🔍 Checking Store ID 2019...
✅ Found: Maaseik
🔍 Checking Store ID 2020...
✅ Found: Schel

# Brico

## 🔨 Project: Scraping Brico Store Locations – Belgium

### 🌐 Website Overview
Brico is a Belgian retail chain specializing in home improvement and DIY products. Their store locator is backed by a well-structured JSON architecture: one file lists store IDs and locations, and a separate endpoint provides detailed store information. This API is not officially documented but is openly accessible and easy to query.

### 🚧 Challenge
This scraping task involved:
- Navigating a **two-tier API setup**:
  - A base endpoint (`.json`) that lists all stores with coordinates and minimal metadata
  - A secondary endpoint (`/rest/v1/storefinder/store/{id}`) to retrieve full store details
- Joining location data (lat/lng) from the first source with rich metadata (name, contact, address) from the second
- Handling some missing fields (e.g., email and contact names not consistently populated)

### 🧠 Solution Rationale
To ensure clean and complete data:
1. The base list file was retrieved from a public CloudFront CDN URL and parsed to get all store IDs, lat/lng, and parking info.
2. For each store ID:
   - A second API call was made to the detail endpoint to fetch address, email, and phone.
3. Extracted data was combined into a unified record with:
   - Store name
   - Full address
   - Contact info
   - Coordinates
   - Parking (from `box` field)

4. Defaults were added for unavailable contact names and missing fields.

Why this works:
- API-based approach ensures clean, structured data — no need for page rendering or scraping.
- Two-stage retrieval separates concerns and avoids redundant requests.
- Scalable and adaptable if the number of stores increases.

### 📦 Data Fields Extracted
Each store entry includes:
- ✅ Location Name
- ✅ Country
- ✅ Full Address
- ✅ Coordinates (Latitude, Longitude from base JSON)
- ✅ Parking Type
- ✅ Contact Phone Number
- ✅ Contact Email (if available)
- ⚠ Contact First/Last Name marked as `Not Available` by default

### 🗂 Output Format
Data was compiled into a structured `DataFrame` and is ready for export to CSV or Excel. Cleaned fields and standardized structure support downstream use in dashboards, analytics, or marketing workflows.

---

This project demonstrates your ability to **join multi-source API layers**, extract rich structured data, and prepare it for professional delivery — a client-ready solution balancing speed, accuracy, and flexibility.


In [None]:
import requests
import pandas as pd

base_list_url = "https://d1pb0z5hi4vdgm.cloudfront.net/assets/storefinder/brico.20f5ca8ca68112ecbca1268f86712ba7.json"
store_list = requests.get(base_list_url).json()

sample_stores = store_list[:5]


def fetch_store_detail(store_id):
    url = f"https://www.brico.be/rest/v1/storefinder/store/{store_id}?format=brico&lang=fr"
    res = requests.get(url)
    if res.status_code == 200:
        return res.json()
    return None


results = []
for store in store_list:
    detail = fetch_store_detail(store["id"])
    if detail:
        addr = detail.get("address", {})
        results.append({
            "Location Name": detail.get("displayName", ""),
            "Country": addr.get("country", {}).get("name", "Belgium"),
            "Coordinates": f'{store["lat"]}, {store["lng"]}',
            "Full Address": addr.get("formattedAddress", ""),
            "Parking Type": store.get("box", "Not Available"),
            "Contact First Name": "Not Available",
            "Contact Last Name": "Not Available",
            "Contact Email": addr.get("email", "Not Available"),
            "Contact Phone Number": addr.get("phone", "Not Available")
        })

df_brico = pd.DataFrame(results)
df_brico.to_excel("Final Data/Brico Data.xlsx", index=False)