Scraping the chili pepper flakes!!!! 

#GET (git) request 

-GET: `https://foodsofnations.com/collections/chili-chile-pepper-whole-powder-flakes`

```py
import requests

response = requests.get("....url....")

data = {
    'q' : 'chili flakes'
    'sort' : 'price'
}

response = requests.posts{"...url...", data=data}

In [1]:
!pip install requests beautifulsoup4 pandas tqdm



In [6]:
import requests
from bs4 import BeautifulSoup
import pandas as pd
from tqdm import tqdm
import time

In [7]:
def scrape_page(page_num):
    """Scrape products from a single page"""
    url = f"https://foodsofnations.com/collections/chili-chile-pepper-whole-powder-flakes?page={page_num}"
    
    print(f"Fetching page {page_num}...")
    response = requests.get(url)
    soup = BeautifulSoup(response.content, 'html.parser')
    
    # Find all product cards
    products = soup.find_all('div', class_='card card--standard card--media')
    print(f"Found {len(products)} products on page {page_num}")
    
    page_data = []
    
    for product in products:
        # Extract product name
        name_tag = product.find('h3', class_='card__heading h5')
        name = name_tag.find('a').text.strip() if name_tag else None
        
        # Extract product URL
        url_tag = product.find('a', class_='full-unstyled-link')
        product_url = 'https://foodsofnations.com' + url_tag['href'] if url_tag else None
        
        # Extract image URL
        img_tag = product.find('img')
        image_url = 'https:' + img_tag['src'] if img_tag and 'src' in img_tag.attrs else None
        
        # Extract price
        price_tag = product.find('span', class_='price-item price-item--regular')
        price = price_tag.text.strip() if price_tag else None
        
        # Check availability (sold out badge)
        badge = product.find('span', class_='badge')
        availability = 'Sold out' if badge and 'Sold out' in badge.text else 'In stock'
        
        page_data.append({
            'Product Name': name,
            'Price': price,
            'Product URL': product_url,
            'Image URL': image_url,
            'Availability': availability
        })
    
    return page_data

# Test on page 1
test_data = scrape_page(1)
print(f"\nFirst product: {test_data[0]}")

Fetching page 1...
Found 20 products on page 1

First product: {'Product Name': 'Aleppo Pepper Flakes Medium-Hot', 'Price': 'From $5.99 USD', 'Product URL': 'https://foodsofnations.com/products/aleppo-pepper-flakes-medium-hot-67012300178', 'Image URL': 'https://foodsofnations.com/cdn/shop/files/aleppo-pepper-flakes-medium-hot-67012300178.png?v=1739396354&width=533', 'Availability': 'In stock'}


In [8]:
# Scrape all 10 pages
all_products = []

for page in tqdm(range(1, 11), desc="Scraping pages"):
    page_data = scrape_page(page)
    all_products.extend(page_data)
    time.sleep(1)  # Be polite to the server
    
print(f"\n✓ Scraping complete! Total products collected: {len(all_products)}")

Scraping pages:   0%|          | 0/10 [00:00<?, ?it/s]

Fetching page 1...
Found 20 products on page 1


Scraping pages:  10%|█         | 1/10 [00:01<00:13,  1.52s/it]

Fetching page 2...
Found 20 products on page 2


Scraping pages:  20%|██        | 2/10 [00:03<00:12,  1.54s/it]

Fetching page 3...
Found 19 products on page 3


Scraping pages:  30%|███       | 3/10 [00:04<00:11,  1.58s/it]

Fetching page 4...
Found 20 products on page 4


Scraping pages:  40%|████      | 4/10 [00:06<00:09,  1.61s/it]

Fetching page 5...
Found 19 products on page 5


Scraping pages:  50%|█████     | 5/10 [00:07<00:07,  1.59s/it]

Fetching page 6...
Found 20 products on page 6


Scraping pages:  60%|██████    | 6/10 [00:09<00:06,  1.59s/it]

Fetching page 7...
Found 19 products on page 7


Scraping pages:  70%|███████   | 7/10 [00:11<00:04,  1.57s/it]

Fetching page 8...
Found 20 products on page 8


Scraping pages:  80%|████████  | 8/10 [00:12<00:03,  1.58s/it]

Fetching page 9...
Found 20 products on page 9


Scraping pages:  90%|█████████ | 9/10 [00:14<00:01,  1.56s/it]

Fetching page 10...
Found 6 products on page 10


Scraping pages: 100%|██████████| 10/10 [00:15<00:00,  1.55s/it]


✓ Scraping complete! Total products collected: 183





In [9]:
# Create DataFrame
df = pd.DataFrame(all_products)

# Display summary
print(f"Total products scraped: {len(df)}")
print(f"\nAvailability breakdown:")
print(df['Availability'].value_counts())
print(f"\nFirst 5 products:")
df.head()

Total products scraped: 183

Availability breakdown:
Availability
In stock    178
Sold out      5
Name: count, dtype: int64

First 5 products:


Unnamed: 0,Product Name,Price,Product URL,Image URL,Availability
0,Aleppo Pepper Flakes Medium-Hot,From $5.99 USD,https://foodsofnations.com/products/aleppo-pep...,https://foodsofnations.com/cdn/shop/files/alep...,In stock
1,Kashmiri Red Chilli Powder Mild,From $6.99 USD,https://foodsofnations.com/products/kashmiri-r...,https://foodsofnations.com/cdn/shop/files/kash...,In stock
2,Spanish Smoked Paprika Pimenton Sweet,From $6.99 USD,https://foodsofnations.com/products/spanish-sm...,https://foodsofnations.com/cdn/shop/files/span...,In stock
3,"Paprika, Spanish, Smoked, Sweet",$7.99 USD,https://foodsofnations.com/products/paprika-sp...,https://foodsofnations.com/cdn/shop/products/p...,In stock
4,"Paprika, Spanish, Smoked, Hot",$7.99 USD,https://foodsofnations.com/products/paprika-sp...,https://foodsofnations.com/cdn/shop/products/p...,In stock


In [10]:
# Save to CSV
filename = 'chili_pepper_products.csv'
df.to_csv(filename, index=False)

print(f"✓ Data saved to {filename}")
print(f"✓ Total rows: {len(df)}")
print(f"✓ Columns: {', '.join(df.columns)}")

✓ Data saved to chili_pepper_products.csv
✓ Total rows: 183
✓ Columns: Product Name, Price, Product URL, Image URL, Availability


In [11]:
!open .
