Web Data Scraper with MySQL and Docker

This setup demonstrates an end-to-end approach to extracting structured data (e.g., product name and price of sourdough bread) from a simple HTML interface and storing it in a MySQL database — all containerized using Docker for portable deployment.

Tech Stack

Python 3.10
BeautifulSoup (HTML parsing)
MySQL (Data storage)
Docker (Environment containerization)
Ubuntu / WSL2
VS Code

Use Case

Extracting and storing product data from a local HTML layout using Python. The flow simulates scraping from a bakery-style website where items like Sourdough Bread are dynamically picked and saved.

🔄 How It Works

HTML Structure:
- index.html contains the product listing.
Python Script:
- Scrape.py uses BeautifulSoup to locate a specific item.
- Connects to a local MySQL database.
- Inserts extracted data into a defined table.
Docker Integration:
- Everything runs in an isolated Docker container.
- Dockerfile builds the image and installs all dependencies.

Getting Started

1. Clone the Repository

git clone https://github.com/JPOORNA/web-scraper-mysql-docker.git
cd web-scraper-mysql-docker

2. Update `Scrape.py` with Your MySQL Credentials

conn = mysql.connector.connect(
    host="host.docker.internal",
    user="root",
    password="poorna@610",
    database="bakery"
)

3. Build Docker Image

docker build -t bakery-scraper .

4. Run the Container

docker run bakery-scraper

✅ Expected Output

checking Sourdough Bread
Sourdough Bread: 200
data inserted

What’s Covered

Tag-based HTML parsing with BeautifulSoup
Host-to-container database communication via Docker
WSL2 environment support without using cloud
Real-time insertion of extracted data into MySQL

Possible Enhancements

Add cron or schedule for automation
Include UI or dashboard using Flask / Streamlit
Connect scraped data to analytics layer (Power BI / Pandas)

👤 Author

Poorna Chandra
Python | DevOps | Cloud | 🔗 GitHub: github.com/JPOORNA
🌐 LinkedIn: linkedin.com/in/yourprofile

✨ This setup helped build confidence working with Docker containers, database integrations, and local scraping logic without relying on AWS or cloud platforms.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
Dockerfile		Dockerfile
README.md		README.md
Scrape.py		Scrape.py
bakery.html		bakery.html
database.sql		database.sql
requirements.txt		requirements.txt
test.py		test.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Web Data Scraper with MySQL and Docker

Tech Stack

Use Case

🔄 How It Works

Getting Started

1. Clone the Repository

2. Update `Scrape.py` with Your MySQL Credentials

3. Build Docker Image

4. Run the Container

✅ Expected Output

What’s Covered

Possible Enhancements

👤 Author

About

Uh oh!

Releases

Packages

Languages

JPOORNA/web-scraper-mysql-docker

Folders and files

Latest commit

History

Repository files navigation

Web Data Scraper with MySQL and Docker

Tech Stack

Use Case

🔄 How It Works

Getting Started

1. Clone the Repository

2. Update Scrape.py with Your MySQL Credentials

3. Build Docker Image

4. Run the Container

✅ Expected Output

What’s Covered

Possible Enhancements

👤 Author

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

2. Update `Scrape.py` with Your MySQL Credentials

Packages