Automated Data Collection System - Python

Web scraping project completed for Master's in Data Science at Dalarna University.

📊 Project Overview

This project demonstrates automated data collection techniques using Python web scraping. Built 3 different scrapers to collect data from various online sources.

🎯 Tasks

Task 1: Textual Data Scraping

File: Task1.py
Output: task1_output.txt

Scrapes articles about Machine Learning and AI from:

TechTarget
IBM

Extracts headlines and full article content from both sources.

Task 2: E-commerce Product Scraping

File: task2.py
Output: task2_products.csv

Scrapes product information from Books to Scrape website:

Product names
Prices
Exports to CSV format

Task 3: Weather Data with Error Handling

File: task3.py
Output: task3_weather.txt

Collects weather data from multiple sources:

Wttr.in (Weather API)
TimeAndDate.com

Features fault-tolerant error handling that continues running even when sources fail.

🛠️ Technologies Used

Python 3.x
BeautifulSoup4 - HTML parsing
Requests - HTTP requests
CSV - Data export

📦 Installation

pip install beautifulsoup4 requests

▶️ How to Run

python Task1.py
python task2.py
python task3.py

📄 Output Files

Each script generates its own output file:

task1_output.txt - Extracted articles
task2_products.csv - Product data (opens in Excel)
task3_weather.txt - Weather information

⚖️ Ethical Scraping

This project follows ethical web scraping practices:

Respects robots.txt
Uses appropriate User-Agent headers
Implements delays between requests
Only accesses publicly available data

🎓 Academic Context

Program: Master's in Data Science
University: Dalarna University, Sweden
Course: Data Collection and Quality

📧 Contact

Mhmoud Ahmad
LinkedIn

📝 License

For educational purposes.


---

## ✅ **NEXT STEPS:**

1. **Click "Add a README"** (green button)
2. **Paste the content above**
3. **Click "Commit"**
4. **Done!** ✓

---

## 🚀 **THEN:**

**Your GitHub link will be:**

https://github.com/Mhmoud94/Automated-Data-Collection-System---Python

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Automated Data Collection System - Python

📊 Project Overview

🎯 Tasks

Task 1: Textual Data Scraping

Task 2: E-commerce Product Scraping

Task 3: Weather Data with Error Handling

🛠️ Technologies Used

📦 Installation

▶️ How to Run

📄 Output Files

⚖️ Ethical Scraping

🎓 Academic Context

📧 Contact

📝 License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
README.md		README.md
Task1.py		Task1.py
task1_output.txt		task1_output.txt
task2.py		task2.py
task2_products.csv		task2_products.csv
task3.py		task3.py
task3_weather.txt		task3_weather.txt

Folders and files

Latest commit

History

Repository files navigation

Automated Data Collection System - Python

📊 Project Overview

🎯 Tasks

Task 1: Textual Data Scraping

Task 2: E-commerce Product Scraping

Task 3: Weather Data with Error Handling

🛠️ Technologies Used

📦 Installation

▶️ How to Run

📄 Output Files

⚖️ Ethical Scraping

🎓 Academic Context

📧 Contact

📝 License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages