🕷️ Web Scraping Projects Portfolio

A collection of three data scraping projects showcasing different scraping methods—HTML scraping with requests and BeautifulSoup, and API scraping using RapidAPI. Each project involves data extraction, preprocessing, and storing the results in a structured format for analysis.

📑 Table of Contents

📦 Project 1: Flipkart UHD TVs Scraper
📚 Project 2: Goodreads Dystopian Books Scraper
🐦 Project 3: Twitter API Scraper
📦 Project 1: Flipkart UHD TVs Scraper

Objective: Scrape UHD TV product listings from Flipkart for basic price and feature comparison.

Source: flipkart.com
Method: HTML scraping using requests and BeautifulSoup
Pages Scraped: 10
Total Rows: 240 products
Columns: 7 (Name, Price, Rating, Discount, Launch Year, Operating System, Delivery Type)
Output: CSV file

Key Learnings:

Pagination handling
Dealing with inconsistent HTML tags and missing data
Headers and user-agent spoofing

📚 Project 2: Goodreads Dystopian Books Scraper

Objective: Extract book details for dystopian genre to analyze popularity and patterns.

Source: goodreads.com
Method: HTML scraping using requests and BeautifulSoup
Pages Scraped: 40
Total Rows: 3,636 books
Columns: 6 (Book Title, Author, Ratings, Avg Rating, Score, Total Votes)
Output: CSV file

Key Learnings:

Extracting nested data in HTML
Managing large pagination without server blocking
Parsing numeric and textual data from strings

🐦 Project 3: Twitter API Scraper

Objective: Collect tweet and user metadata using RapidAPI to experiment with API data extraction.

Source: RapidAPI - twitter154.p.rapidapi.com
Method: API scraping using requests and API key authentication
Total Rows: 83 tweets
Columns: 27 (tweet_id, creation_date, text, media_url, video_url, user, language, favorite_count, retweet_count, reply_count, quote_count, retweet, views, timestamp, video_view_count, in_reply_to_status_id, quoted_status_id, binding_values, expanded_url, retweet_tweet_id, extended_entities, conversation_id, retweet_status, quoted_status, bookmark_count, source, community_note)
Output: CSV or JSON

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
README.md		README.md
dystopian book scraping.ipynb		dystopian book scraping.ipynb
ecommerce_scraping.ipynb		ecommerce_scraping.ipynb
the old bird twitter api.ipynb		the old bird twitter api.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Uh oh!

Repository files navigation

🕷️ Web Scraping Projects Portfolio

📑 Table of Contents

📦 Project 1: Flipkart UHD TVs Scraper

Key Learnings:

📚 Project 2: Goodreads Dystopian Books Scraper

Key Learnings:

🐦 Project 3: Twitter API Scraper

About

Uh oh!

Releases

Packages

Languages

Uh oh!

Uh oh!

ria08/web-scraping-and-API

Folders and files

Latest commit

History

Repository files navigation

🕷️ Web Scraping Projects Portfolio

📑 Table of Contents

📦 Project 1: Flipkart UHD TVs Scraper

Key Learnings:

📚 Project 2: Goodreads Dystopian Books Scraper

Key Learnings:

🐦 Project 3: Twitter API Scraper

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages