This project focuses on automating the extraction of smartphone data such as brand, model, color, memory, battery capacity, and price from Flipkart using Python. The scraped data is structured, cleaned, and stored in CSV and Excel formats for analysis and comparison.
With the continuous release of new smartphones, manually comparing specifications and prices across multiple online platforms is tedious and time-consuming. This project automates the process by scraping the latest mobile data and organizing it efficiently.
- Language: Python
- Libraries Used:
- requests – to fetch webpage data
- BeautifulSoup – to parse HTML content
- re – for data cleaning using Regular Expressions
- pandas – for data manipulation and export
- Output Files:
- Mobiles Data.csv
- Mobiles Data Excel.xlsx
- Fetch HTML data from Flipkart’s mobile listing page.
- Extract key attributes:
- Brand and Model
- Price
- Color
- ROM and GB
- Display Size
- Battery Capacity
- Clean the extracted text using regular expressions.
- Create a DataFrame with structured columns.
- Export the data to CSV and Excel formats.
- Extend scraping to Amazon and Samsung official site.
- Include ratings, reviews, and camera specifications.
- Visualize trends using Power BI or Tableau.
- Automate data refresh using schedulers or APIs.