GitHub - Cyklic/Web-Scraping-Assignment-with-Python

Web-Scraping-Assignment-with-Python

Completed on 29th November, 2024

The file Umoru_Leonard_Assign2.ipynb is a file with Web Scraping Script.

The Website being scraped is https://www.mercadolibre.com/, but the specific webpage is https://listado.mercadolibre.com.mx/computacion/accesorios-pc-gaming/audifonos/headsets_NoIndex_True, and the specific data points to scrape are Product Details:

Name: This is the name of the product.
Price: This is the price of the product in dollars.
Rating: This is the average rating of products out of 5.
Review Count: This is the number of people who left a review, for each product.

I explained the ethics and best practices for Web Scraping. I imported the necessary Libraries which were requests, bs4, pandas, seaborn, and matplotlib.pyplot. I performed an HTTP request and created a soup object for accessing the website. I tested getting data from the first page of the webpage. I gathered all the necessary data that I wanted using for loop, from 3 pages through handling pagination. I also replaced empty cells with 'N/A' 'NULL', and '0'. I stored the data into a dataframe and cleaned the data by removing dollar sign, parentheses, changing ',' to '.' and changing the datatypes of the features. I also dropped duplicated records. I performed some basic analysis and visualizations on the clean data, to show headsets with the highest and lowest prices, ratings and reviews. I also showed statistic data and correlation between features. I also plotted some graphs to show relationship between features. Finally I saved the data to a csv file.

Instructions and Dependencies needed for running the Script are:

Ensure you have the necessary libraries installed, if not install and import them. The libraries are requests, bs4, pandas, seaborn, and matplotlib.pyplot. pip install requests pip install beautifulsoup4 pip install pandas pip install matplotlib seaborn
Run the script in descending order of the cells.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
README.md		README.md
Umoru_Leonard_Assign2.ipynb		Umoru_Leonard_Assign2.ipynb
Umoru_Leonard_headset_data.csv		Umoru_Leonard_headset_data.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

About

Uh oh!

Releases

Packages

Languages

Cyklic/Web-Scraping-Assignment-with-Python

Folders and files

Latest commit

History

Repository files navigation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages