🕸️ Web Scraper 💻

Web scraping is the practice of extracting content and data from a website using bots. Web scraping, unlike screen scraping, which replicates only the pixels seen onscreen, retrieves the underlying HTML code and, with it, the data contained in a database. The scraper can then copy the full website's content to another location. This custom code searches the source code of the page for specific parts defined and extracts the content asked to extract.

⚠️ Beware

Before scraping any website, check the terms and conditions page to determine if there are any clear scraping rules. You should follow them if there are any. If there aren't any, it's more of a guessing game.

😔Note

Sadly, not all websites support web scraping.

📚Resource Used

Newegg eCommerce Online Store. Newegg Commerce, Inc. is a company that sells computer hardware and consumer gadgets online. Its headquarters are in the City of Industry, California.

Newegg.eCommerce.mp4

National Weather Service. The National Weather Service is a federal government agency responsible with delivering weather forecasts, hazardous weather warnings, and other weather-related services to organizations and the general public for protection, safety, and general information.

NWS.mp4

🛠️Tools & Languages Used

Anaconda Version 4.10.1
Python Version 3.8.8
Beautiful Soup - Beautiful Soup is a Python package for parsing HTML and XML documents.
Extensible library for opening URLs -The urllib.request module
Python requests library in NWS-webscraper.py

🔆 Best Practices when Web Scraping

Never scrape more frequently than you need to.
Consider caching the content you scrape so that it’s only downloaded once.
Build pauses into your code using functions like time.sleep() to keep from overwhelming servers with too many requests too quickly.

🔌 What to Expect

Script Results in cmd.exe

Results in the Products.csv file

Dataframe Display in Terminal for NWS-webscraper.py

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
csv		csv
LICENSE		LICENSE
NWS-webscraper.py		NWS-webscraper.py
README.md		README.md
ecommerce-webscraper.py		ecommerce-webscraper.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🕸️ Web Scraper 💻

⚠️ Beware

😔Note

📚Resource Used

🛠️Tools & Languages Used

🔆 Best Practices when Web Scraping

🔌 What to Expect

This code was built with ❤️ and 2 cups of Coffee☕

About

Languages

License

octocatblain/Webscraper

Folders and files

Latest commit

History

Repository files navigation

🕸️ Web Scraper 💻

⚠️ Beware

😔Note

📚Resource Used

🛠️Tools & Languages Used

🔆 Best Practices when Web Scraping

🔌 What to Expect

This code was built with ❤️ and 2 cups of Coffee☕

About

Topics

Resources

License

Stars

Watchers

Forks

Languages