Skip to content

dark-teal-coder/project-washington-post-web-scraper

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

23 Commits
 
 
 
 
 
 

Repository files navigation

Washington Post Web Scraper

Metadata

Project

  • Title: Washington Post Web Scraper
  • Difficulty:
    • Beginner
    • Intermediate
    • Advanced
  • Scale:
    • Small
    • Medium
    • Big

Repository Description

The project uses Python to scrape newspaper article content from Washington Post. The article used here is "87 percent of websites are tracking you. This new tool will let you run a creepiness check" and the scraped items are the newspaper article title, author, date and body. The original idea is taken from "Web scraper to get news article content" by DevProjects @codementor.

Installation

Tools

Description

Check if you have Python installed using the command python --version, or simply, python version, in the CLI. Git-clone the project repository from Github to the local machine. Use the command py -m pip install package_name to install the necessary Python libraries. Check out pip documentation to learn more about pip install. Check the top part of the .py script file for the list of libraries required. For example, you may need requests and beautifulsoup4 libraries if you see the following lines in the top part of the script file:

import requests
from bs4 import BeautifulSoup

If pip fails to locate the relevant packages, you may find it at Python Package Index (PyPI). Use python file_name.py to run the script in a CLI. Or, use an IDE, such as VS Code, to run the script. There will usually be a [Run] button in the top right corner of the opened script file.

Credits

Contributors

References

 

1st Completion Date: Oct 23, 2022

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages