In this repository you can find the code present in this post on devinsimplewords blog.
This post explain how to use the beautifulsoup package and here you can find two different Python files.
The first one is parse_html_file.py
and it analyzes the html file present in the html_file
folder.
The second one is web_scraping.py
and it performs a simple web scraping task on the 'HelloWorld' Wikipedia page.
Steps to run the script:
- prepare a virtualenv and install what it is present in requirements.txt file using the following command:
If you don't know how to create and activate a virtualenv, please check this post
pip install -r requirements.txt
- Run the script using the following command:
or
python parse_html_file.py
python web_scraping.py