simpleCrawler

Simple python script that uses the bs4 package to extract the text from the body of a webpage without html tags or Javascript.

Requirements

Install the following packages:

pip install requests pip install bs4

For the Selenium crawler you need also the selenium package:

pip install selenium

You have also to download the driver from here and unzip it in a directory named driver:

Run python simple.py www.google.com to test it.

Run python simpleSelenium.py www.google.com to test it with the google chrome driver.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
driver		driver
README.md		README.md
simple.py		simple.py
simpleSelenium.py		simpleSelenium.py