A python program to scrape information from wikipedia articles such as hyperlink and image count. Relies on BeautifulSoup and Urllib2. Written to test out the libraries and try some basic scraping of webpages.
Simply run in your prefered python IDE.
Requires Python 3.x.x
- BeautifulSoup - Filtering and extracting of webpage content
- Urllib2 - Gathering webpage data and getting html to be processed
- Python - IDE and language
Fhoughton - Initial work
See also the list of contributors who participated in this project.
This project is licensed under the GNU General Public License - see the LICENSE.md file for details
- Based off an article by DigitalOcean