This is a web application built using python django framework. This app uses django version 3.0.4 in the backend. A collection of small programs that extract data from multiple websites and packages it to be useful with the use of BeautifulSoup, a Python package for parsing HTML and XML documents. Once you retrive the raw HTML of a site, you can start to select and extract with BeautifulSoup, which parses raw HTML strings and produces an object that mirrors HTML documents' structure.
- This project is only for study purpose.
- Check a website's Term and Conditions before scraping it and read the statements about legal use of the data.
- Do not request data from the website too aggressiely and ensure that your program behaves in a reasonable manner.
- Revisit the website and rewrite code as needed as the layout of the site may change.
- Install python 3.8 if not done already.
- Download and extract application.
- Import the project to an IDE of your choice.
- Install BeautifulSoup
pip install beautifulsoup4
- Now run the application.
python manage.py runserver
- Browse through the app and Happy coding.