I'm crawling wikipedia website, and i want to store them in a database(postgresql maybe). My future plans, use this data base and make a full stack app.
P.S: docker is required
- clone the repo
git clone https://github.com/cs-fedy/wikipedia-crawler
- run
docker compose up -d
to start the db. - install virtualenv using pip:
sudo pip install virtualenv
- create a new virtualenv:
virtualenv venv
- activate the virtualenv:
source venv/bin/activate
- install requirements:
pip install requirements.txt
- run the script and enjoy:
python scraper.py
- requests: Python HTTP for Humans.
- BeautifulSoup: Beautiful Soup is a library that makes it easy to scrape information from web pages. It sits atop an HTML or XML parser, providing Pythonic idioms for iterating, searching, and modifying the parse tree.
- python-dotenv: Add .env support to your django/flask apps in development and deployments.
- psycopg2: psycopg2 - Python-PostgreSQL Database Adapter.
- tabulate: Pretty-print tabular data.
created at 🌙 with 💻 and ❤ by f0ody
- Fedi abdouli - wikipedia crawler - fedi abdouli
- my twitter account FediAbdouli
- my instagram account f0odyy