wikipedia crawler:

I'm crawling wikipedia website, and i want to store them in a database(postgresql maybe). My future plans, use this data base and make a full stack app.

P.S: docker is required

installation:

clone the repo git clone https://github.com/cs-fedy/wikipedia-crawler
run docker compose up -d to start the db.
install virtualenv using pip: sudo pip install virtualenv
create a new virtualenv: virtualenv venv
activate the virtualenv: source venv/bin/activate
install requirements: pip install requirements.txt
run the script and enjoy: python scraper.py

used tools:

requests: Python HTTP for Humans.
BeautifulSoup: Beautiful Soup is a library that makes it easy to scrape information from web pages. It sits atop an HTML or XML parser, providing Pythonic idioms for iterating, searching, and modifying the parse tree.
python-dotenv: Add .env support to your django/flask apps in development and deployments.
psycopg2: psycopg2 - Python-PostgreSQL Database Adapter.
tabulate: Pretty-print tabular data.

Author:

created at 🌙 with 💻 and ❤ by f0ody

Fedi abdouli - wikipedia crawler - fedi abdouli
my twitter account FediAbdouli
my instagram account f0odyy

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
.env.example		.env.example
.gitignore		.gitignore
README.md		README.md
docker-compose.yaml		docker-compose.yaml
requirements.txt		requirements.txt
scraper.py		scraper.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

.env.example

.env.example

.gitignore

.gitignore

README.md

README.md

docker-compose.yaml

docker-compose.yaml

requirements.txt

requirements.txt

scraper.py

scraper.py

Repository files navigation

wikipedia crawler:

installation:

used tools:

Author:

About

Releases

Packages

Languages

cs-fedy/wikipedia-crawler

Folders and files

Latest commit

History

Repository files navigation

wikipedia crawler:

installation:

used tools:

Author:

About

Topics

Resources

Stars

Watchers

Forks

Languages