mycrawler

Trabalho sobre crawling para demonstrar o conceito de coleta de links e páginas da web.

O script solicita uma URL inicial e a partir dela são coletados os links existentes no HTML. O crawler continua coletando os "links dos links" indefinidamente.

Instalação no Linux

É necessário ter o Python 3 e virtualenv instalados.

Crie um virtualenv.

$ virtualenv -p python3 venv

Ative o ambiente.

$ source venv/bin/activate

Instale as libs do Python no ambiente utilizando o arquivo "requirements.txt".

$ pip install -r requirements.txt

Rode o crawler.

$ python run.py

Qualquer dúvida entre em contato: yoshiodeveloper@gmail.com

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

mycrawler

Instalação no Linux

Files

README.md

Latest commit

History

README.md

File metadata and controls

mycrawler

Instalação no Linux