mycrawler

Trabalho sobre crawling para demonstrar o conceito de coleta de links e páginas da web.

O script solicita uma URL inicial e a partir dela são coletados os links existentes no HTML. O crawler continua coletando os "links dos links" indefinidamente.

Instalação no Linux

É necessário ter o Python 3 e virtualenv instalados.

Crie um virtualenv.

$ virtualenv -p python3 venv

Ative o ambiente.

$ source venv/bin/activate

Instale as libs do Python no ambiente utilizando o arquivo "requirements.txt".

$ pip install -r requirements.txt

Rode o crawler.

$ python run.py

Qualquer dúvida entre em contato: yoshiodeveloper@gmail.com

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
.gitignore		.gitignore
README.md		README.md
mycrawler.py		mycrawler.py
myparser.py		myparser.py
requirements.txt		requirements.txt
run.py		run.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

.gitignore

.gitignore

README.md

README.md

mycrawler.py

mycrawler.py

myparser.py

myparser.py

requirements.txt

requirements.txt

run.py

run.py

Repository files navigation

mycrawler

Instalação no Linux

About

Releases

Packages

Languages

yoshiodeveloper/mycrawler

Folders and files

Latest commit

History

Repository files navigation

mycrawler

Instalação no Linux

About

Topics

Resources

Stars

Watchers

Forks

Languages