Skip to content

carloocchiena/python_url_crawler

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 

Repository files navigation

python_url_crawler

A script that starting from a webpage, iterate thru all its link, appending them in a list. Sort of proxy to get all pages in a website.

the old_main is a raw version I made in 1 hours outta a stack overflow questions;

main.py is a quite better version I created from blank, with less code entropy. Seems working decently.

Consider that the script aims to find only urls within the domain, but this could be easily configured tweaking the "cleaner" function

About

A script that starting from a webpage, iterate thru all its link, appending them in a list. Sort of proxy to get all pages in a website

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages