Skip to content

themooer1/LinkScanner

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

49 Commits
 
 
 
 
 
 
 
 

Repository files navigation

LinkScanner

Follow links from website to website.

Usage

LinkScanner takes input from a text file with newline seperated links, locates the links on those pages and either outputs them to links.txt or locates the links on those pages to a set recursion depth.

Import

import LinkScanner

Can be used like this

if __name__=='__main__':
    linkscanner=Scanner(iterations=3, maxthreads=20, siteList='pathToInputLinks.txt')
    linkscanner.startScan()
    linkscanner.save()

The second line is most important, as it sets the options for the scanner and provides input. Most arguments while not necessary are recommended.

Options     Necessary     Default Purpose
iterations   no 3 How many times the results will be scanned before output.
maxthreads   no 8 The number of worker threads allowed to run at once
siteList yes siteList.txt The name of the input file in the same directory.

Input

The input sitelist.txt must be formatted as follows.

https://wikipedia.org
http://msn.com
https://example.com

Use http:// if in doubt.

About

Follow links from website to website.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages