Skip to content

matteoredaelli/ragno

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

57 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ragno.erl

Ragno is a light crawler for domains. It extract some useful info about the domain

  / _ \
\_\(_)/_/
 _//o\\_ Ragno
  /   \

Build

BUILD_WITHOUT_QUIC=1 make

Usage

From the command line

_rel/ragno_release/bin/ragno_release foreground -run ragno_app crawl_domains_string www.libero.it,www.redaelli.org

Extract unvisited domains (todo)

cd _rel/ragno_release/data
spark-sql -i ../../../utils/views.sql -f ../../../utils/extract_new_domains.sql

About

Simple web domains crawler written in erlang

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published