Skip to content

tomlinsonk/site-graph

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

46 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Website link graph visualization

Dependencies

python3

  • bs4
  • pyvis
  • networkx
  • requests
  • scipy

Running

git clone https://github.com/tomlinsonk/site-graph.git
cd site-graph
pip3 install -r requirements.txt
python3 site_graph.py https://www.cs.cornell.edu/~kt/

To see site of interest for you, just change the URL.

To see more options, run: python3 site_graph.py -h

Blue nodes are internal pages, green nodes are internal resource files (anything that isn't HTML), orange nodes are external pages, and red nodes are pages with errors. Hover over nodes to see URLs and specific errors (e.g. 404, 500, timeout).

To see a graph of a local files, serve the files using a simple local HTTP server such as Twisted (in Python), usage: twistd -no web --path=[path to files], or http-server (in Node.js), usage: http-server [path to files], and use the resulting URL, for example: python3 site_graph.py http://localhost:8080/

Contributing

This code is under a MIT License. Feel free to make pull requests if there are some features you'd like included (or bugs you'd like fixed).

About

Visualize the link graph of a website.

Resources

License

Stars

Watchers

Forks

Languages