Skip to content

A simple tool to find and visualise links between two sets of websites built with Scrapy and Graphviz

Notifications You must be signed in to change notification settings

INNOVINATI/linkminer

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

linkminer

A simple tool to find and visualise links between two sets of websites built with Scrapy and Graphviz

About

linkminer uses the power of Scrapy to build a higher-level network graph based on two sets of URLs which is then visualised with Graphviz. We are using this tool internally for Competitive Intelligence, i.e. when we want to find out which customers have some kind of relationship with specific competitors.

Getting started

Install via PyPi:

pip install linkminer

Install via Git:

git clone https://github.com/INNOVINATI/linkminer.git
cd linkminer-master
virtualenv venv #Optional
source venv/bin/activate #Optional
pip setup.py install

Usage

Extract links from 2 given sets of URLs:

from linkminer.miner import LinkMiner

source_urls = [...]
target_urls = [...]

m = LinkMiner(source_urls, target_urls)
m.extract()

Render the graph:

m.render('testfile')

Export graph and data as JSON file:

m.export_json('testfile')

About

A simple tool to find and visualise links between two sets of websites built with Scrapy and Graphviz

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages