Skip to content
Deep Web Crawler for Data Enrichment
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
.idea
dblp_example
deeperlib
google_example
test
yelp_example
README.md
requirements.txt

README.md

DeepER - Deep Entity Resolution

Travis David

A web data integration tool, A novel framework to overcome limitations, Easy for configuration, Fully functional, Smooth interface.

which aims to find pairs of records that describe the same entity between a local database and a hidden database and has many applications in data enrichment and data cleaning.

API Support

DeepER is ready for the following API:

  • DBLP(DataBase systems and Logic Programming)
  • YELP(Yelp Fusion API)
  • AMiner(arnetminer)

Custom

implement a subclass of deeper.api.simapi and pass it to deeper.core.smartcrawl and you would integrate a new api to collect more data.

Documentation

Fantastic documentation is available at https://sfu-db.github.io/deeperlib/

Requirements

  • pqdict>=1.0.0
  • requests>=2.18.4
  • simplejson>=3.11.1
  • rauth>=0.7.3

Requests officially supports Python 2.7.13, and runs great on PyPy.

Installation and Update

pip install deeperlib
pip install --upgrade deeperlib

Changelog

v0.2a

  • 2017/09/19 support Windows-32bit/64bit, Linux-32bit/64bit, MacOs-64bit, csv and pickle input

v0.1a

  • 2017/09/14 deeper's birthday

Team

  • Jiannan Wang, Assistant Professor at Simon Fraser University
  • Eugene Wu, Assistant Professor at Columbia University
  • Ryan Shea, Research Associate at Simon Fraser University
  • Pei Wang, Ph.D. Student at Simon Fraser University
  • Yongjun He, Undergraduate Student at Nanjing University

Discussing

Maintainer email
Yongjun He 141250047@smail.nju.edu.cn
You can’t perform that action at this time.