Skip to content

Stravanni/blast

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

36 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

#Blast

  • Doing entity resolution on entity resolution you can find: record linkage, duplicates detection, entity matching, deduplication, fuzzy matching.

  • Doing entity resolution on records you can find: tuples, entity profiles, semi-structured documents.

What does Blast do?

It helps to scale Entity Resolution: it efficiently extracts loose schema information, and uses this information to group together records that most likely will match.

Basically, instead of comparing all possible paris of records, you only compare subsets of them.

P.S.: Blast employs only unsupervised techniques.

When to use Blast?

When you have semi-structured data to clean, but you cannot do schema-matching to apply traditional blocking techniques.

Current Project Version

Here the code of "BLAST: a loosely schema-aware meta-blocking approach for entity resolution", please cite: [1].

The approach is implemented on top of the Blocking Framework, A framework for blocking-based Entity Resolution [2].

Where to start

Take a look to Experiments -> Test_metablocking.java

References

[1] Simonini, Giovanni, Sonia Bergamaschi, and H. V. Jagadish. "BLAST: a loosely schema-aware meta-blocking approach for entity resolution." Proceedings of the VLDB Endowment 9.12 (2016): 1173-1184. [2] sourceforge.net/projects/erframework/

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages