30 Jun 14:15

Nikoletos-K

0.0.7

Fixed:

Issues in block filtering
Issues in vector based blocking
Data model set types
EJoin wrong naming

Added:

Prioritization algorithms
Tf-Idf functionality
More metrics on entity matching
Optional data cleaning functionalities
New visualizations
New stats for the blocking workflows

Assets 2

06 Jun 11:55

Nikoletos-K

v0.0.6

Fixed issue in VB.

Assets 2

22 May 14:57

Nikoletos-K

v0.0.5

Added:

New evaluation module
Matching metrics
Vector based blocking techniques
Data process methods
Entity matching plots
sphinx website
New tests

Fixed:

Architecture, abstract data types
Data bugs in block building
Bugs in vector based blocking
Using workflows without gt
Code runtime

Assets 2

05 Oct 09:30

Nikoletos-K

v0.0.4

Python 3.7 and 3.8 are now supported!

New dependencies. pyJedAI supports now older python versions.
Total supported versions:

3.7
3.8
3.9
3.10

Also, added tests for all supported python versions and MacOS.

Assets 2

26 Sep 12:07

Nikoletos-K

v0.0.3

First official release in PyPI

Contains:

Tutorials and demos
Fixed issues

Assets 2

0 Join discussion

21 Sep 14:03

Nikoletos-K

v0.0.2

Optimizations, User-friendly Approach Updates

This is the second release. Project is still under development. In this release we:

Added WorkFlow module: A high-level method that simplifies all the process. User friendly approach.
Added comments in the basic methods.
Performed time optimizations using by utilizing the most python.
Created automatic tests.
Created new Block Building Method, by using pre-trained embeddings and Gensim. Similarity search with FAISS framework.
Uploaded to PyPI.
Visualization techniques for performance check.

Assets 2

22 Jul 16:57

Nikoletos-K

v0.0.1

First pyJedAI release: This release presents the basic structure of the well-known JedAI toolkit into the python environment. Contains:

Data reading techniques: RDF/OWL, SPARKQL, CSV, JSON, DB
Block building: Standard Blocking, QGrams & Extended, SuffixArray & Extended
Block cleaning: Block purging, Block filtering
Comparison cleaning: Weighted edge/node pruning, Cardinality edge/node pruning, BLAST, etc
Entity matching: strsimpy
Entity clustering: Connected component clustering
Similarity Joins: SchemaAgnosticΕJoin, TopKSchemaAgnosticJoin
Evaluation through Jupyter notebook

Assets 2