Skip to content

A rather rough PoC for wrapping dedupe active learning console inside a streamline app

License

Notifications You must be signed in to change notification settings

KlausGPaul/streamlit-dedupe

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 

Repository files navigation

streamlit-dedupe

A rather rough PoC for wrapping dedupe active learning console inside a streamline app. The purpose is to make the training available to a group of users who lack access to the infrastructure and data required to set up and perform dedupe interactive labelling.

You will need a file "ops.parquet" which contains the entries you want to deduplicate. Also, at the moment, the implementation uses a static schema.

The output is a labelled json file (which can be used to (re)train a dedupe model.

About

A rather rough PoC for wrapping dedupe active learning console inside a streamline app

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages