diff --git a/assets/update.gif b/assets/update.gif new file mode 100644 index 000000000..1ca7d0ce2 Binary files /dev/null and b/assets/update.gif differ diff --git a/docs/generatingDocumentation.md b/docs/generatingDocumentation.md index e28f8f885..7177e97af 100644 --- a/docs/generatingDocumentation.md +++ b/docs/generatingDocumentation.md @@ -3,6 +3,6 @@ layout: default nav_order: 6 --- ## Generating Documentation -Zingg allows generating readable documentation about the training data, including those marked as matches as non matches. The documentation is written to the zinggDir/modelId folder and can be built using the following +Zingg generates readable documentation about the training data, including those marked as matches as non matches. The documentation is written to the zinggDir/modelId folder and can be built using the following `./scripts/zingg.sh --phase generateDocs --conf ` diff --git a/docs/stepByStep.md b/docs/stepByStep.md index c1b44de23..29bf6d909 100644 --- a/docs/stepByStep.md +++ b/docs/stepByStep.md @@ -31,4 +31,6 @@ The training data in Step 4 above is used to train Zingg and build and save the ### Step 6: Voila, lets match! Its now time to apply the model above on our data. This si done by running the *match* or the *link* phases depending on whether you are matching within a single source or linking multiple sources respectively. You can read more about [matching](setup/match.md) and [linking](setup/linking.md) -As long as your input columns and the field types are not changing, the same model should work and you do not need to build a new model. +As long as your input columns and the field types are not changing, the same model should work and you do not need to build a new model. If you change the match type, you can cotinue to use the training data and add more labelled pairs on top of it. + + diff --git a/docs/updatingLabels.md b/docs/updatingLabels.md new file mode 100644 index 000000000..4decd597d --- /dev/null +++ b/docs/updatingLabels.md @@ -0,0 +1,13 @@ +--- +layout: default +nav_order: 6 +--- +## Updating Labeled Pairs +As our understanding of our data changes, we may need to revisit the previously marked pairs and update them. To do this, please [generate the documentation of the model.](./generatingDocumentation.md) + +You can then invoke the updater by invoking +`./scripts/zingg.sh --phase updateLabel --conf ` + +This brings up the console labeller which accepts the cluster id of the pairs you want to update. + +![Shows records and asks user to update yes, no, cant say on the cli.](/assets/update.gif)