New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Results from link phase change between runs #601
Comments
thanks for reporting @spentelow. Will look into this. |
I have started looking into this and I am able to reproduce the issue, will check it further. Ran link phase twice and got 79 results first time and 65 2nd time. Steps followed: ./scripts/zingg.sh --phase findTrainingData --conf examples/febrl/config.json --zinggDir /tmp/z_601 |
Pull request #617 raised |
Issue #601 linker has inconsistent results
keeping the issue open as putting a count is more of a workaround to trigger the explain plan but we need to find out the root cause of why the joins behave in this way if we don't have any action till the final join. |
Describe the bug
The output data produced by the link phase change each time the model is run.
To Reproduce
Steps to reproduce the behavior:
Change
matchType
inexamples/febrl/configLink.json
from 'exact' to 'fuzzy' (resolves issue with this example in version 0.3.4 realted to Issue 427)Run the 'febrl' model in link mode
/tmp/zinggOutput
)z_score
values.Expected behavior
My expectation is that sequential runs without config or input file changes would produce identical results (except, perhaps, in
z_cluster
labels).The text was updated successfully, but these errors were encountered: