-
Notifications
You must be signed in to change notification settings - Fork 292
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HMASynthesizer throws an error when sampling multi table models with three levels of depths #1600
Comments
We are working on a potential fix for the current graph traversal algorithm implementation. We will create a pull request with the fix. |
Hi @portovep and @elisaherrmann, very nice to meet you and thanks for filing such a detailed issue. Please hold off on any such pull request, as we already have a fix that was merged into main branch two weeks ago. See #1562. I don't believe any release candidates were made after this merge. So if you'd like to test it, I'd recommend installing directly from the |
Hi @npatki, nice to meet you too. Thanks for replying so quickly and we are glad to heard that support for multi table sampling for models with 3+ levels of depth will be added in SDV 1.5.0. Our current use case requires this feature so we are very pleased to see this coming in the next release. The error described on this issue was encountered while running the latest code from the main branch, which includes changes introduced in #1562. We found the error while traversing graphs with the following characteristics:
You can see the proposed fix implementation and two integration tests that covered the above mentioned scenarios in this fork's branch: Let me know if you would like me to raise a PR so you can check if the proposed fix is valid. |
Hi @portovep, no problem. I had hoped to avoid any duplicate efforts but I realize that you were already using the up-to-date code. Indeed, I can replicate the problem on the Seems like it never even reaches the asserts and fails right on the Next StepsSupporting these types of schemas is important to us. Our team has been actively working on this area, and we'd like to clean up more of the traversal code to make it simpler, and prevent these edge cases. We'll also be checking to ensure it works with other parts of the SDV software. So no need to submit any fixes as of yet. In the meantime, you are welcome to continue using your fork for personal use if it's working for you. We'll reach out if we need any PRs. |
@npatki cleaning up the traversal code make it simpler and prevent edge cases makes sense. Perhaps a established graph traversal algorithm like Depth-first search (DFS) could help here. Thanks for your help. We will wait until the fix gets released as part of a future version and use a work-around in the meantime. |
My pleasure, @portovep. We hope to have that fix soon.
Yes indeed, the initial SDV paper had an DFS approach. |
Environment Details
Please indicate the following details about the environment in which you found the bug:
Error Description
@elisaherrmann and I are getting an error when sampling a multi table model with three levels of depth and multiple root parent nodes with the HMASynthesizer.
The model we are trying to sample:
Steps to reproduce
We created an integration test to reproduce:
When we run the integration test above we get this error:
We think the graph traversal algorithm implemented contains a bug. The last child node to be traversed is never sampled by the BaseHierarchicalSampler. When adding relationships in the BaseHierarchicalSampler, the last child (in the provided exampled, child1) is not found in the sampled_data dictionary causing the error.
Notes
We switched to the latest development version (1.4.1.dev0) as we found a similar error when sampling the provided model with the latest stable version (1.4.0). We observed that In version 1.4.0 the hierarchical sampler was not able to sample more than one level of depth.
The text was updated successfully, but these errors were encountered: