-
Notifications
You must be signed in to change notification settings - Fork 317
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support for multiple foreign keys in one table. #185
Comments
Hi @JagdishKolhe would you mind providing a few more details about what you mean in this case? Do you mean supporting something like having multiple relationships existing between two tables? For example, one such scenario might be:
Is this what you mean? |
Hi @csala, |
@csala, Thanks for attention . Above can be one scenario, I have not yet thought of much on that.
|
Hi @JagdishKolhe what you are describing seems to be a multi-parent scenario, which was added in #162 You can test it using the >>> from sdv import SDV
>>> from sdv.demo import load_demo
>>>
>>> metadata, tables = load_demo('got_families', metadata=True)
>>>
>>> sdv = SDV()
>>> sdv.fit(metadata, tables)
>>> sdv.sample()
{'characters': character_id name age
0 0 Arya 14
1 1 Arya 22
2 2 Robb 25
3 3 Robb 21
4 4 Daenerys 16
5 5 Bran 17
6 6 Sansa 21, 'character_families': character_id family_id type generation
0 0 4 both 4
1 0 5 mother 7
2 1 5 mother 7
3 2 5 mother 7
4 0 6 both 6
5 1 6 both 6
6 0 7 both 12
7 1 7 both 11, 'families': family_id name
0 4 Lannister
1 5 Tully
2 6 Lannister
3 7 Lannister} You can also see an example of how to define a schema like this in issue #193 |
Hi , would it preserve column to column correlation between columns in the parents tables ? (This is Wim Blommaert ) |
@csala : It clearly evident in your example (got_families) that character_id = 0 is being generated with all the combinations of family_id (4,5,6,7) and same is with the case of character_id = 1 which is also being generated with family_id (5,6,7). Now the question arias how the samples are being generated?
Note: We try the multi parent scenario based on our data model and our results alos looks similar. Kindly let us know how sampling works in multiparent scenario and does the systems considers the correlation while between columns? |
Multi-foreign-key scenarios are supported after #298 so this can be closed. |
* Add working addons * Add eradicate * Add dlint * Decrease complexity (sdv-dev#184) * Add addon (sdv-dev#186) * Add `pytest-style` (sdv-dev#192) * Add addon * Fix randomized error message * Add addon (sdv-dev#188) * Add addon (#191) * Add `pandas-vet` (sdv-dev#190) * Add addon * noqa torch.stack * remove double quotes (sdv-dev#187) * Add addon (sdv-dev#185) * Add `flake8-docstrings` (sdv-dev#193) * Add addon * Fix D100 * Add more docstrings * Fix docstrings * Update docstrings * Fix lint * Add `flake8-builtins` (sdv-dev#189) * Add addon * Add variables-names * Fix bug * Fix mistakes * Add `flake8-multiline-containers` (sdv-dev#183) * Add addon * Add addon * Address feedback * Fix lint * Fix bugs * Remove pydoclint * Ignore D101 errors * Update ignores
Description
This is new feature. many of the real databases has such scenarios.
The text was updated successfully, but these errors were encountered: