Support for multiple foreign keys in one table. #185

JagdishKolhe · 2020-09-03T10:05:12Z

SDV version: 0.4.0
Python version: 3.6.9
Operating System: CentOS

Description

This is new feature. many of the real databases has such scenarios.

csala · 2020-09-14T16:37:29Z

Hi @JagdishKolhe would you mind providing a few more details about what you mean in this case?

Do you mean supporting something like having multiple relationships existing between two tables?

For example, one such scenario might be:

You have an employees table that has an employee_id as the primary key.
You have a tasks table with, among other things:
- assignee_id: The employee to whom the task is assigned - defines a first Foreign Key to employees
- supervisor_id: The employee who will supervise the task - defines a second Foreign Key to employees

Is this what you mean?

kvrameshreddy · 2020-09-16T06:20:05Z

Hi @csala,
Does SDV work for the above scenario ?

JagdishKolhe · 2020-10-20T10:25:59Z

@csala, Thanks for attention .

Above can be one scenario, I have not yet thought of much on that.
But we have similar scenario as follows.

user table with primary key user_id
product table with primary key product_id
order table with primary key order_id and TWO foreign keys (user.user_id and product.product_id)

csala · 2020-10-20T11:06:31Z

Hi @JagdishKolhe what you are describing seems to be a multi-parent scenario, which was added in #162

You can test it using the got_families demo, which has this structure:

>>> from sdv import SDV
>>> from sdv.demo import load_demo
>>> 
>>> metadata, tables = load_demo('got_families', metadata=True)
>>> 
>>> sdv = SDV()
>>> sdv.fit(metadata, tables)
>>> sdv.sample()
{'characters':    character_id      name  age
0             0      Arya   14
1             1      Arya   22
2             2      Robb   25
3             3      Robb   21
4             4  Daenerys   16
5             5      Bran   17
6             6     Sansa   21, 'character_families':    character_id  family_id    type  generation
0             0          4    both           4
1             0          5  mother           7
2             1          5  mother           7
3             2          5  mother           7
4             0          6    both           6
5             1          6    both           6
6             0          7    both          12
7             1          7    both          11, 'families':    family_id       name
0          4  Lannister
1          5      Tully
2          6  Lannister
3          7  Lannister}

You can also see an example of how to define a schema like this in issue #193

Wim65 · 2020-10-21T08:02:10Z

Hi , would it preserve column to column correlation between columns in the parents tables ?

(This is Wim Blommaert )

abhisheknagar1983 · 2020-12-09T15:40:39Z

@csala : It clearly evident in your example (got_families) that character_id = 0 is being generated with all the combinations of family_id (4,5,6,7) and same is with the case of character_id = 1 which is also being generated with family_id (5,6,7).

Now the question arias how the samples are being generated?

Based on cardinality ? (1:1 or 1:N)
In your example there is no rows generated for character_id ;= (3,4,5,6), so it seems that system is generating data for only few of the parent table (not for all).

Note: We try the multi parent scenario based on our data model and our results alos looks similar.

Kindly let us know how sampling works in multiparent scenario and does the systems considers the correlation while between columns?

csala · 2021-01-21T20:34:36Z

Multi-foreign-key scenarios are supported after #298 so this can be closed.

* Add working addons * Add eradicate * Add dlint * Decrease complexity (sdv-dev#184) * Add addon (sdv-dev#186) * Add `pytest-style` (sdv-dev#192) * Add addon * Fix randomized error message * Add addon (sdv-dev#188) * Add addon (#191) * Add `pandas-vet` (sdv-dev#190) * Add addon * noqa torch.stack * remove double quotes (sdv-dev#187) * Add addon (sdv-dev#185) * Add `flake8-docstrings` (sdv-dev#193) * Add addon * Fix D100 * Add more docstrings * Fix docstrings * Update docstrings * Fix lint * Add `flake8-builtins` (sdv-dev#189) * Add addon * Add variables-names * Fix bug * Fix mistakes * Add `flake8-multiline-containers` (sdv-dev#183) * Add addon * Add addon * Address feedback * Fix lint * Fix bugs * Remove pydoclint * Ignore D101 errors * Update ignores

csala mentioned this issue Jan 19, 2021

Multi-parent re-model and re-sample issue #298

Merged

csala closed this as completed Jan 21, 2021

csala added the feature request Request for a new feature label Jan 21, 2021

csala added this to the 0.6.2 milestone Jan 21, 2021

csala self-assigned this Jan 21, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support for multiple foreign keys in one table. #185

Support for multiple foreign keys in one table. #185

JagdishKolhe commented Sep 3, 2020

csala commented Sep 14, 2020

kvrameshreddy commented Sep 16, 2020

JagdishKolhe commented Oct 20, 2020 •

edited

Loading

csala commented Oct 20, 2020

Wim65 commented Oct 21, 2020

abhisheknagar1983 commented Dec 9, 2020

csala commented Jan 21, 2021

Support for multiple foreign keys in one table. #185

Support for multiple foreign keys in one table. #185

Comments

JagdishKolhe commented Sep 3, 2020

Description

csala commented Sep 14, 2020

kvrameshreddy commented Sep 16, 2020

JagdishKolhe commented Oct 20, 2020 • edited Loading

csala commented Oct 20, 2020

Wim65 commented Oct 21, 2020

abhisheknagar1983 commented Dec 9, 2020

csala commented Jan 21, 2021

JagdishKolhe commented Oct 20, 2020 •

edited

Loading