Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multi-table should show foreign key transformers as None #1249

Closed
amontanez24 opened this issue Feb 9, 2023 · 0 comments
Closed

Multi-table should show foreign key transformers as None #1249

amontanez24 opened this issue Feb 9, 2023 · 0 comments
Assignees
Labels
bug Something isn't working internal The issue doesn't change the API or functionality
Milestone

Comments

@amontanez24
Copy link
Contributor

Environment Details

Please indicate the following details about the environment in which you found the bug:

  • SDV version: v1.0.0
  • Python version: Any
  • Operating System: Any

Error Description

In the multi-table case, all foreign keys are assigned None as a transformer. However this change is done in preprocess, and not in auto_assign_transformers. For this reason, if a user calls get_transformers after auto-assigning them, it won't show None for the foreign key. The proposed fix is to just move the code in _skip_foreign_key_transformations to inside of auto_assign_transformers. This will require some refactoring to loop over all tables and assign the transformers for foreign keys to None.

Steps to reproduce

from sdv.multi_table import HMASynthesizer
from sdv.datasets.demo import download_demo

data, metadata = download_demo('multi_table', 'world_v1')
synth = HMASynthesizer(metadata=metadata)
synth.auto_assign_transformers(data)
synth.get_transformers(table_name='City')

Observe the following output

{'Name': LabelEncoder(add_noise=True),
 'add_numerical': FloatFormatter(computer_representation='Int64'),
 'ID': None,
 'CountryCode': RegexGenerator(),
 'District': LabelEncoder(add_noise=True),
 'Population': FloatFormatter(computer_representation='Int64')}
synth.fit(data)
synth.get_transformers('City')

Notice the foreign key CountryCode now has None as a transformer.

{'Name': LabelEncoder(add_noise=True),
 'add_numerical': FloatFormatter(computer_representation='Int64'),
 'ID': None,
 'CountryCode': None,
 'District': LabelEncoder(add_noise=True),
 'Population': FloatFormatter(computer_representation='Int64')}
@amontanez24 amontanez24 added bug Something isn't working internal The issue doesn't change the API or functionality labels Feb 9, 2023
@amontanez24 amontanez24 added this to the 1.0.0 milestone Feb 9, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working internal The issue doesn't change the API or functionality
Projects
None yet
Development

No branches or pull requests

2 participants