Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

UniqueCombinations constraint outputting wrong data type #510

Closed
fealho opened this issue Jul 12, 2021 · 4 comments · Fixed by #538
Closed

UniqueCombinations constraint outputting wrong data type #510

fealho opened this issue Jul 12, 2021 · 4 comments · Fixed by #538
Assignees
Labels
bug Something isn't working
Milestone

Comments

@fealho
Copy link
Member

fealho commented Jul 12, 2021

The UniqueCombinations constraint can sometimes produce the wrong output data type. For example, if the passed input only contains integers, the output is expected to be made of only integers as well. Instead, sometimes it can produce floats. Below is some code that runs this example.

import pandas as pd
from sdv.constraints import *
from sdv.tabular import *

zero_to_nine = [i for i in range(10)]
data = pd.DataFrame({
    'a': zero_to_nine,
    'b': zero_to_nine,
    'c': zero_to_nine
})

constraint = UniqueCombinations(
    columns=['a', 'b'],
)

model = CTGAN(constraints=[constraint])
model.fit(data)
samples = model.sample(10, conditions={'a': 1})
samples

The code above returns the following:

	a	b	c
0	1	1.0	5
1	1	1.0	6
2	1	1.0	5
3	1	1.0	2
4	1	1.0	3
5	1	1.0	3
6	1	1.0	6
7	1	1.0	4
8	1	1.0	2
9	1	1.0	4
@fealho fealho added bug Something isn't working pending review labels Jul 12, 2021
@onacrame
Copy link

I also get nulls produced by unique combinations

@onacrame
Copy link

I'll see if I can produce a simple example. I should note that the proportion of joint nulls across the specific columns was small relative to the valid data

@tjhallum
Copy link

I just saw this comment above by @onacrame:

I also get nulls produced by unique combinations

I have an open issue about this same problem: #521

@katxiao katxiao self-assigned this Jul 29, 2021
@csala csala closed this as completed in #538 Aug 2, 2021
@csala csala added this to the 0.12.0 milestone Aug 2, 2021
@csala csala assigned csala and fealho Aug 2, 2021
@csala
Copy link
Contributor

csala commented Aug 2, 2021

A full explanation of what caused this issue has been added in the PR that closed it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants