Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to use inequality constraints with datetime? #1115

Closed
montasIET opened this issue Nov 23, 2022 · 9 comments
Closed

How to use inequality constraints with datetime? #1115

montasIET opened this issue Nov 23, 2022 · 9 comments
Labels
bug Something isn't working feature:constraints Related to inputting rules or business logic resolution:duplicate This issue or pull request already exists

Comments

@montasIET
Copy link

Environment details

If you are already running SDV, please indicate the following details about the environment in
which you are running it:

  • SDV version: 0.17.1
  • Python version: 3.8.15
  • Operating System: Windows 11 Pro

Problem description

I'm using an inequality constraint with two columns of type 'datetime64[ns]' and I want that date1 is before date2, but I get the error:

Traceback (most recent call last): File "minisogei_generation.py", line 41, in <module> model = GaussianCopula(constraints = c, table_metadata=metadata.get_table_meta('minisogei'), File "C:\Users\luimon\miniconda3\envs\minisogei\lib\site-packages\sdv\tabular\copulas.py", line 171, in __init__ super().__init__( File "C:\Users\luimon\miniconda3\envs\minisogei\lib\site-packages\sdv\tabular\base.py", line 110, in __init__ 'If table_metadata is given {} must be None'.format(arg.__name__)) AttributeError: 'list' object has no attribute '__name__'

How can I fix? Thanks

@montasIET montasIET added new Automatic label applied to new issues question General question about the software labels Nov 23, 2022
@npatki
Copy link
Contributor

npatki commented Nov 23, 2022

Hi @montasIET, it will be helpful if you could provide more information to us to help us replicate the issue.

  1. Could you provide a snippet of the code where you are defining the constraint and creating the model?
  2. (If you have written it) Could you provide your metadata.json file?

@npatki npatki added bug Something isn't working feature:constraints Related to inputting rules or business logic under discussion Issue is currently being discussed and removed question General question about the software new Automatic label applied to new issues labels Nov 23, 2022
@montasIET
Copy link
Author

@npatki sure!
1.
data_inizio_fine_attivita = Inequality(low_column_name='DATA_INIZIO_ATTIVITA', high_column_name='DATA_FINE_ATTIVITA')

c = [data_inizio_fine_attivita]

model = GaussianCopula(constraints = c, table_metadata=metadata.get_table_meta('minisogei'), categorical_transformer = 'LabelEncoder_noised', default_distribution = 'gaussian')

  1. https://wtools.io/paste-code/bHuD

@npatki
Copy link
Contributor

npatki commented Nov 28, 2022

Hi @montasIET, thanks for your reply and details! I might know what the issue is.

Right now, you cannot provide both metadata and constraints in the model. (We’re working on adding better documentation & usage for this feature.)

For now, you can directly add the constraint inside the metadata and then you only need to supply the metadata. Below is an example of how to do this,

{  
  "fields": {
    ...
  },
  "constraints": [{
    "constraint": "sdv.constraints.tabular.Inequality",
    "low_column_name": "DATA_INIZIO_ATTIVITA",
    "high_column_name": "DATA_FINE_ATTIVITA"
  }]
}

Let me know if that helps!

@montasIET
Copy link
Author

Hi @npatki, thanks for answering!
So, doing in this way it's not possible to define a custom constraint, is it?

@npatki
Copy link
Contributor

npatki commented Nov 30, 2022

Hi @montasIET, we call the Inequality constraint a predefined constraint because we have already defined its logic. Until we fix #1121, I don't think it will be possible to use it in your case.

However you can at any point create your own custom constraint. That is: You can write code to implement your own logic for what is allowed and how to achieve it in the synthetic data. This User Guide should help you get started.

@montasIET
Copy link
Author

montasIET commented Nov 30, 2022

Hi @npatki, yes I know, but I mean that if I create a custom constraint, how can I insert it into metadata since it has not a predefined path?
I mean, for Inequality constraint I added to metadata.json this dict:
{"constraint": "sdv.constraints.Inequality", "low_column_name": "DATA_INIZIO_ATTIVITA", "high_column_name": "DATA_FINE_ATTIVITA"}

but for a custom contraint, how can I do that?

@npatki
Copy link
Contributor

npatki commented Nov 30, 2022

@montasIET oh got it. To use a custom constraint inside the metadata, you'd need to write the logic in a separate file. For example, you can write in a file called my_file.py:

from sdv.constraints import create_custom_constraint

<your logic>

MyCustomConstraint = create_custom_constraint(<your parameters>)

Then you should be able to reference the file and class in the metadata just like a predefined constraint:

{  
  "fields": {
    ...
  },
  "constraints": [{
    "constraint": "my_file.MyCustomConstraint",
    "custom_parameter": <custom_value>,
    "custom_parameter": <custom_value>
  }]
}

If you have any questions or feedback about this, please feel free to file a new issue. Since this original issue is about the Inequality constraint, I'll close it in favor of #1121.

@npatki npatki closed this as completed Nov 30, 2022
@npatki npatki added resolution:duplicate This issue or pull request already exists and removed under discussion Issue is currently being discussed labels Nov 30, 2022
@montasIET
Copy link
Author

@npatki sorry for reopening the thread, but I followed each step and got into this error:
raise MultipleConstraintsErrors('\n' + '\n\n'.join(map(str, errors)))
sdv.constraints.errors.MultipleConstraintsErrors:
Data is not valid for the 'CustomConstraint' constraint:
FIELD_ONE FIELD_TWO
0 .... ....
1 .... ....
2 .... ....
3 .... ....
4 .... ....
+9995 more

(sorry I cannot share full data but it shows me values instead of ....)

@npatki
Copy link
Contributor

npatki commented Dec 1, 2022

@montasIET can you start a new issue for this? It would be good to keep the current issue focused on the initial topic (using Inequality constraint)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working feature:constraints Related to inputting rules or business logic resolution:duplicate This issue or pull request already exists
Projects
None yet
Development

No branches or pull requests

2 participants