Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NewRowSynthesis: ValueError: multi-line expressions are only valid in the context of data, use DataFrame.eval #275

Closed
darenr opened this issue Nov 28, 2022 · 2 comments
Labels
bug Something isn't working feature:metrics Related to any of the individual metrics resolution:resolved The issue was fixed, the question was answered, etc.

Comments

@darenr
Copy link

darenr commented Nov 28, 2022

Environment Details

  • SDV version: sdv==0.17.1
  • Python version: Python 3.9.13
  • Operating System: Linux

Error Description

pandas==1.4.3

ValueError when running NewRowSynthesis

Steps to reproduce

from sdmetrics.single_table import NewRowSynthesis

metadata_obj, real_data = load_tabular_demo("student_placements_pii", metadata=True)

model = GaussianCopula(
    primary_key="student_id"
)
model.fit(real_data)
synthetic_data = model.sample(250)

new_row_synthesis_score = NewRowSynthesis.compute(
    real_data=real_data, synthetic_data=synthetic_data, metadata=metadata_obj.to_dict()
)
ValueError                                Traceback (most recent call last)
Cell In [43], line 11
      8 model.fit(real_data)
      9 synthetic_data = model.sample(250)
---> 11 new_row_synthesis_score = NewRowSynthesis.compute(
     12     real_data=real_data, synthetic_data=synthetic_data, metadata=metadata_obj.to_dict()
     13 )

File ~/miniconda3/envs/mloperator/lib/python3.9/site-packages/sdmetrics/single_table/new_row_synthesis.py:104, in NewRowSynthesis.compute(cls, real_data, synthetic_data, metadata, numerical_match_tolerance, synthetic_sample_size)
    101     row_filter.append(field_filter)
    103 try:
--> 104     matches = real_data.query(' and '.join(row_filter))
    105 except TypeError:
    106     if len(real_data) > 10000:

File ~/miniconda3/envs/mloperator/lib/python3.9/site-packages/pandas/core/frame.py:4111, in DataFrame.query(self, expr, inplace, **kwargs)
   4109 kwargs["level"] = kwargs.pop("level", 0) + 1
   4110 kwargs["target"] = None
-> 4111 res = self.eval(expr, **kwargs)
   4113 try:
   4114     result = self.loc[res]

File ~/miniconda3/envs/mloperator/lib/python3.9/site-packages/pandas/core/frame.py:4240, in DataFrame.eval(self, expr, inplace, **kwargs)
   4237     kwargs["target"] = self
...
    328     )
    329 engine = _check_engine(engine)
    330 _check_parser(parser)

ValueError: multi-line expressions are only valid in the context of data, use DataFrame.eval

@darenr darenr added bug Something isn't working new Label applied to new issues labels Nov 28, 2022
@npatki
Copy link
Contributor

npatki commented Nov 28, 2022

Hi @darenr thanks for filing this issue and providing the details. I'm moving it to our SDMetrics library since it is isolated to that particular software.

I can replicate the issue on sdmetrics 0.7.0 but it is resolved once I upgrade to the latest 0.8.0. We'll upgrade the dependency in the sdv library but for now, you can install the latest sdmetrics in a fresh environment.

pip install sdmetrics==0.8.0

@npatki npatki transferred this issue from sdv-dev/SDV Nov 28, 2022
@npatki npatki added under discussion Issue is currently being discussed feature:metrics Related to any of the individual metrics and removed new Label applied to new issues labels Nov 28, 2022
@npatki
Copy link
Contributor

npatki commented Dec 9, 2022

Hi @darenr -- Good news! We have just released SDV 0.17.2, which upgrades the SDMetrics requirement to the latest 0.8.0.

With both upgrades now in place, I believe your error should be resolved.

(If you are continuing to experience this after upgrading to SDV 0.17.2 and SDMetrics 0.8.0, then feel free to reply here. I will reopen the issue for further investigation.)

@npatki npatki closed this as completed Dec 9, 2022
@npatki npatki added resolution:resolved The issue was fixed, the question was answered, etc. and removed under discussion Issue is currently being discussed labels Dec 9, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working feature:metrics Related to any of the individual metrics resolution:resolved The issue was fixed, the question was answered, etc.
Projects
None yet
Development

No branches or pull requests

2 participants