Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sampled context columns in PAR must be in the same order #1052

Closed
npatki opened this issue Oct 4, 2022 · 0 comments
Closed

Sampled context columns in PAR must be in the same order #1052

npatki opened this issue Oct 4, 2022 · 0 comments
Assignees
Labels
bug Something isn't working data:sequential Related to timeseries datasets
Milestone

Comments

@npatki
Copy link
Contributor

npatki commented Oct 4, 2022

Environment Details

  • SDV version: 0.17.1

Error Description

When using the PAR model's sample method and providing a context, there is an error if the context columns aren't in the same order as the original data.

Expected Behavior: I expect that it's ok if the columns are in a different order. They are named so it's possible to know which column is which.

Steps to reproduce

from sdv.demo import load_timeseries_demo
from sdv.timeseries import PAR
import pandas as pd

data = load_timeseries_demo()

model = PAR(
    entity_columns=['Symbol'],
    context_columns=['MarketCap', 'Sector', 'Industry'],
    sequence_index='Date',
    epochs=1
)

model.fit(data)

# these are in the wrong order
# original data has MarketCap, Sector then Industry
context = pd.DataFrame(data={
    'Symbol': ['Apple', 'Google'],  
    'Sector': ['Technology', 'Health Care'],
    'MarketCap': [1.2345e+11, 4.5678e+10],
    'Industry': ['Electronic Components', 'Medical/Nursing Services']
})

model.sample(context=context)

The above code will work if I just move the columns around

context = pd.DataFrame(data={
    'Symbol': ['Apple', 'Google'],  
    'MarketCap': [1.2345e+11, 4.5678e+10],
    'Sector': ['Technology', 'Health Care'],
    'Industry': ['Electronic Components', 'Medical/Nursing Services']
})

model.sample(context=context)
@npatki npatki added bug Something isn't working data:sequential Related to timeseries datasets labels Oct 4, 2022
@npatki npatki added this to the 1.0.0 milestone Oct 4, 2022
@amontanez24 amontanez24 self-assigned this Mar 24, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working data:sequential Related to timeseries datasets
Projects
None yet
Development

No branches or pull requests

2 participants