Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is there a way to deal with categorical feature? #191

Closed
kja815 opened this issue Jul 14, 2021 · 9 comments
Closed

Is there a way to deal with categorical feature? #191

kja815 opened this issue Jul 14, 2021 · 9 comments

Comments

@kja815
Copy link

kja815 commented Jul 14, 2021

I want to train model with scalar or categorical feature.

but I can't find the way to deal with categorical feature in informer.

is available to control categorical feature in informer?

@zhouhaoyi
Copy link
Owner

The categorical feature is important in the time-series problem, we may add it to our to-do list.
If you have best practices, pull requests are highly welcome.

@kja815
Copy link
Author

kja815 commented Jul 23, 2021

@zhouhaoyi
thank you for your answer.
I think ETTh dataset has only one time series set.
Informer is available to train for (different) multiple time series set (with same features)?
for example, there are time series data (like electric consumption) for house1, ... , house100 with same features.
single Informer model can be trained with these data?

@cookieminions
Copy link
Collaborator

@zhouhaoyi
thank you for your answer.
I think ETTh dataset has only one time series set.
Informer is available to train for (different) multiple time series set (with same features)?
for example, there are time series data (like electric consumption) for house1, ... , house100 with same features.
single Informer model can be trained with these data?

Hi, if these time series have more than one feature, Informer cannot deal with these data now.

@777udo
Copy link

777udo commented Jul 29, 2021

@cookieminions
so the Informer is able to deal with data from multiple devices in one training set for the univariate case? Why is it that it doesn't work for multivariate data and are there requirements on how the input data containing multiple devices has to be ordered or preprocessed (multiple identical timestamps for several devices)?

@cookieminions
Copy link
Collaborator

The input's shape of Informer model without input layer must be [batch_size, seq_len, dimension], so if your data is multi time series with multi variate, the input's shape of input layer may be [batch_size, seq_len, num_series, num_features].
If you want to use Informer to deal with multi time series whose features is more than 1, you need to modify input layer. A feasible solution is using emebdding layer for each categorical feature and aggregating the embeddings together, and then feed the embeddings to Informer.

@777udo
Copy link

777udo commented Jul 30, 2021

@cookieminions thanks for your reply.
Consider I use univariate data sets, only having timestamps and unse only one feature, but for multiple households for example, like @kja815 describes it. Then there would be multiple identical timestamps referring to different households. But each input sample for the encoder has to receive sequential input of one distinct household. Can the model handle this by just appending data of different households into one big csv file as input? Like jan-dec household one append jan-dec household two etc.

@cookieminions
Copy link
Collaborator

As your description, can your data be organized as a big csv with all households (each household has only 1 feature) and timestamp, whose columns are date, household1, household2, ..., householdN? If my understanding is correct, you can feed the data into Informer directly, and the model will deal with the multi-series as multi-variates.

If each household has more than 1 feature, and the data will be date, household1_feat1, household1_feat2, ..., householdN_feat1, householdN_feat2, ... householdN_featM, you need to aggregate the features of each household together, and feed data such as household1_embed, household2_embed, ... householdN_embed into Informer, where householdN_embed is aggregated by householdN_feat1 to householdN_featM.

Please correct me if I am wrong.

@Lisa-FFY
Copy link

Lisa-FFY commented Aug 1, 2021

Excuse me,My data is a dichotomy problem,what should I do with my tags?

@zhouhaoyi
Copy link
Owner

Excuse me,My data is a dichotomy problem,what should I do with my tags?

Could you please provide more descriptions of your dataset?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants