-
Notifications
You must be signed in to change notification settings - Fork 618
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Manage imbalancing in TFT #1040
Comments
Have you tried "weight" argument while creating datasets? You can create a column with weights to be used in training ds = TimeSeriesDataSet(
data=data[train_data_filter],
time_idx=time_idx_col,
target=...,
weight='weight', # pass name of a weight column in your df, samples/sampler weight(s)
group_ids=group_ids,
...
) |
N.B: The DeepAR paper empirically shows the benefit of method 2) compared to not using any weights. To the best of my knowledge, they do not present any result based on method 1). That being said, in their setting, the issue is the size of the dataset and the main problem in this case is to be able to select the most relevant samples (since the total number of samples is huge, it may not be possible to go over all samples several times during the training procedure and they show that weighting the samples based on their "velocity" greatly improves the performances). |
First of all thanks to @RonanFR and @fnavruzov, for your replies. Due to the struggling I am having to get answer, and since you look expert, I would like you to kindly have a look at this question I posted quite a few days ago (which it will never get an answer I guess): I know it is not good practice to post another question in a different issue, so I really apologise in advance, but I cannot get over this problem, even after looking the source code. many thanks |
Thanks @RonanFR, @fnavruzov. I am trying to implement what you've suggested using the "
Where the Unfortunetly the described implementation raises the error below: Would you know how to solve it? |
Hi @FrancescoFondaco , Can you provide a detailed minimal reproducible example that raises this error ? (small toy dataset of only few lines) |
Have you figured out this issue? I am having the same issue after adding the "weight" parameter. Thx! |
Dear @FrancescoFondaco and @QijiaShao, Best wished, |
I have a dataset of several shops. For each I have a time series of sales.
Shops are spread unequally in the world (1000 in us, 100 in EU), I need to predict the sales based on the location and other variables.
However, such data set is imbalanced.
Is there a way to manage imbalance in TFT? (upsampling, downsampling, apply a weight-balance similar to sklearn, or force each batch to select equal number of example)
The text was updated successfully, but these errors were encountered: