-
Notifications
You must be signed in to change notification settings - Fork 109
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Custom Data Usage From Pandas With Gluonts #10
Comments
Thank you so much for the code + detailed explanation @turkalpmd ! All of us from the team deeply appreciate you sharing these positive results with us - we strive for exactly this kind of real-world impact. We're releasing the fine-tuning scripts soon; I presume your results will be even better with that. Stay tuned, and thank you again! |
also as an experiment try the estimator with the linear-rope positional embedding via: estimator = LagLlamaEstimator(
ckpt_path="lag-llama.ckpt",
prediction_length=prediction_length,
context_length=context_length,
# estimator args
input_size=estimator_args["input_size"],
n_layer=estimator_args["n_layer"],
n_embd_per_head=estimator_args["n_embd_per_head"],
n_head=estimator_args["n_head"],
scaling=estimator_args["scaling"],
time_feat=estimator_args["time_feat"],
rope_scaling={
"type": "linear",
"factor": max(
1.0, (context_length + prediction_length) / estimator_args["context_length"]
),
},
) |
@turkalpmd Thanks for sharing. I see the forecasts is always (100, prediction_length). Does this mean it gives predictions for next 100 prediction lengths? Second. This uses the train_ds and predicts for last prediction_length sequences. How to enable it to do predictions beyond that? Thanks in advance. |
No, it's not 100 points. My own custom data was hourly, which is why I asked for a weekly forecast from the 24 intervals of a day. I didn't apply any Benchmark tests on this model, I only compared its forecasts with another Transformer time series model, which had previously taken 3 days to train with a 24 GB RTX3090. Therefore, as far as I understand, the following codes seem sufficient to set the intervals; window and forecasting horizon: context_length = 24
prediction_length = 24*7 Splitting data train_ds = ListDataset(
[{'target': time_series[:-prediction_length], 'start': start}],
freq=freq
)
test_ds = ListDataset(
[{'target': time_series, 'start': start}],
freq=freq
) |
@kashifmunircshs 100 represent the number of samples sampled from the probability distribution for each timestep (since we're a probabilistic forecasting model). FYI we uploaded a new Colab demo with a tutorial to use a CSV dataset. We explain the dimensions of the forecasts tensor there. |
@turkalpmd can you post the code you used to plot the graphs you reported? Thanks |
@warner83 The code for plotting is in the Colab |
So just FYI the context length is best kept the same (32); the code will work if changed but I suggest trying 32 first before the others |
@turkalpmd |
in my custom dataset i have two columns only Time and values so how to go ahead with this `for col in df.columns: dataset = PandasDataset.from_long_dataframe(df, target = 'Inverter 1 Active Energy (D1)', item_id = 'time') backtest_dataset = dataset facing item_id error |
Enviroments
Custom data preparation
Model training and evalution
The results were so bizarre, I couldn't believe it. After 20-30 days on a project, I developed a forecaster model in just 1 minute that performed better than the last model I developed, which took 3 days with an RTX3090. I'm going to lose my mind, holly shit.
Old TCN model
Log-Llama model
The text was updated successfully, but these errors were encountered: