Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feat/Uber TLC Dataset #1003

Merged
merged 14 commits into from
Jun 20, 2022
Merged

Feat/Uber TLC Dataset #1003

merged 14 commits into from
Jun 20, 2022

Conversation

gdevos010
Copy link
Contributor

@gdevos010 gdevos010 commented Jun 10, 2022

Fixes #.

Summary

New dataset based on the FiveThirtyEight dataset.
Supports both daily and hourly sampling.

Other Information

@codecov-commenter
Copy link

codecov-commenter commented Jun 10, 2022

Codecov Report

Merging #1003 (ca332e1) into master (abf12da) will decrease coverage by 0.33%.
The diff coverage is 74.35%.

@@            Coverage Diff             @@
##           master    #1003      +/-   ##
==========================================
- Coverage   92.92%   92.58%   -0.34%     
==========================================
  Files          76       76              
  Lines        7628     7704      +76     
==========================================
+ Hits         7088     7133      +45     
- Misses        540      571      +31     
Impacted Files Coverage Δ
darts/models/forecasting/block_rnn_model.py 98.14% <ø> (ø)
darts/models/forecasting/nbeats.py 98.06% <ø> (ø)
darts/models/forecasting/nhits.py 99.25% <ø> (ø)
darts/models/forecasting/rnn_model.py 97.46% <ø> (ø)
darts/models/forecasting/tcn_model.py 96.84% <ø> (ø)
darts/models/forecasting/tft_model.py 96.92% <ø> (ø)
darts/models/forecasting/transformer_model.py 100.00% <ø> (ø)
darts/datasets/dataset_loaders.py 91.48% <63.63%> (-3.92%) ⬇️
darts/datasets/__init__.py 78.26% <65.21%> (-21.74%) ⬇️
darts/models/forecasting/pl_forecasting_module.py 93.49% <100.00%> (+0.54%) ⬆️
... and 2 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 14d671a...ca332e1. Read the comment docs.

Copy link
Contributor

@hrzn hrzn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Once more, a very nice addition, thanks a lot!
I just have a comment concerning leaving it as an option to return a multivariate series, or multiple series. Let me know if you think that's a good idea.

darts/datasets/__init__.py Outdated Show resolved Hide resolved
Copy link
Contributor

@hrzn hrzn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool, only one minor suggestion left then we can merge :)

darts/datasets/__init__.py Outdated Show resolved Hide resolved
@gdevos010
Copy link
Contributor Author

I added the functionality to the electricity dataset too

@gdevos010
Copy link
Contributor Author

Is something else needed, or can we merge this branch?

@hrzn
Copy link
Contributor

hrzn commented Jun 15, 2022

Is something else needed, or can we merge this branch?

I think we can merge soon. The test coverage seems to have decreased. Are you able to have a look why?

@gdevos010
Copy link
Contributor Author

@hrzn It's probably the _to_multi_series. I can add a test for them, but the Electricity's _to_multi_series is very very slow.

@hrzn
Copy link
Contributor

hrzn commented Jun 16, 2022

@hrzn It's probably the _to_multi_series. I can add a test for them, but the Electricity's _to_multi_series is very very slow.

You could maybe add a test that calls _to_multi_series with a small DataFrame, and just have a sanity check that what the function returns has the right format.

ele_multi_series_dataset = DatasetLoaderCSV(
metadata=DatasetLoaderMetadata(
"Electricity_test.csv",
uri=_DEFAULT_PATH + "/Electricity_test.csv",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we maybe put this one in datasets/test/Electricity_test.csv ? i.e., we create a test subfolder.

@hrzn
Copy link
Contributor

hrzn commented Jun 19, 2022

Thanks @gdevos010. I assume the tests should pass once it is merged, right? The only last tiny thing I'd ask is to put the test dataset in a subfolder datasets/test/.

@gdevos010
Copy link
Contributor Author

@hrzn Yes, the tests passed locally. I moved the file to tests/datasets.

@gdevos010
Copy link
Contributor Author

This will be my last PR for about a month. I will be traveling without a laptop.

@hrzn
Copy link
Contributor

hrzn commented Jun 20, 2022

@hrzn Yes, the tests passed locally. I moved the file to tests/datasets.

Thanks! I was referring to datasets/Electricity_test.csv though, which I think is still living under datasets/. But it's fine, I will merge the PR and we can change it later on.

@hrzn
Copy link
Contributor

hrzn commented Jun 20, 2022

This will be my last PR for about a month. I will be traveling without a laptop.

Enjoy :)

@hrzn hrzn merged commit a83b0f7 into unit8co:master Jun 20, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants