Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DOC] add examples for loading data from common tabular csv formats #4612

Merged
merged 7 commits into from
May 24, 2023

Conversation

TonyZhangkz
Copy link
Contributor

@TonyZhangkz TonyZhangkz commented May 18, 2023

Reference Issues/PRs

Attempts to respond to #4581.

What does this implement/fix? Explain your changes.

I added a section 4 to the notebook "AA_datatypes_and_datasets.ipynb" as suggested by @fkiraly to demonstrate how to load & convert from a csv file to a sktime-compatible data container. I showed with the "mtype" function that they are indeed compatible for each case.

Does your contribution introduce a new dependency? If yes, which one?

No.

What should a reviewer concentrate their feedback on?

This is my first PR. I would appreciate feedback on any fundamentals or specifics.

Did you add any tests for the change?

No.

Any other comments?

I have yet to add myself to list of contributors. Will figure this out later.

PR checklist

For all contributions
  • The PR title starts with either [ENH], [MNT], [DOC], or [BUG] indicating whether the PR topic is related to enhancement, maintenance, documentation, or bug.

@TonyZhangkz TonyZhangkz requested a review from fkiraly as a code owner May 18, 2023 09:32
@review-notebook-app
Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

Copy link
Collaborator

@fkiraly fkiraly left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks, that's a great start! Welcome to sktime!

One thing that's confusing for the reader is that you are saving the data before you are reading it.

However, some of thet data is already avaiable as csv in datasets/data/Airline etc.
For the panel dataset, you can also save one of the smaller datasets as csv for the sake of demomstration.

Could you change that, so we use (small) csv already shipping with sktime to demonstrate the loaders?

@TonyZhangkz
Copy link
Contributor Author

One thing that's confusing for the reader is that you are saving the data before you are reading it. However, some of thet data is already avaiable as csv in datasets/data/Airline etc. For the panel dataset, you can also save one of the smaller datasets as csv for the sake of demonstration. Could you change that, so we use (small) csv already shipping with sktime to demonstrate the loaders?

Thank you for pointing that out! I didn't know there are already such csv files so I fell back to creating ones on my own. Will update that!

Also, would there be any other use cases that are typically challenging for users? I'm happy to expand this section further when I finish with other interesting issues.

@TonyZhangkz
Copy link
Contributor Author

One thing that's confusing for the reader is that you are saving the data before you are reading it. However, some of thet data is already avaiable as csv in datasets/data/Airline etc. For the panel dataset, you can also save one of the smaller datasets as csv for the sake of demonstration. Could you change that, so we use (small) csv already shipping with sktime to demonstrate the loaders?

Thank you for pointing that out! I didn't know there are already such csv files so I fell back to creating ones on my own. Will update that!

Also, would there be any other use cases that are typically challenging for users? I'm happy to expand this section further when I finish with other interesting issues.

@TonyZhangkz
Copy link
Contributor Author

Update: Done with modifications.

@TonyZhangkz TonyZhangkz reopened this May 18, 2023
Copy link
Collaborator

@fkiraly fkiraly left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great content!

Minor but blocking request: can you ensure the code formatting checks pass?
Guide is here: https://www.sktime.net/en/latest/developer_guide/coding_standards.html

@TonyZhangkz
Copy link
Contributor Author

@fkiraly Thank you so much for the feedback and for the source! I'm rather new to the community so I did not know the norms. Is there a way I can run the other checks myself to ensure we do not run into these mechanical errors when making a PR? Thanks!

@fkiraly
Copy link
Collaborator

fkiraly commented May 21, 2023

yes, of course! Have you seen the guide? It has instructions how to set up the pre-commit locally.

@TonyZhangkz
Copy link
Contributor Author

Yes (and thanks again!), I'm following the guide for pre-commit right now. For some reason, my commits are not being properly pushed or synced to this GitHub PR. I'll try to debug this myself.

@TonyZhangkz
Copy link
Contributor Author

@fkiraly My pushes have not been updated since last night. I'm contacting support and will try to update this PR soon.

@fkiraly
Copy link
Collaborator

fkiraly commented May 22, 2023

@TonyZhangkz, that's odd, sorry to hear!
Have you checked whether your problem is with GitHub, or with git?

Feel free to ask in the help-desk channel on discord, there might be a simple explanation.

@TonyZhangkz
Copy link
Contributor Author

@fkiraly Thanks a lot! That's actually very good suggestion as I didn't know it could be a local git bug. I'm able to update the PR now. May I know if I could run the tests (e.g. code quality) myself instead of pending them in the PR?

@fkiraly
Copy link
Collaborator

fkiraly commented May 22, 2023

@TonyZhangkz, a maintainer has to run the tests, I've started them now

@fkiraly
Copy link
Collaborator

fkiraly commented May 23, 2023

alright, let's see if it runs through!

fkiraly
fkiraly previously approved these changes May 24, 2023
Copy link
Collaborator

@fkiraly fkiraly left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really great, thanks! This will be really helpful to sktime users!

I've added some section headers, updated the table of contents, straightened some explanations, and added a third (difficult) example, hope you're ok with that.

@fkiraly fkiraly merged commit 6489d3d into sktime:main May 24, 2023
23 checks passed
@fkiraly fkiraly added documentation Documentation & tutorials module:datasets&loaders data sets and data loaders labels May 24, 2023
@TonyZhangkz
Copy link
Contributor Author

I've added some section headers, updated the table of contents, straightened some explanations, and added a third (difficult) example, hope you're ok with that.

Great, and thanks again for everything on my first PR! I've learned quite some from how you'd do the explanations. And I was actually gonna raise the question for whether or not we're doing the .tsv files. So I'm more than OK with the changes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Documentation & tutorials module:datasets&loaders data sets and data loaders
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants