Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactored Dataset Class #576

Merged
merged 3 commits into from Feb 19, 2021

Conversation

DebadityaPal
Copy link
Contributor

@DebadityaPal DebadityaPal commented Feb 17, 2021

This PR is a temporary fix for the refactoring of the Dataset Class.

Since the Dataset class is too heavy and proper refactoring would take a significant amount of time and resources, this is a stopgap solution in which certain functions that convert hub datasets to other frameworks and vice-versa have been relocated from the base Dataset class file dataset.py to a new file integrations.py. These functions are not imported when a Dataset object is created, however, if the user calls these functions, they are subsequently imported and used.
Closes #526

@github-actions
Copy link

Locust summary

Git references

Initial: d9ebc7e
Terminal: a7a9f65

hub/api/integrations.py
Changes:
hub/api/dataset.py
Changes:

@DebadityaPal
Copy link
Contributor Author

Also, @mynameisvinn should I keep the docstrings for the functions in both the files? Or would it be cleaner to remove them from dataset.py

@codecov
Copy link

codecov bot commented Feb 17, 2021

Codecov Report

Merging #576 (6bceea7) into master (d9ebc7e) will increase coverage by 0.08%.
The diff coverage is 88.70%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #576      +/-   ##
==========================================
+ Coverage   89.10%   89.18%   +0.08%     
==========================================
  Files          52       53       +1     
  Lines        3845     3874      +29     
==========================================
+ Hits         3426     3455      +29     
  Misses        419      419              
Impacted Files Coverage Δ
hub/api/integrations.py 88.23% <88.23%> (ø)
hub/api/dataset.py 89.32% <100.00%> (+0.88%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update d9ebc7e...6bceea7. Read the comment docs.

@AbhinavTuli
Copy link
Contributor

Looks good to me. I don't think we should remove the docstrings as we need them to show up when the user calls the function from Dataset class. The codecov/patch is a little short, other than that should be good to merge unless @mynameisvinn has any more suggestions.

@DebadityaPal
Copy link
Contributor Author

I'll try to get better coverage for the patch

@mynameisvinn
Copy link
Contributor

@DebadityaPal thanks, will review.

  1. Docstrings. Always, always keep docstrings. This goes for modules, classes, and methods.
  2. When you submit the updated PR, be sure to include a verbose description (why, what) since this is a stopgap solution.

@mynameisvinn
Copy link
Contributor

mynameisvinn commented Feb 18, 2021

@DebadityaPal two minor modifications.

  • Can you rename methods to_pytorch to _to_pytorch? This naming convention indicates that users should invoke to_pytorch from dataset.py, and not directly import to_pytorch from integrations.py.
  • Make sure __init__.py does not automatically import anything from integrations.py.

Then I think we're all set!

@DebadityaPal
Copy link
Contributor Author

DebadityaPal commented Feb 18, 2021

@mynameisvinn, @AbhinavTuli could you help me a bit with the codecov?
Most of the uncovered lines are ModuleNotFoundErrors .
However certain functions like the tf_gen() function in the _to_tensorflow() function, are being labelled as uncovered even though they are executed in the unit tests.
I put certain print statements in them to get a better sense of the situation, and they were printed indeed.

@AbhinavTuli
Copy link
Contributor

Yeah, I'm aware of those functions not being covered. Our hypothesis is that codecov only covers the code that's on the main thread. I just took a look at the codecov and I think we can ignore the codecov/patch drop, it's not really on you, it just dropped because you happened to move the functions that were not covered, the codecov/project is going up anyway.

@mynameisvinn mynameisvinn merged commit 3746cde into activeloopai:master Feb 19, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[FEATURE] Refactoring Dataset Class
3 participants