New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Transform repo into using nbdev framework #6
Conversation
|
README.md
Outdated
|
||
|
||
True | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice to have: A little tqdm bar to see how the download progresses? Can be added to the ToDo's if you want it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work!
Before anything else, please install the git hooks that run automatic scripts during each commit and merge to strip the notebooks of superfluous metadata (and avoid merge conflicts). After cloning the repository, run the following command inside it: | ||
``` | ||
nbdev_install_git_hooks | ||
``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would be nice to elaborate a bit more on the Dev Cycle
, maybe also add a linter (black?)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe add link to: https://github.com/fastai/nbdev/ to get people started ... and add run pip install nbdev
or conda install -c fastai nbdev
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, this is the standard text from nbdev, but its interesting that they did not include these links :)
The underlying download library allowed to show the downloaded file size as a progress. Added that as an option. |
Google colab fails when loading and transforming the numpy data into a torch array. The main issue is the "slates" data. It is currently stored as int64, but we can try to reduce it to int32. Compressed saving only reduces from 1.2gb to 1.1gb, but the in-memory usage is halfed:
|
I removed the attributes from git lfs to make nbdev work. So a bug now is that git lfs is not working properly. For usage, however, this is less relevant as users can download the files from gdrive using the buildin function. I also created a int32 version of the data.npz file that is uploaded to gdrive, decreasing the memory footprint and makes it usable in google colab. I'll merge with these fixes. |
nbdev is a framework from fast.ai that develops everything in notebooks. It its heavily opinionated, but also gives us documentation and pip packaging "for free". Ive transformed the repo into this format.
Todo: My main issue now is that I removed the following lines from .gitattributes which means git lfs no longer works. But nbdev did not build with these present..