-
Notifications
You must be signed in to change notification settings - Fork 24
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Clarify caching #131
Clarify caching #131
Conversation
Codecov Report
@@ Coverage Diff @@
## 492-final-models #131 +/- ##
==================================================
+ Coverage 85.7% 87.2% +1.5%
==================================================
Files 25 25
Lines 1462 1472 +10
==================================================
+ Hits 1253 1284 +31
+ Misses 209 188 -21
|
f227acf
to
5fecc96
Compare
492-final-models changed the final model checkpoint names, meaning if we PR this branch into v2 the tests will fail for unrelated reasons (e.g. older model files not found). Therefore, it makes more sense to merge this into 492 rather than into v2. Tests are all passing locally but won't be run on CI since this isn't into v2. Ready for review @r-b-g-b |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just want to think about whether we really want to support loading .env
+ a few small changes.
zamba/__init__.py
Outdated
@@ -1,3 +1,15 @@ | |||
from dotenv import load_dotenv |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we sure we want to load env variables from .env
? I'm not sure I've ever seen that in a library -- I might thing that it's more standard to have the user source .env
or MYENV=thing python ...
explicitly.
One unexpected behavior that could result from using dotenv
is that find_dotenv
will keep walking up the file hierarchy until it finds something, so if the user hasn't made a .env
in their working directory, it may pull in env variables from an .env
in a higher directory.
I think given that we cover most (all?) of the configuration in our config.yaml
files, we may want to get rid of dotenv entirely.
I guess one case I can imagine is for cache dirs -- since those are machine-specific, a user might want to leave it out of the yaml file and include the machine-specific dir in an .env
file on that machine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh good call, we don't want .envs from outside the working dir. you can specify logging and video suffixes only through env variables (not on the configs), but agree i think we can remove the load_dotenv and just have users to source .env
as you say
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@r-b-g-b if the user does source .env
, when we do os.getenv("LOG_LEVEL")
, it won't find anything. is there a way to use environment variable without needing to do load_dotenv anywhere in the codebase?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Huh well I learned something about source
, which is it doesn't export to all subprocesses in the current shell. So with
# .env
LOG_LEVEL=DEBUG
$ source .env
$ echo $LOG_LEVEL
DEBUG
# current shell sees the env
but
import os; os.getenv("LOG_LEVEL")
# None
# Python does not see the env
Apparently, the way to do this is either:
- Use an
export
statement in the.env
file:
# .env
export LOG_LEVEL=DEBUG
source .env
- Use
export
after sourcing:
# .env
LOG_LEVEL=DEBUG
source .env
export LOG_LEVEL
set -a
source .env
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jayqi do you know of any examples in our open source libraries where environment variables are used? just trying to figure out what the right convention is here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In cloudpathlib, we read the environment variables from os.environ
but we don't do load_dotenv
or anything like that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In cloudpathlib, we read the environment variables from os.environ but we don't do load_dotenv or anything like that.
so the expectation is that the user does option 2 or 3 as robert listed above?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For some exposition on source
and .env files:
source
just executes all of the commands of the shell script in your current shell. So
LOG_LEVEL=DEBUG
is just setting a regular shell variable. export
is specifically the command for setting an environment variable. That's why you need export
in your bash profiles and stuff when setting env vars.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess usually libraries expect you to know how to set environment variables and let you do it however you want.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cool this filled in some knowledge gaps for me around environment variables. thanks @jayqi!
@r-b-g-b this is ready for another review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All good!
* set ability to turn off caching for prediction * reomve cache dir from cli * test cache dir is set but not used * rename to MODEL_CACHE_DIR for consistency * rename cache_dir to model_cache_dir * move video cache dir into configs * remove unneeded code as caching is off by default * rename to video_cache_dir for clarity * remove * do not support setting in configs * put back in settings * lint and such * rebase fix * put cache_dir and cleanup option on video laoder config * add tests for caching * get empty vlc if none is passed * put within func to avoid writing to real path * add logging * bug fix * reomve old change * loguru uses warning not warn * load dotenv in init * setup logger in init and rename to log_level for simplicity * cleanup does not change hash * fix test * lint * add regression test * do not use load_dotenv
* add official models dir * remove tf events * move to outside template folder * simplify model mapping; set batch size default to 2 * move official models into zamba * add vlc for slowfast * update manifest for official models * update filename * update filepath * must import models for them to be in available_models * remove config details used for training so these can be the baseline * update config * add backbone finetune config that was used in case defaults change * update labels for time dist; remove slowfast until it is retrained * update per new model * formatting * helper command * already path * Clarify caching (#131) * set ability to turn off caching for prediction * reomve cache dir from cli * test cache dir is set but not used * rename to MODEL_CACHE_DIR for consistency * rename cache_dir to model_cache_dir * move video cache dir into configs * remove unneeded code as caching is off by default * rename to video_cache_dir for clarity * remove * do not support setting in configs * put back in settings * lint and such * rebase fix * put cache_dir and cleanup option on video laoder config * add tests for caching * get empty vlc if none is passed * put within func to avoid writing to real path * add logging * bug fix * reomve old change * loguru uses warning not warn * load dotenv in init * setup logger in init and rename to log_level for simplicity * cleanup does not change hash * fix test * lint * add regression test * do not use load_dotenv * Fix transform look up bug (#135) * separate out weight lookup from class based dict * fix lookup * set split propotions to none if using split in labels * tense * shorten * fix failing test * add updated slowfast model * add test for specifying checkpoint * address deprectation warnings * update templates to reflect final models * fix imports * lint
* delete v1 * remove hidden files * Zamba v1 docs port (#108) * workflow tests * remove tests workflow from v2 * Great migration (#113) * copy everything * first round of deletion * more deletions * more deletions * put everything under zamba * remove old readme * replace zamba_algorithms with zamba * integrate makefile * changed files * updates from save dir overwrite * add gitignore * use gitignore from zamba algo * remove load metadata * remove old dirs * lint * remove load metadata * remove load_metadata test * fix datamodule tests * simplify time dist model since there is only one * flake8 * remove unused file * put back lost code * back to long name because that is was model is registered as * species is no longer on model * use models without species prefix and zamba backbone finetune params; * rename time dist head since there is only one * further remove species * use lstrip instead * passing tests * add missing underscore * update links to setup.cfg * specify branches in workflow * update setup.cfg * Update setup.cfg Co-authored-by: Peter Bull <pjbull@gmail.com> * Update setup.cfg Co-authored-by: Peter Bull <pjbull@gmail.com> Co-authored-by: Peter Bull <pjbull@gmail.com> * bug fix (#114) * Skip on failure, actually drop duplicates, clarify logging (#115) * skip on failure, actually drop duplicates, clarify logging * correct comment * only apply to root validators * First draft of zamba v2 docs (#112) * Update zamba predict in cli doc * Update zamba train in cli doc * Formatting * Start updating index * Update ungulate pic * update to chimp image * specify image sizes * update image sizes * add labels to images * Format * add labels to frames * note about top div * start updating inputs_outputs * tweaks to index * Use new vids in index * use full size images * remove some todo notes * comments to cli md * comments to cli md * comments to cli md * Add comments to index * Comments to inputs_outputs * Incorporate preliminary comments * Start working on install * Start working on algorithms * Start reorganizing * Redo quickstart to have a python section * Add basics of slowfast * Use better compressed images * Try and view with new TOC * Add more model info * Start configuration documentation * Add all configurations * Update TOC with configurations doc page * Add where output is saved * Add placeholders for user tutorials * v2 updates * Add info about yolox * Send us labeled videos * Make available models top level * TODO about model details * Better explanation of YAML v CLI args * Copy edits and formatting * Update install page * Add basic use to available models * Update mkdocs.yml * Start working on python package page * Make TOC work correctly * Add explanation of more of the parameter options * Start user tutorials * Update based on new default help text * Copy editing * Add where to reference for python package * Remove repetitive parameter explanation from cli.md * Update TOC names * Update tutorial names * Add python package and training to quickstart * copy edits * Status before big reorganization * Restructure * Write predict tutorial * Other updates for restructuring * YAML configs page * Update homepage index * Save path for CSV * Update tutorials * Update default video loading configs * Record of extra code, then will delete * Remove extra md files * Save path updates * delete more extra files * copy edits * Update args * TODO updates * New default save path * Update megadetector explanation * Better ffmpeg installation instructions * Fix checkpoint saving behavior * Typos * Add troubleshooting sections and format python code blocks better * Add video loading requirements to yaml-config.md * Improve default model descriptions * Begin incorporating feedback * consistent FFmpeg capitalization * index testing * index testing * More feedback * Add debugging page * yaml feedback * Seprate common advanced options page * beautiful magic tabs and more copy editing * Tabs on model page * Support training models with just two classes (#117) * two class metric updates * Update tests/conftest.py Co-authored-by: Emily Miller <ejm714@gmail.com> Co-authored-by: Emily Miller <ejm714@gmail.com> * 32 not 31 species (#118) * Add erdantic diagram (#121) * add erdantic diagram * add missing docs reqs * unpin * Write out splits + fix for videos that cannot be loaded (#119) * write out splits * write out zeros for videos that cannot be loaded * create missing save dir * typo * Add OSes and codecov to github actions (#124) * add oses and codecov * try w/o windows * msvc config * test with env var * Give up on Windows for now * [V2] Clean up dev dependencies (#125) * Remove unneeded dev dependencies * Split out docs and lint deps for faster installation Co-authored-by: Jay Qi <jayqi@users.noreply.github.com> * [V2] Release and docs CI (#126) * Add make command to build distributions * Add --version command * Add README * Test built distributions; add failed build notify * Reorder config to keep build metadata together at the top * Rename docs-publish to docs-master * Rename docs-master to docs-latest * Add maintainers docs * Add release workflow * Add mike for versioned docs * Run docs-latest on push to v2 * Remove code tags in nav * Add __main__ for python -m entrypoint Co-authored-by: Jay Qi <jayqi@users.noreply.github.com> * Get default video loader config for models (#127) * get default video loader for model * get default vlc if not specified * set save dir to tmp path so splits.csv is written to temp dir * keep vlc in so we can use for training * add evenly sample * use loguru * set num workers to 3 * remove cpu count * revert change * show ffmpeg error in debug mode * add link to wiki * Rename README.md to MAINTAINING.md (#128) * API Reference with mkdocstrings (#129) * Remove upload_models.py script * Add API Reference with mkdocstrings * Minor documentation tweaks to fix rendering * Use sections instead of auto-expand * Need to install zamba to use mkdocstrings * Put back upload_models.py * Wrap upload_models in if __name__ = '__main__' Co-authored-by: Jay Qi <jayqi@users.noreply.github.com> * Resized test assets (#130) * add resized test assets * resampled to fps=1 * update labels per new videos * update with new video * fix failed test * start python tutorial * Update v2 docs with new changes (#132) * clarify labels filepath columns * start updating video loader config list * update configs * Update video organization reqs * video_width/height and num_workers * filepaths * default num_workers * update num_workers in CLI * writing out train_config and predict_config yamls * splits.csv * updates based on netlify preview * add screencast video demo * fix linting fail * updates for caching PR 131 * typo * use best terminal video from asciinema * specific explanation of frame_selection_height * correct default batch size * add frame_selection_width v model_input_width * talk about num_workers more * flake8 fix * PR feedback * update home page language * fix typo and change from sections to expand * reduce toc depth to 2 to allow expand * use megadetectorlite consistently * move api reference to end * add train data size recs * Update docs/docs/train-tutorial.md rephrase train data size rec Co-authored-by: Emily Miller <ejm714@gmail.com> * Enable nav index page for models * make contribute section header * Edit MDLite language * put yolox back in Co-authored-by: Emily Miller <ejm714@gmail.com> Co-authored-by: Jay Qi <jayqi@users.noreply.github.com> * Expose final three models (#133) * add official models dir * remove tf events * move to outside template folder * simplify model mapping; set batch size default to 2 * move official models into zamba * add vlc for slowfast * update manifest for official models * update filename * update filepath * must import models for them to be in available_models * remove config details used for training so these can be the baseline * update config * add backbone finetune config that was used in case defaults change * update labels for time dist; remove slowfast until it is retrained * update per new model * formatting * helper command * already path * Clarify caching (#131) * set ability to turn off caching for prediction * reomve cache dir from cli * test cache dir is set but not used * rename to MODEL_CACHE_DIR for consistency * rename cache_dir to model_cache_dir * move video cache dir into configs * remove unneeded code as caching is off by default * rename to video_cache_dir for clarity * remove * do not support setting in configs * put back in settings * lint and such * rebase fix * put cache_dir and cleanup option on video laoder config * add tests for caching * get empty vlc if none is passed * put within func to avoid writing to real path * add logging * bug fix * reomve old change * loguru uses warning not warn * load dotenv in init * setup logger in init and rename to log_level for simplicity * cleanup does not change hash * fix test * lint * add regression test * do not use load_dotenv * Fix transform look up bug (#135) * separate out weight lookup from class based dict * fix lookup * set split propotions to none if using split in labels * tense * shorten * fix failing test * add updated slowfast model * add test for specifying checkpoint * address deprectation warnings * update templates to reflect final models * fix imports * lint * fix make docs-serve * Update link to models and contribute pages * update erdantic schema * Update references to other pages * Add links to final github source code to docs (#141) * add working links with v2 branch * use master branch instead of v2 * refer to api docs instead of github * Automated workflow for publishing models (#140) * WIP * update weight lookup and hash config to get model name * use auto workflow * auto workflow * add logging * rename to publish models * auto workflow * add code for handling finetuned models * add docs for adding a new model * checklist * update model name * put models in official models dir on s3 * add lookup function * auto workflow * use model dir which is clearer than ckpt * update docs * new slowfast model * rglob needs list * update euro config * only hash once and have function do a lookup * make this not model specific * clarify comments * remove checkpoint from train_config * docs * function to get only model params * use function to subset train config * update yamls based on exclude rather than include * only set checkpoint if not from_scratch, otherwise look up hparams * missing import * lint * checkpoint will now be none for training fromm scratch * remove space * docs-latest needs zamba (#143) * Add densepose model (#142) * Add densepose model * Update densepose install and testing * format * remove errant test yml * more generic codec * append coverage * Update README for v2 (#134) * screencast video demo * video not showing * video show method * autoplay vid * vid * updated terminal video * test video embedding * terminal video * Embed video in README * start updating text * full draft of text * delete vid and update text * editing based on page rendering * rephrase * make same as index.md * update monkey video * edit README * improve formatting * add make docs-setup * delete index.md raw page * hybrid README * update * updates * correct contribute link * add changelog from HISTORY.md * create changelog folder placeholder * delete old changelog md * Simplify caching (#145) * simplify caching * nest function * simplify * move * alphabetize * lint * rename to cache path * fix dataset * Change default backbone finetuning, auto_lr_find, and fix persistent workers bug (#148) * Default patience from 3 to 5 * Default for auto_lr_find to False * Change default unfreeze_backbone_at_epoch to 5 * Fix persistent_workers bug * Revert changes to templates * 496 docs review (#150) * add denspose and update save dir * commit two autogenerated files * changelog edits * contribute edits * WIP edits for models * finish edits for models and separate out densepose * full list of parks * config options * vscode format * move config guide into tutorials * extra options * windows not tested * tutorial edits * remove extra nb * put save path back * quickstart edits * vscode formatting * finish quickstart edits * train tutorial * add template section * logging * ffpmeg install in readme * finetuning * capitalization * remove ffmpmeg * date * add densepose * tweak * simplify * fix densepose video link * caps and tensorboard * add help * remove pythong piece since this is focused on yaml files * more tweaks * edit history not change log * copy edits * fix changelog * table * table * typo * alphabetize * table bug * table edits * Simplify save_dir and some directory -> dir renames (#151) * wip renames * renames in docs * readme * data dir renamme in docs * rename in code from data_directory to data_dir * maintaining update * fix capitalization * further updates * tweak * do not overwrite * add overwrite save dir * add overwrite save dir to config * update configs with all info * use full train configuration * only upload if does not exist * tests for save * overwrite param * better set up and test for overwrite * docs * update docs with overwrite * from overwrite_save_dir to overwrite * missed rename * remove machine specific from vlc * unindent so test actually runs * check for local and cached checkpoints * should be and * write out predict config before preds start like we do for train config * update all configs and use only first 10 digits of hash * dry run check after save is configured; more robust test * reorder * show save directory * copy edits * update template * fix test * lower case for consistency * fix test * Fix docs links (#155) * update readme links * update makefile Co-authored-by: Emily Miller <ejm714@gmail.com> Co-authored-by: ejm714 <emily@drivendata.org> Co-authored-by: Katie Wetstone <46792169+klwetstone@users.noreply.github.com> Co-authored-by: Jay Qi <2721979+jayqi@users.noreply.github.com> Co-authored-by: Jay Qi <jayqi@users.noreply.github.com> Co-authored-by: Katie Wetstone <klwetstone@gmail.com> Co-authored-by: Robert Gibboni <robert@drivendata.org>
* delete v1 * remove hidden files * Zamba v1 docs port (#108) * workflow tests * remove tests workflow from v2 * Great migration (#113) * copy everything * first round of deletion * more deletions * more deletions * put everything under zamba * remove old readme * replace zamba_algorithms with zamba * integrate makefile * changed files * updates from save dir overwrite * add gitignore * use gitignore from zamba algo * remove load metadata * remove old dirs * lint * remove load metadata * remove load_metadata test * fix datamodule tests * simplify time dist model since there is only one * flake8 * remove unused file * put back lost code * back to long name because that is was model is registered as * species is no longer on model * use models without species prefix and zamba backbone finetune params; * rename time dist head since there is only one * further remove species * use lstrip instead * passing tests * add missing underscore * update links to setup.cfg * specify branches in workflow * update setup.cfg * Update setup.cfg Co-authored-by: Peter Bull <pjbull@gmail.com> * Update setup.cfg Co-authored-by: Peter Bull <pjbull@gmail.com> Co-authored-by: Peter Bull <pjbull@gmail.com> * bug fix (#114) * Skip on failure, actually drop duplicates, clarify logging (#115) * skip on failure, actually drop duplicates, clarify logging * correct comment * only apply to root validators * First draft of zamba v2 docs (#112) * Update zamba predict in cli doc * Update zamba train in cli doc * Formatting * Start updating index * Update ungulate pic * update to chimp image * specify image sizes * update image sizes * add labels to images * Format * add labels to frames * note about top div * start updating inputs_outputs * tweaks to index * Use new vids in index * use full size images * remove some todo notes * comments to cli md * comments to cli md * comments to cli md * Add comments to index * Comments to inputs_outputs * Incorporate preliminary comments * Start working on install * Start working on algorithms * Start reorganizing * Redo quickstart to have a python section * Add basics of slowfast * Use better compressed images * Try and view with new TOC * Add more model info * Start configuration documentation * Add all configurations * Update TOC with configurations doc page * Add where output is saved * Add placeholders for user tutorials * v2 updates * Add info about yolox * Send us labeled videos * Make available models top level * TODO about model details * Better explanation of YAML v CLI args * Copy edits and formatting * Update install page * Add basic use to available models * Update mkdocs.yml * Start working on python package page * Make TOC work correctly * Add explanation of more of the parameter options * Start user tutorials * Update based on new default help text * Copy editing * Add where to reference for python package * Remove repetitive parameter explanation from cli.md * Update TOC names * Update tutorial names * Add python package and training to quickstart * copy edits * Status before big reorganization * Restructure * Write predict tutorial * Other updates for restructuring * YAML configs page * Update homepage index * Save path for CSV * Update tutorials * Update default video loading configs * Record of extra code, then will delete * Remove extra md files * Save path updates * delete more extra files * copy edits * Update args * TODO updates * New default save path * Update megadetector explanation * Better ffmpeg installation instructions * Fix checkpoint saving behavior * Typos * Add troubleshooting sections and format python code blocks better * Add video loading requirements to yaml-config.md * Improve default model descriptions * Begin incorporating feedback * consistent FFmpeg capitalization * index testing * index testing * More feedback * Add debugging page * yaml feedback * Seprate common advanced options page * beautiful magic tabs and more copy editing * Tabs on model page * Support training models with just two classes (#117) * two class metric updates * Update tests/conftest.py Co-authored-by: Emily Miller <ejm714@gmail.com> Co-authored-by: Emily Miller <ejm714@gmail.com> * 32 not 31 species (#118) * Add erdantic diagram (#121) * add erdantic diagram * add missing docs reqs * unpin * Write out splits + fix for videos that cannot be loaded (#119) * write out splits * write out zeros for videos that cannot be loaded * create missing save dir * typo * Add OSes and codecov to github actions (#124) * add oses and codecov * try w/o windows * msvc config * test with env var * Give up on Windows for now * [V2] Clean up dev dependencies (#125) * Remove unneeded dev dependencies * Split out docs and lint deps for faster installation Co-authored-by: Jay Qi <jayqi@users.noreply.github.com> * [V2] Release and docs CI (#126) * Add make command to build distributions * Add --version command * Add README * Test built distributions; add failed build notify * Reorder config to keep build metadata together at the top * Rename docs-publish to docs-master * Rename docs-master to docs-latest * Add maintainers docs * Add release workflow * Add mike for versioned docs * Run docs-latest on push to v2 * Remove code tags in nav * Add __main__ for python -m entrypoint Co-authored-by: Jay Qi <jayqi@users.noreply.github.com> * Get default video loader config for models (#127) * get default video loader for model * get default vlc if not specified * set save dir to tmp path so splits.csv is written to temp dir * keep vlc in so we can use for training * add evenly sample * use loguru * set num workers to 3 * remove cpu count * revert change * show ffmpeg error in debug mode * add link to wiki * Rename README.md to MAINTAINING.md (#128) * API Reference with mkdocstrings (#129) * Remove upload_models.py script * Add API Reference with mkdocstrings * Minor documentation tweaks to fix rendering * Use sections instead of auto-expand * Need to install zamba to use mkdocstrings * Put back upload_models.py * Wrap upload_models in if __name__ = '__main__' Co-authored-by: Jay Qi <jayqi@users.noreply.github.com> * Resized test assets (#130) * add resized test assets * resampled to fps=1 * update labels per new videos * update with new video * fix failed test * start python tutorial * Update v2 docs with new changes (#132) * clarify labels filepath columns * start updating video loader config list * update configs * Update video organization reqs * video_width/height and num_workers * filepaths * default num_workers * update num_workers in CLI * writing out train_config and predict_config yamls * splits.csv * updates based on netlify preview * add screencast video demo * fix linting fail * updates for caching PR 131 * typo * use best terminal video from asciinema * specific explanation of frame_selection_height * correct default batch size * add frame_selection_width v model_input_width * talk about num_workers more * flake8 fix * PR feedback * update home page language * fix typo and change from sections to expand * reduce toc depth to 2 to allow expand * use megadetectorlite consistently * move api reference to end * add train data size recs * Update docs/docs/train-tutorial.md rephrase train data size rec Co-authored-by: Emily Miller <ejm714@gmail.com> * Enable nav index page for models * make contribute section header * Edit MDLite language * put yolox back in Co-authored-by: Emily Miller <ejm714@gmail.com> Co-authored-by: Jay Qi <jayqi@users.noreply.github.com> * Expose final three models (#133) * add official models dir * remove tf events * move to outside template folder * simplify model mapping; set batch size default to 2 * move official models into zamba * add vlc for slowfast * update manifest for official models * update filename * update filepath * must import models for them to be in available_models * remove config details used for training so these can be the baseline * update config * add backbone finetune config that was used in case defaults change * update labels for time dist; remove slowfast until it is retrained * update per new model * formatting * helper command * already path * Clarify caching (#131) * set ability to turn off caching for prediction * reomve cache dir from cli * test cache dir is set but not used * rename to MODEL_CACHE_DIR for consistency * rename cache_dir to model_cache_dir * move video cache dir into configs * remove unneeded code as caching is off by default * rename to video_cache_dir for clarity * remove * do not support setting in configs * put back in settings * lint and such * rebase fix * put cache_dir and cleanup option on video laoder config * add tests for caching * get empty vlc if none is passed * put within func to avoid writing to real path * add logging * bug fix * reomve old change * loguru uses warning not warn * load dotenv in init * setup logger in init and rename to log_level for simplicity * cleanup does not change hash * fix test * lint * add regression test * do not use load_dotenv * Fix transform look up bug (#135) * separate out weight lookup from class based dict * fix lookup * set split propotions to none if using split in labels * tense * shorten * fix failing test * add updated slowfast model * add test for specifying checkpoint * address deprectation warnings * update templates to reflect final models * fix imports * lint * fix make docs-serve * Update link to models and contribute pages * update erdantic schema * Update references to other pages * Add links to final github source code to docs (#141) * add working links with v2 branch * use master branch instead of v2 * refer to api docs instead of github * Automated workflow for publishing models (#140) * WIP * update weight lookup and hash config to get model name * use auto workflow * auto workflow * add logging * rename to publish models * auto workflow * add code for handling finetuned models * add docs for adding a new model * checklist * update model name * put models in official models dir on s3 * add lookup function * auto workflow * use model dir which is clearer than ckpt * update docs * new slowfast model * rglob needs list * update euro config * only hash once and have function do a lookup * make this not model specific * clarify comments * remove checkpoint from train_config * docs * function to get only model params * use function to subset train config * update yamls based on exclude rather than include * only set checkpoint if not from_scratch, otherwise look up hparams * missing import * lint * checkpoint will now be none for training fromm scratch * remove space * docs-latest needs zamba (#143) * Add densepose model (#142) * Add densepose model * Update densepose install and testing * format * remove errant test yml * more generic codec * append coverage * Update README for v2 (#134) * screencast video demo * video not showing * video show method * autoplay vid * vid * updated terminal video * test video embedding * terminal video * Embed video in README * start updating text * full draft of text * delete vid and update text * editing based on page rendering * rephrase * make same as index.md * update monkey video * edit README * improve formatting * add make docs-setup * delete index.md raw page * hybrid README * update * updates * correct contribute link * add changelog from HISTORY.md * create changelog folder placeholder * delete old changelog md * Simplify caching (#145) * simplify caching * nest function * simplify * move * alphabetize * lint * rename to cache path * fix dataset * Change default backbone finetuning, auto_lr_find, and fix persistent workers bug (#148) * Default patience from 3 to 5 * Default for auto_lr_find to False * Change default unfreeze_backbone_at_epoch to 5 * Fix persistent_workers bug * Revert changes to templates * 496 docs review (#150) * add denspose and update save dir * commit two autogenerated files * changelog edits * contribute edits * WIP edits for models * finish edits for models and separate out densepose * full list of parks * config options * vscode format * move config guide into tutorials * extra options * windows not tested * tutorial edits * remove extra nb * put save path back * quickstart edits * vscode formatting * finish quickstart edits * train tutorial * add template section * logging * ffpmeg install in readme * finetuning * capitalization * remove ffmpmeg * date * add densepose * tweak * simplify * fix densepose video link * caps and tensorboard * add help * remove pythong piece since this is focused on yaml files * more tweaks * edit history not change log * copy edits * fix changelog * table * table * typo * alphabetize * table bug * table edits * Simplify save_dir and some directory -> dir renames (#151) * wip renames * renames in docs * readme * data dir renamme in docs * rename in code from data_directory to data_dir * maintaining update * fix capitalization * further updates * tweak * do not overwrite * add overwrite save dir * add overwrite save dir to config * update configs with all info * use full train configuration * only upload if does not exist * tests for save * overwrite param * better set up and test for overwrite * docs * update docs with overwrite * from overwrite_save_dir to overwrite * missed rename * remove machine specific from vlc * unindent so test actually runs * check for local and cached checkpoints * should be and * write out predict config before preds start like we do for train config * update all configs and use only first 10 digits of hash * dry run check after save is configured; more robust test * reorder * show save directory * copy edits * update template * fix test * lower case for consistency * fix test * Fix docs links (#155) * update readme links * update makefile Co-authored-by: Emily Miller <ejm714@gmail.com> Co-authored-by: ejm714 <emily@drivendata.org> Co-authored-by: Katie Wetstone <46792169+klwetstone@users.noreply.github.com> Co-authored-by: Jay Qi <2721979+jayqi@users.noreply.github.com> Co-authored-by: Jay Qi <jayqi@users.noreply.github.com> Co-authored-by: Katie Wetstone <klwetstone@gmail.com> Co-authored-by: Robert Gibboni <robert@drivendata.org>
Main change:
model cache dir (where weights get downloaded to)
model_cache_dir
for claritymodel_cache_dir
in the CLI given it's easy to set in configs and isn't a main option we expect people to setvideo cache dir (for .npy files)
cache_dir
to VideoLoaderConfig which gets used by load_video_frames, so that you can either use your env variable (VIDEO_CACHE_DIR) or specify on the config.LOAD_VIDEO_FRAMES_CACHE_DIR
toVIDEO_CACHE_DIR
Outstanding:
Bonus fixes:
warning
notwarn