Clarify caching #131

ejm714 · 2021-10-14T18:07:58Z

Main change:

expose the load video frames cache in the configs (set up npy_cache_factory to do so)

model cache dir (where weights get downloaded to)

rename cache_dir to model_cache_dir for clarity
don't expose the model_cache_dir in the CLI given it's easy to set in configs and isn't a main option we expect people to set

video cache dir (for .npy files)

adds a cache_dir to VideoLoaderConfig which gets used by load_video_frames, so that you can either use your env variable (VIDEO_CACHE_DIR) or specify on the config.
adds a cleanup option to VideoLoaderConfig which allows you to delete or keep the cache after training/inference
renames environment variable LOAD_VIDEO_FRAMES_CACHE_DIR to VIDEO_CACHE_DIR

Outstanding:

tests for caching
add logging

Bonus fixes:

moves loguru set up and env loading to init so we always have access to these vars
logger users warning not warn

github-actions · 2021-10-14T18:10:44Z

🚀 Deployed on https://deploy-preview-131--silly-keller-664934.netlify.app

codecov · 2021-10-14T18:25:39Z

Codecov Report

Merging #131 (a258bf8) into 492-final-models (bb08926) will increase coverage by 1.5%.
The diff coverage is 100.0%.

❗ Current head a258bf8 differs from pull request most recent head e1b5729. Consider uploading reports for the commit e1b5729 to get more accurate results

@@                Coverage Diff                 @@
##           492-final-models    #131     +/-   ##
==================================================
+ Coverage              85.7%   87.2%   +1.5%     
==================================================
  Files                    25      25             
  Lines                  1462    1472     +10     
==================================================
+ Hits                   1253    1284     +31     
+ Misses                  209     188     -21

Impacted Files	Coverage Δ
zamba/cli.py	`90.9% <ø> (+1.2%)`	⬆️
zamba/models/model_manager.py	`85.5% <ø> (ø)`
zamba/settings.py	`90.0% <ø> (+8.7%)`	⬆️
zamba/data/video.py	`79.7% <100.0%> (+8.8%)`	⬆️
zamba/models/config.py	`97.0% <100.0%> (ø)`
zamba/pytorch/dataloaders.py	`92.5% <100.0%> (ø)`

ejm714 · 2021-10-16T19:08:01Z

492-final-models changed the final model checkpoint names, meaning if we PR this branch into v2 the tests will fail for unrelated reasons (e.g. older model files not found). Therefore, it makes more sense to merge this into 492 rather than into v2. Tests are all passing locally but won't be run on CI since this isn't into v2. Ready for review @r-b-g-b

zamba/data/video.py

r-b-g-b

Just want to think about whether we really want to support loading .env + a few small changes.

r-b-g-b · 2021-10-18T14:49:27Z

zamba/__init__.py

@@ -1,3 +1,15 @@
+from dotenv import load_dotenv


Are we sure we want to load env variables from .env? I'm not sure I've ever seen that in a library -- I might thing that it's more standard to have the user source .env or MYENV=thing python ... explicitly.

One unexpected behavior that could result from using dotenv is that find_dotenv will keep walking up the file hierarchy until it finds something, so if the user hasn't made a .env in their working directory, it may pull in env variables from an .env in a higher directory.

I think given that we cover most (all?) of the configuration in our config.yaml files, we may want to get rid of dotenv entirely.

I guess one case I can imagine is for cache dirs -- since those are machine-specific, a user might want to leave it out of the yaml file and include the machine-specific dir in an .env file on that machine.

oh good call, we don't want .envs from outside the working dir. you can specify logging and video suffixes only through env variables (not on the configs), but agree i think we can remove the load_dotenv and just have users to source .env as you say

@r-b-g-b if the user does source .env, when we do os.getenv("LOG_LEVEL"), it won't find anything. is there a way to use environment variable without needing to do load_dotenv anywhere in the codebase?

Huh well I learned something about source, which is it doesn't export to all subprocesses in the current shell. So with

# .env LOG_LEVEL=DEBUG

$ source .env $ echo $LOG_LEVEL DEBUG # current shell sees the env

but

import os; os.getenv("LOG_LEVEL") # None # Python does not see the env

Apparently, the way to do this is either:

Use an export statement in the .env file:

# .env export LOG_LEVEL=DEBUG

source .env

Use export after sourcing:

# .env LOG_LEVEL=DEBUG

source .env export LOG_LEVEL

Use the allexport shell option.

set -a source .env

@jayqi do you know of any examples in our open source libraries where environment variables are used? just trying to figure out what the right convention is here

In cloudpathlib, we read the environment variables from os.environ but we don't do load_dotenv or anything like that.

In cloudpathlib, we read the environment variables from os.environ but we don't do load_dotenv or anything like that.

so the expectation is that the user does option 2 or 3 as robert listed above?

For some exposition on source and .env files:

source just executes all of the commands of the shell script in your current shell. So

LOG_LEVEL=DEBUG

is just setting a regular shell variable. export is specifically the command for setting an environment variable. That's why you need export in your bash profiles and stuff when setting env vars.

https://www.cs.ait.ac.th/~on/O/oreilly/unix/upt/ch06_01.htm#:~:text=The%20difference%20between%20environment%20variables,including%20another%20shell%20(38.4).

I guess usually libraries expect you to know how to set environment variables and let you do it however you want.

cool this filled in some knowledge gaps for me around environment variables. thanks @jayqi!

zamba/data/video.py

zamba/settings.py

ejm714 · 2021-10-18T19:49:18Z

@r-b-g-b this is ready for another review

r-b-g-b

All good!

* set ability to turn off caching for prediction * reomve cache dir from cli * test cache dir is set but not used * rename to MODEL_CACHE_DIR for consistency * rename cache_dir to model_cache_dir * move video cache dir into configs * remove unneeded code as caching is off by default * rename to video_cache_dir for clarity * remove * do not support setting in configs * put back in settings * lint and such * rebase fix * put cache_dir and cleanup option on video laoder config * add tests for caching * get empty vlc if none is passed * put within func to avoid writing to real path * add logging * bug fix * reomve old change * loguru uses warning not warn * load dotenv in init * setup logger in init and rename to log_level for simplicity * cleanup does not change hash * fix test * lint * add regression test * do not use load_dotenv

* add official models dir * remove tf events * move to outside template folder * simplify model mapping; set batch size default to 2 * move official models into zamba * add vlc for slowfast * update manifest for official models * update filename * update filepath * must import models for them to be in available_models * remove config details used for training so these can be the baseline * update config * add backbone finetune config that was used in case defaults change * update labels for time dist; remove slowfast until it is retrained * update per new model * formatting * helper command * already path * Clarify caching (#131) * set ability to turn off caching for prediction * reomve cache dir from cli * test cache dir is set but not used * rename to MODEL_CACHE_DIR for consistency * rename cache_dir to model_cache_dir * move video cache dir into configs * remove unneeded code as caching is off by default * rename to video_cache_dir for clarity * remove * do not support setting in configs * put back in settings * lint and such * rebase fix * put cache_dir and cleanup option on video laoder config * add tests for caching * get empty vlc if none is passed * put within func to avoid writing to real path * add logging * bug fix * reomve old change * loguru uses warning not warn * load dotenv in init * setup logger in init and rename to log_level for simplicity * cleanup does not change hash * fix test * lint * add regression test * do not use load_dotenv * Fix transform look up bug (#135) * separate out weight lookup from class based dict * fix lookup * set split propotions to none if using split in labels * tense * shorten * fix failing test * add updated slowfast model * add test for specifying checkpoint * address deprectation warnings * update templates to reflect final models * fix imports * lint

* delete v1 * remove hidden files * Zamba v1 docs port (#108) * workflow tests * remove tests workflow from v2 * Great migration (#113) * copy everything * first round of deletion * more deletions * more deletions * put everything under zamba * remove old readme * replace zamba_algorithms with zamba * integrate makefile * changed files * updates from save dir overwrite * add gitignore * use gitignore from zamba algo * remove load metadata * remove old dirs * lint * remove load metadata * remove load_metadata test * fix datamodule tests * simplify time dist model since there is only one * flake8 * remove unused file * put back lost code * back to long name because that is was model is registered as * species is no longer on model * use models without species prefix and zamba backbone finetune params; * rename time dist head since there is only one * further remove species * use lstrip instead * passing tests * add missing underscore * update links to setup.cfg * specify branches in workflow * update setup.cfg * Update setup.cfg Co-authored-by: Peter Bull <pjbull@gmail.com> * Update setup.cfg Co-authored-by: Peter Bull <pjbull@gmail.com> Co-authored-by: Peter Bull <pjbull@gmail.com> * bug fix (#114) * Skip on failure, actually drop duplicates, clarify logging (#115) * skip on failure, actually drop duplicates, clarify logging * correct comment * only apply to root validators * First draft of zamba v2 docs (#112) * Update zamba predict in cli doc * Update zamba train in cli doc * Formatting * Start updating index * Update ungulate pic * update to chimp image * specify image sizes * update image sizes * add labels to images * Format * add labels to frames * note about top div * start updating inputs_outputs * tweaks to index * Use new vids in index * use full size images * remove some todo notes * comments to cli md * comments to cli md * comments to cli md * Add comments to index * Comments to inputs_outputs * Incorporate preliminary comments * Start working on install * Start working on algorithms * Start reorganizing * Redo quickstart to have a python section * Add basics of slowfast * Use better compressed images * Try and view with new TOC * Add more model info * Start configuration documentation * Add all configurations * Update TOC with configurations doc page * Add where output is saved * Add placeholders for user tutorials * v2 updates * Add info about yolox * Send us labeled videos * Make available models top level * TODO about model details * Better explanation of YAML v CLI args * Copy edits and formatting * Update install page * Add basic use to available models * Update mkdocs.yml * Start working on python package page * Make TOC work correctly * Add explanation of more of the parameter options * Start user tutorials * Update based on new default help text * Copy editing * Add where to reference for python package * Remove repetitive parameter explanation from cli.md * Update TOC names * Update tutorial names * Add python package and training to quickstart * copy edits * Status before big reorganization * Restructure * Write predict tutorial * Other updates for restructuring * YAML configs page * Update homepage index * Save path for CSV * Update tutorials * Update default video loading configs * Record of extra code, then will delete * Remove extra md files * Save path updates * delete more extra files * copy edits * Update args * TODO updates * New default save path * Update megadetector explanation * Better ffmpeg installation instructions * Fix checkpoint saving behavior * Typos * Add troubleshooting sections and format python code blocks better * Add video loading requirements to yaml-config.md * Improve default model descriptions * Begin incorporating feedback * consistent FFmpeg capitalization * index testing * index testing * More feedback * Add debugging page * yaml feedback * Seprate common advanced options page * beautiful magic tabs and more copy editing * Tabs on model page * Support training models with just two classes (#117) * two class metric updates * Update tests/conftest.py Co-authored-by: Emily Miller <ejm714@gmail.com> Co-authored-by: Emily Miller <ejm714@gmail.com> * 32 not 31 species (#118) * Add erdantic diagram (#121) * add erdantic diagram * add missing docs reqs * unpin * Write out splits + fix for videos that cannot be loaded (#119) * write out splits * write out zeros for videos that cannot be loaded * create missing save dir * typo * Add OSes and codecov to github actions (#124) * add oses and codecov * try w/o windows * msvc config * test with env var * Give up on Windows for now * [V2] Clean up dev dependencies (#125) * Remove unneeded dev dependencies * Split out docs and lint deps for faster installation Co-authored-by: Jay Qi <jayqi@users.noreply.github.com> * [V2] Release and docs CI (#126) * Add make command to build distributions * Add --version command * Add README * Test built distributions; add failed build notify * Reorder config to keep build metadata together at the top * Rename docs-publish to docs-master * Rename docs-master to docs-latest * Add maintainers docs * Add release workflow * Add mike for versioned docs * Run docs-latest on push to v2 * Remove code tags in nav * Add __main__ for python -m entrypoint Co-authored-by: Jay Qi <jayqi@users.noreply.github.com> * Get default video loader config for models (#127) * get default video loader for model * get default vlc if not specified * set save dir to tmp path so splits.csv is written to temp dir * keep vlc in so we can use for training * add evenly sample * use loguru * set num workers to 3 * remove cpu count * revert change * show ffmpeg error in debug mode * add link to wiki * Rename README.md to MAINTAINING.md (#128) * API Reference with mkdocstrings (#129) * Remove upload_models.py script * Add API Reference with mkdocstrings * Minor documentation tweaks to fix rendering * Use sections instead of auto-expand * Need to install zamba to use mkdocstrings * Put back upload_models.py * Wrap upload_models in if __name__ = '__main__' Co-authored-by: Jay Qi <jayqi@users.noreply.github.com> * Resized test assets (#130) * add resized test assets * resampled to fps=1 * update labels per new videos * update with new video * fix failed test * start python tutorial * Update v2 docs with new changes (#132) * clarify labels filepath columns * start updating video loader config list * update configs * Update video organization reqs * video_width/height and num_workers * filepaths * default num_workers * update num_workers in CLI * writing out train_config and predict_config yamls * splits.csv * updates based on netlify preview * add screencast video demo * fix linting fail * updates for caching PR 131 * typo * use best terminal video from asciinema * specific explanation of frame_selection_height * correct default batch size * add frame_selection_width v model_input_width * talk about num_workers more * flake8 fix * PR feedback * update home page language * fix typo and change from sections to expand * reduce toc depth to 2 to allow expand * use megadetectorlite consistently * move api reference to end * add train data size recs * Update docs/docs/train-tutorial.md rephrase train data size rec Co-authored-by: Emily Miller <ejm714@gmail.com> * Enable nav index page for models * make contribute section header * Edit MDLite language * put yolox back in Co-authored-by: Emily Miller <ejm714@gmail.com> Co-authored-by: Jay Qi <jayqi@users.noreply.github.com> * Expose final three models (#133) * add official models dir * remove tf events * move to outside template folder * simplify model mapping; set batch size default to 2 * move official models into zamba * add vlc for slowfast * update manifest for official models * update filename * update filepath * must import models for them to be in available_models * remove config details used for training so these can be the baseline * update config * add backbone finetune config that was used in case defaults change * update labels for time dist; remove slowfast until it is retrained * update per new model * formatting * helper command * already path * Clarify caching (#131) * set ability to turn off caching for prediction * reomve cache dir from cli * test cache dir is set but not used * rename to MODEL_CACHE_DIR for consistency * rename cache_dir to model_cache_dir * move video cache dir into configs * remove unneeded code as caching is off by default * rename to video_cache_dir for clarity * remove * do not support setting in configs * put back in settings * lint and such * rebase fix * put cache_dir and cleanup option on video laoder config * add tests for caching * get empty vlc if none is passed * put within func to avoid writing to real path * add logging * bug fix * reomve old change * loguru uses warning not warn * load dotenv in init * setup logger in init and rename to log_level for simplicity * cleanup does not change hash * fix test * lint * add regression test * do not use load_dotenv * Fix transform look up bug (#135) * separate out weight lookup from class based dict * fix lookup * set split propotions to none if using split in labels * tense * shorten * fix failing test * add updated slowfast model * add test for specifying checkpoint * address deprectation warnings * update templates to reflect final models * fix imports * lint * fix make docs-serve * Update link to models and contribute pages * update erdantic schema * Update references to other pages * Add links to final github source code to docs (#141) * add working links with v2 branch * use master branch instead of v2 * refer to api docs instead of github * Automated workflow for publishing models (#140) * WIP * update weight lookup and hash config to get model name * use auto workflow * auto workflow * add logging * rename to publish models * auto workflow * add code for handling finetuned models * add docs for adding a new model * checklist * update model name * put models in official models dir on s3 * add lookup function * auto workflow * use model dir which is clearer than ckpt * update docs * new slowfast model * rglob needs list * update euro config * only hash once and have function do a lookup * make this not model specific * clarify comments * remove checkpoint from train_config * docs * function to get only model params * use function to subset train config * update yamls based on exclude rather than include * only set checkpoint if not from_scratch, otherwise look up hparams * missing import * lint * checkpoint will now be none for training fromm scratch * remove space * docs-latest needs zamba (#143) * Add densepose model (#142) * Add densepose model * Update densepose install and testing * format * remove errant test yml * more generic codec * append coverage * Update README for v2 (#134) * screencast video demo * video not showing * video show method * autoplay vid * vid * updated terminal video * test video embedding * terminal video * Embed video in README * start updating text * full draft of text * delete vid and update text * editing based on page rendering * rephrase * make same as index.md * update monkey video * edit README * improve formatting * add make docs-setup * delete index.md raw page * hybrid README * update * updates * correct contribute link * add changelog from HISTORY.md * create changelog folder placeholder * delete old changelog md * Simplify caching (#145) * simplify caching * nest function * simplify * move * alphabetize * lint * rename to cache path * fix dataset * Change default backbone finetuning, auto_lr_find, and fix persistent workers bug (#148) * Default patience from 3 to 5 * Default for auto_lr_find to False * Change default unfreeze_backbone_at_epoch to 5 * Fix persistent_workers bug * Revert changes to templates * 496 docs review (#150) * add denspose and update save dir * commit two autogenerated files * changelog edits * contribute edits * WIP edits for models * finish edits for models and separate out densepose * full list of parks * config options * vscode format * move config guide into tutorials * extra options * windows not tested * tutorial edits * remove extra nb * put save path back * quickstart edits * vscode formatting * finish quickstart edits * train tutorial * add template section * logging * ffpmeg install in readme * finetuning * capitalization * remove ffmpmeg * date * add densepose * tweak * simplify * fix densepose video link * caps and tensorboard * add help * remove pythong piece since this is focused on yaml files * more tweaks * edit history not change log * copy edits * fix changelog * table * table * typo * alphabetize * table bug * table edits * Simplify save_dir and some directory -> dir renames (#151) * wip renames * renames in docs * readme * data dir renamme in docs * rename in code from data_directory to data_dir * maintaining update * fix capitalization * further updates * tweak * do not overwrite * add overwrite save dir * add overwrite save dir to config * update configs with all info * use full train configuration * only upload if does not exist * tests for save * overwrite param * better set up and test for overwrite * docs * update docs with overwrite * from overwrite_save_dir to overwrite * missed rename * remove machine specific from vlc * unindent so test actually runs * check for local and cached checkpoints * should be and * write out predict config before preds start like we do for train config * update all configs and use only first 10 digits of hash * dry run check after save is configured; more robust test * reorder * show save directory * copy edits * update template * fix test * lower case for consistency * fix test * Fix docs links (#155) * update readme links * update makefile Co-authored-by: Emily Miller <ejm714@gmail.com> Co-authored-by: ejm714 <emily@drivendata.org> Co-authored-by: Katie Wetstone <46792169+klwetstone@users.noreply.github.com> Co-authored-by: Jay Qi <2721979+jayqi@users.noreply.github.com> Co-authored-by: Jay Qi <jayqi@users.noreply.github.com> Co-authored-by: Katie Wetstone <klwetstone@gmail.com> Co-authored-by: Robert Gibboni <robert@drivendata.org>

ejm714 marked this pull request as ready for review October 14, 2021 22:28

klwetstone mentioned this pull request Oct 15, 2021

Update v2 docs with new changes #132

Merged

ejm714 changed the base branch from v2 to 492-final-models October 16, 2021 18:40

ejm714 added 20 commits October 16, 2021 11:51

set ability to turn off caching for prediction

61592b7

reomve cache dir from cli

d623912

test cache dir is set but not used

4e61adf

rename to MODEL_CACHE_DIR for consistency

bc96a27

rename cache_dir to model_cache_dir

240ba24

move video cache dir into configs

e9322b5

remove unneeded code as caching is off by default

59fe563

rename to video_cache_dir for clarity

10d27ae

remove

838d874

do not support setting in configs

531810f

put back in settings

6ded38e

lint and such

980f98d

rebase fix

529cfd2

put cache_dir and cleanup option on video laoder config

ce74a90

add tests for caching

530e27b

get empty vlc if none is passed

ebf42ef

put within func to avoid writing to real path

98f9eda

add logging

252da67

bug fix

b49d374

reomve old change

5fecc96

ejm714 force-pushed the 451-no-caching-predict branch from f227acf to 5fecc96 Compare October 16, 2021 19:00

ejm714 requested a review from r-b-g-b October 16, 2021 19:05

ejm714 commented Oct 16, 2021

View reviewed changes

zamba/data/video.py Show resolved Hide resolved

loguru uses warning not warn

4be02c5

ejm714 and others added 4 commits October 17, 2021 00:21

load dotenv in init

6fa7378

setup logger in init and rename to log_level for simplicity

f6174f2

cleanup does not change hash

7093b44

fix test

7f43799

ejm714 requested a review from pjbull October 17, 2021 17:31

r-b-g-b requested changes Oct 18, 2021

View reviewed changes

ejm714 added 3 commits October 18, 2021 09:47

lint

50c8806

add regression test

8bad7d0

do not use load_dotenv

e1b5729

r-b-g-b approved these changes Oct 18, 2021

View reviewed changes

ejm714 merged commit 8ef2340 into 492-final-models Oct 18, 2021

ejm714 deleted the 451-no-caching-predict branch October 18, 2021 19:54

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Clarify caching #131

Clarify caching #131

ejm714 commented Oct 14, 2021 •

edited

github-actions bot commented Oct 14, 2021 •

edited

codecov bot commented Oct 14, 2021 •

edited

ejm714 commented Oct 16, 2021

r-b-g-b left a comment

r-b-g-b Oct 18, 2021

ejm714 Oct 18, 2021

ejm714 Oct 18, 2021

r-b-g-b Oct 18, 2021 •

edited

ejm714 Oct 18, 2021

jayqi Oct 18, 2021

ejm714 Oct 18, 2021

jayqi Oct 18, 2021 •

edited

jayqi Oct 18, 2021

ejm714 Oct 18, 2021

ejm714 commented Oct 18, 2021

r-b-g-b left a comment

Clarify caching #131

Clarify caching #131

Conversation

ejm714 commented Oct 14, 2021 • edited

model cache dir (where weights get downloaded to)

video cache dir (for .npy files)

github-actions bot commented Oct 14, 2021 • edited

codecov bot commented Oct 14, 2021 • edited

Codecov Report

ejm714 commented Oct 16, 2021

r-b-g-b left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

r-b-g-b Oct 18, 2021 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jayqi Oct 18, 2021 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ejm714 commented Oct 18, 2021

r-b-g-b left a comment

Choose a reason for hiding this comment

ejm714 commented Oct 14, 2021 •

edited

github-actions bot commented Oct 14, 2021 •

edited

codecov bot commented Oct 14, 2021 •

edited

r-b-g-b Oct 18, 2021 •

edited

jayqi Oct 18, 2021 •

edited