Skip to content
This repository has been archived by the owner on Nov 16, 2023. It is now read-only.

prepare_dutchf3.py vertical sample locations drops patches at the bottom of the volume - leads to worse results in that region #259

Closed
yalaudah opened this issue Apr 16, 2020 · 0 comments · Fixed by #282 or #386
Assignees
Labels
Prior: High Type: Accuracy related to increasing performance accuracy

Comments

@yalaudah
Copy link
Contributor

In the prepare_dutchf3.py script, the vertical locations are computed in a way such that the bottom of the volume is not sampled. This is because the code doesn't use any padding around the volume, and therefore, unless the depth of the volume is an integer multiple of patch_size, the bottom part of the volume would not be sampled. This significantly affects the results for deeper classes (Zechstein, Scruff, Chalk).

image

Here's the line of code that needs to be fixed (either padding the volume, or manually adding patches from the bottom of the volume).

vert_locations = range(0, depth - patch_size, patch_size)

Also, this is also a problem for horizontal patch location, but to a lesser degree, since the data doesn't change much in the horizontal direction.

@yalaudah yalaudah self-assigned this Apr 16, 2020
@yalaudah yalaudah added the Prior: ShowStopper Blocks a release label Apr 16, 2020
@yalaudah yalaudah added this to the Ver 1.0 milestone Apr 16, 2020
@yalaudah yalaudah linked a pull request Apr 22, 2020 that will close this issue
2 tasks
@maxkazmsft maxkazmsft added Type: Correctness anything to do with repo being technically correct Type: Enhancement This an enhancement to an existing feature labels May 7, 2020
maxkazmsft added a commit that referenced this issue May 21, 2020
* correctness branch setup (#251)

* created correctnes branch, trimmed experiments to Dutch F3 only

* trivial change to re-trigger build

* dummy PR to re-trigger malfunctioning builds

* reducing scope further (#258)

* created correctnes branch, trimmed experiments to Dutch F3 only

* trivial change to re-trigger build

* dummy PR to re-trigger malfunctioning builds

* reducing scope of the correctness branch further

* added branch triggers

* 214 Ignite 0.3.0 upgrade (#261)

* upgraded to Ignite 0.3.0 and fixed upgrade compatibility

* added seeds and modified notebook for ignite 0.3.0

* updated code and tests to work with ignite 0.3.0

* made code consistent with Ignite 0.3.0 as much as possible

* fixed iterator epoch_length bug by subsetting validation set

* applied same fix to the notebook

* bugfix in distributed train.py

* increased distributed tests to 2 batched - hoping for one batch per GPU

* resolved rebase conflict

* added seeds and modified notebook for ignite 0.3.0

* updated code and tests to work with ignite 0.3.0

* made code consistent with Ignite 0.3.0 as much as possible

* fixed iterator epoch_length bug by subsetting validation set

* applied same fix to the notebook

* bugfix in distributed train.py

* increased distributed tests to 2 batched - hoping for one batch per GPU

* update docker readme (#262)

Co-authored-by: maxkazmsft <maxkaz@microsoft.com>

* tagged all TODOs with issues on github (and created issues) (#278)

* created correctnes branch, trimmed experiments to Dutch F3 only

* trivial change to re-trigger build

* dummy PR to re-trigger malfunctioning builds

* resolved merge conflict

* flagged all non-contrib TODO with github issues

* resolved rebase conflict

* resolved merge conflict

* cleaned up archaic voxel code

* Refactoring train.py, removing OpenCV, adding training results to Tensborboard, bug fixes (#264)

I think moving forward, we'll use smaller PRs. But here are the changes in this one:

Fixes issue #236 that involves rewriting a big portion of train.py such that:

    All the tensorboard event handlers are organized in tensorboard_handlers.py and only called in train.py to log training and validation results in Tensorboard
    The code logs the same results for training and validation. Also, it adds the class IoU score as well.
    All single-use functions (e.g. _select_max, _tensor_to_numpy, _select_pred_and_mask) are lambda functions now
    The code is organized into more meaningful "chunks".. e.g. all the optimizer-related code should be together if possible, same thing for logging, configuration, loaders, tensorboard, ..etc.

In addition:

    Fixed a visualization bug where the seismic images where not normalized correctly. This solves Issue #217.
    Fixed a visualization bug where the predictions where not masked where the input image was padded. This improves the ability to visually inspect and evaluate the results. This solves Issue #230.
    Fixes a potential issue where Tensorboard can crash when a large training batchsize is used. Now the number of images visualized in Tensorboard from every batch has an upper limit.
    Completely removed OpenCV as a dependency from the DeepSeismic Repo. It was only used in a small part of the code where it wasn't really necessary, and OpenCV is a huge library.
    Fixes Issue #218 where the epoch number for the images in Tensorboard was always logged as 1 (therefore, not allowing use to see the epoch number of the different results in Tensorboard.
    Removes the HorovodLRScheduler class since its no longer used
    Removes toolz.take from Debug mode, and uses PyTorch's native Subset() dataset class
    Changes default patch size for the HRNet model to 256
    In addition to several other minor changes


Co-authored-by: Yazeed Alaudah <yalaudah@users.noreply.github.com>
Co-authored-by: Ubuntu <yazeed@yaalauda-dsvm-nd24.jsxrnelwp15e1jpgk5vvfmbzyb.bx.internal.cloudapp.net>
Co-authored-by: Max Kaznady <maxkaz@microsoft.com>

* Fixes training/validation overlap #143, #233, #253, and #259 (#282)

* Correctness single GPU switch (#290)

* resolved rebase conflict

* resolved merge conflict

* resolved rebase conflict

* resolved merge conflict

* reverted multi-GPU builds to run on single GPU

* 249r3 (#283)

* resolved rebase conflict

* resolved merge conflict

* resolved rebase conflict

* resolved merge conflict

* wrote the bulk of checkerboard example

* finished checkerboard generator

* resolved merge conflict

* resolved rebase conflict

* got binary dataset to run

* finished first implementation mockup - commit before rebase

* made sure rebase went well manually

* added new files

* resolved PR comments and made tests work

* fixed build error

* fixed build VM errors

* more fixes to get the test to pass

* fixed n_classes issue in data.py

* fixed notebook as well

* cleared notebook run cell

* trivial commit to restart builds

* addressed PR comments

* moved notebook tests to main build pipeline

* fixed checkerboard label precision

* relaxed performance tests for now

* resolved merge conflict

* resolved merge conflict

* fixed build error

* resolved merge conflicts

* fixed another merge mistake

* enabling development on docker (#291)

* 289: correctness metrics and tighter tests (#293)

* resolved rebase conflict

* resolved merge conflict

* resolved rebase conflict

* resolved merge conflict

* wrote the bulk of checkerboard example

* finished checkerboard generator

* resolved merge conflict

* resolved rebase conflict

* got binary dataset to run

* finished first implementation mockup - commit before rebase

* made sure rebase went well manually

* added new files

* resolved PR comments and made tests work

* fixed build error

* fixed build VM errors

* more fixes to get the test to pass

* fixed n_classes issue in data.py

* fixed notebook as well

* cleared notebook run cell

* trivial commit to restart builds

* addressed PR comments

* moved notebook tests to main build pipeline

* fixed checkerboard label precision

* relaxed performance tests for now

* resolved merge conflict

* resolved merge conflict

* fixed build error

* resolved merge conflicts

* fixed another merge mistake

* resolved rebase conflict

* resolved rebase 2

* resolved merge conflict

* resolved merge conflict

* adding new logging

* added better logging - cleaner - debugged metrics on checkerboard dataset

* resolved rebase conflict

* resolved merge conflict

* resolved merge conflict

* resolved merge conflict

* resolved rebase 2

* resolved merge conflict

* updated notebook with the changes

* addressed PR comments

* addressed another PR comment

* uniform colormap and correctness tests (#295)

* correctness code good for PR review

* addressed PR comments

* added data dumps to the code

* all dumps work properly now

* fixed build error, added binary dataset

* done - now need to test

* finished dev build script

* updates to tests to run on local machine as well we build

* updated gradient direction in gen_checkerboard

* increased Dutch F3 timeout

Co-authored-by: yalaudah <yazeed.alaudah@microsoft.com>
Co-authored-by: Yazeed Alaudah <yalaudah@users.noreply.github.com>
Co-authored-by: Ubuntu <yazeed@yaalauda-dsvm-nd24.jsxrnelwp15e1jpgk5vvfmbzyb.bx.internal.cloudapp.net>
@maxkazmsft maxkazmsft changed the title prepare_dutchf3.py vertical sample locations drops patches at the bottom of the volume, leads to worse results in that region prepare_dutchf3.py vertical sample locations drops patches at the bottom of the volume - leads to worse results in that region May 26, 2020
@maxkazmsft maxkazmsft added Prior: High Type: Accuracy related to increasing performance accuracy and removed Prior: ShowStopper Blocks a release Type: Correctness anything to do with repo being technically correct Type: Enhancement This an enhancement to an existing feature labels May 28, 2020
@yalaudah yalaudah closed this as completed Jul 1, 2020
maxkazmsft added a commit that referenced this issue Jul 7, 2020
* cleaning up files which are no longer needed

* fixes after removing forking workflow (#322)

* PR to resolve merge issues

* updated main build as well

* added ability to read in git branch name directly

* manually updated the other files

* fixed number of classes for main build tests (#327)

* fixed number of classes for main build tests

* corrected DATASET.ROOT in builds

* added dev build script

* Fixes for development inside the docker container (#335)

* Fix the mound command for the HRNet pretrained model in the docker readme

* Properly catch InvalidGitRepository exception

* make repo paths consistent with non-docker runs -- this way configs paths do not need to be changed

* Properly catch InvalidGitRepository exception in train.py

* Readme update (#337)

* README updates

* Removing user specific path from config

Authored-by: Fatemeh Zamanian <Fatemeh.Zamanian@microsoft.com>

* Fixing #324 and #325 (#338)

* update colormap to a non-discrete one -- fixes #324

* fix mask_to_disk to normalize by n_classes

* changes to test.py

* Updating data.py

* bug fix

* increased timeout time for main_build

* retrigger build

* retrigger the build

* increase timeout

* fixes 318 (#339)

* finished 318

* increased checkerboard test timeout

* fix 333 (#340)

* added label correction to train gradient

* changing the gradient data generator to take inline/crossline argument conssistent with the patchloader

* changing variable name to be more descriptive


Co-authored-by: maxkazmsft <maxkaz@microsoft.com>

* bug fix to model predictions (#345)

* replace hrnet with seresnet in experiments - provides stable default model (#343)

* PR to fix #342 (#347)

* intermediate work for normalization

* 1) normalize function runs based on global MIN and MAX 2) has a error handling for division by zero, np.finfo 3) decode_segmap normalizes the label/mask based on the n_calsses

* global normalization added to test.py

* increasing the threshold on timeout

* trigger

* revert

* idk what happened

* increase timeout

* picking up global min and max

* passing config to TrainPatchLoader to facilitate access to global min and max and other attr in low level functions, WIP

* removed print statement

* changed section loaders

* updated test for min and max from config too

* adde MIN and MAX to config

* notebook modified for loaders

* another dataloader in notebook

* readme update

* changed the default values for min max, updated the docstring for loaders, removed suppressed lines

* debug

* merging work from CSE team into main staging branch (#357)

* Adding content to interpretation README (#171)

* added sharat, weehyong to authors

* adding a download script for Dutch F3 dataset

* Adding script instructions for dutch f3

* Update README.md

prepare scripts expect root level directory for dutch f3 dataset. (it is downloaded into $dir/data by the script)

* Adding readme text for the notebooks and checking if config is correctly setup

* fixing prepare script example

* Adding more content to interpretation README

* Update README.md

* Update HRNet_Penobscot_demo_notebook.ipynb

Co-authored-by: maxkazmsft <maxkaz@microsoft.com>

* Updates to prepare dutchf3 (#185)

* updating patch to patch_size when we are using it as an integer

* modifying the range function in the prepare_dutchf3 script to get all of our data

* updating path to logging.config so the script can locate it

* manually reverting back log path to troubleshoot build tests

* updating patch to patch_size for testing on preprocessing scripts

* updating patch to patch_size where applicable in ablation.sh

* reverting back changes on ablation.sh to validate build pass

* update patch to patch_size in ablation.sh (#191)

Co-authored-by: Sharat Chikkerur <sharat.chikkerur@gmail.com>

* TestLoader's support for custom paths (#196)

* Add testloader support for custom paths.

* Add test

* added file name workaround for Train*Loader classes

* adding comments and clean up

* Remove legacy code.

* Remove parameters that dont exist in init() from documentation.

* Add unit tests for data loaders in dutchf3

* moved unit tests

Co-authored-by: maxkazmsft <maxkaz@microsoft.com>

* select contiguous data splits for val and train (#200)

* select contiguous data splits for test and train

* changed data-dir to data_dir as arg to prepare_dutchf3.py

* update script with new required parameter label_file

* ignoring split_alaudah_et_al_19 as it is not updated

* changed TEST to VALIDATION for clarity in the code

* included job to run scripts unit test

* Fix val/train split and add tests

* adjust to consider the whole horz_lines

* update environment - gitpython version

* Segy Converter Utility (#199)

* Add convert_segy utility script and related notebooks

* add segy files to .gitignore

* readability update

* Create methods for normalizing and clipping separately.

* Add comment

* update file paths

* cleanup tests and terminology for the normalization/clipping code

* update notes to provide more context for using the script

* Add tests for clipping.

* Update comments

* added Microsoft copyright

* Update root README

* Add a flag to turn on clipping in dataprep script.

* Remove hard coded values and fix _filder_data method.

* Fix some minor issues pointed out on comments.

* Remove unused lib.

* Rename notebooks to impose order; set env; move all def funtions into utils; improve comments in notebooks; and include code example to run prepare_dutchf3.py

* Label missing data with 255.

* Remove cell with --help command.

* Add notebooks to test pipeline.

* grammer edits

* update notebook output and utils naming

* fix output dir error and cleanup notebook

* fix yaml indent error in notebooks_build.yml

* fix merge issues and job name errors

* debugging the build pipeline

* combine notebook tests for segy converter since they are dependent on each other

Co-authored-by: Geisa Faustino <32823639+GeisaFaustino@users.noreply.github.com>

* Azureml train pipeline (#195)

* initial add of azure ml pipeline

* update references and dependencies

* fix integration tests

* remove incomplete tests

* add azureml requirements.txt for dutchf3 local patch and update pipeline config

* add empty __init__.py to cv_lib dutchf3

* Get train,py to run in pipeline

* allow output dir in train.py

* Clean up README and __init__

* only pass output if available and use input dir for output in train.py

* update comment in train.py

* updating azureml_requirements to only pull from /master

* removing windows guidance in azureml_pipelines/README.md

* adding .env.example

* adding azureml config example

* updating documentation in azureml_pipelines README.md

* updating main README.md to refer to AML guidance documentation

* updating AML README.md to include additional guidance to cancel runs

* adding documentation on AzureML pipelines in the AML README.me

* adding files needed section for AML training run

* including hyperlink in format poiniting to additional detail on Azure Machine Learning pipeslines in AML README.md

* removing the mention of VSCode in the AML README.md

* fixing typo

* modifying config to pipeline configuration in README.md

* fixing typo in README.md

* adding documentation on how to create a blob container and copy data onto it

* adding documentation on blob storage guidance

* adding guidance on how to get the subscription id

* adding guidance to activate environment and then run the kick off train pipeline from ROOT

* adding ability to pass in experiement name and different pipeline configuration to kickoff_train_pipeline.py

* adding Microsoft Corporation Copyright to kickoff_train_pipeline.py

* fixing format in README.md

* adding trouble shooting section in README.md for connection to subscription

* updating troubleshooting title

* adding guidance on how to download the config.json from the Azure Portal in the README.md

* adding additional guidance and information on AzureML compute targets and naming conventions

* changing the configuation file example to only include the train step that is currently supported

* updating config to pipeline configuration when applicable

* adding link to Microsoft docs for additional information on pipeline steps

* updated AML test build definitions

* updated AML test build definitions

* adding job to aml_build.yml

* updating example config for testing

* modifying the test_train_pipeline.py to have appropriate number of pipeline steps and other required modifications

* updating AML_pipeline_tests in aml_build.yml to consume environment variables

* updating scriptType, sciptLocation, and inlineScript in aml_build.yml

* trivial commit to re-trigger broken build pipelines

* fix to aml yml build to use env vars for secrets and everything else

* another yml fix

* another yml fix

* reverting structure format of jobs for aml_build pipeline tests

* updating path to test_train_pipeline.py

* aml_pipeline_tests timed out, extending timeoutInMinutes from 10 to 40

* adding additional pytest

* adding az login

* updating variables in aml pipeline tests

Co-authored-by: Anna Zietlow <annamzietlow@gmail.com>
Co-authored-by: maxkazmsft <maxkaz@microsoft.com>

* moved contrib contributions around from CSE

* fixed dataloader tests - updated them to work with new code from staging branch

* segyconverter notebooks and tests run and pass; updated documentation

* added test job for segy converter notebooks

* removed AML training pipeline from this release

* fixed training model tolerance precision in the tests - wasn't working

* fixed train.py build issues after the merge

* addressed PR comments

* fixed bug in check_performance

Co-authored-by: Sharat Chikkerur <sharat.chikkerur@microsoft.com>
Co-authored-by: kirasoderstrom <kirasoderstrom@gmail.com>
Co-authored-by: Sharat Chikkerur <sharat.chikkerur@gmail.com>
Co-authored-by: Geisa Faustino <32823639+GeisaFaustino@users.noreply.github.com>
Co-authored-by: Ricardo Squassina Lee <8495707+squassina@users.noreply.github.com>
Co-authored-by: Michael Zawacki <mikezawacki@hotmail.com>
Co-authored-by: Anna Zietlow <annamzietlow@gmail.com>

* make tests simpler (#368)

* removed Dutch F3 job from main_build

* fixed a bug in data subset in debug mode

* modified epoch numbers to pass the performance checks, checkedout check_performance from Max's branch

* modified get_data_for_builds.sh to set up checkerboard data for smaller size, minor improvements on gen_checkerboard

* send all the batches, disabled the performance checks for patch_deconvnet

* added comment to enable tests for patch_deconvnet after debugging, renamed gen_checkerboard, added options to new arg per Max's suggestion

* Replace HRNet with SEResNet model in the notebook (#362)

* replaced HRNet with SEResNet model in the notebook

* removed debugging cell info

* fixed bug where resnet_unet model wasn't loading the pre-trained version in the notebook

* fixed build VM problems

* Multi-GPU training support (#359)

* Data flow tests (#375)

* renamed checkerboard job name

* restructured default outputs from test.py to be dumped under output dir and not debug dir

* test.py output re-org

* removed outdated variable from check_performance.py

* intermediate work

* intermediate work

* bunch of intermediate works

* changing args for different trainings

* final to run dev_build"

* remove print statements

* removed print statement

* removed suppressed lines

* added assertion error msg

* added assertion error msg, one intential bug to test

* testing a stupid bug

* debug

* omg

* final

* trigger build

* fixed multi-GPU termination in train.py (#379)

* PR to fix #371 and #372  (#380)

* added learning rate to logs

* changed epoch for patch_deconvnet, and enabled the tests

* removed TODOs

* changed tensorflow pinned version (#387)

* changed tensorflow pinned version

* trigger build

* closes 385 (#389)

* Fixing #259 by adding symmetric padding along depth direction  (#386)

* BYOD Penobscot (#390)

* minor updates to files

* added penobscot conversion code

* docker build test (#388)

* added a new job to test bulding the docker, for now it is daisy-chained to the end

* this is just a TEST

* test

* test

* remove old image

* debug

* debug

* test

* debug

* enabled all the jobs

* quick fix

* removing non-tagged iamges

Co-authored-by: maxkazmsft <maxkaz@microsoft.com>

* added missing license headers and fixed formatting (#391)

* added missing license headers and fixed formatting

* some more license headers

* updated documentation to close 354 and 381 (#392)

* fix test.py and notebook issues (#394)

* resolved conflicts for 0.2 release (#396)

* V00.01.00003 release (#356)

* cleaning up files which are no longer needed

* fixes after removing forking workflow (#322)

* PR to resolve merge issues

* updated main build as well

* added ability to read in git branch name directly

* manually updated the other files

* fixed number of classes for main build tests (#327)

* fixed number of classes for main build tests

* corrected DATASET.ROOT in builds

* added dev build script

* Fixes for development inside the docker container (#335)

* Fix the mound command for the HRNet pretrained model in the docker readme

* Properly catch InvalidGitRepository exception

* make repo paths consistent with non-docker runs -- this way configs paths do not need to be changed

* Properly catch InvalidGitRepository exception in train.py

* Readme update (#337)

* README updates

* Removing user specific path from config

Authored-by: Fatemeh Zamanian <Fatemeh.Zamanian@microsoft.com>

* Fixing #324 and #325 (#338)

* update colormap to a non-discrete one -- fixes #324

* fix mask_to_disk to normalize by n_classes

* changes to test.py

* Updating data.py

* bug fix

* increased timeout time for main_build

* retrigger build

* retrigger the build

* increase timeout

* fixes 318 (#339)

* finished 318

* increased checkerboard test timeout

* fix 333 (#340)

* added label correction to train gradient

* changing the gradient data generator to take inline/crossline argument conssistent with the patchloader

* changing variable name to be more descriptive


Co-authored-by: maxkazmsft <maxkaz@microsoft.com>

* bug fix to model predictions (#345)

* replace hrnet with seresnet in experiments - provides stable default model (#343)

Co-authored-by: yalaudah <yazeed.alaudah@microsoft.com>
Co-authored-by: Fatemeh <fazamani@microsoft.com>

* typos

Co-authored-by: yalaudah <yazeed.alaudah@microsoft.com>
Co-authored-by: Fatemeh <fazamani@microsoft.com>

Co-authored-by: yalaudah <yazeed.alaudah@microsoft.com>
Co-authored-by: Fatemeh <fazamani@microsoft.com>
Co-authored-by: Sharat Chikkerur <sharat.chikkerur@microsoft.com>
Co-authored-by: kirasoderstrom <kirasoderstrom@gmail.com>
Co-authored-by: Sharat Chikkerur <sharat.chikkerur@gmail.com>
Co-authored-by: Geisa Faustino <32823639+GeisaFaustino@users.noreply.github.com>
Co-authored-by: Ricardo Squassina Lee <8495707+squassina@users.noreply.github.com>
Co-authored-by: Michael Zawacki <mikezawacki@hotmail.com>
Co-authored-by: Anna Zietlow <annamzietlow@gmail.com>
maxkazmsft added a commit that referenced this issue Jul 22, 2020
* cleaning up files which are no longer needed

* fixes after removing forking workflow (#322)

* PR to resolve merge issues

* updated main build as well

* added ability to read in git branch name directly

* manually updated the other files

* fixed number of classes for main build tests (#327)

* fixed number of classes for main build tests

* corrected DATASET.ROOT in builds

* added dev build script

* Fixes for development inside the docker container (#335)

* Fix the mound command for the HRNet pretrained model in the docker readme

* Properly catch InvalidGitRepository exception

* make repo paths consistent with non-docker runs -- this way configs paths do not need to be changed

* Properly catch InvalidGitRepository exception in train.py

* Readme update (#337)

* README updates

* Removing user specific path from config

Authored-by: Fatemeh Zamanian <Fatemeh.Zamanian@microsoft.com>

* Fixing #324 and #325 (#338)

* update colormap to a non-discrete one -- fixes #324

* fix mask_to_disk to normalize by n_classes

* changes to test.py

* Updating data.py

* bug fix

* increased timeout time for main_build

* retrigger build

* retrigger the build

* increase timeout

* fixes 318 (#339)

* finished 318

* increased checkerboard test timeout

* fix 333 (#340)

* added label correction to train gradient

* changing the gradient data generator to take inline/crossline argument conssistent with the patchloader

* changing variable name to be more descriptive


Co-authored-by: maxkazmsft <maxkaz@microsoft.com>

* bug fix to model predictions (#345)

* replace hrnet with seresnet in experiments - provides stable default model (#343)

* PR to fix #342 (#347)

* intermediate work for normalization

* 1) normalize function runs based on global MIN and MAX 2) has a error handling for division by zero, np.finfo 3) decode_segmap normalizes the label/mask based on the n_calsses

* global normalization added to test.py

* increasing the threshold on timeout

* trigger

* revert

* idk what happened

* increase timeout

* picking up global min and max

* passing config to TrainPatchLoader to facilitate access to global min and max and other attr in low level functions, WIP

* removed print statement

* changed section loaders

* updated test for min and max from config too

* adde MIN and MAX to config

* notebook modified for loaders

* another dataloader in notebook

* readme update

* changed the default values for min max, updated the docstring for loaders, removed suppressed lines

* debug

* merging work from CSE team into main staging branch (#357)

* Adding content to interpretation README (#171)

* added sharat, weehyong to authors

* adding a download script for Dutch F3 dataset

* Adding script instructions for dutch f3

* Update README.md

prepare scripts expect root level directory for dutch f3 dataset. (it is downloaded into $dir/data by the script)

* Adding readme text for the notebooks and checking if config is correctly setup

* fixing prepare script example

* Adding more content to interpretation README

* Update README.md

* Update HRNet_Penobscot_demo_notebook.ipynb

Co-authored-by: maxkazmsft <maxkaz@microsoft.com>

* Updates to prepare dutchf3 (#185)

* updating patch to patch_size when we are using it as an integer

* modifying the range function in the prepare_dutchf3 script to get all of our data

* updating path to logging.config so the script can locate it

* manually reverting back log path to troubleshoot build tests

* updating patch to patch_size for testing on preprocessing scripts

* updating patch to patch_size where applicable in ablation.sh

* reverting back changes on ablation.sh to validate build pass

* update patch to patch_size in ablation.sh (#191)

Co-authored-by: Sharat Chikkerur <sharat.chikkerur@gmail.com>

* TestLoader's support for custom paths (#196)

* Add testloader support for custom paths.

* Add test

* added file name workaround for Train*Loader classes

* adding comments and clean up

* Remove legacy code.

* Remove parameters that dont exist in init() from documentation.

* Add unit tests for data loaders in dutchf3

* moved unit tests

Co-authored-by: maxkazmsft <maxkaz@microsoft.com>

* select contiguous data splits for val and train (#200)

* select contiguous data splits for test and train

* changed data-dir to data_dir as arg to prepare_dutchf3.py

* update script with new required parameter label_file

* ignoring split_alaudah_et_al_19 as it is not updated

* changed TEST to VALIDATION for clarity in the code

* included job to run scripts unit test

* Fix val/train split and add tests

* adjust to consider the whole horz_lines

* update environment - gitpython version

* Segy Converter Utility (#199)

* Add convert_segy utility script and related notebooks

* add segy files to .gitignore

* readability update

* Create methods for normalizing and clipping separately.

* Add comment

* update file paths

* cleanup tests and terminology for the normalization/clipping code

* update notes to provide more context for using the script

* Add tests for clipping.

* Update comments

* added Microsoft copyright

* Update root README

* Add a flag to turn on clipping in dataprep script.

* Remove hard coded values and fix _filder_data method.

* Fix some minor issues pointed out on comments.

* Remove unused lib.

* Rename notebooks to impose order; set env; move all def funtions into utils; improve comments in notebooks; and include code example to run prepare_dutchf3.py

* Label missing data with 255.

* Remove cell with --help command.

* Add notebooks to test pipeline.

* grammer edits

* update notebook output and utils naming

* fix output dir error and cleanup notebook

* fix yaml indent error in notebooks_build.yml

* fix merge issues and job name errors

* debugging the build pipeline

* combine notebook tests for segy converter since they are dependent on each other

Co-authored-by: Geisa Faustino <32823639+GeisaFaustino@users.noreply.github.com>

* Azureml train pipeline (#195)

* initial add of azure ml pipeline

* update references and dependencies

* fix integration tests

* remove incomplete tests

* add azureml requirements.txt for dutchf3 local patch and update pipeline config

* add empty __init__.py to cv_lib dutchf3

* Get train,py to run in pipeline

* allow output dir in train.py

* Clean up README and __init__

* only pass output if available and use input dir for output in train.py

* update comment in train.py

* updating azureml_requirements to only pull from /master

* removing windows guidance in azureml_pipelines/README.md

* adding .env.example

* adding azureml config example

* updating documentation in azureml_pipelines README.md

* updating main README.md to refer to AML guidance documentation

* updating AML README.md to include additional guidance to cancel runs

* adding documentation on AzureML pipelines in the AML README.me

* adding files needed section for AML training run

* including hyperlink in format poiniting to additional detail on Azure Machine Learning pipeslines in AML README.md

* removing the mention of VSCode in the AML README.md

* fixing typo

* modifying config to pipeline configuration in README.md

* fixing typo in README.md

* adding documentation on how to create a blob container and copy data onto it

* adding documentation on blob storage guidance

* adding guidance on how to get the subscription id

* adding guidance to activate environment and then run the kick off train pipeline from ROOT

* adding ability to pass in experiement name and different pipeline configuration to kickoff_train_pipeline.py

* adding Microsoft Corporation Copyright to kickoff_train_pipeline.py

* fixing format in README.md

* adding trouble shooting section in README.md for connection to subscription

* updating troubleshooting title

* adding guidance on how to download the config.json from the Azure Portal in the README.md

* adding additional guidance and information on AzureML compute targets and naming conventions

* changing the configuation file example to only include the train step that is currently supported

* updating config to pipeline configuration when applicable

* adding link to Microsoft docs for additional information on pipeline steps

* updated AML test build definitions

* updated AML test build definitions

* adding job to aml_build.yml

* updating example config for testing

* modifying the test_train_pipeline.py to have appropriate number of pipeline steps and other required modifications

* updating AML_pipeline_tests in aml_build.yml to consume environment variables

* updating scriptType, sciptLocation, and inlineScript in aml_build.yml

* trivial commit to re-trigger broken build pipelines

* fix to aml yml build to use env vars for secrets and everything else

* another yml fix

* another yml fix

* reverting structure format of jobs for aml_build pipeline tests

* updating path to test_train_pipeline.py

* aml_pipeline_tests timed out, extending timeoutInMinutes from 10 to 40

* adding additional pytest

* adding az login

* updating variables in aml pipeline tests

Co-authored-by: Anna Zietlow <annamzietlow@gmail.com>
Co-authored-by: maxkazmsft <maxkaz@microsoft.com>

* moved contrib contributions around from CSE

* fixed dataloader tests - updated them to work with new code from staging branch

* segyconverter notebooks and tests run and pass; updated documentation

* added test job for segy converter notebooks

* removed AML training pipeline from this release

* fixed training model tolerance precision in the tests - wasn't working

* fixed train.py build issues after the merge

* addressed PR comments

* fixed bug in check_performance

Co-authored-by: Sharat Chikkerur <sharat.chikkerur@microsoft.com>
Co-authored-by: kirasoderstrom <kirasoderstrom@gmail.com>
Co-authored-by: Sharat Chikkerur <sharat.chikkerur@gmail.com>
Co-authored-by: Geisa Faustino <32823639+GeisaFaustino@users.noreply.github.com>
Co-authored-by: Ricardo Squassina Lee <8495707+squassina@users.noreply.github.com>
Co-authored-by: Michael Zawacki <mikezawacki@hotmail.com>
Co-authored-by: Anna Zietlow <annamzietlow@gmail.com>

* make tests simpler (#368)

* removed Dutch F3 job from main_build

* fixed a bug in data subset in debug mode

* modified epoch numbers to pass the performance checks, checkedout check_performance from Max's branch

* modified get_data_for_builds.sh to set up checkerboard data for smaller size, minor improvements on gen_checkerboard

* send all the batches, disabled the performance checks for patch_deconvnet

* added comment to enable tests for patch_deconvnet after debugging, renamed gen_checkerboard, added options to new arg per Max's suggestion

* Replace HRNet with SEResNet model in the notebook (#362)

* replaced HRNet with SEResNet model in the notebook

* removed debugging cell info

* fixed bug where resnet_unet model wasn't loading the pre-trained version in the notebook

* fixed build VM problems

* Multi-GPU training support (#359)

* Data flow tests (#375)

* renamed checkerboard job name

* restructured default outputs from test.py to be dumped under output dir and not debug dir

* test.py output re-org

* removed outdated variable from check_performance.py

* intermediate work

* intermediate work

* bunch of intermediate works

* changing args for different trainings

* final to run dev_build"

* remove print statements

* removed print statement

* removed suppressed lines

* added assertion error msg

* added assertion error msg, one intential bug to test

* testing a stupid bug

* debug

* omg

* final

* trigger build

* fixed multi-GPU termination in train.py (#379)

* PR to fix #371 and #372  (#380)

* added learning rate to logs

* changed epoch for patch_deconvnet, and enabled the tests

* removed TODOs

* changed tensorflow pinned version (#387)

* changed tensorflow pinned version

* trigger build

* closes 385 (#389)

* Fixing #259 by adding symmetric padding along depth direction  (#386)

* BYOD Penobscot (#390)

* minor updates to files

* added penobscot conversion code

* docker build test (#388)

* added a new job to test bulding the docker, for now it is daisy-chained to the end

* this is just a TEST

* test

* test

* remove old image

* debug

* debug

* test

* debug

* enabled all the jobs

* quick fix

* removing non-tagged iamges

Co-authored-by: maxkazmsft <maxkaz@microsoft.com>

* added missing license headers and fixed formatting (#391)

* added missing license headers and fixed formatting

* some more license headers

* updated documentation to close 354 and 381 (#392)

* fix test.py and notebook issues (#394)

* resolved conflicts for 0.2 release (#396)

* V00.01.00003 release (#356)

* cleaning up files which are no longer needed

* fixes after removing forking workflow (#322)

* PR to resolve merge issues

* updated main build as well

* added ability to read in git branch name directly

* manually updated the other files

* fixed number of classes for main build tests (#327)

* fixed number of classes for main build tests

* corrected DATASET.ROOT in builds

* added dev build script

* Fixes for development inside the docker container (#335)

* Fix the mound command for the HRNet pretrained model in the docker readme

* Properly catch InvalidGitRepository exception

* make repo paths consistent with non-docker runs -- this way configs paths do not need to be changed

* Properly catch InvalidGitRepository exception in train.py

* Readme update (#337)

* README updates

* Removing user specific path from config

Authored-by: Fatemeh Zamanian <Fatemeh.Zamanian@microsoft.com>

* Fixing #324 and #325 (#338)

* update colormap to a non-discrete one -- fixes #324

* fix mask_to_disk to normalize by n_classes

* changes to test.py

* Updating data.py

* bug fix

* increased timeout time for main_build

* retrigger build

* retrigger the build

* increase timeout

* fixes 318 (#339)

* finished 318

* increased checkerboard test timeout

* fix 333 (#340)

* added label correction to train gradient

* changing the gradient data generator to take inline/crossline argument conssistent with the patchloader

* changing variable name to be more descriptive


Co-authored-by: maxkazmsft <maxkaz@microsoft.com>

* bug fix to model predictions (#345)

* replace hrnet with seresnet in experiments - provides stable default model (#343)

Co-authored-by: yalaudah <yazeed.alaudah@microsoft.com>
Co-authored-by: Fatemeh <fazamani@microsoft.com>

* typos

Co-authored-by: yalaudah <yazeed.alaudah@microsoft.com>
Co-authored-by: Fatemeh <fazamani@microsoft.com>

* tensorboard notebook fix & loading of pre-trained models fix (#397)

Co-authored-by: Max Kaznady <max.kaznady@gmail.com>

* Docker README corrections and pretrained model checking (#398)

* added better instructions to Docker readme; removed HRNet references

* added checking of pre-trained models on startup

* Update docker/README.md

Co-authored-by: yalaudah <yazeed.alaudah@microsoft.com>

* added more README changes and a video link with overview

* readme tweaks

Co-authored-by: yalaudah <yazeed.alaudah@microsoft.com>

* finalized performance metrics (#399)

Co-authored-by: yalaudah <yazeed.alaudah@microsoft.com>
Co-authored-by: Fatemeh <fazamani@microsoft.com>
Co-authored-by: Sharat Chikkerur <sharat.chikkerur@microsoft.com>
Co-authored-by: kirasoderstrom <kirasoderstrom@gmail.com>
Co-authored-by: Sharat Chikkerur <sharat.chikkerur@gmail.com>
Co-authored-by: Geisa Faustino <32823639+GeisaFaustino@users.noreply.github.com>
Co-authored-by: Ricardo Squassina Lee <8495707+squassina@users.noreply.github.com>
Co-authored-by: Michael Zawacki <mikezawacki@hotmail.com>
Co-authored-by: Anna Zietlow <annamzietlow@gmail.com>
Co-authored-by: Max Kaznady <max.kaznady@gmail.com>
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Prior: High Type: Accuracy related to increasing performance accuracy
Projects
None yet
2 participants