Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[2pt] PR: Updates for acquire, logging and cpu test #1054

Merged
merged 23 commits into from
Feb 2, 2024

Conversation

RobHanna-NOAA
Copy link
Contributor

@RobHanna-NOAA RobHanna-NOAA commented Dec 13, 2023

Recent testing exposed a bug with the acquire_and_preprocess_3dep_dems.py script. It lost the ability to be re-run and look for files that were unsuccessful earlier attempts and try them again. It may have been lost due to confusion of the word "retry". Now "retry" means restart the entire run. A new flag called "repair" has been added meaning fix what failed earlier. This is a key feature it is common for communication failures when calling USGS to download DEMs. And with some runs taking many hours, this feature becomes important.

Also used the opportunity to fix a couple of other minor issues:

  1. Reduce log output
  2. Add a test for ensuring the user does not submit job numbers (num of cpu requests) to exceed the system max cpus. This test exists in a number of places in the code but way later in the processing stack after alot of processing has been done. Now it is done at the start of the fim pipeline stack.
  3. remove arguments for "isaws" which is no longer in use and has not been for a while.
  4. quick upgrade to the tracker log that keeps track of duration of each unit being processed.

Closes Issue 1049, 1053 and 942.

Changes

  • data\usgs\
    • acquire_and_preprocess_3dep_dems.py: Re-add a feature which allowed for restarting and redo missing outputs or partial outputs. System now named as a "repair" system.
  • fim_pipeline.sh: remove the parallel --eta flag to reduce logging. It was not needed, also removed "isaws" flag.
  • fim_pre_processing.sh: Added validation tests for maximum CPU requests (job numbers)
  • fim_post_processing.sh: Added a permissions updated as output folders were being locked due to permissions.
  • fim_process_unit_wb.sh: Fixed a bug with output folders being locked due to permissions, but it was not recursive.
  • src
    • bash_functions.sh: Added function so the unit timing logs would also have a time in percentage so it can easily be used to calculate averages.
    • delineate_hydros_and_produce_HAND.sh: Removed some unnecessary logging. Changed a few gdal calls to be less verbose.
    • derive_level_paths.py: Changed verbose to false to reduce unnecessary logging.
    • run_by_branch.sh: Removed some unnecessary logging. Added a duration system so we know how long the branch took to process.
    • run_unit_by_wb.sh: Removed some unnecessary logging. Changed a few gdal calls to be less verbose.
    • split_flows.py: Removed progress bar which was unnecessary and was adding to logging.

Testing

  • acquire_and_preprocessing_3dep_dems.py was testing in a number of permutations against a small subset of Texas HUC8's, checking combinations of repair. For additional testing, small subsets of HUC boundary / extent files can be created or extracted from previous runs to be run against the tool. In some tests, I ran the acquire tool and let is complete, then delete some of the files, then ran it again. It asked about overwrite and it started redoing all files. I deleted a few files again, but this time added the repair flag and it filled in the missing DEMs. To test partial download, I aborted the program just as it started one of the downloads to get file that is under the 10 MB threshold and tried again.
  • For fim_pipeline.sh - Max CPU test addition was tested in various combinations of the -jh and -jb flag.
  • For Logging... a fim_pipeline.sh test was run against 05030104 before changes, then after. Each line in each log file output was reviewed to ensure no errors or surprise changes to outputs. Output files were also compared for size. Also tested two hucs to ensure the rollup time script logs/unit/total_duration_run_by_unit_all_HUCS.csv to ensure entries were correct and now included the new time percentage colum for seconds.

Issuer Checklist (For developer use)

You may update this checklist before and/or after creating the PR. If you're unsure about any of them, please ask, we're here to help! These items are what we are going to look for before merging your code.

  • Informative and human-readable title, using the format: [_pt] PR: <description>
  • Links are provided if this PR resolves an issue, or depends on another other PR
  • If submitting a PR to the dev branch (the default branch), you have a descriptive Feature Branch name using the format: dev-<description-of-change> (e.g. dev-revise-levee-masking)
  • Changes are limited to a single goal (no scope creep) - no. three small ones were done as one PR.
  • The feature branch you're submitting as a PR is up to date (merged) with the latest dev branch
  • pre-commit hooks were run locally
  • Any change in functionality is tested
  • n/a - Passes all unit tests locally (inside interactive Docker container, at /foss_fim/, run: pytest unit_tests/)
  • New functions are documented (with a description, list of inputs, and expected output)
  • n/a - Placeholder code is flagged / future todos are captured in comments
  • CHANGELOG updated with template version number, e.g. 4.x.x.x
  • Reviewers requested
  • Add yourself as an assignee in the PR as well as the FIM Technical Lead

Merge Checklist (For Technical Lead use only)

  • Update CHANGELOG with latest version number and merge date
  • Update the Citation.cff file to reflect the latest version number in the CHANGELOG
  • If applicable, update README with major alterations

@RobHanna-NOAA RobHanna-NOAA marked this pull request as ready for review December 14, 2023 23:31
CONTRIBUTING.md Outdated Show resolved Hide resolved
CONTRIBUTING.md Outdated Show resolved Hide resolved
CONTRIBUTING.md Outdated Show resolved Hide resolved
CONTRIBUTING.md Outdated Show resolved Hide resolved
CONTRIBUTING.md Show resolved Hide resolved
CONTRIBUTING.md Outdated Show resolved Hide resolved
CONTRIBUTING.md Outdated Show resolved Hide resolved
CONTRIBUTING.md Outdated Show resolved Hide resolved
CONTRIBUTING.md Outdated Show resolved Hide resolved
CONTRIBUTING.md Show resolved Hide resolved
robgpita
robgpita previously approved these changes Jan 8, 2024
Copy link
Contributor

@robgpita robgpita left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! Added a usage statement to exit the run if jh & jb values are too high. Logging is significantly less verbose. Nice work on this one.

Copy link
Contributor

@mluck mluck left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice added (re-)functionality. I assume the repair works but wasn't able to try it as several attempts at downloading a few HUC6s I didn't have any missing tiles to test it. The -j limit was tested and works.

One thing that I'm not sure was working correctly was the retry function. The usage/docs say that if retry is used then any existing DEMs will be skipped, but it seems that retry deletes the outputs folder and does a fresh download of everything, similar to overwrite. Maybe retry means to retry the download if the first attempt fails (I was unable to test this) in which case it may be working correctly but maybe just the documentation should be clarified.

data/usgs/acquire_and_preprocess_3dep_dems.py Outdated Show resolved Hide resolved
data/usgs/acquire_and_preprocess_3dep_dems.py Outdated Show resolved Hide resolved
data/usgs/acquire_and_preprocess_3dep_dems.py Outdated Show resolved Hide resolved
data/usgs/acquire_and_preprocess_3dep_dems.py Outdated Show resolved Hide resolved
data/usgs/acquire_and_preprocess_3dep_dems.py Outdated Show resolved Hide resolved
mluck
mluck previously approved these changes Jan 29, 2024
Copy link
Contributor

@mluck mluck left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good -- also tested with an Alaska HUC8.

@RobHanna-NOAA RobHanna-NOAA changed the title [2pt] PR: fix for acquire, logging and cpu test [2pt] PR: Updates for acquire, logging and cpu test Feb 2, 2024
@CarsonPruitt-NOAA CarsonPruitt-NOAA merged commit a14e5e1 into dev Feb 2, 2024
1 check passed
@CarsonPruitt-NOAA CarsonPruitt-NOAA deleted the dev-acquire-3dep-bug branch February 2, 2024 17:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working enhancement New feature or request FIM4
Projects
None yet
4 participants