Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adds the ability to add images to an existing run #443

Merged
merged 145 commits into from
Jan 18, 2021
Merged

Adds the ability to add images to an existing run #443

merged 145 commits into from
Jan 18, 2021

Conversation

ajstewart
Copy link
Contributor

@ajstewart ajstewart commented Dec 17, 2020

This adds the feature of adding an image to an existing run.

  • Rebuilds the initial association data frames from the current parquet files available.
  • Performs pre-run checks to see if the job config has actually changed - will not run if any other parameters have changed.
  • Adding mode is default re-run mode, to force a complete re-run there is the option --complete-rerun when running.
  • Creates backups of parquets for completed runs so that the next adding run can use them.
  • Also creates a config_prev.py for future jobs.
  • Sources are updated using a SQL statement.
  • Ideal and new source stages are left the same (this enables images to be added that are anywhere in time, i.e. not limited to only adding images in the future.)
  • Forced extractions are only run for those measurements needed.
  • At the upload stage the old parquets are used once again to see which ones do not need to be uploaded.
  • UI changes to accommodate the new option.
  • Adds various tests to test the new feature.

I have tested it on a large scale run by adding epoch 12 to epochs 1 - 11 and it works well.

Also fixes issues #444 and #451 that were found while developing this feature.

Fixes #212.
Fixes #444.
Fixes #451.

@dliptai Fixes #461

Ella Wang and others added 30 commits December 9, 2020 14:16
- Run configuration checks.
- Constructing the sources_df ready for new association.
- Outputs need to be checked!
- Parallel currently broken.
Ensured measurement pairs are in date (a < b) order
Relations fix, parallel fix and ERR status fix
Fixed monitoring measurement pairs in add image mode
Copy link
Contributor

@srggrs srggrs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ajstewart minor things mostly about indentation

@ellawang44 I can see vast_pipeline/tests/test_regression/gen_config.py but why config files are still present? (e.g. vast_pipeline/tests/pipeline-runs/regression/parallel-deruiter/config.py)

vast_pipeline/pipeline/utils.py Show resolved Hide resolved
vast_pipeline/pipeline/main.py Outdated Show resolved Hide resolved
vast_pipeline/pipeline/main.py Outdated Show resolved Hide resolved
vast_pipeline/pipeline/main.py Outdated Show resolved Hide resolved
vast_pipeline/pipeline/main.py Outdated Show resolved Hide resolved
vast_pipeline/pipeline/main.py Outdated Show resolved Hide resolved
vast_pipeline/pipeline/utils.py Outdated Show resolved Hide resolved
vast_pipeline/pipeline/utils.py Show resolved Hide resolved
vast_pipeline/pipeline/utils.py Show resolved Hide resolved
Pipeline Backlog automation moved this from In progress to Review in progress Jan 14, 2021
Changes to template are back compatible. Only the tests give epoch_mode
@ellawang44
Copy link
Contributor

ellawang44 commented Jan 14, 2021

@srggrs Just finished changing over to generating all the config files on the fly in fd9bb23. The config files under vast_pipeline/tests/pipeline-runs/regression should all be gone now.

I had to modify the template slightly because sometimes the inputs are dictionaries but the template assumes lists. This change should not affect previously generated config files.

ajstewart and others added 3 commits January 15, 2021 14:21
Co-authored-by: Serg <34258464+srggrs@users.noreply.github.com>
Co-authored-by: Serg <34258464+srggrs@users.noreply.github.com>
@srggrs
Copy link
Contributor

srggrs commented Jan 15, 2021

@srggrs Just finished changing over to generating all the config files on the fly in fd9bb23. The config files under vast_pipeline/tests/pipeline-runs/regression should all be gone now.

I had to modify the template slightly because sometimes the inputs are dictionaries but the template assumes lists. This change should not affect previously generated config files.

Thanks! I think the test suit should be modified as well... or maybe not? Anyhow the tests fails.... due to missing folders/files.

@ellawang44
Copy link
Contributor

@srggrs Just finished changing over to generating all the config files on the fly in fd9bb23. The config files under vast_pipeline/tests/pipeline-runs/regression should all be gone now.
I had to modify the template slightly because sometimes the inputs are dictionaries but the template assumes lists. This change should not affect previously generated config files.

Thanks! I think the test suit should be modified as well... or maybe not? Anyhow the tests fails.... due to missing folders/files.

The test suite is working on local machines. Something weird is happening with github actions.

srggrs
srggrs previously approved these changes Jan 18, 2021
Copy link
Contributor

@srggrs srggrs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is ready to go, one minor comment on readability of the imports in the regression tests, but not really important.

Thanks for putting this together!

Please use the squash and merge option when merging, as this is a massive PR

vast_pipeline/tests/test_regression/test_add_image.py Outdated Show resolved Hide resolved
Pipeline Backlog automation moved this from Review in progress to Reviewer approved Jan 18, 2021
Co-authored-by: Serg <34258464+srggrs@users.noreply.github.com>
@ellawang44 ellawang44 dismissed stale reviews from srggrs via 1ce0836 January 18, 2021 01:55
Pipeline Backlog automation moved this from Reviewer approved to Review in progress Jan 18, 2021
Pipeline Backlog automation moved this from Review in progress to Reviewer approved Jan 18, 2021
@ajstewart ajstewart requested a review from srggrs January 18, 2021 02:13
Copy link
Contributor

@srggrs srggrs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@ajstewart ajstewart merged commit 4ab6f12 into master Jan 18, 2021
Pipeline Backlog automation moved this from Reviewer approved to Done Jan 18, 2021
@ajstewart ajstewart deleted the iss212 branch January 18, 2021 02:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
4 participants