Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid parsing whole config prematurely #504

Merged
merged 2 commits into from
Mar 21, 2023

Conversation

JosephMarinier
Copy link
Contributor

@JosephMarinier JosephMarinier commented Mar 21, 2023

Description:

I recommend reviewing the two commits separately:

  • Split load_azimuth_config into two single-responsibility methods. This is a trivial refactor that makes the actual changes (in the second commit) cleaner (allowing for an early return in AzimuthConfig.load()).
  • Avoid parsing the whole config prematurely by initially only parsing the artifact_path, and parsing the whole config only if it is not loaded from the config history. This avoids irrelevant parsing errors for fields that would be taken from the config history anyway, and it allows for defining required fields in the config.

Checklist:

You should check all boxes before the PR is ready. If a box does not apply, check it to acknowledge it.

  • ISSUE NUMBER. You linked the issue number (Ex: Resolve #XXX).
  • PRE-COMMIT. You ran pre-commit on all commits, or else, you
    ran pre-commit run --all-files at the end.
  • USER CHANGES. The changes are added to CHANGELOG.md and the documentation, if they impact
    our users.
  • DEV CHANGES.
    • Update the documentation if this PR changes how to develop/launch on the app.
    • Update the README files and our wiki for any big design decisions, if relevant.
    • Add unit tests, docstrings, typing and comments for complex sections.

by initially only parsing the `artifact_path`, and parsing the whole config only if it is not loaded from the config history. This avoids irrelevant parsing errors for fields that would be taken from the config history anyway, and it allows for defining required fields in the config.
@JosephMarinier JosephMarinier self-assigned this Mar 21, 2023
class CommonFieldsConfig(ProjectConfig, extra=Extra.ignore):
"""Fields that can be modified without affecting caching."""

class ArtifactsConfig(AzimuthBaseSettings, extra=Extra.ignore):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure about this renaming - batch_size, use_cuda, large_dask_cluster don't really refer to artifacts, no? Or perhaps I get the artifacts definition wrong? I agree though that CommonFields is not great either. What about GenericConfig, or GeneralSettingsConfig? We should also rename it in the documentation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My new class ArtifactsConfig only contains artifact_path. The other fields (batch_size, use_cuda, large_dask_cluster and read_only_config ) remain in CommonFieldsConfig. Does that clarify?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, sorry!! I interpreted the diff wrong, looks great then.


return cfg
return cls.parse_file(config_path) if config_path else cls()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand why this change "allows for defining required fields in the config". Since config_history will always be empty at one point, won't we face the same issue?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the very first run, you will need to specify a config with the required fields (via CFG_PATH or other env vars), but if you kill Azimuth, you will now be able to restart it with only LOAD_CONFIG_HISTORY=1, since I now get the artifact_path from ArtifactsConfig() instead of AzimuthConfig(). Does that make sense?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, ok. Thanks for clarifying!! ☺️

@@ -440,69 +440,64 @@ def dynamic_language_config_values(cls, values):
similarity.faiss_encoder = similarity.faiss_encoder or defaults.faiss_encoder
return values

@classmethod
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I'm loving these changes!! Awesome, so much cleaner!! I was wondering on my current branch why some of these lines were not methods actually.

Copy link
Contributor

@gabegma gabegma left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved!!

@JosephMarinier JosephMarinier merged commit 2d987f7 into main Mar 21, 2023
@JosephMarinier JosephMarinier deleted the joseph/avoid-parsing-whole-config-prematurely branch March 21, 2023 17:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants