-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Avoid parsing whole config prematurely #504
Avoid parsing whole config prematurely #504
Conversation
by initially only parsing the `artifact_path`, and parsing the whole config only if it is not loaded from the config history. This avoids irrelevant parsing errors for fields that would be taken from the config history anyway, and it allows for defining required fields in the config.
class CommonFieldsConfig(ProjectConfig, extra=Extra.ignore): | ||
"""Fields that can be modified without affecting caching.""" | ||
|
||
class ArtifactsConfig(AzimuthBaseSettings, extra=Extra.ignore): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure about this renaming - batch_size
, use_cuda
, large_dask_cluster
don't really refer to artifacts, no? Or perhaps I get the artifacts definition wrong? I agree though that CommonFields
is not great either. What about GenericConfig
, or GeneralSettingsConfig
? We should also rename it in the documentation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My new class ArtifactsConfig
only contains artifact_path
. The other fields (batch_size
, use_cuda
, large_dask_cluster
and read_only_config
) remain in CommonFieldsConfig
. Does that clarify?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, sorry!! I interpreted the diff wrong, looks great then.
|
||
return cfg | ||
return cls.parse_file(config_path) if config_path else cls() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand why this change "allows for defining required fields in the config". Since config_history
will always be empty at one point, won't we face the same issue?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On the very first run, you will need to specify a config with the required fields (via CFG_PATH
or other env vars), but if you kill Azimuth, you will now be able to restart it with only LOAD_CONFIG_HISTORY=1
, since I now get the artifact_path
from ArtifactsConfig()
instead of AzimuthConfig()
. Does that make sense?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, ok. Thanks for clarifying!!
@@ -440,69 +440,64 @@ def dynamic_language_config_values(cls, values): | |||
similarity.faiss_encoder = similarity.faiss_encoder or defaults.faiss_encoder | |||
return values | |||
|
|||
@classmethod |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I'm loving these changes!! Awesome, so much cleaner!! I was wondering on my current branch why some of these lines were not methods actually.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approved!!
Description:
I recommend reviewing the two commits separately:
load_azimuth_config
into two single-responsibility methods. This is a trivial refactor that makes the actual changes (in the second commit) cleaner (allowing for an earlyreturn
inAzimuthConfig.load()
).artifact_path
, and parsing the whole config only if it is not loaded from the config history. This avoids irrelevant parsing errors for fields that would be taken from the config history anyway, and it allows for defining required fields in the config.Checklist:
You should check all boxes before the PR is ready. If a box does not apply, check it to acknowledge it.
ran
pre-commit run --all-files
at the end.our users.
README
files and our wiki for any big design decisions, if relevant.