Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Embedding config into context #1481

Merged
merged 21 commits into from
Sep 11, 2019
Merged

Embedding config into context #1481

merged 21 commits into from
Sep 11, 2019

Conversation

cicdw
Copy link
Member

@cicdw cicdw commented Sep 10, 2019

Thanks for contributing to Prefect!

Please describe your work and make sure your PR:

  • adds new tests (if appropriate)
  • updates CHANGELOG.md (if appropriate)
  • updates docstrings for any new functions or function arguments, including docs/outline.toml for API reference docs (if appropriate)

Note that your PR will not be reviewed unless all three boxes are checked.

What does this PR change?

This PR is subtle. Superficially, it converts most references of prefect.config to prefect.context.config via embedding the config object into context. In practice, however, it does something deeper: whatever Prefect-specific configuration options are set on the main Flow Runner process through environment variables will be respected on all machines that Tasks are submitted to run on. This is accomplished by embedding config into prefect.context, which is always kept up-to-date with the context present in the main Flow Runner process. Note that:

  • set_temporary_config still works for unit-testing because dictionaries are tracked by-reference in Python
  • this introduces some hidden "gotchas" when referencing config during execution that all developers should be aware of (this isn't something users will encounter, but developers might)

Why is this PR important?

Many Cloud users ran into the following problem: executing Flows against external Dask clusters required all dask workers to have Prefect Cloud specific environment variables set. This is unsatisfactory, as Prefect should handle those execution details internally. This PR ensures that as long as all workers have an up-to-date version of Prefect, no Prefect-specific configurations need to be set on the workers.

Notes for reviewers

@joshmeek / @jlowin : I've tested this using local agents in various forms, and verified that my local setup failed without this PR. Note that, at worst, it preserves current behavior as the first config to populate context will persist. That being said, please review this carefully as it is a somewhat deep change.

cc: @bnaul @joeschmid

attribute access.

I dunno add more as_nested_dict calls

Explicitly create new context object yo

Force convert everything to a dotdict all the way down

All tests pass for the new config -> context embedding

Add test for cloud flow runner initializing a cloud task runner

Special case the config key when updating context
@cicdw cicdw changed the title Embedding config into context [WIP] Embedding config into context Sep 10, 2019
@codecov
Copy link

codecov bot commented Sep 10, 2019

Codecov Report

Merging #1481 into master will increase coverage by 0.05%.
The diff coverage is 100%.

@cicdw cicdw force-pushed the context-config branch 2 times, most recently from 924dcfe to 31f7c5f Compare September 10, 2019 07:19
@cicdw cicdw force-pushed the context-config branch 2 times, most recently from aa465cf to 6a31cae Compare September 10, 2019 07:43
@jlowin
Copy link
Member

jlowin commented Sep 10, 2019

because the logging hooks in Prefect are created / called prior to the actual byte-code payload being deserialized on the worker

Feels like we could explore configuring logging handlers in a context manager, possibly even config itself.

with prefect.utilities.logging.configure_logging():
    do_stuff()
do_stuff_with_regular_logging()

This would solve some of my other frictions from last week, so it's on my mind

Copy link
Member

@jlowin jlowin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is a very important change and it's amazing how short it is (other than the prefect.config -> prefect.context.config updates)

I have a couple comments on how the new config is loaded into the existing one (which might break the set_temporary_config because they make a more explicit copy, not sure). Depending on what we want to do there (if anything) I'll give a second more complete review.

src/prefect/utilities/context.py Outdated Show resolved Hide resolved
src/prefect/utilities/logging.py Outdated Show resolved Hide resolved
src/prefect/utilities/context.py Show resolved Hide resolved
@joeschmid
Copy link

👏 Love this change! Much better experience for Prefect Cloud users running their own external Dask clusters.

@cicdw cicdw changed the title [WIP] Embedding config into context Embedding config into context Sep 10, 2019
Copy link
Member

@jlowin jlowin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two minor suggestions and a comment that doesn't require any action IMO.

Question: should we remove config from the top level of Prefect? Currently prefect.config is a thing, which IMO encourages using it. I'm sure we have documentation that references it as well. Perhaps it should live in prefect.configuration.config and the "first class" recommendation becomes prefect.context.config.

src/prefect/utilities/configuration.py Show resolved Hide resolved
new_context = dict(*args, **kwargs)
if "config" in new_context:
new_config = merge_dicts(self.get("config", {}), new_context["config"])
new_context["config"] = as_nested_dict(new_config, dct_class=Config)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is actually a really subtle detail -- but if a value in the config were itself a dictionary, this would replace that dictionary with a Config. Now the config really shouldn't have any dictionaries in it, as its TOML defined and the only possible dictionaries are keys, not values, but it's just something to call out.

Copy link
Member Author

@cicdw cicdw Sep 11, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea, I thought about this some; because we don't hedge against it in our normal config, I wasn't concerned here. Ultimately I'd argue users should only use config by setting a finite list of environment variables which act as levers for affecting cloud functionality (or local flow-building functionality, i.e., default settings). For all other dynamic values, users can always provide context values directly when creating a flow run (whether locally or in Cloud).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I agree with you.

src/prefect/utilities/context.py Show resolved Hide resolved
Co-Authored-By: Jeremiah Lowin <153965+jlowin@users.noreply.github.com>
@cicdw
Copy link
Member Author

cicdw commented Sep 11, 2019

Whether to remove prefect.config from the top level is a good question; I don't think we should hide it / protect it, but I do think we should document that it is not intended to be interacted with directly and is always placed in context during execution. For all other dynamic data, context is what users should interact with.

@jlowin
Copy link
Member

jlowin commented Sep 11, 2019

👍

@cicdw cicdw merged commit 2ebf197 into master Sep 11, 2019
@cicdw cicdw deleted the context-config branch September 11, 2019 00:46
zanieb pushed a commit that referenced this pull request Apr 13, 2022
Add conditional statement to display docs button in FlowsPageFlowListEmptyState Part 2
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants