Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ResolvedContext.get_environ() is missing key / values from parent_environ #1316

Closed
ColinKennedy opened this issue Jun 7, 2022 · 11 comments
Closed
Labels

Comments

@ColinKennedy
Copy link

ColinKennedy commented Jun 7, 2022

ResolvedContext.get_environ() doesn't return parent information, even if a variable is explicitly set via context.get_environ(self, parent_environ={"FOO": "bar"}). This behavior differs from execute_command(..., parent_environ={"FOO": "bar"}) and execute_shell(parent_environ={"FOO": "bar"}), which both include FOO in the resulting resolve.

Environment

  • OS: CentOS 7
  • Rez version: 2.109.0
  • Rez python version: 3.7

To Reproduce
Make a Python file like this:

from rez import resolved_context

requests = []
context = resolved_context.ResolvedContext(requests)
environment = context.get_environ(parent_environ={"FOO": "bar})

Expected behavior
For FOO to be included in the environment variable, just like how doing export FOO=bar && rez-env -- echo '$FOO' prints bar.

Actual behavior
environment does not include FOO unless one of the package requests in requests modifies it.

At the moment I'm getting around the problem by doing this

environment = {"FOO": "bar"}
environment.update(context.get_environ(parent_environ=environment))

Rez should be managing the environment variables, so I find this hack to be a bit fragile. I'm not sure if it is "guaranteed" to work and would rather Rez handle it for me.

@instinct-vfx
Copy link
Contributor

I am not sure i understand the environment and what exactly does not work. How are DISPLAY and FOO set in the parent env?

@ColinKennedy
Copy link
Author

ColinKennedy commented Jun 7, 2022

Ah sorry, bad copy-paste. The real situation this occurred was when I was trying to figure out why the DISPLAY variable wasn't getting passed through and it was causing issues when trying to make detached Rez terminal resolves. In the process of troubleshooting the problem I noticed that get_environ was still not returning DISPLAY (FOO), even though I had fixed it elsewhere.

I've updated the issue to remove all references to DISPLAY.

FOO is just any variable defined somewhere in the ~/.profile. So it exists prior to calling rez-env.

@instinct-vfx
Copy link
Contributor

I am not sure how this is handled on Linux platforms. On Windows rez takes environment variables from the registry and only loads system level environment variables, not the ones defined in the scope of the current shell. The reason is to prevent environment variables from a rez created shell into a nested rez created shell. An example would be resolving a python 2.7 shell and then a python 3 shell inside. That would mix up the PYTHONPATHS.
If i see this correctly in the code then on Linux it runs with --norc to yield a similar result. Hence your profile is never executed for the child shell (at least not in the context of environment variable inheritance).
I think the reason why export FOO=bar && rez-env -- echo '$FOO' seems to work is that $FOO is expanded in the parent shell already so the command that is getting executed in the child shell is echo bar rather than echo $FOO. I think it will also break if you escape the dollar so that it gets passed into the child shell.

@ColinKennedy
Copy link
Author

I think the reason why export FOO=bar && rez-env -- echo '$FOO' seems to work is that $FOO is expanded in the parent shell already so the command that is getting executed in the child shell is echo bar rather than echo $FOO

If you rez-env without --, as in run rez-env and then manually type echo $FOO you'll absolutely get bar just the same. Though I used strong quotes (single quotes) to prevent variable expansion so I think that should work in either case.

I'll triple check it later when I have a Linux machine again but I believe also that execute_command(..., parent_environ={"FOO": "bar"}) and execute_shell(parent_environ={"FOO": "bar"}) will both define FOO. It's only get_environ(parent_environ={"FOO": "bar"}) that seems to be different.

@ColinKennedy
Copy link
Author

ColinKennedy commented Jun 7, 2022

On Windows rez takes environment variables from the registry and only loads system level environment variables, not the ones defined in the scope of the current shell.

I wasn't aware of this, that's interesting info. I didn't realize there was that difference

An example would be resolving a python 2.7 shell and then a python 3 shell inside. That would mix up the PYTHONPATHS.

Just want to clarify - In Linux and in my experience, the scenario you mentioned would not mix PYTHONPATHs because the second rez-env's packages would first override PYTHONPATH on first reference and then append to PYTHONPATH for each subsequent call. The scenarios where a mixture could occur are:

  • all_parent_variables is True
  • PYTHONPATH is included in parent_variables
  • env.PYTHONPATH.append(os.environ["PYTHONPATH"]) is defined in a package commands() / pre_commands() or something.

I believe Windows would work the same way. However in Linux there's a 4th scenario where variables which a package in the first resolve defines and then the second resolve does not will carry through. e.g. if the first rez-env defines FOO somewhere but the second rez-env doesn't define FOO, FOO is still in the nested rez-env

@instinct-vfx
Copy link
Contributor

Yesterday was a bank holiday, so today is monday and it seems i am not 100% up to cruise speed yet and somehow skipped beyond the use of parent_variable. Time to sit in a corner and think again

@nerdvegas
Copy link
Contributor

This is by design - get_environ() is returning only those env-vars set by rez. The env-vars that you provide in parent_environ potentially influence the env-vars that rez sets, but that's all, they don't need to be in the result. The result intentionally tells you what env-vars were set by the context - if it included more than that, then you wouldn't know which is which.

For example, consider this package_commands, where USER impacts the resulting environ:

def commands():
    if env.USER == "jbloggs":
        env.PATH.append("{root}/joes_tools")

Here USER affects the resolve, but the resolve doesn't set USER, so it's not returned by get_environ().

If the env-vars that potentially impact the environ are important to you, then manually combining them as per your second example is the correct approach. In a normal rez-env (and following our example here), nothing is amiss - USER affected the resolved environ, and USER is still set within that environ (because rez doesn't alter env-vars unless its packages do, besides various REZ_* vars of course).

"""However in Linux there's a 4th scenario where variables which a package in the first resolve defines and then the second resolve does not will carry through. e.g. if the first rez-env defines FOO somewhere but the second rez-env doesn't define FOO, FOO is still in the nested rez-env"""

Just wanted to mention about this also, as it comes up now and then. This, again, is by design. Rez does not treat the parent environ as a special case just because it also is a rez-resolved env. The behaviour is the same - the second resolve will leave FOO unchanged - regardless of whether a parent rez session set it, or a user set it - unless a package in the resolve changes it specifically.

Hth, closing.

@nerdvegas
Copy link
Contributor

To add one more kinda related point. Having said what I did above, it's also true that we would benefit from having greater control over how env-vars are managed, down to the per-var level ideally. For example, a problem we have at the moment is that PATH doesn't really behave ideally during execution of the context - the system paths are only appended at the end (if you attempted to do an ls in a package's commands for eg, it would probably fail). It would be good to be able to say, "append to PATH, but keep the pre-rez value of PATH always appended at the end." In other words, if 3 packages appended A, B and C to PATH, and if PATH (before any rez-env) was X:Y, then, during the resolve, PATH would be set first to A:X:Y, then A:B:X:Y, and finally A:B:C:X:Y.

Were we to have this level of control, perhaps we could think about giving more control over what to do with extra vars set by rez packages that might 'pollute' a child rez shell (and LD_PRELOAD comes to mind). Honestly though I'm still on the fence about it - who's to say that LD_PRELOAD, whilst set by the parent rez shell, wasn't also set explicitly and intentionally by the user afterward, who is now going to be thoroughly confused when it disappears from their next rez-env?

See the existing ticket on this topic: #737

@ColinKennedy
Copy link
Author

If the env-vars that potentially impact the environ are important to you, then manually combining them as per your second example is the correct approach.

The important thing in this case is to know the environment variables that would exist, via API, as if the user had called rez-env (for the first time) from the terminal. FWIW in my previous reply, I wasn't implying that any of the behavior I'd mentioned were bugs, as you say they're by design. And on several occasions that environment variable passing has been useful (even for nested rez-env calls) so I'd definitely like to that feature to stay.

So just to confirm, the conclusion on this issue is "get_environ() doesn't show parent environment variables because if it did then you wouldn't know which environment variable came from a resolve or which was from the parent?" And the work-around is to

environment = {"FOO": "bar"}
environment.update(context.get_environ(parent_environ=environment))

And I can be sure that this will always be consistent? If the answer is yes then I guess I'll go forward with that. But if there's any doubt that the above may not work, I'd suggest we add a new parameter to get_environ(), maybe include_parent=True or something so Rez can manage and return in the case where I, as the API caller, don't care which variables were set via context as long as the result is consistent with what I'd see from a rez-env.

@nerdvegas
Copy link
Contributor

nerdvegas commented Jun 7, 2022 via email

@ColinKennedy
Copy link
Author

ColinKennedy commented Jun 7, 2022

If you wanted a 100% reliable version of this function (ie exactly what a shell would give), that would be something different

Maybe in the future but FWIW my case for get_environ() is for a debug tool, nothing fancy or critical. Accuracy is ideal but if we can get 99% of the way there and only source() or some other less common features aren't supported, that'd be okay, I think. I'll just add a disclaimer that the view may not show 100% everything. Thank you for the in-depth explanation!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants