Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

:extra-env does not respect case insensitivy of Windows environment variables #79

Closed
lread opened this issue Aug 10, 2022 · 10 comments · Fixed by #80
Closed

:extra-env does not respect case insensitivy of Windows environment variables #79

lread opened this issue Aug 10, 2022 · 10 comments · Fixed by #80

Comments

@lread
Copy link
Contributor

lread commented Aug 10, 2022

Repro

On Windows 10, given a Path environment variable:
image

And a fiddle.clj (which I've plunked in a fresh clone of babashka/process):

(require '[babashka.process :as p])

(-> (p/process ["clojure" "-e" "(println (System/getenv \"PATH\"))"]
               {:extra-env {"PATH" (str "some-path" (System/getProperty "path.separator") (System/getenv "PATH"))}})
    :out
    slurp
    (subs 0 50)
    println)

From your shell, run:

clojure -M fiddle.clj

Expected Behaviour

As a Windows user, I would expect :extra-env to respect Windows environment variable name case insensitivity when overriding existing environment variables.

This means any case of path should override Path (including PATH).

If this works, I expect to see some-path printed as the first path returned by PATH.

Actual Behaviour

We do not see out some-path prefixed:

>clojure -M fiddle.clj
C:\Program Files\Parallels\Parallels Tools\Applica

Diagnosis

If I alter fiddle.clj to change PATH to Path for :extra-env:

(require '[babashka.process :as p])

(-> (p/process ["clojure" "-e" "(println (System/getenv \"PATH\"))"]
               {:extra-env {"Path" (str "some-path" (System/getProperty "path.separator") (System/getenv "PATH"))}})
    :out
    slurp
    (subs 0 50)
    println)

I will get the expected result:

>clojure -M fiddle.clj
some-path;C:\Program Files\Parallels\Parallels Too

(Notice I did not alter PATH for getenv calls).

On Windows, environment variables are case insensitive.

The JDK behaves accordingly for System/getenv on Windows, but ProcessBuilder.environment does not.

Any case variant of path passed to System/getenv returns value for Path.

user=> (= (System/getenv "Path") (System/getenv "PATH") (System/getenv "pAtH"))
true

I think ProcessBuilder.environment holds both Path and PATH. I'm not sure how one gets chosen over the other when the env is set for a Windows process.

Side Note: To keep things confusing, Windows does show Path for cmd.exe set, but PATH for env, but Path is the underlying environment variable name.

Option 1: Do nothing

If we take the stance that:

Then maybe we will do nothing.

Pros:

  • easy for us

Cons:

  • confusing for users who bump into this (it was for me).

Option 2: Document Peculiarity

Add a tip to docs about :extra-env on Windows stating that although Windows environment variable names are case insensitive, an environment variable will not be overridden unless the provided name matches the original case of the variable name.

Questions: Is the original name always Path on Windows? Can it be path or PATH?

Pros:

  • at least we've explained the quirk for users

Cons:

  • We might have to do more work to understand and document peculiarities than fix the issue.

Option 3: Tweak env merging for Windows only

If an :extra-env variable name is a case insensitive match for an existing variable name it is a match and the variable's value's entry should be replaced.

But what if someone specifies an :extra-env of {"PATH" "boo" "path" "foo"}?
Well, that's probably unlikely enough to ignore as undefined behaviour.

Pros:

  • cross-platform support, ex. PATH sets PATH on Linux/macOS but appropriately any case of Path on Windows.
  • :extra-env behaves like the Windows shells and therefore matches what Windows users would expect.

Concerns:

  • what if JDK folks agree this is a bug and finally fix this for Windows? Our fix would still be active but unnecessary for JDK versions with fix.

Cons:

  • can't think of one other than more code, more maintenance, more chance for bugs.

Proposal

I like option 3 the for pros it lists.
But, I'm sure I've not thought of everything, please do chime in with your thoughts.

Next steps

Happy to PR if you decide option 3 is the path (pun intended) as well.

@borkdude
Copy link
Contributor

@lread I'll have to think about this a little bit. What would your fix for 3 look like? Also pinging @bobisageek here since I consider him a Windows expert (relative to me at the very least).

@lread
Copy link
Contributor Author

lread commented Aug 10, 2022

Not sure exactly @borkdude, but I was thinking it would be pretty straightforward alternate handling for Windows around

(defn- add-env
"Adds environment for a ProcessBuilder instance.
Returns instance to participate in the thread-first macro."
^java.lang.ProcessBuilder [^java.lang.ProcessBuilder pb env]
(doto (.environment pb)
(.putAll (as-string-map env)))
pb)

Could do a PR to explore what it could look like.
Might wait to hear what Bob has to say first.

@borkdude
Copy link
Contributor

borkdude commented Aug 10, 2022

@lread So e.g. if :extra-path has "pAth" and environment already has got patH, how would you merge this? Detect that patH exists and then change only that key?

@lread
Copy link
Contributor Author

lread commented Aug 10, 2022

Yeah, but to be super clear: change only the value for patH (name remains the same, value is updated).

@borkdude
Copy link
Contributor

That's what I intended to say, thanks for making this extra clear.

@bobisageek
Copy link
Contributor

Quick skim: other cases of path do all end up in the process's environment on both Windows and Linux, but as said, Windows seems to have some sort of preference. Based on some quick experiments, I think it might prioritize system-level environment variables over user-level, regardless of case, but I'm not confident in that - need a bit more testing (and Path is a weird corner case that happens at both the system and user-level, so it just gets weirder there).

I think a con (but not necessarily a disqualifier) for the merging of environments is that we would need to "process" the underlying environment, which adds a bit of time to those process calls, but obviously that's a trade-off for what might be more desirable behavior.

The merge would be consistent with Windows shell behavior. It would eliminate the possibility of using two environment variables that vary in case only, but that's not really an option on Windows in general (as demonstrated by System/getenv). I'm not sure if there would be there anyone out there "working around" the case-insensitive nature of Windows by using (System/getenv) to get the full environment (which can have multiple keys that vary only in case) - that feels like it'd be a very esoteric edge case. The other edge case that I guess this brings up is that we might need to update multiple environment values, but that would be predicated on the environment already containing 'case-insensitive matching' variables.

One other consideration that might be of interest... this is (sort of) re-createable using clojure.java.shell/sh. Since :extra-env doesn't exist, I have to sort of hand-roll it with merge and getting the environment and so on:

(require '[clojure.java.shell :as shell])

(-> (shell/sh "deps" "-M" "-e" "(println (System/getenv \\\"PATH\\\"))"
              :env 
              (merge (into {} (System/getenv))
                     {"PATH" (str "some-path" (System/getProperty "path.separator") (System/getenv "PATH"))}))
    :out
    (subs 0 50)
    println)

(shutdown-agents) ; lol futures

Running this (I'm using deps instead of clojure here because I find deps.clj more convenient, but it's still running clojure.main for this piece that we're talking about) exhibits the same behavior - if we add "PATH" to the environment, we get the unmodified "Path", but if we add (by way of merging over) "Path", we get the "some-path;..." value. Again, this is not a disqualifier, but it might be something that people are already (ab)using from Clojure's sh function. This could make an argument for saying "it works like Clojure's sh function" (and maybe it's worth documenting in both places, e.g. also on clojuredocs), or it could be "here's a funky edge case where babashka.process works differently (and hopefully better in popular opinion)".

With possible performance implications, should we trial it? As in, we can run some perf testing to make sure it's not adding like a second to each shell call, and then maybe put it out provisionally and solicit user feedback? I don't know if some use cases on Windows might have large numbers of env vars and could experience a noticeable impact. I sort of doubt it, but I'm also not immediately coming up with a more elegant/efficient strategy than a case-insensitive key comparison across the entire environment map.

@lread
Copy link
Contributor Author

lread commented Aug 11, 2022

Wow, thanks for the time and interest @bobisageek!

Disclaimer: I am using deps.exe as well, I use it renamed to clojure.exe. 🙂

Path and PATH in env

You inspired me to have a peek with Sysinternal Process Explorer (to take the JVM out of the equation) and yah... I see both Path and PATH in the environment:
image

Clojure core sh

Thanks for comparing!

And ya (System/getenv) without passing any args, returns the env as is.

user=> (some-> (get (System/getenv) "PATH") (subs 0 50))
nil
user=> (some-> (get (System/getenv) "Path") (subs 0 50))
"C:\\Program Files\\Parallels\\Parallels Tools\\Applica"

So you are saying on Windows Clojure's sh is behaving the same way as bb process.
Good to note.

Performance

I did not expect any real performance hit relative to the cost of spawning out, but you are right, we don't know until we measure.

@lread
Copy link
Contributor Author

lread commented Aug 11, 2022

Observation: we have already gone beyond "Option 1 - Do nothing" by raising and talking about this issue. This issue does serve as a form of documentation for the extra curious.

I'll go ahead and address this Path PATH issue where I bumped into it in babashka/fs test setup. That will give us more info. Maybe it is not so ugly to just compensate for the existing behaviour.

@lread
Copy link
Contributor Author

lread commented Aug 11, 2022

Here's the example compensation from babashka/fs:

:tasks
 {:requires ([babashka.fs :as fs]
             [clojure.string :as str])
  test (clojure {:extra-env {(if (fs/windows?) "Path" "PATH") (str "on-path" fs/path-separator (System/getenv "PATH"))}}
                "-M:test")

So maybe not too horrible? Am willing to PR Option 2 (Document) with a small tip if that's the current preference.

@borkdude
Copy link
Contributor

I think adding docs is a no-brainer. If this issue keeps bugging people we could consider more drastic measures.

lread added a commit to lread/process that referenced this issue Aug 11, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants