defer to stdlib for path.get_home_dir() #998

Merged
merged 3 commits into from Nov 24, 2011

Projects

None yet

3 participants

@minrk
Member
minrk commented Nov 14, 2011

We have elaborate and fragile logic for determining
home dir, and it is ultimately less reliable than the stdlib behavior
used for os.path.expanduser('~'). This commit defers to that in
all cases other than a bundled Python in py2exe/py2app environments.

The one case where the default guess will not be correct, based on
inline comments, is on WinHPC, where all paths must be UNC (\\foo), and
thus HOMESHARE is the logical first choice. However, HOMESHARE is
the wrong answer in approximately all other cases where it is defined,
and the fix for WinHPC users is the trivial HOME=%HOMESHARE%.

This removes the various tests of our Windows path resolution logic,
which are no longer relevant. Further, $HOME is used by the stdlib
as first priority on all platforms, so tests for this behavior are
no longer posix-specific.

closes gh-970
closes gh-747

@minrk
Member
minrk commented Nov 14, 2011

I also made a small change, allowing IPython to start without requiring that $HOME is actually writable, because we almost never write anything there. Note that we already handle the actually important case of the ipython dir being writable, falling back on a temp dir if not.

If HOME does not exist, the only two references to HOME that I found (%cd with no args, and %logstart global) will raise proper, informative 'no such file' errors, but IPython is fully functional otherwise, so it seemed silly to have a fatal error on something that isn't actually a requirement for reasonably well behaved IPython.

@minrk
Member
minrk commented Nov 14, 2011

pinging @ellisonbg as the individual with the most experience on the one system that will be adversely affected by this change: Windows-based clusters. Do you still have a test system, where we can check what needs to be done when HOMESHARE is not the default choice? Is it difficult to set HOME or IPYTHON_DIR for jobs?

@fperez fperez commented on an outdated diff Nov 20, 2011
IPython/utils/path.py
else:
- raise HomeDirError('No valid home directory could be found for your OS')
+ raise HomeDirError('%s is not a writable dir, set $HOME env to override' % homedir)
@fperez
fperez Nov 20, 2011 Member

spell out 'environment variable' in full: windows users may be confused if they only see 'env', while the full word is easy to google for even if they don't know how/where to configure the environment in windows.

@fperez
Member
fperez commented Nov 20, 2011

I definitely like the simplification this gives, though I'm a little skittish at jettisoning logic that might actually protect a valid corner case perhaps not covered by expanduser('~'). What do you think of #154, for example?

I'd also like to hear @ellisonbg's opinion: after all the pain of getting all that working on the winhpc server he went through, the last thing we want is to break that!

So let's be cautious with this one, but if indeed we can remove those hacks for such a vastly simpler replacement, I'd be delighted! My only other feedback is a tiny fix on a user-facing message.

Finally, this should also (once we settle the policy decision) have a paragraph in the docs explaining how to configure things (and remove any possibly outdated info that could refer to the old logic).

But thanks for the big cleanup, I hope we can indeed merge the whole thing!

@ellisonbg
Member

On Sun, Nov 20, 2011 at 2:28 AM, Fernando Perez
reply@reply.github.com
wrote:

I definitely like the simplification this gives, though I'm a little skittish at jettisoning logic that might actually protect a valid corner case perhaps not covered by expanduser('~').  What do you think of #154, for example?

I definitely don't trust expanduser to get all of the corner cases
correct unless we can verify that it handles everything that our
current code does.

I'd also like to hear @ellisonbg's opinion: after all the pain of getting all that working on the winhpc server he went through, the last thing we want is to break that!

Yes, getting the user's home directory right was a major pain and a
critical part of the winhpc server support. Whatever happens, we
can't loose support for any of the cases our code handles on Windows.
IIRC, the difficult part was handling network mounted user home
directories. In that case, the usual logic didn't work.

I understand it would be nice to simplify our code, but before we do
that, let's figure out exactly what expanduser does.

So let's be cautious with this one, but if indeed we can remove those hacks for such a vastly simpler replacement, I'd be delighted!  My only other feedback is a tiny fix on a user-facing message.

Finally, this should also (once we settle the policy decision) have a paragraph in the docs explaining how to configure things (and remove any possibly outdated info that could refer to the old logic).

But thanks for the big cleanup, I hope we can indeed merge the whole thing!


Reply to this email directly or view it on GitHub:
#998 (comment)

Brian E. Granger
Cal Poly State University, San Luis Obispo
bgranger@calpoly.edu and ellisonbg@gmail.com

@minrk
Member
minrk commented Nov 21, 2011

I definitely don't trust expanduser to get all of the corner cases
correct unless we can verify that it handles everything that our
current code does.

No, expanduser will not get all of the corner cases, but it does a better job than our current code in almost all normal cases. The problem is that we simply cannot get the right answer on both WinHPC and the rest of Windows, because WinHPC wants HOMESHARE for its UNC path, which is the wrong answer approximately everywhere else that it is defined.

An advantage of expanduser is that HOME is first priority on all platforms, so it is the easiest and most natural to manually specify.

Using expanduser means that covering every possible cornercase is handled with one line of user code, setting $HOME prior to launching IPython, in all environments on all platforms.

Our choice is between:

A) requiring all WinHPC users to specify HOME=%HOMESHARE% (or just use IPYTHON_DIR, which is enough for the parallel apps)
B) requiring most non-WinHPC users to specify HOMESHARE=something (this is not guaranteed to be possible, and just using IPYTHON_DIR does not cut it for interactive IPython users).

I understand it would be nice to simplify our code, but before we do
that, let's figure out exactly what expanduser does.

os.path.expanduser priority on Windows:

  1. $HOME
  2. $USERPROFILE
  3. $HOMEDRIVE\$HOMEPATH

and on Unix:

  1. $HOME
  2. query passwd database via pwd module
@ellisonbg
Member

On Mon, Nov 21, 2011 at 2:40 PM, Min RK
reply@reply.github.com
wrote:

I definitely don't trust expanduser to get all of the corner cases
correct unless we can verify that it handles everything that our
current code does.

No, expanduser will not get all of the corner cases, but it does a better job than our current code in almost all normal cases.  The problem is that we simply cannot get the right answer on both WinHPC and the rest of Windows, because WinHPC wants HOMESHARE for its UNC path, which is the wrong answer approximately everywhere else that it is defined.

IIRC, HOMESHARE is not usually set on Windows and the current logic
will ignore it if it is not set and move onto the other options
(HOMEDRIVE+HOMEPATH, USERROFILE, My Documents, HOME). So the only
time HOMESHARE is used is when it is set and is actually the option
that likely should be used. We know this approach works as this is
how our code has done it for about 2 years now.

An advantage of expanduser is that HOME is first priority on all platforms, so it is the easiest and most natural to manually specify.

Yes, I agree that there is a nice uniformity in having HOME first on
all platforms, but see my note below about env vars on windows.

Using expanduser means that covering every possible cornercase is handled with one line of user code, setting $HOME prior to launching IPython, in all environments on all platforms.

But there is a problem with this. On multiuser, networked Windows
boxes (managed by Active Directory) environment variables are
particularly difficult to manage on a per user basis. This was a
problem I continually faced when doing the work for Microsoft. I went
in with the unix way of thinking, "I will just set an environment
variable and everything will be happy" and that was almost never the
case. There is not such thing as a .bashrc file that users can use to
set global environment variables that override the system defaults on
all affected systems. IIRC, such environment variables have to be set
by sys-admins in the centralized Active Directory configuration using
Group Policy Objects (ouch my head hurts just saying this...) We may
still want to consider putting HOME first on Windows, but we can't
require users on Windows to set environment variables to get IPython
to work.

There is part of my that would love to ditch this additional logic for
Win-HPC, but I have a feeling it will haunt me in the future
specifically...

Our choice is between:

A) requiring all WinHPC users to specify HOME=%HOMESHARE% (or just use IPYTHON_DIR, which is enough for the parallel apps)
B) requiring most non-WinHPC users to specify HOMESHARE=something (this is not guaranteed to be possible, and just using IPYTHON_DIR does not cut it for interactive IPython users).

I don't see how these are our only two options. If non-WinHPC users
don't set HOMESHARE, it is ignored and the other options are
attempted, which is what we do today.

I understand it would be nice to simplify our code, but before we do
that, let's figure out exactly what expanduser does.

os.path.expanduser priority on Windows:

  1. $HOME
  2. $USERPROFILE
  3. $HOMEDRIVE\$HOMEPATH

OK, thanks for tracking this down. Quite different logic than we have
though. This makes me wonder (even aside from the HOMESHARE issue)
why our ordering is so different.

and on Unix:

  1. $HOME
  2. query passwd database via pwd module

Reply to this email directly or view it on GitHub:
#998 (comment)

Brian E. Granger
Cal Poly State University, San Luis Obispo
bgranger@calpoly.edu and ellisonbg@gmail.com

@minrk
Member
minrk commented Nov 22, 2011

IIRC, HOMESHARE is not usually set on Windows and the current logic
will ignore it if it is not set and move onto the other options
(HOMEDRIVE+HOMEPATH, USERROFILE, My Documents, HOME). So the only
time HOMESHARE is used is when it is set and is actually the option
that likely should be used. We know this approach works as this is
how our code has done it for about 2 years now.

The fact that we have had multiple bug reports (including #747) of cases where HOMESHARE is the wrong choice would suggest otherwise.

I don't see how these are our only two options. If non-WinHPC users
don't set HOMESHARE, it is ignored and the other options are
attempted, which is what we do today.

The problem is the enterprise/computer lab environments, which regularly do set this env, where it is not the right answer. It would appear from the user reports that HOMESHARE is never right outside WinHPC even when defined, but obviously we won't be getting reports that we are doing the right thing. Sysadmins, not users, set HOMESHARE, but users can set HOME. So if HOMESHARE is wrong, there isn't much recourse if it is the first priority, but if HOME is wrong or undefined, it should be safe to override.

So the logical choices for first priority are to match the rest of the Python universe and use HOME, or use HOMESHARE, which has the benefit of guaranteeing UNC paths on WinHPC, but the disadvantages of being the wrong answer everywhere else that it is defined, along with being internally inconsistent with the rest of Python.

The fact is that both of these cases are guaranteed to get the wrong answer in some cases, so we just have to pick the one that is going to require some extra config. WinHPC seems the more logical choice to me, because it is already the more complicated and unconventional use case, and the fix is standard and less intrusive.

OK, thanks for tracking this down. Quite different logic than we have
though. This makes me wonder (even aside from the HOMESHARE issue)
why our ordering is so different.

Yes, I think we are exactly backwards. If we do restore HOMESHARE as first priority, we should use expanduser after that, and can still fall back on My Documents in the end, if we want.

@ellisonbg
Member

On Mon, Nov 21, 2011 at 9:50 PM, Min RK
reply@reply.github.com
wrote:

IIRC, HOMESHARE is not usually set on Windows and the current logic
will ignore it if it is not set and move onto the other options
(HOMEDRIVE+HOMEPATH, USERROFILE, My Documents, HOME).  So the only
time HOMESHARE is used is when it is set and is actually the option
that likely should be used.  We know this approach works as this is
how our code has done it for about 2 years now.

The fact that we have had multiple bug reports (including #747) of cases where HOMESHARE is the wrong choice would suggest otherwise.

Ahh, I hadn't followed that very closely. In that case, we do need to
change our logic and I agree that the Win-HPC usage case is the least
important.

I don't see how these are our only two options.  If non-WinHPC users
don't set HOMESHARE, it is ignored and the other options are
attempted, which is what we do today.

The problem is the enterprise/computer lab environments, which regularly do set this env, where it is not the right answer.  It would appear from the user reports that HOMESHARE is never right outside WinHPC even when defined, but obviously we won't be getting reports that we are doing the right thing.  Sysadmins, not users, set HOMESHARE, but users can set HOME.  So if HOMESHARE is wrong, there isn't much recourse if it is the first priority, but if HOME is wrong or undefined, it should be safe to override.

Yep.

So the logical choices for first priority are to match the rest of the Python universe and use HOME, or use HOMESHARE, which has the benefit of guaranteeing UNC paths on WinHPC, but the disadvantages of being the wrong answer everywhere else that it is defined, along with being internally inconsistent with the rest of Python.

The fact is that both of these cases are guaranteed to get the wrong answer in some cases, so we just have to pick the one that is going to require some extra config.  WinHPC seems the more logical choice to me, because it is already the more complicated and unconventional use case, and the fix is standard and less intrusive.

OK, thanks for tracking this down.  Quite different logic than we have
though.  This makes me wonder (even aside from the HOMESHARE issue)
why our ordering is so different.

Yes, I think we are exactly backwards.  If we do restore HOMESHARE as first priority, we should use expanduser after that, and can still fall back on My Documents in the end, if we want.

Given the situation, I think that using expanduser makes sense.


Reply to this email directly or view it on GitHub:
#998 (comment)

Brian E. Granger
Cal Poly State University, San Luis Obispo
bgranger@calpoly.edu and ellisonbg@gmail.com

@minrk
Member
minrk commented Nov 22, 2011

Given the situation, I think that using expanduser makes sense.

Okay, should I add the My Documents wreg bit back in as a fallback? One case we no longer handle on Windows is the lack of any environment, but I don't know when/where that might come up, if ever.

@fperez
Member
fperez commented Nov 22, 2011

On Tue, Nov 22, 2011 at 10:26 AM, Min RK
reply@reply.github.com
wrote:

Okay, should I add the My Documents wreg bit back in as a fallback?

I would. It's just a couple lines of code and if it saves a few
people with weird setups from having problems (and us from having to
answer them) it's worth it.

@minrk
Member
minrk commented Nov 23, 2011

Okay, My Documents fallback is back (along with its test), and @fperez's comment on the error message is addressed.

minrk added some commits Nov 14, 2011
@minrk minrk defer to stdlib for path.get_home_dir()
We have elaborate and fragile logic for determining
home dir, and it is ultimately less reliable than the stdlib behavior
used for `os.path.expanduser('~')`.  This commit defers to that in
all cases other than a bundled Python in py2exe/py2app environments.

The one case where the default guess will *not* be correct, based on
inline comments, is on WinHPC, where all paths must be UNC (`\\foo`), and
thus HOMESHARE is the logical first choice.  However, HOMESHARE is
the wrong answer in approximately all other cases where it is defined,
and the fix for WinHPC users is the trivial `HOME=%HOMESHARE%`.

This removes the various tests of our Windows path resolution logic,
which are no longer relevant. Further, $HOME is used by the stdlib
as first priority on *all* platforms, so tests for this behavior are
no longer posix-specific.

closes gh-970
closes gh-747
b5ca646
@minrk minrk allow IPython to run without writable home dir
get_ipython_dir() ensures that the *IPython* dir is writable, which is more relevant, but the home dir need not be writable. Some optional behaviors (e.g. `%logstart global`) will not work if the home dir is not writable, but IPython should not crash.  Approximately no other operations actually depend on writing directly to $HOME.
db42b5f
@minrk minrk restore My Documents fallback for get_home_dir on Windows 9df2cbb
@fperez
Member
fperez commented Nov 24, 2011

@minrk, this looks pretty much ready to merge, no? If so, feel free to go ahead with it, the last commits look fine.

Thanks for the cleanup and being patient to work through the issues on WinHPC with @ellisonbg!

@minrk
Member
minrk commented Nov 24, 2011

Sure, it seems well behaved to me. I'll give it a few iptests, then merge if nothing pops up.

@fperez
Member
fperez commented Nov 24, 2011

On Wed, Nov 23, 2011 at 6:31 PM, Min RK
reply@reply.github.com
wrote:

Sure, it seems well behaved to me.  I'll give it a few iptests, then merge if nothing pops up.

Great, thanks!

@minrk minrk merged commit 351c8fc into ipython:master Nov 24, 2011
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment