Support environments without a home dir or writable file system #1824

Merged
merged 17 commits into from Apr 16, 2013

Conversation

Projects
None yet
5 participants
Contributor

mgiuca-google commented Mar 15, 2013

This patch makes it possible to import and use matplotlib on a system where the user has no $HOME directory, and furthermore, on which there is no writable file system. It does this by:

  • Falling back to creating a temporary config directory instead of raising RuntimeError if the user has no $HOME directory.
  • Falling back to not writing to the config dir if it is not possible to create a temporary directory. (In this case, all caching mechanisms, including texmanager, font_manager and finance, are disabled.)

Partial fix for Issue #1823.

lib/matplotlib/__init__.py
- t.write(ascii('1'))
- finally:
- t.close()
+ return os.access(p, os.W_OK)
except OSError: return False
@pelson

pelson Mar 15, 2013

Member

Would you mind separating this line into 2.

@mdboom

mdboom Mar 15, 2013

Owner

I believe there is a corner case in which os.access will not do the right thing, hence the convoluted approach you see here... just navigated the git history to no avail. At the very least, this should use realpath of p to resolve any symbolic links.

@mgiuca-google

mgiuca-google Mar 17, 2013

Contributor

@pelson: Done.

@mdboom: Yeah the code was written in 2005 and wasn't really changed since. I assume the corner case is what the Python manual for os.access hints at: "Note: I/O operations may fail even when access() indicates that they would succeed, particularly for operations on network filesystems which may have permissions semantics beyond the usual POSIX permission-bit model."

I don't know what this means in practice. Is there a concrete example of access() returning True for a directory but you can't create files in it? Creating and deleting a temporary file seems like a messy thing to do if there is a function that tells you directly whether a directory is writable. But I am happy to put it back. The App Engine corner case is a failure in the other direction: TemporaryFile succeeds even though the file system is not writable (since TemporaryFile just creates a StringIO, ignoring the dir argument). So it should be fine to have both checks: os.access() to check if it is theoretically writable (which works for App Engine), followed up by TemporaryFile to test if it is writable in practice. Should I go ahead and do this?

At the very least, this should use realpath of p to resolve any symbolic links.

I don't think so. From man 2 access:

If pathname is a symbolic link, it is dereferenced.

I tested this on Linux and it seemed to work (created a non-writable directory and a writable symlink to the directory; os.access with W_OK on the symlink returns False).

@mdboom

mdboom Mar 18, 2013

Owner

I think doing both (as you suggest) sounds like a good suggestion.

On Sun, Mar 17, 2013 at 7:50 PM, Matt Giuca notifications@github.comwrote:

In lib/matplotlib/init.py:

@@ -215,13 +215,8 @@ def _is_writable_dir(p):
try: p + '' # test is string like
except TypeError: return False
try:

  •    t = tempfile.TemporaryFile(dir=p)
    
  •    try:
    
  •        t.write(ascii('1'))
    
  •    finally:
    
  •        t.close()
    
  •    return os.access(p, os.W_OK)
    
    except OSError: return False

@pelson https://github.com/pelson: Done.

@mdboom https://github.com/mdboom: Yeah the code was written in 2005
and wasn't really changed since. I assume the corner case is what the
Python manual for os.access hints at: "Note: I/O operations may fail even
when access() indicates that they would succeed, particularly for
operations on network filesystems which may have permissions semantics
beyond the usual POSIX permission-bit model."

I don't know what this means in practice. Is there a concrete example of
access() returning True for a directory but you can't create files in it?
Creating and deleting a temporary file seems like a messy thing to do if
there is a function that tells you directly whether a directory is
writable. But I am happy to put it back. The App Engine corner case is a
failure in the other direction: TemporaryFile succeeds even though the file
system is not writable (since TemporaryFile just creates a StringIO,
ignoring the dir argument). So it should be fine to have both checks:
os.access() to check if it is theoretically writable (which works for App
Engine), followed up by TemporaryFile to test if it is writable in
practice. Should I go ahead and do this?

At the very least, this should use realpath of p to resolve any symbolic
links.

I don't think so. From man 2 access:

If pathname is a symbolic link, it is dereferenced.

I tested this on Linux and it seemed to work (created a non-writable
directory and a writable symlink to the directory; os.access with W_OK on
the symlink returns False).


Reply to this email directly or view it on GitHubhttps://github.com/matplotlib/matplotlib/pull/1824/files#r3406610
.

Michael Droettboom
http://www.droettboom.com/

@mgiuca-google

mgiuca-google Mar 18, 2013

Contributor

Done. Also I added an extra check that it is actually a directory!

- :see: http://mail.python.org/pipermail/python-list/2005-February/263921.html
+ :see: http://mail.python.org/pipermail/python-list/2005-February/325395.html
lib/matplotlib/__init__.py
try:
- path=os.path.expanduser("~")
+ path = os.path.expanduser("~")
except:
@pelson

pelson Mar 15, 2013

Member

I'd rather remove this bare except. Would you mind adding the specific exception(s) types that might occur?

@mgiuca-google

mgiuca-google Mar 18, 2013

Contributor

OK. Well I hope I didn't miss any... but the documentation for both os.path.expanduser and pwd do not mention any exceptions. I dug through the code and it doesn't look like any exceptions are possible on normal (non-App-Engine) platforms. (pwd.getpwnam can raise KeyError, but that is caught in expanduser.) So I think we could actually remove that try-except entirely, except on App Engine, the pwd module is sadly not present, so calling expanduser raises ImportError. I'm changing the code to explicitly catch ImportError with a comment that this is for App Engine. I hope this is okay.

lib/matplotlib/__init__.py
+ if os.path.isdir(path):
+ return path
+ for evar in ('HOME', 'USERPROFILE', 'TMP'):
+ try:
@pelson

pelson Mar 15, 2013

Member

The try except seems superfluous to me, I'd change this to:

path = os.environ.get(evar)
if path is not None and os.path.isdir(path):
    return path
+ if not _is_writable_dir(h):
+ return _create_tmp_config_dir()
+ from matplotlib.cbook import mkdirs
+ mkdirs(p)
@pelson

pelson Mar 15, 2013

Member

There must be a subtlety that I'm missing here - I can't see the difference between cbook.mkdirs and os.makedirs. Any ideas?

@mdboom

mdboom Mar 15, 2013

Owner

The difference is that cbook.mkdirs doesn't complain if the directory already exists.

@WeatherGod

WeatherGod Mar 15, 2013

Member

cbook.mkdirs() is even safer than os.makedirs(). Imagine two processes, one needing to make A/B, and the other needing to make A/C. There is a race condition in os.makedirs() where one of the two processes will fail to make their directory because the exception was thrown while dealing with the parent directory.

@mgiuca-google

mgiuca-google Mar 18, 2013

Contributor

So why does it use os.makedirs above? (It's a little bit above the context shown in this patch, but the first block of get_configdir calls os.makedirs, whereas the second block calls cbook.mkdirs.) Can we change it so it consistently calls cbook.mkdirs?

@mdboom

mdboom Mar 18, 2013

Owner

Yes -- let's change this to cbook.mkdirs in both cases.

lib/matplotlib/__init__.py
- print("""\
+ configdir = get_configdir()
+ if home:
+ oldname = os.path.join( home, '.matplotlibrc')
@pelson

pelson Mar 15, 2013

Member

Please remove the space before home.

@mgiuca-google

mgiuca-google Mar 18, 2013

Contributor

Done (and the rest).

lib/matplotlib/__init__.py
+ try:
+ shutil.move('.matplotlibrc', 'matplotlibrc')
+ except IOError as e:
+ print("WARNING: File could not be renamed: %s" % e, file=sys.stderr)
@pelson

pelson Mar 15, 2013

Member

Could you turn this into a real warning.

@mgiuca-google

mgiuca-google Mar 18, 2013

Contributor

Done (and the rest).

lib/matplotlib/__init__.py
+ if os.path.exists(oldname):
+ if configdir is not None:
+ newname = os.path.join(configdir, 'matplotlibrc')
+ print("""\
@pelson

pelson Mar 15, 2013

Member

Again, would you mind turning this into a real warning.

lib/matplotlib/__init__.py
+ try:
+ shutil.move(oldname, newname)
+ except IOError as e:
+ print("WARNING: File could not be renamed: %s" % e,
@pelson

pelson Mar 15, 2013

Member

Ditto.

lib/matplotlib/__init__.py
+ print("WARNING: File could not be renamed: %s" % e,
+ file=sys.stderr)
+ else:
+ print("""\
@pelson

pelson Mar 15, 2013

Member

Ditto.

lib/matplotlib/__init__.py
+ else:
+ print("""\
+WARNING: Could not rename old rc file "%s": a suitable configuration directory
+ could not be found.""" % oldname, file=sys.stderr)
fname = os.path.join( os.getcwd(), 'matplotlibrc')
@pelson

pelson Mar 15, 2013

Member

Would you mind removing a newline before this line & getting rid of the space before os.getcwd.

lib/matplotlib/__init__.py
- if os.path.exists(fname): return fname
+ if configdir is not None:
+ fname = os.path.join(configdir, 'matplotlibrc')
+ if os.path.exists(fname): return fname
@pelson

pelson Mar 15, 2013

Member

Could you remove this newline and turn the if statement into 2 lines.

lib/matplotlib/finance.py
+ mkdirs(cachedir)
+ urlfh = urlopen(url)
+
+ fh = open(cachename, 'wb')
@pelson

pelson Mar 15, 2013

Member

Use the context manager form.

@mgiuca-google

mgiuca-google Mar 18, 2013

Contributor

Fixed in PR #1834.

Note: I also caught and fixed my own bug here: if cachename is supplied, it should not skip caching if there is no cachedir.

lib/matplotlib/texmanager.py
- print("""\
+ if texcache is not None:
+ # FIXME raise proper warning
+ print("""\
@pelson

pelson Mar 15, 2013

Member

Please turn this into a warning.

lib/matplotlib/texmanager.py
+ try:
+ shutil.move(oldcache, texcache)
+ except IOError as e:
+ print("WARNING: File could not be renamed: %s" % e,
@pelson

pelson Mar 15, 2013

Member

Ditto.

lib/matplotlib/texmanager.py
+ print("WARNING: File could not be renamed: %s" % e,
+ file=sys.stderr)
+ else:
+ print("""\
@pelson

pelson Mar 15, 2013

Member

Ditto.

lib/matplotlib/texmanager.py
@@ -141,6 +157,11 @@ class TexManager:
def __init__(self):
+ if not self.texcache:
@pelson

pelson Mar 15, 2013

Member

Lets be consistent. Could you turn this into if self.texcache is None.

Member

pelson commented Mar 15, 2013

Thanks @mgiuca-google - this looks good to me. Thanks for putting this work in.

I'm guessing you've tried this out on G.A.E.? That would be the best way to give me confidence that this is doing the right thing for those circumstances.

@ghost ghost assigned pelson Mar 15, 2013

Member

pelson commented Mar 15, 2013

Also, would you mind adding details in the whats new section (making a sing and a dance that Google app engine should now work). If you're keen I'd also love to see a write-up at http://matplotlib.org/faq/howto_faq.html#matplotlib-in-a-web-application-server about how to go about setting up your App engine instance for mpl?

lib/matplotlib/texmanager.py
+ if not self.texcache:
+ raise RuntimeError(
+ ('Cannot create TexManager, as there is no cache directory '
+ 'available'))
@mdboom

mdboom Mar 15, 2013

Owner

This probably doesn't affect Google AppEngine (which I suspect doesn't have a TeX installation anyway), but it would be nice to continue to be able to get TeX snippets without them being cached in this case.

@mgiuca-google

mgiuca-google Mar 18, 2013

Contributor

From my reading of the code, this whole module doesn't work if there is no writable directory. The so-called "texcache" is not really a cache at all, it's also a working directory, where intermediate files are generated and processed with dvipng. So even if we assume a system that does have a TeX installation and the ability to launch external commands, but does not have any writable file system, we won't be able to run these commands. (This isn't a simple matter of disabling caching.)

Correct me if I'm wrong about that.

lib/matplotlib/finance.py
+ fh = open(cachename)
+ verbose.report('Using cachefile %s for %s'%(cachename, ticker))
+ else:
+ mkdirs(cachedir)
@WeatherGod

WeatherGod Mar 15, 2013

Member

Perhaps we need to import cbook.mkdirs() at the top. I think there is a possibility that this function could be not defined at this point.

@mgiuca-google

mgiuca-google Mar 18, 2013

Contributor

cbook.mkdirs is imported at the top.

Owner

mdboom commented Mar 15, 2013

For the font cache: ideally if .matplotlib is not writable, it should fall back to a font cache created at install time and installed in mpl-data/fonts -- that way the user would still get fast access to the fonts that ship with matplotlib. I don't know if we need that level of finesse in this first round of PR, but it would certainly deal with 90% of use cases and have them be as fast as ever.

Contributor

mgiuca-google commented Mar 18, 2013

@pelson:

Thanks @mgiuca-google - this looks good to me. Thanks for putting this work in.

No problem, thanks for looking at these GAE-specific patches. This should make our (GAE Python team's) lives easier in the future.

I'm guessing you've tried this out on G.A.E.? That would be the best way to give me confidence that this is doing the right thing for those circumstances.

Yeah (well, the dev appserver). I basically wrote this patch by running it on the dev appserver and then fixing things until it worked. This, combined with my other two PRs (#1825 and #1826), allows Matplotlib to run on the dev appserver. There is a caveat, though: because of the recent change to Matplotlib to remove pytz and dateutil, and require them as external dependencies, I had to hack the devappserver code to whitelist those modules. I think that is something we'll have to fix when the time comes to upgrade to Matplotlib 1.3.0, not something that you can fix on your end.

Also, would you mind adding details in the whats new section (making a sing and a dance that Google app engine should now work). If you're keen I'd also love to see a write-up at http://matplotlib.org/faq/howto_faq.html#matplotlib-in-a-web-application-server about how to go about setting up your App engine instance for mpl?

Yeah, I'd be happy to write a short "matplotlib with Google App Engine" section there. The general advice in that section is pretty much all you need for App Engine (though obviously you can't use fig.savefig('test.png'); you need to write to stdout instead). I'll address that in a separate PR. Note that Matplotlib already runs on the production servers (since I have already applied similar patches to the above on our local version); this PR just allows it to run on the dev appserver.

@mdboom:

For the font cache: ideally if .matplotlib is not writable, it should fall back to a font cache created at install time and installed in mpl-data/fonts -- that way the user would still get fast access to the fonts that ship with matplotlib. I don't know if we need that level of finesse in this first round of PR, but it would certainly deal with 90% of use cases and have them be as fast as ever.

But as far as I can tell, the font cache is not actually caching the font files, it's caching the font list. I don't think that mpl-data/fonts contains a font list file, does it? (It doesn't on my system.)

Owner

mdboom commented Mar 18, 2013

Yes -- the fontList.cache is just a directory of the font attributes to font paths. I was suggesting that we generate a font directory at build time of only the fonts that ship with matplotlib, and use that as a fallback if we can't write a font directory at import time.

Contributor

mgiuca-google commented Mar 18, 2013

Yes -- the fontList.cache is just a directory of the font attributes to font paths. I was suggesting that we generate a font directory at build time of only the fonts that ship with matplotlib, and use that as a fallback if we can't write a font directory at import time.

OK. Well can we do that in another PR? This PR is not going to make things worse on this front (since all of these cases are going from "won't work at all" to "won't have access to the cache".

Owner

mdboom commented Mar 19, 2013

Right -- I was suggesting that having a default cache could be another PR.

Contributor

mgiuca-google commented Mar 19, 2013

OK. I've rebased the branch to the current master.

Member

dmcdougall commented Mar 23, 2013

@mgiuca-google Looks like you need another rebase. That might be my fault; I just firehosed a bunch of PRs.

Contributor

mgiuca-google commented Mar 24, 2013

Rebased.

(It was my fault: I conflicted with my own change in another branch!)

mgiuca-google added some commits Mar 6, 2013

matplotlib: _is_writable_dir uses os.access instead of TemporaryFile.
This is a more direct approach and avoids strange behaviour on systems that
provide a fake implementation of TemporaryFile. In particular, on Google App
Engine, TemporaryFile is equivalent to StringIO, so this would have always
succeeded, despite the fact that the directory is not actually writable.

It is also not particularly nice to go creating and deleting random files in the
user's home directory.
Matplotlib now works when the user has no home directory.
matplotlib.get_home now returns None if there is no home directory, instead of
raising RuntimeError. Updated code in several places to handle this gracefully.

matplotlib.get_configdir now returns a temporary directory if there is no home
directory, instead of raising RuntimeError.
get_configdir returns None if tempfile.gettempdir() is not available.
Previously, it would raise NotImplementedError. This is necessary on restricted
platforms such as Google App Engine that do not provide gettempdir.
Deal with all cases where get_configdir might return None.
Each case is either handled gracefully, or raises a RuntimeError.
font_manager: Gracefully handle the case of there being no config dir.
Instead of raising RuntimeError, now avoids reading and writing the font cache.
texmanager: Gracefully handle the case of there being no config dir u…
…pon import.

Instead of raising RuntimeError, now avoids creating the cache dir. This permits
texmanager to be imported in a restricted environment such as Google App Engine.

Actually constructing a TexManager object will fail, but it can be imported.
finance: Gracefully handle the case of there being no config dir.
Instead of raising RuntimeError, now avoids reading and writing the cache.
finance: Fixed caching when cachename is supplied.
Previously, it would not cache if there was no cachedir, even if cachename was
supplied.
matplotlib: Remove catch for OSError.
os.access does not raise this exception.
matplotlib: _is_writable_dir tests with os.access and TemporaryFile.
This fails either if the OS tells us it isn't writable, or if a file cannot be
created there.
Contributor

mgiuca-google commented Apr 16, 2013

Hey guys, we haven't had any traction on this issue for about a month. I thought I'd wait until the 1.2.1 release before pinging.

It looked like we were almost at a consensus. Are there any more issues? I have just rebased again against the latest HEAD, and there were no conflicts.

pelson added a commit that referenced this pull request Apr 16, 2013

Merge pull request #1824 from mgiuca-google/no-home-dir
Support environments without a home dir or writable file system

@pelson pelson merged commit 91b1d9f into matplotlib:master Apr 16, 2013

1 check passed

default The Travis build passed
Details
Member

pelson commented Apr 16, 2013

Done. Another excellent piece of work @mgiuca-google - thanks again!

@mgiuca-google mgiuca-google deleted the mgiuca-google:no-home-dir branch Apr 16, 2013

Contributor

mgiuca-google commented Apr 16, 2013

Thanks, Phil. That's all the patches I have lined up for now. It's been great working with you.

Member

pelson commented Apr 16, 2013

That's all the patches I have lined up for now. It's been great working with you.

Shame! Has this been a work-driven or a hobby-driven project? We've plenty of space for people willing to get their hands dirty improving the codebase 😉 .

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment