Skip to content
This repository

Unicode bug in Itpl when expanding shell variables in syscalls with ! #822

Closed
fperez opened this Issue September 29, 2011 · 22 comments

4 participants

Fernando Perez Min RK Thomas Kluyver Stefan van der Walt
Fernando Perez
Owner

found this today during a presentation, dumping quickly...

In [35]: files
Out[35]: 
['pics:',
 '83547885.jpg',
 '',
 'ppt:',
 'Aiguille du Midi.pps',
 'Ca\xc3\xb1o Cristales.pps',
 'parejas disparejas.ppt',
 'Underwater.pps',
 '',
 'pub:',
 'image_summary.py',
 'stained_glass_barcelona.png',
 'trapezoid_demo.py',
 'trapezoid.py',
 'trap.py',
 'trap.py~']

In [36]: for f in files:
   ....:     !echo $f
   ....:     
pics:
83547885.jpg

ppt:
Aiguille du Midi.pps
---------------------------------------------------------------------------
UnicodeDecodeError                        Traceback (most recent call last)
/home/fperez/tmp/ in ()
      1 for f in files:
----> 2     get_ipython().system(u"echo $f")
      3 

/home/fperez/usr/lib/python2.6/site-packages/IPython/core/interactiveshell.pyc in system_raw(self, cmd)
   1999         # a non-None value would trigger :func:`sys.displayhook` calls.

   2000         # Instead, we store the exit_code in user_ns.

-> 2001         self.user_ns['_exit_code'] = os.system(self.var_expand(cmd, depth=2))
   2002 
   2003     # use piped system by default, because it is better behaved


/home/fperez/usr/lib/python2.6/site-packages/IPython/core/interactiveshell.pyc in var_expand(self, cmd, depth)
   2473                           sys._getframe(depth+1).f_locals # locals
   2474                           )
-> 2475         return py3compat.str_to_unicode(str(res), res.codec)
   2476 
   2477     def mktempfile(self, data=None, prefix='ipython_edit_'):

/home/fperez/usr/lib/python2.6/site-packages/IPython/external/Itpl/_Itpl.pyc in __str__(self)
    240     def __str__(self):
    241         """Evaluate and substitute the appropriate parts of the string."""
--> 242         return self._str(self.globals,self.locals)
    243 
    244     def __repr__(self):

/home/fperez/usr/lib/python2.6/site-packages/IPython/external/Itpl/_Itpl.pyc in _str(self, glob, loc)
    197             if live: app(str(eval(chunk,glob,loc)))
    198             else: app(chunk)
--> 199         out = ''.join(result)
    200         try:
    201             return str(out)

UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 2: ordinal not in range(128)
Min RK
Owner

Isn't it well established that Itpl just doesn't support unicode in the slightest?

Fernando Perez
Owner

Right... I had people in front of me so it was really quick and I didn't have the bandwidth to recall that.

One more reason to push on moving out of Itpl, then. Unicode in the filesystem is more and more common, so this will be more annoying to people as time goes on. I don't know if we'll manage to transition it by 0.12 though.

Thomas Kluyver
Collaborator

Just to mention, the FullEvalFormatter that I'd like to use for this is in PR #507 (prompt manager). So I'd prefer to get that merged in before approaching this. But if the prompt manager is going to be delayed until after 0.12, I can copy FullEvalFormatter across and do this first.

Thomas Kluyver
Collaborator

@fperez: Is the prompt manager stuff likely to get in for 0.12? If not, I can copy across the FullEvalFormatter code and replace Itpl here with string formatting.

Fernando Perez
Owner

Let's see if we can find the time to resolve those. If we can make headway into the harder PRs we have before release, we won't need to. I'd like to tackle them one at a time over the next week, we'll see how it goes.

Min RK
Owner

Itpl is actually quite small, and it was easy to get it working, at least for this particular bug. If we do want to keep it, should I keep it str-only, or make it unicode-native, or just leave it like Thomas says, and move to EvalFormatter (losing '$' in the process).

EvalFormatter is definitely much better (and more pythonic) code. But I have a feeling that the people who use the '$' expansion will be very sad to see it gone. Especially if we restore the long lost shell profile.

Fernando Perez
Owner

I'd forgotten that itpl is the one that gives us $ expansion, and that is something that is definitely very useful, and that I've demoed multiple times to audiences in addition to using it personally quite regularly.

So that's a good argument for keeping Itpl around then, even if we move to EvalFormatter for most of our internal use. In that case, making it unicode compliant seems like the right way to go, as that will ensure things work OK in py3.

Thomas Kluyver
Collaborator

Making $name and $name.attr work should just be a few lines subclass of FullEvalFormatter, so if it allows us to drop ~300 lines of Itpl (which it seems we now need to maintain ourselves), I think it's worth doing.

More complex expressions will need to be written as ${name['item'](args)}, but I don't think that's a show stopper.

Fernando Perez
Owner
Thomas Kluyver
Collaborator
Fernando Perez
Owner
Min RK
Owner

Sounds good - I'll slow down on the small ones, I was just trying to clean out some of the easy 0.12 issues.

Thomas, how were you thinking adding '$foo' support would work in FullEval? All the actual parsing in handled by the str._formatter_parser method, so it seems like you would essentially have to rewrite the Itpl parse all over again.

Min RK
Owner

I should also note that when I was digging into this ( I do already have unicode itpl working ), I discovered a small related bug - os.system doesn't like unicode, so we have to make sure that we encode with unicode_to_str when we pass to it.

Fernando Perez
Owner
Thomas Kluyver
Collaborator
Min RK
Owner

okay, makes sense.

Fernando Perez
Owner
Min RK
Owner

So am I correct in understanding that the official plan for this is for Thomas to add $foo support to FullEvalFormatter, and remove Itpl as part of the PromptManager PR?

Fernando Perez
Owner
Fernando Perez fperez closed this in 09c9952 November 20, 2011
Fernando Perez
Owner

@takluyver, let me know if the test I added in 31ab23f causes any issues in py3. Thanks for the PR!

Stefan van der Walt

Should a person have access to environment variables? E.g., I can't do

!echo ${HOME}
Min RK
Owner

@stefanv Yes, they should, though the point of this is to allow $HOME to get HOME from the IPython environment. I think the way it used to work was that you would use $$HOME to pass the string $HOME to the system call, though that does not work with this change.

Brian E. Granger ellisonbg referenced this issue from a commit January 10, 2012
Commit has since been removed from the repository and is no longer available.
Fernando Perez fperez referenced this issue from a commit January 10, 2012
Commit has since been removed from the repository and is no longer available.
Fernando Perez fperez referenced this issue from a commit January 10, 2012
Commit has since been removed from the repository and is no longer available.
Fernando Perez fperez referenced this issue from a commit January 10, 2012
Commit has since been removed from the repository and is no longer available.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Something went wrong with that request. Please try again.