Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP

Loading…

Str and Bytes traitlets #462

Merged
merged 7 commits into from

3 participants

@takluyver
Owner

Arranges Str and Bytes traitlets to simplify automatic conversion to Python 3. As with the equivalent builtin names, on Python 2, Str is a synonym for Bytes. In Python 3, Str will be synonymous with Unicode.

So, like with the built in functions, we can now use:

  • Bytes for traits that should stay as a bytestring in Python 3
  • Unicode for anything that should already be unicode (like filenames)
  • Str for things that should be bytestrings in Python 2 and unicode in Python 3

Casting versions of each follow the same pattern. A quick look at the codebase suggests that we're already using things as we should - I'll need to do some cleaning up on the Python 3 side, because I got it working without much thought for sorting it out like this.

@rkern

In Traits Str allows either str or unicode objects, i.e. it only typechecks for basestring, not str or unicode. Please use different names if you want different semantics.

@minrk
Owner

This looks great. I was actually just thinking about this a few minutes ago.

I think the only place where we will actually want Bytes is in the parallel code, where it is critical.

@takluyver
Owner

@rkern: Str already checked specifically for str, not basestring. CStr on the other hand would cast to either str or unicode. I've added a commit to make this consistent - it will only try to make a str (i.e. bytes). It doesn't cause any test failures, but feel free to find corner cases. For Python 3 compatibility, it's preferable to avoid ambiguity about string types as much as possible.

@minrk: Excellent. I'll leave replacing Str with Bytes in the parallel code to you, if that's alright.

@rkern

Yes, but it was wrong to do so, insofar as compatibility with Traits is concerned. Please follow Traits semantics for the same-named traits.

@takluyver
Owner

It's only described as like traits, not a drop in replacement. And in any case, these refer to Python's own names, so Str should correspond to str. We could have a Basestring type, but I don't think we need it.

@rkern

But it is an explicit goal for traitlets to follow Traits semantics where possible and to deviate only where necessary.

In any case, making Str change semantics between Python 2 and Python 3 only makes your code harder to port, not easier. It means that you will have to make an additional set of modifications outside of the scope of the standard 2to3 script.

@takluyver
Owner

No, precisely the aim is to avoid having to make additional modifications. Str in each case references the str type, which is unicode on Python 3. So where code expects bytestrings in Python 2, but unicode in Python 3, Str will behave correctly in both situations.

For the other possibilities, we'll use Unicode and Bytes in the same way as unicode and bytes in Python 2.6+.

@rkern

But that's the problem: you never write code that has that requirement. There are three cases:

  1. You require a unicode string in both Python 2 and Python 3.
  2. You require a bytes string in both Python 2 and Python 3.
  3. You want a unicode string in both Python 2 and Python 3, but you will accept a str object in Python 2 and let the user make sure it's ASCII.

You never want "whatever the str type is in this Python version."

@minrk
Owner

@takluyver - we do want to maintain compatibility with Traits, despite the fact that Str does not actually correspond to str. As a result, we should leave Str mapping to basestring, and add Bytes for mapping directly to bytes.

In general, we should probably avoid using the Str trait in new code, in favor of Unicode or Bytes, to minimize ambiguity.

@rkern - There isn't a predefined Trait for case 2) that we should match, is there? I don't see one.

I also see in the Traits source that that ListStr is in fact List(str), and not List(Str), which would behave differently. This means that Traits itself is not internally consistent with respect to Str/str/basestring.

@rkern

@minrk No, there isn't a Bytes trait, but we could add one. Honestly, you should just ignore ListStr and most of the other abbreviations in that part of trait_types.py. They are useless.

@takluyver
Owner

Unfortunately there are cases where data should be in whatever the str type is. There are plenty of cases where Python 2 appears to accept unicode, but is just attempting to silently encode it as ascii text, and will suddenly produce UnicodeEncodeError when you stray outside the first 128 code points. This affects writing to file handles, calling raw_input, and filling readline history, for example. Not to mention the fact that you can .encode() bytestrings and .decode() unicode, with messy results.

If we have an ambiguous data type, we just end up having to check isinstance whenever we want to use it, which defeats the point of using this system. If we really want unicode data, but can accept ascii bytestrings on Python 2, we have the CUnicode trait type to handle it properly.

@rkern

All of those cases should be converting to/from unicode at their sources/sinks in Python 2. As far as IPython is concerned, it should be using unicode objects for whatever it is passing to and receiving from those APIs; it's just that in Python 2, you have to wrap those APIs to explicitly handle the conversions. IPython should not be storing any data for those APIs in attributes that are str objects, only unicode objects. That's the best practice for Unicode text: encode/decode directly at the I/O interface and only pass around unicode objects internally.

I'm not saying that the Traits semantics for Str are perfect or even desirable if we were designing the system now. They were a practical compromise accumulated over various historical accidents, long before Python 3 was anything but a joke. But an explicit goal of IPython's use of traitlets is to be able to drop in Traits in place of it. As long as you still hold that goal (and honestly, you don't have to, but you ought to rename things to avoid confusion), you need to avoid semantic differences where feasible. It is feasible to avoid these differences here.

@takluyver
Owner

OK, if that's what we aim for...

However, if Str traits have ambiguous type, we should aim not to use them at all. So perhaps it's better not to define Str in traitlets, and convert all current uses to either Unicode, CUnicode or Bytes.

@rkern

Quite reasonable.

@takluyver
Owner

Alright, I'll get on that this evening.

@takluyver
Owner

That's taken Str out of most of the codebase, without any obvious problems.

@minrk: If you've got time, could I ask you to sort out Bytes and Unicode traits in the parallel code, since you know that much better than me. If you want to make a pull request against this branch (takluyver/traitlets-str), we should have it all working before we merge in to master.

@minrk
Owner

We are in the middle of rewriting 100% of the application code (in newapp), so that would be tremendously inconvenient right now. I'll make sure that all the traits are the right type there, but I don't think a PR against your PR makes sense right now.

@takluyver
Owner

OK, this can wait until you've done that, then I'll rebase onto it.

@minrk
Owner

I've been working on updating the traits in the parallel code, and there is definitely one case where you do want 'whatever Python str is': imports. importstring.import_item passes arguments along to __import__, which doesn't accept unicode strings. So if you have a configurable that is meant to be an argument passed to an import, a true str trait (that does not allow unicode in Python 2) is really the right type. I don't quite know what the right thing to do about that is, though.

@rkern

Yes, it does:

$ python
Enthought Python Distribution -- http://www.enthought.com
Version: 6.2-2 (32-bit)

Python 2.6.5 |EPD 6.2-2 (32-bit)| (r265:79063, May 28 2010, 15:13:03) 
[GCC 4.0.1 (Apple Inc. build 5488)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> __import__(u'sys')
<module 'sys' (built-in)>
>>> 
@takluyver
Owner
>>> __import__(u"€")
Traceback (most recent call last):
  File "", line 1, in 
UnicodeEncodeError: 'ascii' codec can't encode character u'\u20ac' in position 0: ordinal not in range(128)

I've also come across a similar case: where we're storing attribute names to be used with getattr (for the formatters). Unicode works, but only in the "feeble attempt to make it a string" sense.

@takluyver
Owner

Could we call a trait something like NativeString? It's ugly, but it's better than WhateverTypeStrRepresentsOnThisVersionOfPython.

@rkern

Requiring a str object in Python 2 is incorrect. A unicode object of the right value is a valid input and should be allowed in both of these areas. Preventing u'sys' is wrong. Strictly speaking, preventing an arbitrary byte string is also wrong in Python 2, but using non-ASCII (strictly speaking, non-identifier strings) would be abuses of the system and unlikely to appear. This is one area where Python 2's implicit conversions between str and unicode for ASCII text works just fine.

This is one reason that we have been reasonably happy with the Str situation in Traits (in Python 2). Python 2 is a transitional form that has areas that accept str and unicode objects of the right value. Strictly checking for either is likely to cause problems. If you like, a reasonably correct trait for these two cases would be one that under Python 3, just accepts Unicode str objects and under Python 2 accepts byte str objects and unicode objects that can be converted to str objects through .encode('ascii') without error.

@minrk
Owner

Sorry, I should have specified - the fromlist argument does not allow unicode objects:

n [15]: __import__('numpy', fromlist=[u'ndarray'])
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
/Users/minrk/dev/py/Scale0/playground/<ipython-input-15-3ea93eb4f121> in <module>()
----> 1 __import__('numpy', fromlist=[u'ndarray'])

TypeError: Item in ``from list'' not a string

This is exactly what utils.importstring.import_item(u'numpy.ndarray') does, which will fail.

An ASCIIStr trait should attempt to coerce Unicode, so that the ASCII-ness is checked at assignment time, rather than allow Type (or Value) Errors much later when the trait is used. There are plently of cases where non-ascii unicode is an invalid value, and having a trait that represents that makes sense.

@rkern

ASCIIStr will end up being a misnomer in Python 3 for many of those cases since Python 3 does introduce non-ASCII identifiers. If you want to be strict about it, you may be better off using Regex traits that actually specify what values are supposed to be allowed and change the regex between versions.

@takluyver
Owner

What about a specific trait type for object names? ObjectName? It would coerce to whatever str was on each platform. I was also thinking it could check against a regex for valid Python identifiers, but it turns out that's not trivial in Python 3 (http://www.python.org/dev/peps/pep-3131/).

@rkern

In Python 3, you could use all(x.isidentifier() for x in value.split('.')) to validate a dotted name.

@takluyver
Owner

Hmm, neat. Is that a way forward, then? Is ObjectName a sensible trait name?

@minrk
Owner

I realize that, at least this particular case, It's really the same as Type. Looking at the source, the Type Trait (in real traits) allows assignment as a string. After testing, the default value of a Type traitlet can be a string, but assignment doesn't work. Fixing that in Type.validate should cover this case.

See the original Type trait here:
https://github.com/enthought/traits/blob/master/traits/trait_types.py#L2872

@takluyver
Owner

But that brings us back round to having a trait type that behaves differently from Enthought traits, which is what we're trying to avoid. Also, neither of these cases is necessarily specifying a type/class: the arguments to fromlist can be arbitrary objects inside a module, and the case I ran into is attribute names.

ObjectName would, I think, cover these two cases. NativeString would be more broadly applicable in cases where we can handle ascii-only unicode. I'm happy to add either to this pull request - what do we prefer? Or has someone got a better idea?

@minrk
Owner

No it doesn't. As I tried to describe, the real Trait already handles this, but the Type traitlet is currently inconsistent - it allows strings as the input default, but not in assignment, which the real Type trait does allow.

Making this change would actually fix a mismatch with Traits.

@rkern

I don't think that's what the Type trait does in Traits. The Type trait is a lot like Instance only instead of checking isinstance() it checks issubclass(). It only accepts strings in order to allow lazy initialization. For the actual values, you must assign types that are subclasses of that base class. You cannot assign strings and expect it to resolve them.

I recommend the ObjectName trait.

@minrk
Owner

Okay, I must have misread the code, sorry. In that case, I'm fine with ObjectName or Identifier.

@takluyver
Owner

I've added it as ObjectName (Identifier is more accurate, but I think it's a bit ambiguous). There's also a subclass, DottedObjectName, for where you want to store a reference like "numpy.ndarray".

@rkern

+1

@takluyver
Owner

Rebased and tidied up CStr and Str traits that were lurking in newapp.

Min, none of the places the tests failed in parallel looked like they wanted ObjectName traits, so I haven't added any beyond those in IPython.core.formatters. Feel free to point out places that could use them.

@minrk
Owner

Great, thanks!

I left the CStr

ObjectNames:

  • zmq.session.Session.pack/unpack
  • zmq.kernelapp.KernelApp.kernel_class
  • zmq.kernelapp.KernelApp.outstream_class
  • zmq.kernelapp.KernelApp.displayhook_class
  • parallel.apps.ipcluster.IPClusterEngines.engine_launcher_class
  • parallel.apps.ipcluster.IPClusterStart.controller_launcher_class
  • parallel.controller.hub.HubFactory.db_class
@takluyver
Owner

All done!

@minrk
Owner

Looks good to me. As long as tests pass, I say merge.

@takluyver takluyver merged commit 96d15c3 into from
@damianavila damianavila referenced this pull request from a commit
Commit has since been removed from the repository and is no longer available.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
This page is out of date. Refresh to see the latest.
View
8 IPython/config/tests/test_configurable.py
@@ -28,7 +28,7 @@
)
from IPython.utils.traitlets import (
- Int, Float, Str
+ Int, Float, Unicode
)
from IPython.config.loader import Config
@@ -42,7 +42,7 @@
class MyConfigurable(Configurable):
a = Int(1, config=True, help="The integer a.")
b = Float(1.0, config=True, help="The integer b.")
- c = Str('no config')
+ c = Unicode('no config')
mc_help=u"""MyConfigurable options
@@ -56,11 +56,11 @@ class MyConfigurable(Configurable):
class Foo(Configurable):
a = Int(0, config=True, help="The integer a.")
- b = Str('nope', config=True)
+ b = Unicode('nope', config=True)
class Bar(Foo):
- b = Str('gotit', config=False, help="The string b.")
+ b = Unicode('gotit', config=False, help="The string b.")
c = Float(config=True, help="The string c.")
View
40 IPython/core/formatters.py
@@ -28,7 +28,7 @@
# Our own imports
from IPython.config.configurable import Configurable
from IPython.lib import pretty
-from IPython.utils.traitlets import Bool, Dict, Int, Str, CStr
+from IPython.utils.traitlets import Bool, Dict, Int, Unicode, CUnicode, ObjectName
#-----------------------------------------------------------------------------
@@ -191,11 +191,11 @@ class BaseFormatter(Configurable):
returned and this format type is not used.
"""
- format_type = Str('text/plain')
+ format_type = Unicode('text/plain')
enabled = Bool(True, config=True)
- print_method = Str('__repr__')
+ print_method = ObjectName('__repr__')
# The singleton printers.
# Maps the IDs of the builtin singleton objects to the format functions.
@@ -337,14 +337,14 @@ def dtype_pprinter(obj, p, cycle):
"""
# The format type of data returned.
- format_type = Str('text/plain')
+ format_type = Unicode('text/plain')
# This subclass ignores this attribute as it always need to return
# something.
enabled = Bool(True, config=False)
# Look for a _repr_pretty_ methods to use for pretty printing.
- print_method = Str('_repr_pretty_')
+ print_method = ObjectName('_repr_pretty_')
# Whether to pretty-print or not.
pprint = Bool(True, config=True)
@@ -356,12 +356,12 @@ def dtype_pprinter(obj, p, cycle):
max_width = Int(79, config=True)
# The newline character.
- newline = Str('\n', config=True)
+ newline = Unicode('\n', config=True)
# format-string for pprinting floats
- float_format = Str('%r')
+ float_format = Unicode('%r')
# setter for float precision, either int or direct format-string
- float_precision = CStr('', config=True)
+ float_precision = CUnicode('', config=True)
def _float_precision_changed(self, name, old, new):
"""float_precision changed, set float_format accordingly.
@@ -454,9 +454,9 @@ class HTMLFormatter(BaseFormatter):
could be injected into an existing DOM. It should *not* include the
```<html>`` or ```<body>`` tags.
"""
- format_type = Str('text/html')
+ format_type = Unicode('text/html')
- print_method = Str('_repr_html_')
+ print_method = ObjectName('_repr_html_')
class SVGFormatter(BaseFormatter):
@@ -471,9 +471,9 @@ class SVGFormatter(BaseFormatter):
```<svg>``` tags, that could be injected into an existing DOM. It should
*not* include the ```<html>`` or ```<body>`` tags.
"""
- format_type = Str('image/svg+xml')
+ format_type = Unicode('image/svg+xml')
- print_method = Str('_repr_svg_')
+ print_method = ObjectName('_repr_svg_')
class PNGFormatter(BaseFormatter):
@@ -487,9 +487,9 @@ class PNGFormatter(BaseFormatter):
The return value of this formatter should be raw PNG data, *not*
base64 encoded.
"""
- format_type = Str('image/png')
+ format_type = Unicode('image/png')
- print_method = Str('_repr_png_')
+ print_method = ObjectName('_repr_png_')
class LatexFormatter(BaseFormatter):
@@ -503,9 +503,9 @@ class LatexFormatter(BaseFormatter):
The return value of this formatter should be a valid LaTeX equation,
enclosed in either ```$``` or ```$$```.
"""
- format_type = Str('text/latex')
+ format_type = Unicode('text/latex')
- print_method = Str('_repr_latex_')
+ print_method = ObjectName('_repr_latex_')
class JSONFormatter(BaseFormatter):
@@ -518,9 +518,9 @@ class JSONFormatter(BaseFormatter):
The return value of this formatter should be a valid JSON string.
"""
- format_type = Str('application/json')
+ format_type = Unicode('application/json')
- print_method = Str('_repr_json_')
+ print_method = ObjectName('_repr_json_')
class JavascriptFormatter(BaseFormatter):
@@ -534,9 +534,9 @@ class JavascriptFormatter(BaseFormatter):
The return value of this formatter should be valid Javascript code and
should *not* be enclosed in ```<script>``` tags.
"""
- format_type = Str('application/javascript')
+ format_type = Unicode('application/javascript')
- print_method = Str('_repr_javascript_')
+ print_method = ObjectName('_repr_javascript_')
FormatterABC.register(BaseFormatter)
FormatterABC.register(PlainTextFormatter)
View
27 IPython/core/interactiveshell.py
@@ -70,7 +70,7 @@
from IPython.utils.strdispatch import StrDispatch
from IPython.utils.syspathcontext import prepended_to_syspath
from IPython.utils.text import num_ini_spaces, format_screen, LSString, SList
-from IPython.utils.traitlets import (Int, Str, CBool, CaselessStrEnum, Enum,
+from IPython.utils.traitlets import (Int, CBool, CaselessStrEnum, Enum,
List, Unicode, Instance, Type)
from IPython.utils.warn import warn, error, fatal
import IPython.core.hooks
@@ -122,16 +122,16 @@ def get_default_colors():
return 'Linux'
-class SeparateStr(Str):
- """A Str subclass to validate separate_in, separate_out, etc.
+class SeparateUnicode(Unicode):
+ """A Unicode subclass to validate separate_in, separate_out, etc.
- This is a Str based trait that converts '0'->'' and '\\n'->'\n'.
+ This is a Unicode based trait that converts '0'->'' and '\\n'->'\n'.
"""
def validate(self, obj, value):
if value == '0': value = ''
value = value.replace('\\n','\n')
- return super(SeparateStr, self).validate(obj, value)
+ return super(SeparateUnicode, self).validate(obj, value)
class ReadlineNoRecord(object):
@@ -294,9 +294,9 @@ def _exiter_default(self):
"""
)
- prompt_in1 = Str('In [\\#]: ', config=True)
- prompt_in2 = Str(' .\\D.: ', config=True)
- prompt_out = Str('Out[\\#]: ', config=True)
+ prompt_in1 = Unicode('In [\\#]: ', config=True)
+ prompt_in2 = Unicode(' .\\D.: ', config=True)
+ prompt_out = Unicode('Out[\\#]: ', config=True)
prompts_pad_left = CBool(True, config=True)
quiet = CBool(False, config=True)
@@ -307,7 +307,7 @@ def _exiter_default(self):
readline_use = CBool(True, config=True)
readline_merge_completions = CBool(True, config=True)
readline_omit__names = Enum((0,1,2), default_value=2, config=True)
- readline_remove_delims = Str('-/~', config=True)
+ readline_remove_delims = Unicode('-/~', config=True)
# don't use \M- bindings by default, because they
# conflict with 8-bit encodings. See gh-58,gh-88
readline_parse_and_bind = List([
@@ -327,9 +327,9 @@ def _exiter_default(self):
# TODO: this part of prompt management should be moved to the frontends.
# Use custom TraitTypes that convert '0'->'' and '\\n'->'\n'
- separate_in = SeparateStr('\n', config=True)
- separate_out = SeparateStr('', config=True)
- separate_out2 = SeparateStr('', config=True)
+ separate_in = SeparateUnicode('\n', config=True)
+ separate_out = SeparateUnicode('', config=True)
+ separate_out2 = SeparateUnicode('', config=True)
wildcards_case_sensitive = CBool(True, config=True)
xmode = CaselessStrEnum(('Context','Plain', 'Verbose'),
default_value='Context', config=True)
@@ -1670,7 +1670,8 @@ def init_readline(self):
# Remove some chars from the delimiters list. If we encounter
# unicode chars, discard them.
delims = readline.get_completer_delims().encode("ascii", "ignore")
- delims = delims.translate(None, self.readline_remove_delims)
+ for d in self.readline_remove_delims:
+ delims = delims.replace(d, "")
delims = delims.replace(ESC_MAGIC, '')
readline.set_completer_delims(delims)
# otherwise we end up with a monster history after a while:
View
8 IPython/core/magic.py
@@ -3472,25 +3472,25 @@ def magic_precision(self, s=''):
In [1]: from math import pi
In [2]: %precision 3
- Out[2]: '%.3f'
+ Out[2]: u'%.3f'
In [3]: pi
Out[3]: 3.142
In [4]: %precision %i
- Out[4]: '%i'
+ Out[4]: u'%i'
In [5]: pi
Out[5]: 3
In [6]: %precision %e
- Out[6]: '%e'
+ Out[6]: u'%e'
In [7]: pi**10
Out[7]: 9.364805e+04
In [8]: %precision
- Out[8]: '%r'
+ Out[8]: u'%r'
In [9]: pi**10
Out[9]: 93648.047476082982
View
18 IPython/core/prefilter.py
@@ -36,7 +36,7 @@
from IPython.core.splitinput import split_user_input
from IPython.core import page
-from IPython.utils.traitlets import List, Int, Any, Str, CBool, Bool, Instance
+from IPython.utils.traitlets import List, Int, Any, Unicode, CBool, Bool, Instance
from IPython.utils.text import make_quoted_expr
from IPython.utils.autoattr import auto_attr
@@ -756,7 +756,7 @@ def check(self, line_info):
class PrefilterHandler(Configurable):
- handler_name = Str('normal')
+ handler_name = Unicode('normal')
esc_strings = List([])
shell = Instance('IPython.core.interactiveshell.InteractiveShellABC')
prefilter_manager = Instance('IPython.core.prefilter.PrefilterManager')
@@ -797,7 +797,7 @@ def __str__(self):
class AliasHandler(PrefilterHandler):
- handler_name = Str('alias')
+ handler_name = Unicode('alias')
def handle(self, line_info):
"""Handle alias input lines. """
@@ -812,7 +812,7 @@ def handle(self, line_info):
class ShellEscapeHandler(PrefilterHandler):
- handler_name = Str('shell')
+ handler_name = Unicode('shell')
esc_strings = List([ESC_SHELL, ESC_SH_CAP])
def handle(self, line_info):
@@ -839,7 +839,7 @@ def handle(self, line_info):
class MacroHandler(PrefilterHandler):
- handler_name = Str("macro")
+ handler_name = Unicode("macro")
def handle(self, line_info):
obj = self.shell.user_ns.get(line_info.ifun)
@@ -850,7 +850,7 @@ def handle(self, line_info):
class MagicHandler(PrefilterHandler):
- handler_name = Str('magic')
+ handler_name = Unicode('magic')
esc_strings = List([ESC_MAGIC])
def handle(self, line_info):
@@ -864,7 +864,7 @@ def handle(self, line_info):
class AutoHandler(PrefilterHandler):
- handler_name = Str('auto')
+ handler_name = Unicode('auto')
esc_strings = List([ESC_PAREN, ESC_QUOTE, ESC_QUOTE2])
def handle(self, line_info):
@@ -924,7 +924,7 @@ def handle(self, line_info):
class HelpHandler(PrefilterHandler):
- handler_name = Str('help')
+ handler_name = Unicode('help')
esc_strings = List([ESC_HELP])
def handle(self, line_info):
@@ -962,7 +962,7 @@ def handle(self, line_info):
class EmacsHandler(PrefilterHandler):
- handler_name = Str('emacs')
+ handler_name = Unicode('emacs')
esc_strings = List([])
def handle(self, line_info):
View
8 IPython/core/tests/test_magic.py
@@ -448,15 +448,15 @@ def doctest_precision():
In [1]: f = get_ipython().shell.display_formatter.formatters['text/plain']
In [2]: %precision 5
- Out[2]: '%.5f'
+ Out[2]: u'%.5f'
In [3]: f.float_format
- Out[3]: '%.5f'
+ Out[3]: u'%.5f'
In [4]: %precision %e
- Out[4]: '%e'
+ Out[4]: u'%e'
In [5]: f(3.1415927)
- Out[5]: '3.141593e+00'
+ Out[5]: u'3.141593e+00'
"""
View
15 IPython/frontend/qt/console/ipython_widget.py
@@ -21,7 +21,7 @@
from IPython.core.inputsplitter import IPythonInputSplitter, \
transform_ipy_prompt
from IPython.core.usage import default_gui_banner
-from IPython.utils.traitlets import Bool, Str, Unicode
+from IPython.utils.traitlets import Bool, Unicode
from frontend_widget import FrontendWidget
import styles
@@ -82,19 +82,18 @@ class IPythonWidget(FrontendWidget):
3. IPython: .error, .in-prompt, .out-prompt, etc
""")
-
- syntax_style = Str(config=True,
+ syntax_style = Unicode(config=True,
help="""
If not empty, use this Pygments style for syntax highlighting. Otherwise,
the style sheet is queried for Pygments style information.
""")
# Prompts.
- in_prompt = Str(default_in_prompt, config=True)
- out_prompt = Str(default_out_prompt, config=True)
- input_sep = Str(default_input_sep, config=True)
- output_sep = Str(default_output_sep, config=True)
- output_sep2 = Str(default_output_sep2, config=True)
+ in_prompt = Unicode(default_in_prompt, config=True)
+ out_prompt = Unicode(default_out_prompt, config=True)
+ input_sep = Unicode(default_input_sep, config=True)
+ output_sep = Unicode(default_output_sep, config=True)
+ output_sep2 = Unicode(default_output_sep2, config=True)
# FrontendWidget protected class variables.
_input_splitter_class = IPythonInputSplitter
View
2  IPython/frontend/terminal/embed.py
@@ -33,7 +33,7 @@
from IPython.frontend.terminal.interactiveshell import TerminalInteractiveShell
from IPython.frontend.terminal.ipapp import load_default_config
-from IPython.utils.traitlets import Bool, Str, CBool, Unicode
+from IPython.utils.traitlets import Bool, CBool, Unicode
from IPython.utils.io import ask_yes_no
View
2  IPython/frontend/terminal/interactiveshell.py
@@ -31,7 +31,7 @@
from IPython.utils.process import abbrev_cwd
from IPython.utils.warn import warn
from IPython.utils.text import num_ini_spaces
-from IPython.utils.traitlets import Int, Str, CBool, Unicode
+from IPython.utils.traitlets import Int, CBool, Unicode
#-----------------------------------------------------------------------------
# Utilities
View
7 IPython/parallel/apps/ipclusterapp.py
@@ -37,7 +37,8 @@
from IPython.core.profiledir import ProfileDir
from IPython.utils.daemonize import daemonize
from IPython.utils.importstring import import_item
-from IPython.utils.traitlets import Int, Unicode, Bool, CFloat, Dict, List
+from IPython.utils.traitlets import (Int, Unicode, Bool, CFloat, Dict, List,
+ DottedObjectName)
from IPython.parallel.apps.baseapp import (
BaseParallelApplication,
@@ -198,7 +199,7 @@ def _classes_default(self):
n = Int(2, config=True,
help="The number of engines to start.")
- engine_launcher_class = Unicode('LocalEngineSetLauncher',
+ engine_launcher_class = DottedObjectName('LocalEngineSetLauncher',
config=True,
help="The class for launching a set of Engines."
)
@@ -330,7 +331,7 @@ def _classes_default(self,):
delay = CFloat(1., config=True,
help="delay (in s) between starting the controller and the engines")
- controller_launcher_class = Unicode('LocalControllerLauncher',
+ controller_launcher_class = DottedObjectName('LocalControllerLauncher',
config=True,
help="The class for launching a Controller."
)
View
2  IPython/parallel/client/client.py
@@ -318,6 +318,8 @@ def __init__(self, url_or_file=None, profile='default', profile_dir=None, ipytho
if os.path.isfile(exec_key):
extra_args['keyfile'] = exec_key
else:
+ if isinstance(exec_key, unicode):
+ exec_key = exec_key.encode('ascii')
extra_args['key'] = exec_key
self.session = Session(**extra_args)
View
14 IPython/parallel/controller/hub.py
@@ -30,7 +30,7 @@
# internal:
from IPython.utils.importstring import import_item
from IPython.utils.traitlets import (
- HasTraits, Instance, Int, Unicode, Dict, Set, Tuple, CStr
+ HasTraits, Instance, Int, Unicode, Dict, Set, Tuple, Bytes, DottedObjectName
)
from IPython.parallel import error, util
@@ -111,10 +111,10 @@ class EngineConnector(HasTraits):
heartbeat (str): identity of heartbeat XREQ socket
"""
id=Int(0)
- queue=CStr()
- control=CStr()
- registration=CStr()
- heartbeat=CStr()
+ queue=Bytes()
+ control=Bytes()
+ registration=Bytes()
+ heartbeat=Bytes()
pending=Set()
class HubFactory(RegistrationFactory):
@@ -179,8 +179,8 @@ def _notifier_port_default(self):
monitor_url = Unicode('')
- db_class = Unicode('IPython.parallel.controller.dictdb.DictDB', config=True,
- help="""The class to use for the DB backend""")
+ db_class = DottedObjectName('IPython.parallel.controller.dictdb.DictDB',
+ config=True, help="""The class to use for the DB backend""")
# not configurable
db = Instance('IPython.parallel.controller.dictdb.BaseDB')
View
2  IPython/parallel/controller/scheduler.py
@@ -39,7 +39,7 @@
# local imports
from IPython.external.decorator import decorator
from IPython.config.loader import Config
-from IPython.utils.traitlets import Instance, Dict, List, Set, Int, Str, Enum
+from IPython.utils.traitlets import Instance, Dict, List, Set, Int, Enum
from IPython.parallel import error
from IPython.parallel.factory import SessionFactory
View
55 IPython/utils/tests/test_traitlets.py
@@ -22,12 +22,14 @@
# Imports
#-----------------------------------------------------------------------------
+import sys
from unittest import TestCase
from IPython.utils.traitlets import (
- HasTraits, MetaHasTraits, TraitType, Any, CStr,
- Int, Long, Float, Complex, Str, Unicode, TraitError,
- Undefined, Type, This, Instance, TCPAddress, List, Tuple
+ HasTraits, MetaHasTraits, TraitType, Any, CBytes,
+ Int, Long, Float, Complex, Bytes, Unicode, TraitError,
+ Undefined, Type, This, Instance, TCPAddress, List, Tuple,
+ ObjectName, DottedObjectName
)
@@ -700,13 +702,13 @@ class TestComplex(TraitTestBase):
_bad_values = [10L, -10L, u'10L', u'-10L', 'ten', [10], {'ten': 10},(10,), None]
-class StringTrait(HasTraits):
+class BytesTrait(HasTraits):
- value = Str('string')
+ value = Bytes('string')
-class TestString(TraitTestBase):
+class TestBytes(TraitTestBase):
- obj = StringTrait()
+ obj = BytesTrait()
_default_value = 'string'
_good_values = ['10', '-10', '10L',
@@ -725,11 +727,42 @@ class TestUnicode(TraitTestBase):
_default_value = u'unicode'
_good_values = ['10', '-10', '10L', '-10L', '10.1',
- '-10.1', '', u'', 'string', u'string', ]
+ '-10.1', '', u'', 'string', u'string', u""]
_bad_values = [10, -10, 10L, -10L, 10.1, -10.1, 1j,
[10], ['ten'], [u'ten'], {'ten': 10},(10,), None]
+class ObjectNameTrait(HasTraits):
+ value = ObjectName("abc")
+
+class TestObjectName(TraitTestBase):
+ obj = ObjectNameTrait()
+
+ _default_value = "abc"
+ _good_values = ["a", "gh", "g9", "g_", "_G", u"a345_"]
+ _bad_values = [1, "", u"", "9g", "!", "#abc", "aj@", "a.b", "a()", "a[0]",
+ object(), object]
+ if sys.version_info[0] < 3:
+ _bad_values.append(u"þ")
+ else:
+ _good_values.append(u"þ") # þ=1 is valid in Python 3 (PEP 3131).
+
+
+class DottedObjectNameTrait(HasTraits):
+ value = DottedObjectName("a.b")
+
+class TestDottedObjectName(TraitTestBase):
+ obj = DottedObjectNameTrait()
+
+ _default_value = "a.b"
+ _good_values = ["A", "y.t", "y765.__repr__", "os.path.join", u"os.path.join"]
+ _bad_values = [1, u"abc.€", "_.@", ".", ".abc", "abc.", ".abc."]
+ if sys.version_info[0] < 3:
+ _bad_values.append(u"t.þ")
+ else:
+ _good_values.append(u"t.þ")
+
+
class TCPAddressTrait(HasTraits):
value = TCPAddress()
@@ -781,7 +814,7 @@ class TestTupleTrait(TraitTestBase):
def test_invalid_args(self):
self.assertRaises(TypeError, Tuple, 5)
self.assertRaises(TypeError, Tuple, default_value='hello')
- t = Tuple(Int, CStr, default_value=(1,5))
+ t = Tuple(Int, CBytes, default_value=(1,5))
class LooseTupleTrait(HasTraits):
@@ -798,12 +831,12 @@ class TestLooseTupleTrait(TraitTestBase):
def test_invalid_args(self):
self.assertRaises(TypeError, Tuple, 5)
self.assertRaises(TypeError, Tuple, default_value='hello')
- t = Tuple(Int, CStr, default_value=(1,5))
+ t = Tuple(Int, CBytes, default_value=(1,5))
class MultiTupleTrait(HasTraits):
- value = Tuple(Int, Str, default_value=[99,'bottles'])
+ value = Tuple(Int, Bytes, default_value=[99,'bottles'])
class TestMultiTuple(TraitTestBase):
View
64 IPython/utils/traitlets.py
@@ -50,6 +50,7 @@
import inspect
+import re
import sys
import types
from types import (
@@ -951,30 +952,29 @@ def validate (self, obj, value):
except:
self.error(obj, value)
-
-class Str(TraitType):
+# We should always be explicit about whether we're using bytes or unicode, both
+# for Python 3 conversion and for reliable unicode behaviour on Python 2. So
+# we don't have a Str type.
+class Bytes(TraitType):
"""A trait for strings."""
default_value = ''
info_text = 'a string'
def validate(self, obj, value):
- if isinstance(value, str):
+ if isinstance(value, bytes):
return value
self.error(obj, value)
-class CStr(Str):
+class CBytes(Bytes):
"""A casting version of the string trait."""
def validate(self, obj, value):
try:
- return str(value)
+ return bytes(value)
except:
- try:
- return unicode(value)
- except:
- self.error(obj, value)
+ self.error(obj, value)
class Unicode(TraitType):
@@ -986,7 +986,7 @@ class Unicode(TraitType):
def validate(self, obj, value):
if isinstance(value, unicode):
return value
- if isinstance(value, str):
+ if isinstance(value, bytes):
return unicode(value)
self.error(obj, value)
@@ -999,6 +999,50 @@ def validate(self, obj, value):
return unicode(value)
except:
self.error(obj, value)
+
+
+class ObjectName(TraitType):
+ """A string holding a valid object name in this version of Python.
+
+ This does not check that the name exists in any scope."""
+ info_text = "a valid object identifier in Python"
+
+ if sys.version_info[0] < 3:
+ # Python 2:
+ _name_re = re.compile(r"[a-zA-Z_][a-zA-Z0-9_]*$")
+ def isidentifier(self, s):
+ return bool(self._name_re.match(s))
+
+ def coerce_str(self, obj, value):
+ "In Python 2, coerce ascii-only unicode to str"
+ if isinstance(value, unicode):
+ try:
+ return str(value)
+ except UnicodeEncodeError:
+ self.error(obj, value)
+ return value
+
+ else:
+ # Python 3:
+ isidentifier = staticmethod(lambda s: s.isidentifier())
+ coerce_str = staticmethod(lambda _,s: s)
+
+ def validate(self, obj, value):
+ value = self.coerce_str(obj, value)
+
+ if isinstance(value, str) and self.isidentifier(value):
+ return value
+ self.error(obj, value)
+
+class DottedObjectName(ObjectName):
+ """A string holding a valid dotted object name in Python, such as A.b3._c"""
+ def validate(self, obj, value):
+ value = self.coerce_str(obj, value)
+
+ if isinstance(value, str) and all(self.isidentifier(x) \
+ for x in value.split('.')):
+ return value
+ self.error(obj, value)
class Bool(TraitType):
View
13 IPython/zmq/kernelapp.py
@@ -30,7 +30,8 @@
)
from IPython.utils import io
from IPython.utils.localinterfaces import LOCALHOST
-from IPython.utils.traitlets import Any, Instance, Dict, Unicode, Int, Bool
+from IPython.utils.traitlets import (Any, Instance, Dict, Unicode, Int, Bool,
+ DottedObjectName)
from IPython.utils.importstring import import_item
# local imports
from IPython.zmq.heartbeat import Heartbeat
@@ -75,7 +76,7 @@ class KernelApp(BaseIPythonApplication):
flags = Dict(kernel_flags)
classes = [Session]
# the kernel class, as an importstring
- kernel_class = Unicode('IPython.zmq.pykernel.Kernel')
+ kernel_class = DottedObjectName('IPython.zmq.pykernel.Kernel')
kernel = Any()
poller = Any() # don't restrict this even though current pollers are all Threads
heartbeat = Instance(Heartbeat)
@@ -93,10 +94,10 @@ class KernelApp(BaseIPythonApplication):
# streams, etc.
no_stdout = Bool(False, config=True, help="redirect stdout to the null device")
no_stderr = Bool(False, config=True, help="redirect stderr to the null device")
- outstream_class = Unicode('IPython.zmq.iostream.OutStream', config=True,
- help="The importstring for the OutStream factory")
- displayhook_class = Unicode('IPython.zmq.displayhook.DisplayHook', config=True,
- help="The importstring for the DisplayHook factory")
+ outstream_class = DottedObjectName('IPython.zmq.iostream.OutStream',
+ config=True, help="The importstring for the OutStream factory")
+ displayhook_class = DottedObjectName('IPython.zmq.displayhook.DisplayHook',
+ config=True, help="The importstring for the DisplayHook factory")
# polling
parent = Int(0, config=True,
View
11 IPython/zmq/session.py
@@ -47,7 +47,8 @@
from IPython.config.configurable import Configurable, LoggingConfigurable
from IPython.utils.importstring import import_item
from IPython.utils.jsonutil import extract_dates, squash_dates, date_default
-from IPython.utils.traitlets import CStr, Unicode, Bool, Any, Instance, Set
+from IPython.utils.traitlets import (Bytes, Unicode, Bool, Any, Instance, Set,
+ DottedObjectName)
#-----------------------------------------------------------------------------
# utility functions
@@ -211,7 +212,7 @@ class Session(Configurable):
debug=Bool(False, config=True, help="""Debug output in the Session""")
- packer = Unicode('json',config=True,
+ packer = DottedObjectName('json',config=True,
help="""The name of the packer for serializing messages.
Should be one of 'json', 'pickle', or an import name
for a custom callable serializer.""")
@@ -225,7 +226,7 @@ def _packer_changed(self, name, old, new):
else:
self.pack = import_item(str(new))
- unpacker = Unicode('json', config=True,
+ unpacker = DottedObjectName('json', config=True,
help="""The name of the unpacker for unserializing messages.
Only used with custom functions for `packer`.""")
def _unpacker_changed(self, name, old, new):
@@ -238,7 +239,7 @@ def _unpacker_changed(self, name, old, new):
else:
self.unpack = import_item(str(new))
- session = CStr('', config=True,
+ session = Bytes(b'', config=True,
help="""The UUID identifying this session.""")
def _session_default(self):
return bytes(uuid.uuid4())
@@ -247,7 +248,7 @@ def _session_default(self):
help="""Username for the Session. Default is your system username.""")
# message signature related traits:
- key = CStr('', config=True,
+ key = Bytes(b'', config=True,
help="""execution key, for extra authentication.""")
def _key_changed(self, name, old, new):
if new:
View
16 docs/source/config/overview.txt
@@ -126,7 +126,7 @@ subclass::
# Sample component that can be configured.
from IPython.config.configurable import Configurable
- from IPython.utils.traitlets import Int, Float, Str, Bool
+ from IPython.utils.traitlets import Int, Float, Unicode, Bool
class MyClass(Configurable):
name = Unicode(u'defaultname', config=True)
@@ -150,8 +150,8 @@ After this configuration file is loaded, the values set in it will override
the class defaults anytime a :class:`MyClass` is created. Furthermore,
these attributes will be type checked and validated anytime they are set.
This type checking is handled by the :mod:`IPython.utils.traitlets` module,
-which provides the :class:`Str`, :class:`Int` and :class:`Float` types. In
-addition to these traitlets, the :mod:`IPython.utils.traitlets` provides
+which provides the :class:`Unicode`, :class:`Int` and :class:`Float` types.
+In addition to these traitlets, the :mod:`IPython.utils.traitlets` provides
traitlets for a number of other types.
.. note::
@@ -237,14 +237,14 @@ Sometimes, your classes will have an inheritance hierarchy that you want
to be reflected in the configuration system. Here is a simple example::
from IPython.config.configurable import Configurable
- from IPython.utils.traitlets import Int, Float, Str, Bool
+ from IPython.utils.traitlets import Int, Float, Unicode, Bool
class Foo(Configurable):
- name = Str('fooname', config=True)
+ name = Unicode(u'fooname', config=True)
value = Float(100.0, config=True)
class Bar(Foo):
- name = Str('barname', config=True)
+ name = Unicode(u'barname', config=True)
othervalue = Int(0, config=True)
Now, we can create a configuration file to configure instances of :class:`Foo`
@@ -253,7 +253,7 @@ and :class:`Bar`::
# config file
c = get_config()
- c.Foo.name = 'bestname'
+ c.Foo.name = u'bestname'
c.Bar.othervalue = 10
This class hierarchy and configuration file accomplishes the following:
@@ -460,7 +460,7 @@ Here are the main requirements we wanted our configuration system to have:
hierarchical data structures, namely regular attribute access
(``Foo.Bar.Bam.name``). Third, using Python makes it easy for users to
import configuration attributes from one configuration file to another.
- Forth, even though Python is dynamically typed, it does have types that can
+ Fourth, even though Python is dynamically typed, it does have types that can
be checked at runtime. Thus, a ``1`` in a config file is the integer '1',
while a ``'1'`` is a string.
Something went wrong with that request. Please try again.