basestring fixer is extremely disruptive #127

eevee · 2014-12-13T02:27:06Z

futurize --stage2 replaces all uses of basestring with str (and also makes str always be the text type), which breaks isinstance checks on native strings on Python 2.

If there were a string_types available, maybe futurize could use that instead :)

The text was updated successfully, but these errors were encountered:

QuLogic · 2014-12-13T04:15:59Z

Wouldn't stage 2 imply that you're properly following Python 3 recommendations, and correctly segregating str from bytes? You wouldn't test for basestring then, because they shouldn't be mixed.

eevee · 2014-12-13T07:26:00Z

The code in question is a library that explicitly (and sloppily, and incorrectly... and unfixably) allows both native strings and text strings. For Python 2 that means basestring; for Python 3 it means only str.

I ran stage 2 because it includes a lot of fairly mundane fixes that are necessary for running on Python 3, e.g. fixing iterables that became lazier or imports that moved around. At worst, those just make the code slower. It's only a handful of fixes like basestring (and long, come to think of it) that actively sabotages Python 2 code. And because both basestring and unicode are converted to str, it's difficult to go through the resulting code and figure out what the original intent was.

QuLogic · 2014-12-13T07:33:09Z

Instead of stage 2, you could specify fixers explicitly, with -f. You can list them with -l, or for just stage 2 -l -2. Then you could probably skip the basestring change.

eevee · 2014-12-13T07:35:39Z

I'll definitely try that next time, thanks.

Seems a shame to not make any distinction between fixers that will break working Python 2 code. Perhaps a stage 1½ is in order. :)

edschofield · 2015-02-03T03:04:34Z

Thanks for your comments. @eevee: I see your point. It would be better if stage 2 didn't cause previously working code to break on Py2.

If stage 2 imported basestring from past.builtins, the resulting code would no longer break on Py2 but it would be unlikely to run on Py3. Would this be an improvement? I think it might -- because, as you say, it would preserve the original intent of the code.

Do you have any reason to prefer something like six's string_types tuple over the basestring object in past.builtins?

@QuLogic: would you agree to this change to fix_basestring.py?

The recent change in fix_division_safe.py was somewhat similar.

eevee · 2015-02-06T05:09:42Z

I'm pretty sure I use basestring for the same reason as 99% of anyone uses basestring. I have a Python 2 API that wants text, and I would love for it to be unicode, but the world is a cold and bitter place full of legacy concerns and a bad default string type, so I accept str as well. In Python 2 you can largely get away with this thanks to the nearly-identical APIs and silent conversions.

I don't think I've ever seen basestring used in any way other than to test whether a value is text-ish, and Python 3 only has one text-ish type: str. Thus I've always changed basestring to six.string_types when porting a codebase.

past's basestring would work as a substitute (and be an improvement over producing code that doesn't work at all), but it would also preserve semantics that don't really make any sense in Python 3. Importing past is probably against the spirit of a tool called futurize, anyway. :)

I can think of one other case where basestring is useful: for detecting whether a value is a container but not string-like. But that's only necessary on Python 3 anyway, since str and unicode don't have __iter__ on Python 2.

eestrada · 2015-03-12T01:31:01Z

Wouldn't it make more sense to replace isinstance(var, basestring) with:

from builtins import bytes, str

isinstance(var, (bytes, str))

The intention is clear, works in both versions, and doesn't require anything from past. As has been pointed out, it still may break since bytes and str can't mix in Py3 as they can in Py2, but at least it would continue to work in Py2 until the code base could be updated to work with Py3.

Perhaps this could be added as an alternative fixer for basestring.

…and #156)

edschofield · 2015-07-25T13:46:09Z

@eevee and @eestrada : thanks for your input.

I've taken the conservative route to fixing this in v0.15.0 by importing the basestring class from past.builtins. This preserves the same behaviour on Py2.x and at least marks the code as needing attention on Py3 through the reference to past. I suppose the best translation of basestring on Py3 will depend on the circumstances: sometimes str or bytes, other times just str.

The main reason I chose this versus replacing basestring with the sensible suggestion of (bytes, str) was robustness and simplicity of implementation of the libfuturize fixer. I'm open to changing it to (bytes, str) in the future if I see enough of a need.

I believe the issue is resolved now, but, if not, please feel free to reopen it.

eevee · 2015-07-28T23:50:23Z

I'm cool with that, thank you :)

edschofield added a commit that referenced this issue Jul 25, 2015

Enable conservative basestring fixer from past.builtins (issues #127 …

c865899

…and #156)

edschofield closed this as completed Jul 25, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

basestring fixer is extremely disruptive #127

basestring fixer is extremely disruptive #127

eevee commented Dec 13, 2014

QuLogic commented Dec 13, 2014

eevee commented Dec 13, 2014

QuLogic commented Dec 13, 2014

eevee commented Dec 13, 2014

edschofield commented Feb 3, 2015

eevee commented Feb 6, 2015

eestrada commented Mar 12, 2015

edschofield commented Jul 25, 2015

eevee commented Jul 28, 2015

basestring fixer is extremely disruptive #127

basestring fixer is extremely disruptive #127

Comments

eevee commented Dec 13, 2014

QuLogic commented Dec 13, 2014

eevee commented Dec 13, 2014

QuLogic commented Dec 13, 2014

eevee commented Dec 13, 2014

edschofield commented Feb 3, 2015

eevee commented Feb 6, 2015

eestrada commented Mar 12, 2015

edschofield commented Jul 25, 2015

eevee commented Jul 28, 2015