Skip to content
This repository was archived by the owner on Jul 7, 2023. It is now read-only.

Conversation

@cclauss
Copy link
Contributor

@cclauss cclauss commented Aug 23, 2017

@martinpopel
Copy link
Contributor

@cclauss: Have you benchmarked this edit? In the original code the condition (if six.PY2) is evaluated just once, but in this PR the condition/exception is evaluated in each method call and there may be many such calls.

@cclauss
Copy link
Contributor Author

cclauss commented Aug 23, 2017

@martinpopel Your thought based on these benchmarks on Python 2 and Python 3?

Call each a million times on Python 2.7.13 (default, Jul 18 2017, 09:17:00) 
[GCC 4.2.1 Compatible Apple LLVM 8.1.0 (clang-802.0.42)]
Function               With str      With unicode
old native_to_unicode: 1.90199708939 0.395492076874
new native_to_unicode: 1.53925609589 0.414881944656
old unicode_to_native: 0.75088596344 0.701120853424
new unicode_to_native: 0.85029101371 0.793823003769

Call each a million times on Python 3.6.2 (default, Jul 17 2017, 16:44:45) 
[GCC 4.2.1 Compatible Apple LLVM 8.1.0 (clang-802.0.42)]
Function               With str           With unicode
old native_to_unicode: 0.1422363230085466 0.11030975799076259
new native_to_unicode: 0.7890558139770292 0.7672536290192511
old unicode_to_native: 0.1131693619827274 0.10763867301284336
new unicode_to_native: 0.1800369980046525 0.17342810999252833
#!/usr/bin/env python2  # change between python2, python 3, pypy, pypy3
# -*- coding: utf-8 -*-

from __future__ import print_function
import six
import sys
import timeit


def native_to_unicode_py2(s):
    """Python 2: transform native string to Unicode."""
    return s if isinstance(s, unicode) else s.decode("utf8")


# Conversion between Unicode and UTF-8, if required (on Python2)
if six.PY2:
    native_to_unicode = native_to_unicode_py2
    unicode_to_native = lambda s: s.encode("utf-8")
else:
    # No conversion required on Python3
    native_to_unicode = lambda s: s
    unicode_to_native = lambda s: s


def new_native_to_unicode(s):
    """Transform native string to Unicode."""
    try:  # Python 2
        return s if isinstance(s, unicode) else s.decode("utf8")
    except NameError:  # Python 3: unicode() was dropped
        return s


def new_unicode_to_native(s):
    """Transform Unicode to native string."""
    return s.encode("utf-8") if six.PY2 else s


print('Call each a million times on Python', sys.version)
print('Function               With str      With unicode')
print('old native_to_unicode:',
      timeit.timeit(
          "native_to_unicode('string')",
          setup="from __main__ import native_to_unicode"),
      timeit.timeit(
          "native_to_unicode(u'Unicöde')",
          setup="from __main__ import native_to_unicode"))
print('new native_to_unicode:',
      timeit.timeit(
          "new_native_to_unicode('string')",
          setup="from __main__ import new_native_to_unicode"),
      timeit.timeit(
          "new_native_to_unicode(u'Unicöde')",
          setup="from __main__ import new_native_to_unicode"))
print('old unicode_to_native:',
      timeit.timeit(
          "unicode_to_native('string')",
          setup="from __main__ import unicode_to_native"),
      timeit.timeit(
          "unicode_to_native(u'Unicöde')",
          setup="from __main__ import unicode_to_native"))
print('new unicode_to_native:',
      timeit.timeit(
          "new_unicode_to_native('string')",
          setup="from __main__ import new_unicode_to_native"),
      timeit.timeit(
          "new_unicode_to_native(u'Unicöde')",
          setup="from __main__ import new_unicode_to_native"))

@vthorsteinsson
Copy link
Contributor

As the author of the original PR that is being modified here, I'm a bit sceptical about this change, as it degrades performance significantly under Python3. However, to avoid the lambda assignment to a variable, a simple change like the following would be easy to do:

if six.PY2:
  def native_to_unicode(s): return s if isinstance(s, unicode) else s.decode("utf8")
  def unicode_to_native(s): return s.encode("utf-8")
else:
  # No conversion required on Python >= 3
  def native_to_unicode(s): return s
  def unicode_to_native(s): return s

@cclauss cclauss closed this Aug 23, 2017
@cclauss cclauss reopened this Aug 23, 2017
Copy link
Contributor

@lukaszkaiser lukaszkaiser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, let's get it in, but after Google code-style changes it won't look much better than before.

@lukaszkaiser lukaszkaiser merged commit 860fe0a into tensorflow:master Aug 25, 2017
@cclauss cclauss deleted the patch-2 branch August 25, 2017 03:09
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants