Skip to content

webob1.2b2 breaks appengine SDK 1.6.0 (python 2.7 runtime) #14

Closed
twillis opened this Issue Dec 9, 2011 · 20 comments

5 participants

@twillis
twillis commented Dec 9, 2011

dev_appserver_blobstore.py is referencing webob.byterange.Range.parse_bytes which I guess was removed in webob1.2?

So here's my concern. Not sure if you agree or disagree.

the python2.7 runtime is currently experimental but everyone is hoping to use it when it "goes gold" I'm already stuck on pyramid 1.0 on the 2.5 runtime due to appengine only providing and depending on web0.9

getting on 2.7 gets me back in line with the latest and greatest pyramid, until pyramid 1.3 when it requires >webob1.2dev. It's going to suck, I want to play with the cool kids.

I would think it would suck for pyramid to have to state that "we just released another cool version, but you can't use it on appengine, you have to stick with 1.2"

Anyway, honestly I think this is an appengine bug, but I bet it would take them longer to decide whether to reply to a bug report than it would for you add parse_bytes back in.

thanks for reading.

@maluke
maluke commented Dec 9, 2011

Can you please provide a traceback with the error? py2.7 runtime on gae has support for library versions, so it should be possible to switch to 1.2 in production once they add it (after we make the 1.2 final release of course).

@twillis
twillis commented Dec 9, 2011

Sure... as you can see, it happens before the app being tested is even loaded, at least that's how I interpret it.

cd ~/projects/hydrant_27_bug/ ; ./bin/nosetests -x --with-gae --gae-application=parts/hydrant-app/ src/hydrant-app/hydrantapp/tests/test_newapi.py
WARNING:root:The rdbms API is not available because the MySQLdb library could not be loaded.
Traceback (most recent call last):
File "./bin/nosetests", line 56, in
nose.run_exit()
File "/home/twillis/projects/hydrant_27_bug/eggs/nose-1.1.2-py2.7.egg/nose/core.py", line 118, in init
*extra_args)
File "/usr/lib/python2.7/unittest/main.py", line 94, in init
self.parseArgs(argv)
File "/home/twillis/projects/hydrant_27_bug/eggs/nose-1.1.2-py2.7.egg/nose/core.py", line 135, in parseArgs
self.config.configure(argv, doc=self.usage())
File "/home/twillis/projects/hydrant_27_bug/eggs/nose-1.1.2-py2.7.egg/nose/config.py", line 338, in configure
self.plugins.configure(options, self)
File "/home/twillis/projects/hydrant_27_bug/eggs/nose-1.1.2-py2.7.egg/nose/plugins/manager.py", line 271, in configure
cfg(options, config)
File "/home/twillis/projects/hydrant_27_bug/eggs/nose-1.1.2-py2.7.egg/nose/plugins/manager.py", line 94, in call
return self.call(
arg, *kw)
File "/home/twillis/projects/hydrant_27_bug/eggs/nose-1.1.2-py2.7.egg/nose/plugins/manager.py", line 162, in simple
result = meth(
arg, **kw)
File "/home/twillis/projects/hydrant_27_bug/eggs/NoseGAE-0.2.0-py2.7.egg/nosegae.py", line 84, in configure
from google.appengine.tools import dev_appserver
File "/home/twillis/projects/hydrant_27_bug/parts/google_appengine/google/appengine/tools/dev_appserver.py", line 137, in
from google.appengine.tools import dev_appserver_blobstore
File "/home/twillis/projects/hydrant_27_bug/parts/google_appengine/google/appengine/tools/dev_appserver_blobstore.py", line 105, in
ParseRange = byterange.Range.parse_bytes
AttributeError: type object 'Range' has no attribute 'parse_bytes'

@maluke
maluke commented Dec 9, 2011

Here's the context in the SDK

_BYTESRANGE_IS_EXCLUSIVE = not hasattr(byterange.Range, 'serialize_bytes')

if _BYTESRANGE_IS_EXCLUSIVE:

  ParseRange = byterange.Range.parse_bytes

That's quite odd, (and the comments are helpfully stripped) but what I think they are trying to do is to support with very old broken WebOb versions that parsed and acted on byte-ranges in a non-consistent manner (the range indexes were treated as exclusive or not in different places).

The quality of code in the GAE Python SDK is horrible, but this is a new low. I understand why you'd suggest to try a quick workaround like this, but honestly, I don't think it's a good idea. The problem is that is might not even help in the first place, it might break in other ways now or later and we're not even sure what the hell are they doing.

Please do file a bug in the GAE tracker, because what they are doing there is inexcusable, they are apparently maintaining two code paths (both dependent on a private API) where one of the paths is to support buggy code removed from webob about 3 years ago and the condition on which the codepaths are chosen is a check for existence of some freaking attribute? How did that shit pass code review and make it into a public SDK?

Also, current ranges parser is included below, if someone from the GAE team reading this, please, just copy it into your code and mutilate it as you wish over there. (The old parser is a bit of a monster which is another reason I don't want it to go back 4a69f25#L3L94 )

    @classmethod
    def parse(cls, header):
        """
            Parse the header; may return None if header is invalid
        """
        m = _rx_range.match(header or '')
        if not m:
            return None
        start, end = m.groups()
        if not start:
            return cls(-int(end), None)
        start = int(start)
        if not end:
            return cls(start, None)
        end = int(end) + 1 # return val is non-inclusive
        if start >= end:
            return None
        return cls(start, end)

_rx_range = re.compile('bytes=(\d*)-(\d*)')
@twillis
twillis commented Dec 9, 2011

Thanks for the explanation sir. I have filed an issue for appengine.

http://code.google.com/p/googleappengine/issues/detail?id=6507

I may bring it up on the appengine mailing list as well.

@ianb
ianb commented Dec 9, 2011
@maluke
maluke commented Dec 12, 2011

This seems to be a devserver issue, at least the code in the reported traceback should not be touched (or available to) deployed apps.

@bquinlan-google

I've included the code without the comment stripping at the end of this issue and I agree that App Engine should have just done its own range parsing.

Also, we do use parse_bytes in production. See google/appengine/ext/webapp/blobstore_handlers.py

# Before WebOb 1.0, its byterange API was buggy: the parse_bytes()
# function would return inclusive ranges (matching the HTTP spec, but
# not the expectations of Python users), and the ContentRange.parse
# function would return a stop value that was one *lower* than the
# value found in the header, i.e., the header "0-9/20" would be
# represented by start=0, stop=8 (!), length=20.
#
# WebOb 1.0 and later fix this by always returning (Pythonic)
# half-open ranges, so that e.g. "0-9" is represented by the tuple (0,
# 10) and "0-9/20" by start=0, stop=10, length=20.
#
# To make our code easier to read, we define some helper functions
# here that always use Pythonic half-open ranges.
#
# Because WebOb < 1.1 doesn't have a __version__ attribute, we
# recognize the buggy versions by the presence of the serialize_bytes
# method in the webob.byterange module.
_BYTESRANGE_IS_EXCLUSIVE = not hasattr(byterange.Range, 'serialize_bytes')

if _BYTESRANGE_IS_EXCLUSIVE:  # WebOb >= 1.0

  ParseRange = byterange.Range.parse_bytes

  MakeContentRange = byterange.ContentRange

  def GetContentRangeStop(content_range):
    return content_range.stop

  # Alas, WebOb >= 1.0 isn't perfect for us.  Its ContentRange.parse()
  # doesn't like "0-9/5" (i.e. stop > length).  Unfortunately we rely
  # on being able to parse this.  Fortunately we can temporarily
  # monkey-patch the internal function byterange._is_content_range_valid
  # to accept this.  (See the WebOb source code for more details.)

  _orig_is_content_range_valid = byterange._is_content_range_valid

  def _new_is_content_range_valid(start, stop, length, response=False):
    return _orig_is_content_range_valid(start, stop, length, False)

  def ParseContentRange(content_range_header):
    # For the duration of this function, we monkey-patch
    # _is_content_range_valid to always accept an out-of-range
    # stop value, by forcing the response argument to False.
    try:
      byterange._is_content_range_valid = _new_is_content_range_valid
      return byterange.ContentRange.parse(content_range_header)
    finally:
      byterange._is_content_range_valid = _orig_is_content_range_valid

else:  # WebOb < 1.0

  def ParseRange(range_header):
    # Need to hide stdout from calls to the byterange.parse_bytes
    # because it prints to stdout on error.
    original_stdout = sys.stdout
    sys.stdout = cStringIO.StringIO()
    try:
      parse_result = byterange.Range.parse_bytes(range_header)
    finally:
      sys.stdout = original_stdout
    if parse_result is None:
      return None
    else:
      ranges = []
      for start, end in parse_result[1]:
        if end is not None:
          end += 1  # Convert inclusive range to exclusive range.
        ranges.append((start, end))
      return parse_result[0], ranges

  class _FixedContentRange(byterange.ContentRange):
    # The old WebOb code is so broken that it cannot parse a
    # Content-Range header whose start and stop values are the same
    # (e.g. "1-1/10"), since it subtracts 1 from the parsed end value
    # and tries to pass that to its own constructor, which rejects
    # stop < start.  We work around this by creating a subclass whose
    # constructor doesn't check the argument ranges.

    def __init__(self, start, stop, length):
      # Bypass the base constructor, its asserts are broken.
      self.start = start
      self.stop = stop
      self.length = length

  # Amazingly, the difference between the old and the new WebOb
  # Content-Range header parsing code is *2*.  For example, the string
  # "0-9/20" used to be represented by start=0, stop=8, length=20,
  # wheras the new code represents this as start=0, stop=10,
  # length=20.  We like the new representation better, so our custom
  # API translates between the new and the old representation by
  # subtracting or adding 2.

  def MakeContentRange(start, stop, length):
    if stop is not None:
      stop -= 2
    content_range = _FixedContentRange(start, stop, length)
    return content_range

  def GetContentRangeStop(content_range):
    stop = content_range.stop
    if stop is not None:
      stop += 2
    return stop

  def ParseContentRange(content_range_header):
    return _FixedContentRange.parse(content_range_header)
@maluke
maluke commented Dec 13, 2011

Yup, pre-1.0 range parsing was very buggy and basically invalid (in some cases the range serving bugs did cancel out the off-by-one errors, so it sometimes it appeared to work I guess).

@maluke
maluke commented Mar 3, 2012

Is this still a problem with the webob 1.2b3 and current gae sdk?

@twillis
twillis commented Mar 3, 2012

seems to still be a problem. the sdk still comes with 0.9, I've been patching it with 1.1.1 locally even as late as sdk 1.6.3 so that stuff runs smoothly on the 2.7 locally. I just tried it with 1.2b3 and devappserver won't even start up

(webob_new) twillis@twillis-MacBookPro:~/projects/buttercup_webob$ ./bin/devappserver parts/hydrant-app
WARNING  2012-03-03 14:41:44,156 rdbms_mysqldb.py:74] The rdbms API is not available because the MySQLdb library could not be loaded.
Traceback (most recent call last):
  File "./bin/devappserver", line 26, in <module>
    dev_appserver.run_file('lib/google_appengine/dev_appserver.py', locals())
  File "/home/twillis/projects/buttercup_webob/lib/google_appengine/dev_appserver.py", line 121, in run_file
    execfile(script_path, globals_)
  File "/home/twillis/projects/buttercup_webob/lib/google_appengine/google/appengine/tools/dev_appserver_main.py", line 159, in <module>
    from google.appengine.tools import dev_appserver
  File "/home/twillis/projects/buttercup_webob/lib/google_appengine/google/appengine/tools/dev_appserver.py", line 141, in <module>
    from google.appengine.tools import dev_appserver_blobstore
  File "/home/twillis/projects/buttercup_webob/lib/google_appengine/google/appengine/tools/dev_appserver_blobstore.py", line 105, in <module>
    ParseRange = byterange.Range.parse_bytes
AttributeError: type object 'Range' has no attribute 'parse_bytes'
(webob_new) twillis@twillis-MacBookPro:~/projects/buttercup_webob$ 

this is kind of not awesome since they closed my issue as fixed with no comments what so ever. Supposedly as a premier account holder I have some developer advocate, I should try that route.

@bquinlan-google

Hey guys,

Are you talking about http://code.google.com/p/googleappengine/issues/detail?id=2788 ?
I closed the issue because the production Python 2.7 runtime includes WebOb 1.1, which is the latest released version.

There is still an issue with correct SDK support that we are working on (it is somewhat complex because WebOb is used internally by the SDK and we don't know what runtime the user is using until late in the startup process).

It is not likely that we will ever go out of our way to support beta versions of third-party libraries - it would just be too much work to keep track of them all.

Also, it might be more appropriate to move this discussion to an App Engine-related forum since this isn't really a WebOb problem (except in so far as as WebOb makes significant incompatible changes between versions, which makes it harder to support).

I hope that helps.

Cheers,
Brian

@twillis
twillis commented Mar 5, 2012

Hey Brian, I do ask on the email list and related issues in the tracker , and I don't get responses usually. At least discussing here I get an answer. :)

@maluke
maluke commented Mar 6, 2012

I don't think the issue is about supporting the beta version of WebOb, it's about removing dependency on a private interface. And I don't think it's fair to blame WebOb for fixing the bugs and doing cleanups, it's the GAE code that's fragile and incorrect here.

@maluke maluke closed this Mar 6, 2012
@bquinlan-google

I didn't know that webob.byterange.Range.parse_bytes was a private interface. Is there some sort of signal that I missed that indicated that it was?

@maluke
maluke commented Mar 6, 2012

I guess it's not communicated clearly, but all the parsing infrastructure is a private API, the public API is the Request / Response constructors and various access methods, but not the constructors / parsers for headers.

@maluke
maluke commented Mar 6, 2012

And um.. checking for presence of an attribute is definitely not an API (which is what broke here).

@bquinlan-google

The only reason that the attribute was checked for was because WebOb < 1.1 didn't have a version attribute so we needed some way to decide what version we are working with (because of incompatible semantics).

But the history isn't very important - we are in the process of removing all internal dependancies on webob so this kind of problem won't occur in the future.

@maluke
maluke commented Mar 6, 2012

What should have been done in the first place is the parser for the header should have been copied into the GAE codebase. It would take less code that this juggling around, it would take far less developer time and it wouldn't create this problem.

@twillis
twillis commented Mar 7, 2012

Relevant appengine issue. status: started woohoo

http://code.google.com/p/googleappengine/issues/detail?id=6507

@akmistry

Update: Range.parse_bytes is no longer used in the dev_appserver as of 1.6.4. appengine.ext.webapp.blobstore_handlers still uses it, but that will be removed in the next release (1.6.5).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Something went wrong with that request. Please try again.