Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP

Loading…

sqs write_batch creates garbled messages #831

Open
sbailey opened this Issue · 10 comments

7 participants

@sbailey

When using sqs.queue.Queue.write_batch(), some of my message come back garbled. I do not have this problem with normal Queue.write().

boto version 2.5.1

Example code:

import boto.sqs
sqs = boto.sqs.connect_to_region('us-west-1')
q = sqs.create_queue('blat')

messages = list()
messages.append( (1, 'echo "hello"', 0) )
messages.append( (2, 'echo "hi"', 0) )
messages.append( (3, 'echo "goodbye"', 0) )
messages.append( (4, 'echo "done"', 0) )    #- Garbled return message
messages.append( (5, 'done', 0) )           #- Garbled return message
messages.append( (6, 'don', 0) )            #- OK
messages.append( (7, 'echo "hello"', 0) )

q.write_batch(messages)

while True:
    m = q.read()
    if m is not None:
        q.delete_message(m)
        print m.get_body()
    else:
        break

Results:

FF:qdo $ python test_boto.py 
echo "goodbye"
v??
echo "hello"
echo "hello"
y?hv??
echo "hi"
don

messages with "done" in the text come back garbled. In a separate test yesterday, messages with "goodbye" would fail but I can't reproduce that today.

@garnaat
Owner

I am able to reproduce this. I'm looking into it now and will report back when I find more information.

@pasc

This is happening because boto.sqs.queue.Queue.read() assumes that the messages are base 64 encoded, and tries to decode them as follows. When decoding fails, the original value is returned:

def decode(self, value):
    try:
        value = base64.b64decode(value)
    except:
        boto.log.warning('Unable to decode message')
        return value
    return value

Out of all the sample strings, only 'done' and 'echo "done"' are valid base-64 strings:

>>> base64.b64decode('echo "done"')
'y\xc8hv\x89\xde'
>>> base64.b64decode('done')
'v\x89\xde'

Which matches the decoded value is returned instead of the original value.

Fixing this is going to be a pain. If people are already using the existing API they are either encoding their values or sometimes losing messages. Maybe we should just update the documentation to indicate that these messages should be base-64 encoded already?

@pasc

I have also just realised the same issue exists in the existing integration test as none of the test messages are valid base64-encoded messages. If a test is done with "This is message 100" instead of something like "This is message 1" then the same error will occur.

@garnaat
Owner

What's happening is that the default message class associated with a Queue is the Message class which always encodes the payload using base64 and then decodes it when read from the queue. Since the original messages are not written using a Message object, they are not encoded prior to writing to SQS. And then the base64 decode fails.

One way to get around this is to set the Queue message class to be RawMessage:

from boto.sqs.message import RawMessage
q.set_message_class(RawMessage)

Then, the messages will be read correctly. We could document that in the method.

Another way to fix it would be to do the base64 encoding in the batch write method. Or, we could require that the messages be base64 encoded and document that.

The base64 encoding is kind of an anachronism. In the early days of the SQS service, any non-ASCII characters in a message would cause problems so the recommendation was to base64-encode all messages. So, that became the default behavior of boto. Somewhere along the way, however, this changed on the service side. The range of acceptable characters is much more accommodating now and the default behavior of encoding the content probably no longer makes sense. But then we start worrying about backwards-compatibility. So, it remains.

I think the best solution is to document the fact that the messages are not being encoded in anyway by boto and that the user must remember to set the appropriate message class on the queue when reading the messages.

@kopertop
Owner
@sbailey

From the user standpoint, the asymmetry between Queue.write() and Queue.batch_write() is confusing. Here's an updated example:

import boto.sqs
import base64
sqs = boto.sqs.connect_to_region('us-west-1')
q = sqs.create_queue('blat')

commands = list()
commands.append( 'echo "hello"' )
commands.append( 'echo "goodbye"' )
commands.append( 'echo "done"' )    #- Garbled return message
commands.append( 'done' )           #- Garbled return message
commands.append( 'don' )            #- OK
commands.append( 'echo "hello"' )

def read_queue(q):
    while True:
        m = q.read()
        if m is not None:
            q.delete_message(m)
            print m.get_body()
        else:
            break

print "Let's try using batch write"
batch = list()
for i, cmd in enumerate(commands):
    batch.append( (i+1, cmd, 0) )   #- This fails for some strings
    # batch.append( (i+1, base64.b64encode(cmd), 0) )  #- This works
q.write_batch(batch)
read_queue(q)

print "Let's try using single write"
for cmd in commands:
    q.write(q.new_message(cmd))  #- This works
read_queue(q)

The same string which works with q.write(q.new_message(cmd)) may not work when putting it in a list to pass to write_batch. Shouldn't write_batch just be a faster version of batch, with the same encoding requirements for both? And as a non-expert user, I would certainly prefer the default to "just work" on python strings of ASCII characters without requiring the user to do any special encoding/decoding/set_message_class stuff. But as mentioned above, whatever is done, the most important thing to to document it clearly.

@kopertop
Owner
@spulec spulec referenced this issue in spulec/moto
Closed

SQS doesn't round trip messages #4

@mholt

Ah. This explains why I'm having troubles... I'm publishing messages to the queue in batches of 10 using Queue.write_batch(), but I'm consuming them one at a time using read(). I figure, from reading this issue, that read() is decoding, but write_batch() isn't encoding... hence why most of my messages are garbled.

Are there plans to standardize this behavior? It probably ought to happen soon.

@saintsjd

Hi all. Thank you for the helpful discussion. I just ran into this issue today using boto 2.23.0

The solution to my issue seems to be:
from boto.sqs.message import RawMessage
q.set_message_class(RawMessage)

I no longer see a recommendation from Amazon that messages be base64 encoded. Our .write_batch() and other amazon sdks, such as C#, appear to not be encoding by default. This incompatibility with our boto SDK would seem to be very confusing to users.

Is it time we do away with the b64 encoding by default?

This would standardize our own sdk and make us more compatible with other sdks. We would need to document this change very clearly and provide clear path for backwards compatibility.

@salopge salopge referenced this issue from a commit
Commit has since been removed from the repository and is no longer available.
@jessedhillon

IMO the Queue.write_batch method should instantiate an instance of Queue.message_class and then call Message.get_encoded_body, passing the return of that to SQSConnection.send_message_batch

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Something went wrong with that request. Please try again.