Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GAX API hangs during (second) 'publish' #1869

Closed
tseaver opened this issue Jun 20, 2016 · 19 comments
Closed

GAX API hangs during (second) 'publish' #1869

tseaver opened this issue Jun 20, 2016 · 19 comments
Assignees
Labels
api: pubsub Issues related to the Pub/Sub API. type: bug Error or flaw in code with unintended results or allowing sub-optimal usage patterns.

Comments

@tseaver
Copy link
Contributor

tseaver commented Jun 20, 2016

/cc @bjwatson

======================================================================
ERROR: test_message_pull_mode_e2e (pubsub.TestPubsub)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/tseaver/projects/agendaless/Google/src/gcloud-python/system_tests/pubsub.py", line 157, in test_message_pull_mode_e2e
    topic.publish(MESSAGE_1, extra=EXTRA_1)
  File "/home/tseaver/projects/agendaless/Google/src/gcloud-python/gcloud/pubsub/topic.py", line 246, in publish
    message_ids = api.topic_publish(self.full_name, [message_data])
  File "/home/tseaver/projects/agendaless/Google/src/gcloud-python/gcloud/pubsub/_gax.py", line 171, in topic_publish
    event.wait()
AttributeError: 'exceptions.AttributeError' object has no attribute 'message_ids'
@tseaver tseaver added type: bug Error or flaw in code with unintended results or allowing sub-optimal usage patterns. api: pubsub Issues related to the Pub/Sub API. labels Jun 20, 2016
@tseaver
Copy link
Contributor Author

tseaver commented Jun 20, 2016

To reproduce:

$ git clone git@github.com:tseaver/gcloud
$ cd gcloud
$ git checkout 1855-moar_gax_paging_fixes
$ tox -e system-tests --notest
$ PYTHONPATH= .tox/system-tests/bin/python system_tests/run_system_test.py --package=pubsub

@bjwatson
Copy link

Seems like the PublishResponse object is being returned without message_ids nor an exception.

Have you gotten the API publish(...) call working in any context? IOW, is this a problem is general, or a problem in this specific context?

@tseaver
Copy link
Contributor Author

tseaver commented Jun 20, 2016

@bjwatson I had an earlier error in my system test masking this problem, so I have never actually seen publish() succeed with GAX.

@tseaver
Copy link
Contributor Author

tseaver commented Jun 20, 2016

@bjwatson One thing different from the other Pubsub APIs (which are working AFAICS, except pull() which is masked by this one) is that push() gets bundled: the AttributeError gets set by the code which unpacks the response during event.wait().

@tseaver
Copy link
Contributor Author

tseaver commented Jun 20, 2016

@bjwatson could it be an issue that the event unpacking code isn't handling repeated elements (e.g., PublishResponse.message_ids) properly?

@tseaver
Copy link
Contributor Author

tseaver commented Jun 20, 2016

@bjwatson To reproduce (updated):

$ git clone git@github.com:GoogleCloudPlatform/gcloud
$ cd gcloud
$ tox -e system-tests --notest
$ GCLOUD_ENABLE_GAX=1 PYTHONPATH= .tox/system-tests/bin/python system_tests/run_system_test.py --package=pubsub

@tseaver
Copy link
Contributor Author

tseaver commented Jun 20, 2016

cc @tbetbetbe

@bjwatson
Copy link

Hi @tseaver. Sorry for the delay; I was looking into a few other things. I'll try what you suggested and let you know what I find.

@bjwatson
Copy link

I have successfully reproduced this issue on my workstation and am looking into it.

@bjwatson
Copy link

So what I've discovered so far is that event.result is assigned to AttributeError: Assignment not allowed to repeated field "message_ids" in protocol message object.

Based on code analysis, I believe this is being thrown by https://github.com/googleapis/gax-python/blob/master/google/gax/bundling.py#L193. I'm not sure why we haven't discovered this before. Is this the first time the bundle_descriptor.subresponse_field is a repeated field, or should that always be the case? (I don't understand bundling that well, yet)

The bundling configuration for the publish(...) call is:

    bundling:
      thresholds:
        element_count_threshold: 10
        element_count_limit: 1000 # TO BE REMOVED LATER
        request_byte_threshold: 1024 # 1 Kb
        request_byte_limit: 10485760 # TO BE REMOVED LATER
        delay_threshold_millis: 10
      bundle_descriptor:
        bundled_field: messages
        discriminator_fields:
        - topic
        subresponse_field: message_ids

I'm about to sign off for the night. @tbetbetbe Do you have some time to look into this during your morning shift?

@bjwatson
Copy link

@tseaver This is a bug in gax-python. I created issue googleapis/gax-python#110 to track it, and plan to fix it today.

@tseaver
Copy link
Contributor Author

tseaver commented Jun 21, 2016

@bjwatson Thanks very much!

@bjwatson
Copy link

@tseaver No problem. I have a fix in code review now: googleapis/gax-python#111

Once this is merged, I will publish new versions of gax-python, pubsub, and logging. And then I will verify that this issue is closed. This should be finished tomorrow.

@bjwatson
Copy link

bjwatson commented Jun 23, 2016

@tseaver I have fixed this issue with https://pypi.python.org/pypi/google-gax/0.12.1 and https://pypi.python.org/pypi/gax-google-pubsub-v1/0.7.10.

Now the behavior I'm seeing is that the test_message_pull_mode_e2e test hangs indefinitely on the second call to publish(...). I'm not sure where in the stack the problem is occurring.

I'm about to wrap up for the day. Can you take a look and see what you think?

@bjwatson
Copy link

I also updated logging, although I have not yet looked into #1889

@bjwatson
Copy link

@tseaver I got caught up in some other things today. Can you or @daspecster look into the indefinite hang on the second call to publish(...), and let me know whether you think it's at the gcloud, gapic, or service layer?

@bjwatson
Copy link

bjwatson commented Jun 25, 2016

This seems to be a [gax-python](https://github.com/googleapis/gax-python issue). The hang is happening at https://github.com/GoogleCloudPlatform/gcloud-python/blob/master/gcloud/pubsub/_gax.py#L170, but only happens for the second call.

I'll look at it some more on Monday, unless @tbetbetbe or @geigerj get to it first.

@geigerj
Copy link
Contributor

geigerj commented Jun 25, 2016

try:
  result = self._gax_api.publish(topic_path, message_pbs,
      options=CallOptions(is_bundling=False))
  return result.message_ids
except:
  ...

@tseaver tseaver changed the title GAX API raises AttributeError during 'publish' GAX API hangs during (second) 'publish' Jun 26, 2016
@tseaver
Copy link
Contributor Author

tseaver commented Jun 26, 2016

@geigerj Thanks for the pointer! #1910 disables bundling as you suggest. We can think about re-enabling it after a fix for googleapis/gax-python#113 is released.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: pubsub Issues related to the Pub/Sub API. type: bug Error or flaw in code with unintended results or allowing sub-optimal usage patterns.
Projects
None yet
Development

No branches or pull requests

4 participants