Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Datastore: infrequent operations always fail first time, requires retry #899

Closed
timanovsky opened this issue Sep 15, 2016 · 17 comments
Closed
Assignees
Labels
api: datastore Issues related to the Datastore API. grpc priority: p0 Highest priority. Critical issue. P0 implies highest priority. release blocking Required feature/issue must be fixed prior to next release. status: acknowledged status: blocked Resolving the issue is dependent on other work. type: bug Error or flaw in code with unintended results or allowing sub-optimal usage patterns.

Comments

@timanovsky
Copy link

Once we moved to v1 API we saw significant slow down of one particular operation. Investigation suggested that only particular type of operation is affected - infrequent (once per tens of minutes) reads, and unfortunately the servers performing these ops are not doing any other kind of datastore operations. I believe it is due to the reason mentioned in grpc/grpc-java#1648 that google load balancer shuts inactive TCP connections down after 10 minutes. So if previous operation was further back than that, new operation fails with EOF code (I mentioned that in other issue). Effectively what we see is that the operations goes shortly enough after previous one it takes 120-180 msec, but if retry is involved it takes 1200 ms.

I think some kind of keep alive / pings should be configured on the connection to prevent this. I'm not sure grpc provides such configuration option though.

In worst case, should I implement this keepalive myself in background thread, what would be a good Datastore endpoint to reach, so that it does not depend on data presence?

@blowmage
Copy link
Contributor

@murgatroid99 does the GRPC client have a keep alive feature for non-streaming requests?

@timanovsky what is the operation that is affected in V1?

@timanovsky
Copy link
Author

@blowmage It is just query (with ancestor)

@quartzmo quartzmo added api: datastore Issues related to the Datastore API. grpc labels Sep 15, 2016
@quartzmo
Copy link
Member

@timanovsky Is this still an issue? If not, can you close?

@danoscarmike danoscarmike added this to Cloud Datastore in First 4 (GA) Feb 21, 2017
@blowmage
Copy link
Contributor

@timanovsky Is this still happening with grpc 1.1.2?

@timanovsky
Copy link
Author

timanovsky commented Feb 22, 2017 via email

@blowmage
Copy link
Contributor

The grpc 1.1.0 release made improvements to networking. If you install the latest gem you will get use that version. Curious if it improves your situation.

@blowmage
Copy link
Contributor

@murgatroid99 can you or someone else comment on keep alive in the GRPC lib?

@timanovsky
Copy link
Author

timanovsky commented Feb 24, 2017 via email

@blowmage
Copy link
Contributor

Thanks @timanovski!

@landrito landrito added priority: p0 Highest priority. Critical issue. P0 implies highest priority. status: blocked Resolving the issue is dependent on other work. release blocking Required feature/issue must be fixed prior to next release. type: bug Error or flaw in code with unintended results or allowing sub-optimal usage patterns. labels Mar 1, 2017
@landrito landrito self-assigned this Mar 1, 2017
@Rob117
Copy link

Rob117 commented Mar 2, 2017

I'm experiencing the same issue in a rails app with the Vision library.

In my config/initializers folder, I have

require 'google/cloud/vision'
project_id = '<valid project here>'
VisionApi = Google::Cloud::Vision.new project: project_id, timeout: 10

Then in the controller, I simply have:

class Api::Services::OcrController < Api::BaseController
  skip_before_action :verify_authenticity_token

  def generate_text
    text = VisionApi.image(request.body).text
    result = {
      locale: text.locale,
      text: text.text
    }
    respond_json result: result
  end
end

This works, but if I wait 4 minutes, I get:

Google::Cloud::InternalError (13:{"created":"@1488451071.891371320","description":"Transport closed","file":"src/core/ext/transport/chttp2/transport/chttp2_transport.c","file_line":1072}):

And such as a result. Subsequent requests within the time frame still work.

@landrito
Copy link
Contributor

grpc/grpc#9986 should be able to help.

@swcloud
Copy link
Contributor

swcloud commented Mar 14, 2017

@landrito can you work with @apolcyn to find out grpc 1.2 release plan, i.e. whether we can get 1.2 release before March 17?

@apolcyn
Copy link

apolcyn commented Mar 23, 2017

grpc-1.2.1.pre1 pre-release gem was just pushed, which should fix this (it includes grpc/grpc#9986). Can you please use this pre-release gem to further test and verify?

@swcloud
Copy link
Contributor

swcloud commented Mar 23, 2017

@Rob117 @timanovsky Can you please give it a try?

@swcloud
Copy link
Contributor

swcloud commented Mar 29, 2017

@Rob117 @timanovsky This issue is blocking our release. If there are no updates from you, we will close this issue. You may reopen it if you still run into the issue later.

@blowmage
Copy link
Contributor

FWIW, I have not been able to reproduce the behavior in described in this issue. I've left a process idle for hours and it connects again without error.

@swcloud
Copy link
Contributor

swcloud commented Mar 31, 2017

Close it now since no updates from the original reporters. Fix added in grpc 1.2.1-pre1 gem.

gcf-owl-bot bot added a commit that referenced this issue Apr 30, 2024
Source-Link: googleapis/googleapis@55499b5

Source-Link: googleapis/googleapis-gen@cf5049b
Copy-Tag: eyJwIjoiZ29vZ2xlLWNsb3VkLWNvbXB1dGUtdjEvLk93bEJvdC55YW1sIiwiaCI6ImNmNTA0OWI3MDc5MjgyMDA2NWRiMzhlNzEyN2YzMmVhYjc3MDU5NDQifQ==
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: datastore Issues related to the Datastore API. grpc priority: p0 Highest priority. Critical issue. P0 implies highest priority. release blocking Required feature/issue must be fixed prior to next release. status: acknowledged status: blocked Resolving the issue is dependent on other work. type: bug Error or flaw in code with unintended results or allowing sub-optimal usage patterns.
Projects
First 4 (GA)
Cloud Datastore
Development

No branches or pull requests

8 participants