Skip to content

Conversation

@sadovnychyi
Copy link
Contributor

We are getting lots of Broken pipe errors and it's only a matter of luck for write to succeed. It's been happening for months.

Partial stack trace:

File "/usr/local/lib/python2.7/dist-packages/apache_beam/io/gcp/datastore/v1/helper.py", line 225, in commit
    response = datastore.commit(request)
  File "/usr/local/lib/python2.7/dist-packages/googledatastore/connection.py", line 140, in commit
    datastore_pb2.CommitResponse)
  File "/usr/local/lib/python2.7/dist-packages/googledatastore/connection.py", line 199, in _call_method
    method='POST', body=payload, headers=headers)
  File "/usr/local/lib/python2.7/dist-packages/oauth2client/transport.py", line 169, in new_request
    redirections, connection_type)
  File "/usr/local/lib/python2.7/dist-packages/httplib2/__init__.py", line 1609, in request
    (response, content) = self._request(conn, authority, uri, request_uri, method, body, headers, redirections, cachekey)
  File "/usr/local/lib/python2.7/dist-packages/httplib2/__init__.py", line 1351, in _request
    (response, content) = self._conn_request(conn, request_uri, method, body, headers)
  File "/usr/local/lib/python2.7/dist-packages/httplib2/__init__.py", line 1273, in _conn_request
    conn.request(method, request_uri, body, headers)
  File "/usr/lib/python2.7/httplib.py", line 1042, in request
    self._send_request(method, url, body, headers)
  File "/usr/lib/python2.7/httplib.py", line 1082, in _send_request
    self.endheaders(body)
  File "/usr/lib/python2.7/httplib.py", line 1038, in endheaders
    self._send_output(message_body)
  File "/usr/lib/python2.7/httplib.py", line 882, in _send_output
    self.send(msg)
  File "/usr/lib/python2.7/httplib.py", line 858, in send
    self.sock.sendall(data)
  File "/usr/lib/python2.7/ssl.py", line 753, in sendall
    v = self.send(data[count:])
  File "/usr/lib/python2.7/ssl.py", line 719, in send
    v = self._sslobj.write(data)
RuntimeError: error: [Errno 32] Broken pipe [while running 'Groups to datastore/Write Mutation to Datastore']

Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily:

  • Choose reviewer(s) and mention them in a comment (R: @username).
  • Format the pull request title like [BEAM-XXX] Fixes bug in ApproximateQuantiles, where you replace BEAM-XXX with the appropriate JIRA issue, if applicable. This will automatically link the pull request to the issue.
  • If this contribution is large, please file an Apache Individual Contributor License Agreement.

Post-Commit Tests Status (on master branch)

Lang SDK Apex Dataflow Flink Gearpump Samza Spark
Go Build Status --- --- --- --- --- ---
Java Build Status Build Status Build Status Build Status
Build Status
Build Status
Build Status Build Status Build Status
Python Build Status
Build Status
--- Build Status
Build Status
Build Status --- --- ---

Pre-Commit Tests Status (on master branch)

--- Java Python Go Website
Non-portable Build Status Build Status Build Status Build Status
Portable --- Build Status --- ---

See .test-infra/jenkins/README for trigger phrase, status and link of all Jenkins jobs.

@sadovnychyi
Copy link
Contributor Author

R: @tvalentyn

@tvalentyn
Copy link
Contributor

LGTM, thank you @sadovnychyi .
R:@chamikaramj who could merge.
Also cc: @udim who has been working on Datastore recently.

@udim
Copy link
Member

udim commented Apr 18, 2019

LGTM
@sadovnychyi Sorry to hear about that. Could you also open a JIRA issues so we can track this?

@udim
Copy link
Member

udim commented Apr 18, 2019

run python postcommit

@chamikaramj
Copy link
Contributor

Run Python PreCommit

@tvalentyn
Copy link
Contributor

tvalentyn commented Apr 18, 2019

Looks like there is a preexisting post-commit failure, looking into it on https://issues.apache.org/jira/browse/BEAM-7063.

@sadovnychyi sadovnychyi changed the title Retry Datastore writes on [Errno 32] Broken pipe [BEAM-7476] Retry Datastore writes on [Errno 32] Broken pipe Jun 1, 2019
@sadovnychyi
Copy link
Contributor Author

Created an issue here: https://issues.apache.org/jira/browse/BEAM-7476

We have applied this workaround internally and haven't seen this error since then.

@udim
Copy link
Member

udim commented Jun 3, 2019

@sadovnychyi Have you seen this issue in Python 3? We are using a different Datastore library for Py3 (google-cloud-datastore) which may or may not have the same issue.

1 similar comment
@udim
Copy link
Member

udim commented Jun 3, 2019

@sadovnychyi Have you seen this issue in Python 3? We are using a different Datastore library for Py3 (google-cloud-datastore) which may or may not have the same issue.

@sadovnychyi
Copy link
Contributor Author

@udim I actually assumed that datastore IO isn't supported in python 3 (since datastoreio uses that ancient client) and have not tried it yet because of that. I will definitely check that, but could take weeks. I'll update here and/or in an issue.

@tvalentyn
Copy link
Contributor

run python postcommit

@stale
Copy link

stale bot commented Aug 2, 2019

This pull request has been marked as stale due to 60 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pull request requires a review, please simply write any comment. If closed, you can revive the PR at any time and @mention a reviewer or discuss it on the dev@beam.apache.org list. Thank you for your contributions.

@stale stale bot added the stale label Aug 2, 2019
@udim
Copy link
Member

udim commented Aug 2, 2019

run python 2 postcommit

@stale stale bot removed the stale label Aug 2, 2019
@udim
Copy link
Member

udim commented Aug 2, 2019

run python 3.5 postcommit

@udim udim merged commit fef16e9 into apache:master Aug 2, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants