Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Got exception: Connection closed prematurely #12

Closed
motymichaely opened this issue Oct 1, 2015 · 1 comment
Closed

Got exception: Connection closed prematurely #12

motymichaely opened this issue Oct 1, 2015 · 1 comment

Comments

@motymichaely
Copy link

Hey there,

We recently got issues with getting into errors when reading files from Cloud Storage:

2015-09-30 23:15:17,334 ERROR org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException as:u_1661 cause:java.io.IOException: Error reading gs://example-bucket/some-file.gz at position 20971520
java.io.IOException: Error reading gs://example-bucket/some-file.gz at position 20971520
  at com.google.cloud.hadoop.gcsio.GoogleCloudStorageReadChannel.openStreamAndSetMetadata(GoogleCloudStorageReadChannel.java:667)
  at com.google.cloud.hadoop.gcsio.GoogleCloudStorageReadChannel.performLazySeek(GoogleCloudStorageReadChannel.java:555)
  at com.google.cloud.hadoop.gcsio.GoogleCloudStorageReadChannel.read(GoogleCloudStorageReadChannel.java:289)
  at com.google.cloud.hadoop.fs.gcs.GoogleHadoopFSInputStream.read(GoogleHadoopFSInputStream.java:158)
  at java.io.DataInputStream.read(DataInputStream.java:149)
  at org.apache.hadoop.io.compress.DecompressorStream.getCompressedData(DecompressorStream.java:151)
  at org.apache.hadoop.io.compress.DecompressorStream.decompress(DecompressorStream.java:135)
  at org.apache.hadoop.io.compress.DecompressorStream.read(DecompressorStream.java:77)
  at java.io.InputStream.read(InputStream.java:101)
  at org.apache.hadoop.util.LineReader.readDefaultLine(LineReader.java:205)
  at org.apache.hadoop.util.LineReader.readLine(LineReader.java:169)
  at org.apache.hadoop.mapreduce.lib.input.LineRecordReader.nextKeyValue(LineRecordReader.java:139)
  at org.apache.pig.builtin.PigStorage.getNext(PigStorage.java:259)
  at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader.nextKeyValue(PigRecordReader.java:204)
  at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:530)
  at org.apache.hadoop.mapreduce.MapContext.nextKeyValue(MapContext.java:67)
  at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
  at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:763)
  at org.apache.hadoop.mapred.MapTask.run(MapTask.java:363)
  at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
  at java.security.AccessController.doPrivileged(Native Method)
  at javax.security.auth.Subject.doAs(Subject.java:415)
  at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1232)
  at org.apache.hadoop.mapred.Child.main(Child.java:249)
Caused by: java.net.SocketTimeoutException: Read timed out
  at java.net.SocketInputStream.socketRead0(Native Method)
  at java.net.SocketInputStream.read(SocketInputStream.java:152)
  at java.net.SocketInputStream.read(SocketInputStream.java:122)
  at sun.security.ssl.InputRecord.readFully(InputRecord.java:442)
  at sun.security.ssl.InputRecord.read(InputRecord.java:480)
  at sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:927)
  at sun.security.ssl.SSLSocketImpl.readDataRecord(SSLSocketImpl.java:884)
  at sun.security.ssl.AppInputStream.read(AppInputStream.java:102)
  at java.io.BufferedInputStream.fill(BufferedInputStream.java:235)
  at java.io.BufferedInputStream.read1(BufferedInputStream.java:275)
  at java.io.BufferedInputStream.read(BufferedInputStream.java:334)
  at sun.net.www.http.HttpClient.parseHTTPHeader(HttpClient.java:687)
  at sun.net.www.http.HttpClient.parseHTTP(HttpClient.java:633)
  at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1323)
  at java.net.HttpURLConnection.getResponseCode(HttpURLConnection.java:468)
  at sun.net.www.protocol.https.HttpsURLConnectionImpl.getResponseCode(HttpsURLConnectionImpl.java:338)
  at com.google.api.client.http.javanet.NetHttpResponse.<init>(NetHttpResponse.java:37)
  at com.google.api.client.http.javanet.NetHttpRequest.execute(NetHttpRequest.java:94)
  at com.google.api.client.http.HttpRequest.execute(HttpRequest.java:972)
  at com.google.api.client.googleapis.services.AbstractGoogleClientRequest.executeUnparsed(AbstractGoogleClientRequest.java:419)
  at com.google.api.client.googleapis.services.AbstractGoogleClientRequest.executeUnparsed(AbstractGoogleClientRequest.java:352)
  at com.google.api.client.googleapis.services.AbstractGoogleClientRequest.executeMedia(AbstractGoogleClientRequest.java:380)
  at com.google.api.services.storage.Storage$Objects$Get.executeMedia(Storage.java:4680)
  at com.google.cloud.hadoop.gcsio.GoogleCloudStorageReadChannel.openStreamAndSetMetadata(GoogleCloudStorageReadChannel.java:651)
  ... 23 more

It also seems like there's a bug in the log output of the current retry.

Any idea why this can occur? It seems like intermittent issues but I want to make sure.

Thanks

@medb
Copy link
Contributor

medb commented Jun 22, 2018

Recently we have added new configuration properties to control reads from GCS, please use them to workaround any intermittent issues while reading from GCS.

@medb medb closed this as completed Jun 22, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants