not trying to hold leases on WAL files if we are holding them already. #142

skyahead · 2016-10-21T18:48:28Z

This PR is for: #141

ConfluentJenkins · 2016-10-21T18:48:29Z

Can one of the admins verify this patch?

lakeofsand · 2016-10-24T09:31:16Z

@skyahead We meet the same question recently.
But,with log's traceback,we found our question is trigger by a close failed:

ERROR Error closing hdfs://192.168.101.55:8020/logs/*_/1/log.
org.apache.kafka.connect.errors.ConnectException: Error closing hdfs://192.168.101.55:8020/logs/_/1/log
at io.confluent.connect.hdfs.wal.FSWAL.close(FSWAL.java:156)
at io.confluent.connect.hdfs.TopicPartitionWriter.close(TopicPartitionWriter.java:325)
at io.confluent.connect.hdfs.DataWriter.onPartitionsRevoked(DataWriter.java:318)
at io.confluent.connect.hdfs.HdfsSinkTask.onPartitionsRevoked(HdfsSinkTask.java:108)
...
Caused by: java.io.InterruptedIOException: Interrupted while waiting for data to be acknowledged by pipeline
at org.apache.hadoop.hdfs.DFSOutputStream.waitForAckedSeqno(DFSOutputStream.java:2151)
at org.apache.hadoop.hdfs.DFSOutputStream.flushInternal(DFSOutputStream.java:2132)
at org.apache.hadoop.hdfs.DFSOutputStream.close(DFSOutputStream.java:2230)
at org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:72)
at org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:106)
at io.confluent.connect.hdfs.wal.WALFile$Writer.close(WALFile.java:329)
at io.confluent.connect.hdfs.wal.FSWAL.close(FSWAL.java:150)

skyahead · 2016-10-24T14:14:45Z

@lakeofsand I do not see exactly the same exceptions, but I do see lots of similar ones :-)

But I think after all those 'Error closing' exceptions, there is a 'topicPartitionWriters.remove(tp)' call in DataWriter.java's close() method, which will remove this writer and a new one will be recreated for the current partition. I.e., the connect code should survive these exceptions. Do you see your coding keeps generating these errors for ever?

But, It seems you are using a version that is older than this commit: b2b1c61#diff-d4f63c72e615f6185c4d472918ba1e95, and the close() method is called
onPartitionsRevoked() in your version.

lakeofsand · 2016-10-26T04:35:56Z

Our issue seems like because of some bug in hdfs client(): @skyahead

org.apache.hadoop.hdfs.DFSOutputStream.java:
@OverRide
public void close() throws IOException {
synchronized (this) {
TraceScope scope = dfsClient.getPathTraceScope("DFSOutputStream#close",
src);
try {
closeImpl(); ==> the previous "FSWAL.close" exception is throw here,dfsClient.endFileLease is missed.
} finally {
scope.close();
}
}
dfsClient.endFileLease(fileId);
}

skyahead · 2016-10-26T11:36:21Z

@lakeofsand When your errors happen, do you see anything wrong in your HDFS namenode log file?

lakeofsand · 2016-10-27T01:48:28Z

@skyahead
The "close exception" lead to a lease-renewer continue exist in connect process，HDFS namenode would not konw about it. File lease is continuely renewed by connect periodically.

Neither the process itself nor other process‘s opertions to this file will be failed for the lease is still owned by the previous process.

"
Failed to APPEND_FILE /logs/beaver_http_response/1/log for DFSClient_NONMAPREDUCE_450119871_31 on 192.168.101.101 because this file lease is currently owned by DFSClient_NONMAPREDUCE_1501121577_33 on 192.168.101.102
"

cotedm · 2017-01-06T14:30:25Z

@skyahead it's definitely dangerous to parse an exception message (these can change without warning if we say upgrade the HDFS client dependency). However, the symptom you describe in issue #141 looks like the writer isn't set to null which happens here
https://github.com/confluentinc/kafka-connect-hdfs/blob/master/src/main/java/io/confluent/connect/hdfs/wal/FSWAL.java#L154

It seems possible that the writer didn't close properly in your scenario because you cut the network. In which case we thrown an exception and never null out the writer, so we don't ever create a new writer (thus obtaining a new lease). Maybe a better approach is to null out the writer and reader in a finally block in the close() method presented above.

cotedm

@skyahead left you a few comments and suggestions. I think there's a couple of accidental changes in here and I think we can be a bit cleaner by modifying close() instead of adding a hard reset option

cotedm · 2017-01-26T17:20:29Z

src/main/java/io/confluent/connect/hdfs/wal/FSWAL.java

@@ -96,6 +97,11 @@ public void acquireLease() throws ConnectException {
    }
  }

+  private void reset() {


Instead of adding a new method, could we just do this in the close() method in a finally block so that we attempt to close the reader/writer gracefully first before nulling them out? Then replace calls to reset() with close() in this file instead?

Great point! Will do.

cotedm · 2017-01-26T17:31:56Z

src/main/java/io/confluent/connect/hdfs/wal/WALFile.java

-        out = streamOption.getValue();
+
+        init(conf, out, ownStream);
+      } catch (RemoteException re) {


Can you explain this part a bit more? I'm not sure why we need to look for this exception here but I might just be missing it.

This is the how the leaseException = "org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException" can be caught. See here: https://github.com/confluentinc/kafka-connect-hdfs/blob/master/src/main/java/io/confluent/connect/hdfs/wal/FSWAL.java#L78.

When this exception is seen, it means we are creating a new lease from a same DFSClient. Previously, we give up and keep the original lease which may lead to the forever waiting issue. The change close open files and clear the FileSystem cache. A new lease will be regenerated by future writing/reading.

Yeah, I can't remember the details but I think the issue is which jar the class is defined in or something like that. So we had to go down the hacky route of checking the class name.

cotedm · 2017-01-26T17:32:49Z

src/test/java/io/confluent/connect/hdfs/wal/WALTest.java

@@ -73,6 +73,7 @@ public void run() {
    });
    thread.start();

+    Thread.sleep(3001);


this seems unrelated?

I was hoping to fix the test but it seems I was wrong. Will commit something new for this.

cotedm · 2017-01-26T17:33:13Z

src/main/java/io/confluent/connect/hdfs/wal/WALFile.java

+        long start = startOpt == null ? 0 : startOpt.getValue();
+        // really set up
+        initialize(filename, file, start, len, conf, headerOnly != null);
+      } catch (RemoteException re) {


Same as the other RemoteException, can you explain why we need to catch this?

skyahead

WALTest fixed, please review

skyahead · 2017-01-26T19:45:04Z

src/main/java/io/confluent/connect/hdfs/wal/FSWAL.java

@@ -96,6 +97,11 @@ public void acquireLease() throws ConnectException {
    }
  }

+  private void reset() {


Great point! Will do.

skyahead · 2017-01-26T19:54:44Z

src/main/java/io/confluent/connect/hdfs/wal/WALFile.java

-        out = streamOption.getValue();
+
+        init(conf, out, ownStream);
+      } catch (RemoteException re) {


This is the how the leaseException = "org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException" can be caught. See here: https://github.com/confluentinc/kafka-connect-hdfs/blob/master/src/main/java/io/confluent/connect/hdfs/wal/FSWAL.java#L78.

When this exception is seen, it means we are creating a new lease from a same DFSClient. Previously, we give up and keep the original lease which may lead to the forever waiting issue. The change close open files and clear the FileSystem cache. A new lease will be regenerated by future writing/reading.

skyahead · 2017-01-26T19:55:31Z

src/test/java/io/confluent/connect/hdfs/wal/WALTest.java

@@ -73,6 +73,7 @@ public void run() {
    });
    thread.start();

+    Thread.sleep(3001);


I was hoping to fix the test but it seems I was wrong. Will commit something new for this.

ewencp · 2017-02-03T04:24:13Z

src/main/java/io/confluent/connect/hdfs/wal/FSWAL.java

@@ -60,12 +61,16 @@ public void append(String tempFile, String committedFile) throws ConnectExceptio
      writer.append(key, value);
      writer.hsync();
    } catch (IOException e) {
+      close();


Hmm, so just for the sake of debuggability, if we're doing work in the exception handler that could itself throw exceptions, I think we might want to at least log that something went wrong. Because with the current code, if close() throws an exception, then all the info about the original exception is lost.

This is true for a couple of the other similar changes below. The only alternative I could think of is allowing close() to take an extra parameter that's the original cause of calling close (or null if there isn't one) and attaching that as the cause if it fails, but then we also lose info about the other exception. I think logging this one proactively is probably fine since these should all be exceptional conditions.

Please have a look now.

ewencp · 2017-02-03T04:33:47Z

src/main/java/io/confluent/connect/hdfs/wal/WALFile.java

-        out = streamOption.getValue();
+
+        init(conf, out, ownStream);
+      } catch (RemoteException re) {


Yeah, I can't remember the details but I think the issue is which jar the class is defined in or something like that. So we had to go down the hacky route of checking the class name.

ewencp · 2017-02-03T04:38:44Z

src/main/java/io/confluent/connect/hdfs/wal/WALFile.java

+
+        init(conf, out, ownStream);
+      } catch (RemoteException re) {
+        log.error("Failed creating a WAL Writer: " + re.getMessage());


Can we do

log.error("Failed creating a WAL Writer: ", re);

instead so the log will include stack trace info?

In fact, is the log even going to be useful since it gets logged again by FSWAL which calls this constructor? I think this might be the case in the other case below as well, though I haven't extensively checked all callers. I guess worst case we're just logging a bit more information, so perhaps being conservative about logging the details isn't a bad idea.

Yes, I was hoping to log a bit more information so that it is easier to read the debug log which is VERY long.

So re.getMessage() is the short version that I like :-) If we log the stack trace, the same info will be logged again all the way up at https://github.com/confluentinc/kafka-connect-hdfs/blob/master/src/main/java/io/confluent/connect/hdfs/TopicPartitionWriter.java#L324.

ewencp · 2017-02-03T04:40:14Z

src/main/java/io/confluent/connect/hdfs/wal/WALFile.java

+        log.error("Failed creating a WAL Writer: " + re.getMessage());
+        if (re.getClassName().equals(leaseException)) {
+          if (fs != null) {
+            fs.close();


Does this need protection from a possible IOException?

Nope. I think IOException will be caught here: https://github.com/confluentinc/kafka-connect-hdfs/blob/master/src/main/java/io/confluent/connect/hdfs/TopicPartitionWriter.java#L324.

ewencp · 2017-02-03T04:49:57Z

src/main/java/io/confluent/connect/hdfs/wal/WAL.java

@@ -27,4 +27,5 @@
  void truncate() throws ConnectException;
  void close() throws ConnectException;
  String getLogFile();
+  long getSleepIntervalMs();


Hmm, a bit of a nit, but since this is only used for testing I'm not sure about making it public for testing. In fact, we separately have some work going on to refactor some of this code to make it more reusable and this seems like an odd addition to the interface given that it's really just public for testing. /cc @kkonstantine

I'm wondering if just casting to the more specific FSWAL class in the test would be a better solution (and label the FSWAL public API as public for testing)? Keeps the interface clean but allows the test to access the info that it needs.

cotedm · 2017-02-07T20:07:31Z

I don't have any other comments, what do you think @ewencp ?

kkonstantine

Thanks for your PR. I got a pointer to review an addition to the WAL interface that's not present any more, so I thought I give it an look.

Looks good in general. I've added some nitpicks that would help clean-up the code a little bit.

kkonstantine · 2017-02-08T16:32:59Z

src/main/java/io/confluent/connect/hdfs/wal/FSWAL.java

      throw new ConnectException(e);
    }
  }

+  public long getSleepIntervalMs() {


nit: Could have "package private" scope instead and a non-javadoc comment above to show it's destined for testing.

kkonstantine · 2017-02-08T16:33:09Z

src/main/java/io/confluent/connect/hdfs/wal/FSWAL.java

@@ -143,6 +151,8 @@ public void truncate() throws ConnectException {
      close();


Redundant call, since close has been added in the finally block.

kkonstantine · 2017-02-08T16:34:23Z

src/main/java/io/confluent/connect/hdfs/wal/FSWAL.java

@@ -159,6 +169,9 @@ public void close() throws ConnectException {
      }
    } catch (IOException e) {
      throw new ConnectException("Error closing " + logFile, e);
+    } finally {
+      writer = null;


I can't point to the actual lines, but nullifying reader and writer above, is redundant now that it's done within the finally block.

kkonstantine · 2017-02-08T16:38:02Z

src/main/java/io/confluent/connect/hdfs/wal/WALFile.java

@@ -56,6 +57,8 @@
  private static final int SYNC_HASH_SIZE = 16;   // number of bytes in hash
  private static final int SYNC_SIZE = 4 + SYNC_HASH_SIZE; // escape + hash

+  private static final String leaseException = "org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException";


nit: It'd be nice for variable naming to be consistent with our code style for static final member fields.
Thus, it should be something like: LEASE_EXCEPTION_CLASS_NAME
(I added more to the name to show it's a class name string and not to actual class. Your call).

I know we don't run checkstyle currently, but we will soon.

kkonstantine · 2017-02-08T16:44:37Z

src/test/java/io/confluent/connect/hdfs/wal/WALTest.java

@@ -43,7 +43,7 @@ public void testWALMultiClient() throws Exception {
    Storage storage = StorageFactory.createStorage(storageClass, conf, url);

    final WAL wal1 = storage.wal(topicsDir, TOPIC_PARTITION);
-    final WAL wal2 = storage.wal(topicsDir, TOPIC_PARTITION);
+    final FSWAL wal2 = (FSWAL)storage.wal(topicsDir, TOPIC_PARTITION);


Single white space needed between casting and storage

kkonstantine · 2017-02-08T16:44:52Z

src/test/java/io/confluent/connect/hdfs/wal/WALTest.java

-          // holding the lease for awhile
-          Thread.sleep(3000);
+          // holding the lease for time that is less than wal2's retry interval, which is 1000 ms.
+          Thread.sleep(wal2.getSleepIntervalMs()-100);


White space needed around -
(Again checkstyle coming soon :) )

kkonstantine · 2017-02-08T16:46:38Z

src/test/java/io/confluent/connect/hdfs/wal/WALTest.java

@@ -73,6 +73,8 @@ public void run() {
    });
    thread.start();

+    // AcquireLease will try to acquire the same lease that wal1 is holding and fail


nit: I'd use the actual method name (acquireLease) if I wanted to refer to it, instead of the concept ("Acquiring the lease")

skyahead · 2017-02-09T14:32:58Z

@kkonstantine Thanks for the review! how about now?

kkonstantine · 2017-02-09T14:41:05Z

Great, thanks. I see you also pushed the declaration of initial and max intervals as constants outside the method, I forgot to mention that.

Ok by me, I'd wait for final comments from @ewencp.

skyahead · 2017-02-09T17:12:27Z

@kkonstantine thanks for replying so promptly :-)

skyahead · 2017-03-28T12:50:44Z

@ewencp Can you have a final look at this PR please?

Perdjesk · 2017-05-18T13:30:14Z

@ewencp @kkonstantine
I hope everyone is doing well. Is there anything that can help to go forward with this PR?
#143 (comment)

pronvis · 2017-08-15T12:34:09Z

Good day for everyone!
I have the same issue in my cluster and need to execute hdfs debug recoverLease -path ${path} every day...
Looks like you fixed this issue for a long time ago, so why it is still not merged?

ConfluentJenkins · 2017-08-15T12:34:10Z

Can one of the admins verify this patch?

ewencp · 2017-08-15T17:22:58Z

ok to test

ewencp

This LGTM. I merged w/ master to clean up the conflicts. @kkonstantine any further thoughts?

kkonstantine

Seems ok overall, but one change needs justification (see comment)

kkonstantine · 2017-08-15T18:21:49Z

src/main/java/io/confluent/connect/hdfs/wal/FSWAL.java


  private WALFile.Writer writer = null;
  private WALFile.Reader reader = null;
  private String logFile = null;
  private HdfsSinkConnectorConfig conf = null;
  private HdfsStorage storage = null;
+  private long sleepIntervalMs = WALConstants.INITIAL_SLEEP_INTERVAL_MS;


Why is this one upgraded from local variable to member field variable? The behavior definitely changes on repeated calls on the same object. Notice that this variable is mutated iteratively as part of the while loop in acquiredLease. Even if it happens and there's only one call of acquireLease per FSWAL object I think it's a better practice to keep it local if that's what we intend to do here.

Good catch!

kkonstantine · 2017-08-15T18:23:37Z

src/main/java/io/confluent/connect/hdfs/wal/WALConstants.java

@@ -0,0 +1,23 @@
+/**
+ * Copyright 2015 Confluent Inc.


nit: If you apply any changes based on the comment above, you may change this one too to be 2017 instead. Otherwise, never mind.

kkonstantine

Great! Thanks for the fix @skyahead!
LGTM

This was referenced Jan 6, 2017

Exceptions when network is broken #141

Open

Exceptions when hdfs is down or hard disk is full #143

Open

cotedm reviewed Jan 26, 2017

View reviewed changes

skyahead commented Jan 27, 2017

View reviewed changes

skyahead added 4 commits February 1, 2017 09:11

not trying to hold leases on WAL files if we are holding them already.

cbf52b3

release HDFS FileSystem CACHE entry before requesting new lease

20184d5

removing reset method

4d51138

fix wal test

06cca38

skyahead force-pushed the issue141 branch from a69abc7 to 06cca38 Compare February 1, 2017 14:13

ewencp reviewed Feb 3, 2017

View reviewed changes

fix per review

d76ad1d

kkonstantine reviewed Feb 8, 2017

View reviewed changes

fix per review round 2

c10caff

Merge remote-tracking branch 'origin/master' into skyahead-issue141

943a089

ewencp approved these changes Aug 15, 2017

View reviewed changes

kkonstantine reviewed Aug 15, 2017

View reviewed changes

change back to use local variable while timing to acquireless

e28ec4f

kkonstantine approved these changes Aug 16, 2017

View reviewed changes

kkonstantine changed the base branch from master to 3.4.x August 16, 2017 20:49

kkonstantine changed the base branch from 3.4.x to master August 16, 2017 20:53

lakeofsand approved these changes Aug 17, 2017

View reviewed changes

kkonstantine merged commit cd0656f into confluentinc:master Aug 17, 2017

codenamelxl mentioned this pull request Sep 22, 2017

Problem consuming topic with multiple workers #223

Closed

		@@ -143,6 +151,8 @@ public void truncate() throws ConnectException {
		close();

not trying to hold leases on WAL files if we are holding them already. #142

not trying to hold leases on WAL files if we are holding them already. #142

Conversation

skyahead commented Oct 21, 2016 • edited

ConfluentJenkins commented Oct 21, 2016

lakeofsand commented Oct 24, 2016

skyahead commented Oct 24, 2016

lakeofsand commented Oct 26, 2016

skyahead commented Oct 26, 2016

lakeofsand commented Oct 27, 2016

cotedm commented Jan 6, 2017

cotedm left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

skyahead left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

cotedm commented Feb 7, 2017

kkonstantine left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kkonstantine Feb 8, 2017 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

skyahead commented Feb 9, 2017

kkonstantine commented Feb 9, 2017

skyahead commented Feb 9, 2017

skyahead commented Mar 28, 2017

Perdjesk commented May 18, 2017 • edited

pronvis commented Aug 15, 2017

ConfluentJenkins commented Aug 15, 2017

ewencp commented Aug 15, 2017

ewencp left a comment

Choose a reason for hiding this comment

kkonstantine left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kkonstantine left a comment

Choose a reason for hiding this comment

skyahead commented Oct 21, 2016 •

edited

kkonstantine Feb 8, 2017 •

edited

Perdjesk commented May 18, 2017 •

edited