Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Don't suppress AlreadyClosedException #19975

Merged
merged 8 commits into from Aug 23, 2016
Expand Up @@ -591,7 +591,7 @@ public final boolean refreshNeeded() {
the store is closed so we need to make sure we increment it here
*/
try {
return !getSearcherManager().isSearcherCurrent();
return getSearcherManager().isSearcherCurrent() == false;
} catch (IOException e) {
logger.error("failed to access searcher manager", e);
failEngine("failed to access searcher manager", e);
Expand Down
Expand Up @@ -59,9 +59,8 @@ public void close() {
} catch (IOException e) {
throw new IllegalStateException("Cannot close", e);
} catch (AlreadyClosedException e) {
/* this one can happen if we already closed the
* underlying store / directory and we call into the
* IndexWriter to free up pending files. */
// This means there's a bug somewhere: don't suppress it
throw new AssertionError(e);
} finally {
store.decRef();
}
Expand Down
Expand Up @@ -562,8 +562,8 @@ public void refresh(String source) throws EngineException {
ensureOpen();
searcherManager.maybeRefreshBlocking();
} catch (AlreadyClosedException e) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we still need to catch this and then immediately rethrow otherwise it will get eaten by the general catch block below?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, note that this refresh call can come from, for example, InternalEngine#writeIndexingBuffer from IndexShard#writeIndexingBuffer which has a general catch block which will swallow the AlreadyClosedException!

ensureOpen();
maybeFailEngine("refresh", e);
failOnTragicEvent(e);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm curious about the removal of the ensureOpen calls; can you explain?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ensureOpen() can throw its own exception, masking the (unexpected) AlreadyClosedException we just hit. Also, except for disaster (failing engine), we should never have hit an AlreadyClosedException because the engine was closed in that try block since we hold the readLock. I think?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is the code that I mean, is it ok to concurrently close SM and call SM#maybeRefreshBlocking? if so we can ignore the ACE if ensureOpen throws an exception so I thin the code is good?

throw e;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't we still be failing the engine here?

} catch (EngineClosedException e) {
throw e;
} catch (Exception e) {
Expand Down Expand Up @@ -610,8 +610,8 @@ public void writeIndexingBuffer() throws EngineException {
indexWriter.flush();
}
} catch (AlreadyClosedException e) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment here as above: do we still need to catch this and then immediately rethrow otherwise it will get eaten by the general catch block below?

ensureOpen();
maybeFailEngine("writeIndexingBuffer", e);
failOnTragicEvent(e);
throw e;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same question: shouldn't we still be failing the engine here?

} catch (EngineClosedException e) {
throw e;
} catch (Exception e) {
Expand Down Expand Up @@ -835,6 +835,14 @@ public void forceMerge(final boolean flush, int maxNumSegments, boolean onlyExpu
} finally {
store.decRef();
}
} catch (AlreadyClosedException ex) {
/* in this case we first check if the engine is still open. If so this exception is just fine
* and expected. We don't hold any locks while we block on forceMerge otherwise it would block
* closing the engine as well. If we are not closed we pass it on to failOnTragicEvent which ensures
* we are handling a tragic even exception here */
ensureOpen();
failOnTragicEvent(ex);
throw ex;
} catch (Exception e) {
try {
maybeFailEngine("force merge", e);
Expand Down Expand Up @@ -869,26 +877,35 @@ public IndexCommit acquireIndexCommit(final boolean flushFirst) throws EngineExc
}
}

private void failOnTragicEvent(AlreadyClosedException ex) {
// if we are already closed due to some tragic exception
// we need to fail the engine. it might have already been failed before
// but we are double-checking it's failed and closed
if (indexWriter.isOpen() == false && indexWriter.getTragicException() != null) {
final Exception tragedy = indexWriter.getTragicException() instanceof Exception ?
(Exception) indexWriter.getTragicException() :
new Exception(indexWriter.getTragicException());
failEngine("already closed by tragic event on the index writer", tragedy);
} else if (translog.isOpen() == false && translog.getTragicException() != null) {
failEngine("already closed by tragic event on the translog", translog.getTragicException());
} else {
// this smells like a bug - we only expect ACE if we are in a fatal case ie. either translog or IW is closed by
// a tragic event or has closed itself. if that is not the case we are in a buggy state and raise an assertion error
throw new AssertionError("Unexpected AlreadyClosedException", ex);
}
}

@Override
protected boolean maybeFailEngine(String source, Exception e) {
boolean shouldFail = super.maybeFailEngine(source, e);
if (shouldFail) {
return true;
}

// Check for AlreadyClosedException
// Check for AlreadyClosedException -- ACE is a very special
// exception that should only be thrown in a tragic event. we pass on the checks to failOnTragicEvent which will
// throw and AssertionError if the tragic event condition is not met.
if (e instanceof AlreadyClosedException) {
// if we are already closed due to some tragic exception
// we need to fail the engine. it might have already been failed before
// but we are double-checking it's failed and closed
if (indexWriter.isOpen() == false && indexWriter.getTragicException() != null) {
final Exception tragedy = indexWriter.getTragicException() instanceof Exception ?
(Exception) indexWriter.getTragicException() :
new Exception(indexWriter.getTragicException());
failEngine("already closed by tragic event on the index writer", tragedy);
} else if (translog.isOpen() == false && translog.getTragicException() != null) {
failEngine("already closed by tragic event on the translog", translog.getTragicException());
}
failOnTragicEvent((AlreadyClosedException)e);
return true;
} else if (e != null &&
((indexWriter.isOpen() == false && indexWriter.getTragicException() == e)
Expand All @@ -914,6 +931,7 @@ protected final void writerSegmentStats(SegmentsStats stats) {

@Override
public long getIndexBufferRAMBytesUsed() {
// We don't guard w/ readLock here, so we could throw AlreadyClosedException
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this need to caught and wrapped then?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, where IndexShard calls this method, it expects/handles the AlreadyClosedException (handleRefreshException in IndexShard.java), and an ACE thrown from here can happen under normal usage (does not necessarily mean there's a bug).

super's javadocs also state that it can throw AlreadyClosedException.

We could alternatively acquire the readLock, in this method (and re-throw to AssertionError), but that's somewhat scary since writeLock can be held for quite some time, blocking IndexingMemoryController from polling the shards when one engine is flushing.

return indexWriter.ramBytesUsed() + versionMap.ramBytesUsedForRefresh();
}

Expand Down Expand Up @@ -963,8 +981,9 @@ protected final void closeNoLock(String reason) {
logger.trace("rollback indexWriter");
try {
indexWriter.rollback();
} catch (AlreadyClosedException e) {
// ignore
} catch (AlreadyClosedException ex) {
failOnTragicEvent(ex);
throw ex;
}
logger.trace("rollback indexWriter done");
} catch (Exception e) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This catch will now swallow an AlreadyClosedException!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think IW.rollback throws AlreadyClosedException.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ähm, it could?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't think it would? It's not supposed to? If it does I think it's a Lucene bug? We have this comment in IndexWriter.rollback:

    // don't call ensureOpen here: this acts like "close()" in closeable.

Expand Down
Expand Up @@ -191,7 +191,8 @@ public void refresh(String source) throws EngineException {
ensureOpen();
searcherManager.maybeRefreshBlocking();
} catch (AlreadyClosedException e) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment here: do we still need to catch this and then immediately rethrow otherwise it will get eaten by the general catch block below?

ensureOpen();
// This means there's a bug somewhere: don't suppress it
throw new AssertionError(e);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this one in the shadow engine shouldn't happen indeed

} catch (EngineClosedException e) {
throw e;
} catch (Exception e) {
Expand Down