Skip to content

Fix #358 - Fixes deleteRecursive fails with ConnectionPoolTimeoutException#359

Merged
dekobon merged 24 commits intoTritonDataCenter:masterfrom
dekobon:fix-358
Oct 13, 2017
Merged

Fix #358 - Fixes deleteRecursive fails with ConnectionPoolTimeoutException#359
dekobon merged 24 commits intoTritonDataCenter:masterfrom
dekobon:fix-358

Conversation

@dekobon
Copy link
Copy Markdown
Contributor

@dekobon dekobon commented Oct 7, 2017

No description provided.

@dekobon dekobon requested review from cburroughs and tjcelaya October 7, 2017 03:38

this.uriSigner = new UriSigner(this.config, keyPair, signer);

this.beanSupervisor = new MantaMBeanSupervisor();
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm unsure how this connects to the fork-join related changes.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I messed up while fixing the merge conflict that was present this morning, there's now an extra assignment that needs to be removed.

/* We repetitively run the find() -> delete() stream operation and check
* the diretory targeted for deletion by attempting to delete it this
* deals with unpredictable directory contents changes and occasional
* delays in Manta where it will return from a file delete operation,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this intended server side behavior? If not, is there a MANTA issue?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is intermittent observed behavior. I honestly don't have a lot of time to go deep and do the reproduction steps needed to file an issue that can actually be acted upon. Past work has shown me that is a multi-day process.
I'm just going to delete the comment so that it doesn't lead anyone down a bad path.


/* If we get a directory not empty error we sleep
* every retry in the hopes that the subdirectory
* objects have been deleted by another process. */
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What other process could there be? Did you mean the other fork-join threads?

}
}

/* We choose the small of the maximum number of connections minus two or
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where did 2 come from?

/* We choose the small of the maximum number of connections minus two or
* the number of available processors as the size of our fork join pool
* for find() parallelism. */
final int parallelism = Math.min(config.getMaximumConnections() - 2,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't we need to guarantee that the result is a positive integer (or that the max connections are > 2)?


/**
* Factory class that returns a {@link ForkJoinPool} instance configured with
* the proper number of maximum threads.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"proper" here, "correct" below. I think this javadoc would be more direct if it explained that the pool size is base on the number of connections, and that threads > connections is a problem.


int processors = Runtime.getRuntime().availableProcessors();

// Overwrite the default if the user has explicitly set a parallelism value
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the case when having threads > connections would be an intentional and desirable configuration? I'm not sure I see what I user can do with this besides accidentally shoot themselves in the foot.

* to make sure that the number of concurrent threads will not exceed
* the maximum number of available connections.
*/
private final ForkJoinPool findForkJoinPool;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Per some of the discussion in #316, should this class be package private?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I reread #316, but I'm not understanding why we would want to change the scope from private to package private. Wouldn't that increase the surface area for compatibility problems?

this.delete(obj.getPath());
/* We repetitively run the find() -> delete() stream operation and check
* the diretory targeted for deletion by attempting to delete it this
* deals with unpredictable directory contents changes and occasional
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"unpredictable directory contents"

Is this referring to concurrent modifications?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it is. I will write that specifically.

* this should be an empty directory. */
delete(obj.getPath());

if (LOG.isTraceEnabled()) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I don't think obj.getPath() is so expensive that it merits a double level guard.

Copy link
Copy Markdown
Contributor Author

@dekobon dekobon Oct 11, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right it is ugly. I prefer the guard clause because obj.getPath() might become more expensive and then someone would have to remember to add guard clauses to all usages.
Honestly, that is a weakly held position that I hold with no passion.

@tjcelaya
Copy link
Copy Markdown
Contributor

tjcelaya commented Oct 10, 2017

There seems to be an edge case that's causing some calls to deleteRecursive to hang indefinitely. I'll need to add some logging before I can pinpoint the cause but running just MantaClientFindIT hangs indefinitely in the cleanUp method. MantaClientIT#testRecursiveDeleteObject terminates just fine so it'll take a bit of digging to figure out what's going on.

EDIT: Looks like this is just a matter of performance, redacting comment

@dekobon
Copy link
Copy Markdown
Contributor Author

dekobon commented Oct 11, 2017

This is great feedback and exactly what I was looking for. I'll be making improvements and pinging you both soon.

@tjcelaya
Copy link
Copy Markdown
Contributor

This PR seems to share its first commit with #360. Would it make sense to incorporate remaining commits in that PR into this one so we can avoid a merge conflict?

@dekobon
Copy link
Copy Markdown
Contributor Author

dekobon commented Oct 12, 2017

I made a mistake initially and branched off of the branch that contains #360. As changes to #360 have been pushed, I've been merging them into this PR to keep it in sync. I would like to push #360 first, so then this one will layer over it with no problems.

Copy link
Copy Markdown
Contributor

@tjcelaya tjcelaya left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tests seem to indicate that there are no more ConnectionPoolTimeoutExceptions are being triggered by deleteRecursive but now we're seeing a lot of 500s instead. I'll file a separate ticket for those but I think this PR is ready to merge.

@dekobon dekobon merged commit e2677b1 into TritonDataCenter:master Oct 13, 2017
@dekobon dekobon deleted the fix-358 branch October 13, 2017 21:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants