Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix for #7060 direct download causes RedirectException #7085

Merged
merged 3 commits into from Jul 17, 2020

Conversation

qqmyers
Copy link
Member

@qqmyers qqmyers commented Jul 14, 2020

What this PR does / why we need it: Allows download redirects

Which issue(s) this PR closes:

Closes #7060 direct download causes RedirectException

Special notes for your reviewer: Don't know why this is needed now and wasn't needed in 4.20

Suggestions on how to test this: Use S3, turn on download redirects - it should work (and won't without this PR)

Does this PR introduce a user interface change? If mockups are available, please link/include them here: no

Is there a release notes update needed for this change?: no

Additional documentation:

@coveralls
Copy link

coveralls commented Jul 14, 2020

Coverage Status

Coverage decreased (-0.004%) to 19.651% when pulling 70ba37c on GlobalDataverseCommunityConsortium:IQSS/7060 into e7f5448 on IQSS:develop.

@pdurbin
Copy link
Member

pdurbin commented Jul 14, 2020

@qqmyers pull request #6974 is the one I mentioned in tech hours just now.

"ThrowableHandler" has this as a comment: "Produces a generic 500 message for the API, being a fallback handler for not specially treated exceptions."

In addition a "JsonParseExceptionHandler" was added.

I have no idea if this is what changed things or not.

@qqmyers
Copy link
Member Author

qqmyers commented Jul 14, 2020

@pdurbin - ah - you've got it. There was no ThrowableHandler in 4.20! So that one is probably cutting off the default handling of the RedirectionException. I'm not sure what the need for it is other than to add logging info for other errors - perhaps it should be more selective? In any case, I think this makes it clearer why this PR works/is needed. Could be a new issue to question whether the ThrowableHandler is needed? Or should we try to resolve now?

@scolapasta
Copy link
Contributor

Good find @pdurbin.

@qqmyers I'd suggest we resolve this now - if we don't beed it we don't need this PR. If we do, this PR is fine.

@qqmyers
Copy link
Member Author

qqmyers commented Jul 14, 2020

@poikilotherm - looks like your addition of src/main/java/edu/harvard/iq/dataverse/api/errorhandlers/ThrowableHandler.java in #6936 broke the handling for the RedirectException used in direct S3 downloads. Can you shed some light on if/why it's needed? Is it just to provide additional logging versus the default? The PR here adds a handler specifically for redirects, but it's only needed if ThrowableHandler blocks the default handling, so we're trying to figure out whether it's better to remove ThrowableHandler, to constrain it to not catch RedirectionExceptions somehow, or to use this PR. Any thoughts?

@poikilotherm
Copy link
Contributor

poikilotherm commented Jul 15, 2020

tl;dr: My vote: 1) Add the exception handler in this PR, completing the gang. 🧑‍🤝‍🧑 2) Refactor some other places to avoid throwing unspecific WebApplicationExceptions. ✌️ 3) Get everyone happy. 😸

🐛 🐛 🐛 🐛 🐛 🐛 🐛

Hey folks, let me shed some light on this. 💡

It's a good common practice to catch unknown exceptions, log them properly and present a meaningful error message to the user. This is what ThrowableHandler does:

  • Avoid sending appserver default HTML error pages when users expect JSON errors from a JSON REST API.
  • Log exceptions with details about when and where this happened, what was the Request URL, etc etc.
  • Avoid sending possible sensible information over the wire (you can't know).
  • Use proper HTML statuses instead of firing meaningless exception messages to a user that has no idea what's going on.

I took this discussion as reason to scan the complete codescan for used JAX RS exceptions: 🔍

rg -NI "javax.ws.rs.*Exception;" src/main | sort | uniq -c
      3 import javax.ws.rs.BadRequestException;
      2 import javax.ws.rs.ForbiddenException;
      1 import javax.ws.rs.NotAllowedException;
      5 import javax.ws.rs.NotFoundException;
      1 import javax.ws.rs.RedirectionException;
      4 import javax.ws.rs.ServiceUnavailableException;
      5 import javax.ws.rs.WebApplicationException;

All of those are subclasses of javax.ws.rs.WebApplicationException. Except for RedirectException and WebApplicationException all already feature a proper error handler in package api.errorhandlers from the time before #6974 happened.

Let's take a look at the usages of the generic top class: 🔍 🔍

src/main/java/edu/harvard/iq/dataverse/api/BundleDownloadInstanceWriter.java
199:            throw new WebApplicationException(Response.Status.INTERNAL_SERVER_ERROR);
202:        throw new WebApplicationException(Response.Status.NOT_FOUND);

src/main/java/edu/harvard/iq/dataverse/api/DownloadInstanceWriter.java
85:                    //throw new WebApplicationException(Response.Status.SERVICE_UNAVAILABLE);
225:                        //throw new WebApplicationException(Response.Status.SERVICE_UNAVAILABLE);
247:                            throw new WebApplicationException(new ServiceUnavailableException());
277:                        throw new WebApplicationException(new ServiceUnavailableException());
371:        throw new WebApplicationException(Response.Status.NOT_FOUND);

Those occasions look like they could use a proper refactoring to their more specific exception variants. ♻️
I'd be happy to create a PR against @qqmyers branch if you don't want to do this yourself. 😉

I vote for keeping ThrowableHandler and taking care of exceptions. 👍

@qqmyers
Copy link
Member Author

qqmyers commented Jul 15, 2020

@poikilotherm -

My understanding is that the framework already supplies exception handlers for these cases and one only has to override them if you need non-standard results. We don't for RedirectionException - that appears to be the way you are supposed to cause a redirect. Similarly, the docs for WebApplicationException say it does what you want - returns a 500 error with no additional info 'leaked'. Or it uses the status and or whole response you send if you want to specify those.

That said, I'm not opposed to wrapping unexpected Throwables to improve logging about them, but my guess is that catching them is also breaking some of the WebApplicationExceptions you note above (and any others), e.g. the WebApplicationException(new ServiceUnavailableException()); is throwing a 500 error now instead of 503. (The same would be true for any NotAcceptableException, NotAuthorizedException, NotSupportedException, but I don't see any of those in the code). So the refactoring is not optional (for anything returning a non-500 response) and developers need to be aware that they can't use the other three classes.

Perhaps a useful follow-up would be to refactor the ExceptionHandler classes so they all extend from a base(s) that standardize the internal logging and support more of the default behaviors (a good student project?)

@djbrooke djbrooke added this to the Dataverse 5 milestone Jul 15, 2020
The ThrowableExceptionHandler overrides default handling of
WebApplicationException, so we either need to handle all the relevant
subclasses directly, or re-implement the logic for
WebApplicationException handling that queries the status or response
supplied to find the right error code to send.
@qqmyers
Copy link
Member Author

qqmyers commented Jul 15, 2020

I've updated the branch but I don't see the PR automatically updating as it usually does. Anyone have ideas?

Regardless - @poikilotherm - check the branch itself before making any additional changes. I added an InternalServerErrorExceptionHandler and removed the generic WebApplicationExceptions that were no longer correctly handled. I have not tested all the underlying error conditions. My guess is that the return codes will now be correct but the exact message returned for errors could be different from what it was, which won't be a problem unless we have tests checking that or are worried about that type of change to the API.

@mheppler mheppler changed the title Fix for #7060 Fix for #7060 direct download causes RedirectException Jul 16, 2020
@djbrooke
Copy link
Contributor

Hey @poikilotherm - we think this is good to go for Dataverse 5. Please let us know if you have any additional comment. If not, we'll plan to move this ahead on the board and get it merged. Thanks!

@landreev
Copy link
Contributor

Just confirming that the extra redirect from under api/REST in my zipper branch is throwing the same exception under recent Payara versions.

Copy link
Contributor

@landreev landreev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@qqmyers Kevin has just merged my zipper branch - so there's now one more place where "RedirectionException" is thrown, in Access.java. So, just to confirm, no changes are needed there - the exception handler you are adding in this PR is going to take care of it - ?

@qqmyers
Copy link
Member Author

qqmyers commented Jul 16, 2020

@landreev - yes. It catches all RedirectionExceptions and triggers an actual redirect to the URL in the response you put in the exception.

Copy link
Contributor

@poikilotherm poikilotherm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks perfect! Thank you @qqmyers for refactoring and cleaning up messy places! ❤️

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
No open projects
Development

Successfully merging this pull request may close these issues.

direct download causes RedirectException
8 participants