New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bitstream download page #1004
Bitstream download page #1004
Conversation
This pull request introduces 1 alert when merging 1de6f1c into a64cb63 - view on LGTM.com new alerts:
|
@YanaDePauw : Now that #975 has been merged, please rebase this PR so that it only shows the changes that are new/unique. Thanks! |
1de6f1c
to
f504530
Compare
This pull request introduces 3 alerts when merging f504530 into 942dd2e - view on LGTM.com new alerts:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@YanaDePauw : I tried testing this with SSR enabled (by running yarn start
). The redirects work at a basic level, except when a short-lived token is necessary. Here's a basic example of what I'm seeing:
- I created an Item with 3 attached Bitstreams: 2 are public, and 1 is Admin only
- The public ones work great (whether logged in or not)
- If I login as an Admin, and attempt to access the Admin-only bitstream, I see the redirect occur (in Chrome Dev Tools), but I get a 401 response from the REST API (which shows both in the browser & in Chrome Dev Tools)
- At that same time, on the commandline (where I ran
yarn start
), I see this message after the redirect:
GET /bitstreams/4d7277e5-87f6-441f-8746-eca425053c86/download 302 345.070 ms - 230
Error in SSR, serving for direct CSR.
Error details : Error [ERR_HTTP_HEADERS_SENT]: Cannot set headers after they are sent to the client
at ServerResponse.setHeader (_http_outgoing.js:470:11)
at ServerResponse.header (D:\programming\dspace-angular\dist\server.js:33055:10)
at ServerResponse.send (D:\programming\dspace-angular\dist\server.js:32454:12)
at res.render (D:\programming\dspace-angular\dist\server.js:232:21)
at engine.render.then.html (D:\programming\dspace-angular\dist\server\main.js:187484:31)
at ZoneDelegate.invoke (D:\programming\dspace-angular\dist\server.js:981:30)
at Zone.run (D:\programming\dspace-angular\dist\server.js:738:47)
at D:\programming\dspace-angular\dist\server.js:1486:38
at ZoneDelegate.invokeTask (D:\programming\dspace-angular\dist\server.js:1016:35)
at Zone.runTask (D:\programming\dspace-angular\dist\server.js:783:51)
In this scenario I don't see the short-lived token appear as a URL param or Header in the final request to the REST API. So, maybe it's being sent as a Header & ends up being blocked in some way?
(While I'm not sure if it has anything to do with this...there's also a small chance that the new CSRF/XSRF token functionality from #977 is trying to modify your request, as that does attempt to add the latest XSRF token to a header in every request. However, if that fails, it results in a 403
error, and not a 401
, as the latter implies the auth token isn't making it in the request. I just want to note this as something else we should look out for in this PR. At this point though, I don't think this is the cause -- but I could be wrong.)
@tdonohue I finally had a chance to investigate this. It is indeed an XSRF issue. I have a commit here that adds some logging and fixes at least one part of the problem: If you replace that line with |
@artlowel : The I'm a bit surprised that this core Angular method wouldn't work server side, but you could be correct. In any case, pulling in the cookie by name directly should be "good enough" I believe. As for the behavior you are seeing, it's very possible you are seeing a side effect of one of these two known bugs:
Generally speaking here, the Angular SSR should be allowed to make XSRF requests on behalf of the Angular UI. IIRC, the current login process may already prove this, as login results in a hard refresh (triggering SSR) and login also requires a valid XSRF token. So, I'd be surprised if this was the issue, but maybe I'm overlooking something. |
A lot of core Angular methods don't work server-side unfortunately. Looking at the source code, you can see that on the server
I think only the initial login POST request requires a valid XSRF token, and that happens from the client. Status checks work fine without it. It wouldn't surprise me if no requests requiring the XSRF header have been tried server side so far, as nearly all requests made server side are GETs |
@artlowel : I think you may be on to something...here's what I tried today:
There's a few things interesting here... First, we know the CSRF issue is happening on I'm not exactly sure why this would be happening, but I'm also not sure if this is specific to CSRF. It might be, but it also might be an issue with sending HTTP headers in general. One thing we could try here is sending the XSRF token as a parameter (that's perfectly OK with Spring Security). So, there are two ways to send the XSRF Token: You can send it either via the So, if we are having trouble sending HTTP Headers in general from the Angular server-side, we could look into adding some custom code (perhaps in our interceptor) to take the token & put it into a parameter only when the code runs server-side (this StackOverflow example may be what we need to detect whether the code is running server-side). But, first, we probably should just run a quick sanity check to see if this PR works fine if we pass the XSRF Token via that |
@tdonohue I set up a local rest api to do some debugging, and the problem seems not to be that the token header is empty (I mentioned last meeting that they'll always look empty when you print the object). It's that rest also needs the cookie.
In Later on It's likely that this is how CSRF protection is supposed to work, but it proves that it can't have worked for requests sent from the node server, as it doesn't send the client's cookies along. I tried setting the So I'm not sure where to go from here. It seems to me the best way forward would be if we could drop the requirement for the cookie on the rest side, but I don't know if that's feasible. |
We discussed the issues @artlowel mentioned into today's meeting. Here's a brief summary: First, I had noted that storing the CSRF token in a Cookie is expected behavior, and it's the behavior used by Spring Security in their own CookieCsrfTokenRepository. [Option 1] Prior to the meeting, I found a few possible "workarounds" to this issue on the web. The workaround essentially seems to be to configure
[Option 2] Mentioned in today's meeting, since this issue only affects the [Option 3] We could disable CSRF protection just for this endpoint (possibly even limited to if the request comes from the UI). This is somewhat similar to Option 2, but would keep the endpoint as a [Option 4] @abollini mentioned using a JWT Token Repository instead of a cookie-based on. He linked to https://stormpath.com/blog/csrf-protection-jwt-spring-security as an example. However, this example describes storing the generated JWT (which acts as the CSRF Token) in the user's session (see section "CSRF Token Repository"). So, I'm not sure we'd want to do this as this is less "stateless" than a cookie approach. [Option 5] I had mentioned storing the CSRF token in the database...but, @benbosman and @abollini both pointed out that would not work if the user attempted to login from two different browsers (as the second login would overwrite the first login's CSRF token). So, we don't want to do this either. Overall, I'm thinking the best options here are Option 2 or Option 1. Retrieving a Thoughts @artlowel , @abollini or @benbosman ? |
hi @tdonohue in short my preference goes for option 4 (see below) or option 2. Option 1. I'm worried about concurrent issues, how the SSR will take care to don't mess the cookie from different requests? this could make much weigh the angular side to deal with just a single use case Option 2. There is no risk to use this auth endpoint in GET as far as I see, we don't pass any credential as request parameters Option 3. It would be ok but since 2 is easier... Option 4. I still think that it is possible to use JWT token and the signature in it to just verify that it is valid... eventually the problem could be that for good security it should be short lived and this would cause issue as it was already necessary to switch to "login session duration". This blog post seems to suggest this direction https://www.nedmcclain.com/better-csrf-protection/ |
@abollini : That article you shared is very interesting (I think I was missing your point about using JWTs, but this article clarifies it much better). That said, even that article still says
What we'd really be talking about in Option 4 is a completely custom CSRF protection technique, as Spring Security doesn't support this idea of an "HMAC Based CSRF Token" yet (as far as I can tell). So, simply put, this seems like a lot of custom work & we'd have to do more extensive security testing of the technique to ensure we are not creating security vulnerabilities (since there's no standard library that uses this technique). In my opinion this would not be possible to implement for Beta 5 or 7.0. That said, I do see that OWASP also says an HMAC Based Token Pattern is a valid CSRF protection technique. In summary, while I now see your point about Option 4, I don't think this is a quick fix & it'd require refactoring a lot of code and thoroughly testing our custom CSRF Token technique to ensure we don't accidentally create security issues for ourselves. Whereas, as you pointed out Option 2 is a very simple, quick fix that we could likely do in just 1-2 hours. We may want to go with Option 2 for now, but create a discussion ticket around whether to consider an "HMAC Based CSRF Token" in the future -- if we continually hit issues with the current cookie-based CSRF approach, we may be forced into that direction anyways. But for now, this PR seems to be the only remaining problem with our cookie-based CSRF approach. @artlowel or @benbosman , I'd also appreciate your feedback here if you have time |
@tdonohue I've reviewed the solutions currently on the table in detail:
|
f504530
to
3c93777
Compare
@tdonohue This PR has been updated to use GET for requests to the shortlivedtoken endpoint on the server side, as we discussed on Thursday. While it isn't directly related to this PR, I decided against putting it in a separate PR, as that would have made it more difficult for reviewers to test. The REST PR is still in progress, but I tested this PR with the current wip REST version and it works. I'll add an update here with a link as soon as it is ready. |
This pull request introduces 3 alerts when merging 3c93777 into 9586428 - view on LGTM.com new alerts:
|
The REST PR is ready now: DSpace/DSpace#3223 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@YanaDePauw and @artlowel : I'm still seeing a 401 error when I test this updated PR with the new backend PR DSpace/DSpace#3223 But, I'll admit, the behavior of the PR is now slightly different (so I don't think this is CSRF related anymore)
Here's what I'm doing...
- I'm running the frontend using
yarn start
- Create an Item with a single access-restricted Bitstream
- Logout
- Attempt to download Bitstream.
- Prompted to Login. Login as a user with access rights... Result: 401 error
- Logout & Login again as user with access rights. Now, while logged in attempt to download that file. Result: 401 error.
- Attempt to download an unrestricted bitstream. Result: works.
When I hit that 401 error, I also see this logged in the command-line (where I ran yarn start
):
Redirecting from /bitstreams/650618e9-7047-4445-944d-25619ffa72a9/download to http://localhost:8080/server/api/core/bitstreams/650618e9-7047-4445-944d-25619ffa72a9/content with 302
GET /bitstreams/650618e9-7047-4445-944d-25619ffa72a9/download 302 496.465 ms - 230
Warning [ERR_HTTP_HEADERS_SENT]: Tried to set headers after they were sent to the client
So, at least for me, I'm still having issues getting this working. I'm not seeing (in the code of this PR or in the REST PR) anything that is logically incorrect.
That said, I do notice that, in Chrome DevTools, neither of those requests (the first request to GET /download
and the redirect request to GET /content
) include any authentication headers. So, in both the Authorization
header is missing, and I don't see any sort of shortlivedtoken being passed in the querystring or similar. Since these are both GET requests, this is no longer a CSRF issue, but seems to do with passing the proper authentication information. (As a sidenote, in Chrome DevTools, I never see a request to the /shortlivedtokens
endpoint at all...but, maybe that's happening from SSR?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did not review the code, but tested it in a few different ways:
Single server setup:
- I've used a test server with CSRF short lived token GET endpoint DSpace#3223 on REST and this branch on Angular
- I've verified that downloading private files without authentication remains forbidden
- I've verified that downloading private files with authentication causes Angular to use the GET on shortlivedtokens, and the user is redirected to REST with this token
Separate servers setup 1:
- I've used a test server with CSRF short lived token GET endpoint DSpace#3223 on REST and this branch on Angular. The Angular server is configured to connect to the other REST server. The REST server is configured to use the other Angular server in
dspace.ui.url
andrest.cors.allowed-origins
- I've verified that downloading private files without authentication remains forbidden
- I've verified that downloading private files with authentication causes Angular to use the GET on shortlivedtokens, and the user is redirected to REST with this token
Separate servers setup 2:
- I've used a test server with CSRF short lived token GET endpoint DSpace#3223 on REST and this branch on Angular. The Angular server is configured to connect to the other REST server. The REST server is configured to use the other Angular server in
rest.cors.allowed-origins
and to use the other Angular server's IP inproxies.trusted.ipranges
. It does not include that Angular server indspace.ui.url
- I've verified that downloading private files without authentication remains forbidden
- I've verified that downloading private files with authentication causes Angular to use the GET on shortlivedtokens, and the user is redirected to REST with this token
Separate servers setup 3:
- I've used a test server with CSRF short lived token GET endpoint DSpace#3223 on REST and this branch on Angular. The Angular server is configured to connect to the other REST server. The REST server is configured to use the other Angular server in
rest.cors.allowed-origins
. It does not include that Angular server indspace.ui.url
nor the other Angular server's IP inproxies.trusted.ipranges
. - I've verified that downloading private files without authentication remains forbidden
- I've verified that downloading private files with authentication causes Angular to use the GET on shortlivedtokens, REST denies it (because the server is not trusted) and the download remains forbidden
@tdonohue I can't seem to reproduce your issue under normal circumstances. The only way I was able to reproduce it was when running the UI and REST api on different domains, and when
Indeed, this PR ensures that the shortlived token is retrieved by the node server. That's the only way we could make download links behave like regular links, that you can right click and open in a new tab, or share with someone etc. |
@artlowel : Huh, this makes me wonder if I did something wrong in my testing. It sounds like the problem is on my end. I'll rebuild now & make sure I do the full |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 After debugging this a bit today with @artlowel via Slack, I found my previous errors were related to a configuration issue when using a Docker environment (I was deploying the REST PR via Docker).
I found that, when running the Backend in Docker, any Angular SSR requests were seen as coming from IP address 172.18.0.1
(this was the IP reported in Apache for the /shortlivedtoken
request). Because this IP is not trusted by default, the REST API was blocking the GET /shortlivedtoken
request from Angular SSR, returning a 403. This then resulted in no shortlivedtoken
being passed back to the REST API during the download, which resulted in the 401 reported above.
So, I added this to my local.cfg
for the Docker Backend:
proxies.trusted.ipranges = 172.18.0.1
Then, everything worked perfectly!
So, in summary, both the backend & frontend PRs are working perfectly. However, we have an issue with auto-determining trusted IPs when running in Docker. I'll log that as a separate known issue, so that we can determine whether it's fixable in our codebase, or if it's something we need to document better for Docker-based environments.
Thanks @YanaDePauw and @artlowel for this work!
Apologies, it seems merging the backend PR accidentally closed this PR as it was linked up as "fixes" this PR. Reopened this so that I can actually merge it. |
References
Description
This PR creates a component to facilitate the downloading of bitstreams. This component will handle all the logic concerning access and authorization.
Depending on the authorization and the authentication state of the user this component will handle the download differently:
Instructions for Reviewers
This PR seems a lot bigger than it is, because it depends on #975 You can see only the changes in this PR here
To test the component the different usecases need to be tested.
Case 1: Authorized and not authenticated
Case 2: Authorized and authenticated
Case 3: Not authorized and not authenticated
Case 4: Not authorized and authenticated
Checklist
This checklist provides a reminder of what we are going to look for when reviewing your PR. You need not complete this checklist prior to creating your PR (draft PRs are always welcome). If you are unsure about an item in the checklist, don't hesitate to ask. We're here to help!
yarn run lint
package.json
), I've made sure their licenses align with the DSpace BSD License based on the Licensing of Contributions documentation.