Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DS-3651: Range header support #1884

Merged
merged 24 commits into from
Dec 3, 2017

Conversation

tomdesair
Copy link
Contributor

@tomdesair tomdesair commented Nov 10, 2017

@Frederic-Atmire and I implemented a solution for https://jira.duraspace.org/browse/DS-3651 and https://jira.duraspace.org/browse/DS-3527 (for bitstreams).

Frederic created a SAF archive you can import and use to test this PR:

  • https://www.dropbox.com/s/jsaoe9bd5o5biq2/DS-3651-SAF.zip?dl=0 (Note: this file will only remain available while the PR is open).
  • Make sure the bitstream format registry is up to date: bin/dspace registry-loader -bitstream config/registries/bitstream-formats.xml
  • bin/dspace import -a -s ~/Downloads/ -z DS-3651-SAF.zip -c 123465789/2 -m mapfile -e admin@mail.com

The SAF archive contains 3 items: one with a PDF, one with a MP4 file and one with a MP3 file. Each modern browser should open the files without problems. The MP4 and MP3 files should be downloaded by parts. You should also be able to jump within the media file to undownloaded sections.

Alternatively, you can also test using curl as illustrated in the integration tests or with a Download mananger that can pause and resume downloads.

@coveralls
Copy link

Coverage Status

Coverage increased (+0.5%) to 21.305% when pulling b84c2183283511c1e8a252bb647aeb7cc12aeba1 on atmire:DS-3651_Range-Header-support into 716912f on DSpace:master.

@coveralls
Copy link

Coverage Status

Coverage increased (+0.5%) to 21.341% when pulling 4afa0f240893de74e4a26aa7bcd2a738e5f26426 on atmire:DS-3651_Range-Header-support into 716912f on DSpace:master.

@artlowel artlowel added the interface: REST API v7+ REST API for v7 and later (dspace-server-webapp module) label Nov 13, 2017
<!--<bean class="org.dspace.discovery.SolrServiceIndexOutputPlugin" id="solrServiceIndexOutputPlugin"/>-->

<!-- Statistics services are both lazy loaded (by name), as you are likely just using ONE of them and not both -->
<bean id="elasticSearchLoggerService" class="org.dspace.statistics.ElasticSearchLoggerServiceImpl" lazy-init="true"/>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this really required? the plan is to withdrawn ES support in DSpace7

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I moved these services from here to this new file so that we can override them in the integration tests.

I suggest we remove the elasticSearchLoggerService bean once we remove the complete Elastic Search implementation. I don't want to partly cleanup Elastic Search now with the risk of forgetting stuff.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, I agree

import org.springframework.web.bind.annotation.PathVariable;
import org.springframework.web.bind.annotation.RequestMapping;
import org.springframework.web.bind.annotation.RequestMethod;
import org.springframework.web.bind.annotation.RestController;

/**
* This is a specialized controller to provide access to the bitstream binary content
*
* @author Andrea Bollini (andrea.bollini at 4science.it)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please add the name of the main authors of the class

try {
authorizeService.authorizeAction(context, bit, Constants.READ);
} catch (AuthorizeException e) {
response.sendError(HttpServletResponse.SC_UNAUTHORIZED);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as said I'm unsure about adding the auth check here. In any case we should not deal with the response directly but throw a new REST runtime AuthorizeException that can be annotated see https://github.com/DSpace/DSpace/blob/master/dspace-spring-rest/src/main/java/org/dspace/app/rest/exception/RepositoryNotFoundException.java

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can also implement a @ControllerAdvice class that has a handler method for AuthorizeExceptions. That way we don't have to manually convert this exception to a REST runtime exception. What do you think?

}

} catch(IOException ex) {
log.debug("Client aborted the request before the download was completed. Client is probably switching to a Range request.", ex);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are you sure about that? an IOException can occur if we have issue to read from the underline storage. If the client abort the request the exception should occur on the Servlet container stack as writing to the response is in some way "buffered" by tomcat, etc. (I admit I haven't check)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We'll make the check more specific on the "Client aborted" use case.


} catch(IOException ex) {
log.debug("Client aborted the request before the download was completed. Client is probably switching to a Range request.", ex);
} catch (Exception e) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No please, avoid catch all block. If we have other not runtime exceptions they should be encapsulated in a REST Exception to have single trasversal code to deal with logging and response code. We should move to the SpringMVC exception handling way https://spring.io/blog/2013/11/01/exception-handling-in-spring-mvc

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok

}
}

private boolean isNotAnErrorResponse(HttpServletResponse response) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

once you switch to the mvc exception handing this should be not required anymore

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Our Range header utility class also sets some error codes. So we cannot remove this method.

import org.dspace.app.rest.model.EPersonRest;
import org.dspace.app.rest.model.GroupRest;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know that we have an open conversation about setting code style to avoid such kind of issue... but can you exclude from the PR the file that only have fix to the import? this should be done in a single dedicated pr when we have set our codestyle. This PR doesn't really need to touch more than 5-10 files I guess

@@ -1972,6 +1963,9 @@ mail.helpdesk = ${mail.admin}
# Should all Request Copy emails go to the helpdesk instead of the item submitter?
request.item.helpdesk.override = false

# 2 MB
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you add more details about the intend of the parameter? it is a limit for the range requests? it is discoverable in some way?

Copy link
Contributor Author

@tomdesair tomdesair Nov 20, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the size of the byte buffer used for copying data from the bitstream InputStream to the request OutputStream. We can also omit the configuration property and give it a static value.

@@ -0,0 +1 @@
You should only add Spring XML definition files here if there is really no way to load them through automatic component scanning.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A good use case to place xml here is to allow override of the provided configuration such as the listners enabled for usage events that you have included

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I hope that in the future we also find an easy way (using XML?) to override the default bean implementations we now have in dspace-spring-rest. For example, what if I want to use a custom ItemConverter bean in my own custom repository.


Context context = ContextUtil.obtainContext(request);

Bitstream bit = getBitstreamIfAuthorized(context, uuid, response);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should throw the authorize exception and use the spring mvc excpetion handling here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it OK if we take the global exception handling approach using @ControllerAdvice?

* @author Andrea Bollini (andrea.bollini at 4science.it)
* @author Atmire NV (info at atmire dot com)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While we haven't really "policed" @author tags closely in recent years, I feel we should likely keep them pointing at individuals (who want credit for creating/refactoring specific classes) instead of organizations. I'm a bit worried that labeling classes with organizational contact info is a bit misleading, as it no longer refers to the specific creator of the class, and also could become an advertising gimmick/battle within our codebase (over which company gets more credit). The only company that "owns" the code is DuraSpace, and we just own the copyright in order to ensure the code remains open source, etc.

So, personally, I feel we should avoid this sort of change in our codebase. Either leave off the @author altogether, or reference the individual(s) who created / refactored the class.

However, if others disagree strongly here, I'll gladly bring this to discussion within the Committers group and/or Steering Group, so that we can formalize a policy on @author tags

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I only noted this in the first file I found with this company @author tag. There are a few others

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While Frederic and I did most of the development, there are also other people at Atmire who provided input and ideas for this (and other) pull-requests. And it is because other colleagues are working full-time on other projects, that Frederic and I have time allocated to work on DSpace 7. So it doesn't feel fair to only put my and Frederic's name as this PR would not have been possible without the support of Atmire.

And while I understand your concerns on the "advertising gimmick/battle", our codebase already contains a lot of company names/websites and contacts as people tend to use their professional e-mail address. There already are other examples of general company authors.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tomdesair : Thanks for pointing out the Lyncode @author tags. I suspect those slipped in during the massive OAI refactor (i.e. creation of XOAI) many years back. It's simply hard to police this sort of thing. Had I noticed these at the time, I also would have questioned them. And, I'd rather remove them then add in more.

FWIW, other major projects (Apache Foundation and Gradle are two, see this blog post for example) do not allow any @author tags whatsoever. Their policy seem to be they quickly get out of date or are misleading, and simply cause clutter. While I understand your reasoning, I think individuals should be attributed via commits or via name, and not via a blanket generic statement. As-is, I think "Atmire NV (info at atmire dot com)" gives us no information about who actually worked on this code at Atmire, so there's no way to attribute those individuals in our Contributor list, or even (eventually) consider those individuals for Committership

@coveralls
Copy link

Coverage Status

Coverage increased (+0.5%) to 21.564% when pulling 41a59ce on atmire:DS-3651_Range-Header-support into b578b03 on DSpace:master.

@coveralls
Copy link

Coverage Status

Coverage increased (+0.5%) to 21.564% when pulling 41a59ce on atmire:DS-3651_Range-Header-support into b578b03 on DSpace:master.

@coveralls
Copy link

Coverage Status

Coverage increased (+0.5%) to 21.565% when pulling ad0187f on atmire:DS-3651_Range-Header-support into b578b03 on DSpace:master.

@tomdesair
Copy link
Contributor Author

@tdonohue I've corrected the author tags to unblock this PR. Once there is a decision on the usage of @author I'll update them accordingly in a separate PR.

@coveralls
Copy link

Coverage Status

Coverage increased (+0.5%) to 21.565% when pulling 71b0bc4 on atmire:DS-3651_Range-Header-support into b578b03 on DSpace:master.

@abollini abollini merged commit 71b0bc4 into DSpace:master Dec 3, 2017
@abollini
Copy link
Member

abollini commented Dec 3, 2017

I have solved a minor conflict in the dspace-spring-rest/src/test/java/org/dspace/app/rest/test/AbstractControllerIntegrationTest.java
and fixed dspace-spring-rest/src/test/java/org/dspace/app/rest/AuthenticationRestControllerIT.java#L120 (to switch to the static way to create builder for test as the constructor is now protected)

@tomdesair tomdesair deleted the DS-3651_Range-Header-support branch December 4, 2017 10:49
@tdonohue tdonohue modified the milestones: 7.0, 7.0preview Jan 26, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
interface: REST API v7+ REST API for v7 and later (dspace-server-webapp module)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants