Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DS-4257: DSpace 7 backend as one webapp (RESTv7, SWORD, SWORDv2, OAI, RDF) #2265

Merged
merged 33 commits into from
May 21, 2019

Conversation

tdonohue
Copy link
Member

@tdonohue tdonohue commented Nov 16, 2018

https://jira.duraspace.org/browse/DS-4257

This is a replacement PR for #2231.

The goal of this PR is to revisit ways to simplify the DSpace 7 install/deployment process by turning the backend into a single webapp. See also early notes at: https://wiki.duraspace.org/display/DSPACE/DSpace+Backend+as+One+Webapp

What this PR currently achieves (will keep this list up-to-date)

  • Fixes DS-3492 - Keep DSpace Configs out of Spring Boot's application.properties. Fixed in commit 939d4b8 which refactors Application startup to load our DSpace Configs early in Spring Boot startup process
  • SWORDv1 module is auto-discovered / auto-deployed by Spring Boot & completely configurable.
    • SWORDv1 can be enabled/disabled by a new sword-server.enabled configuration
    • SWORDv1 can be deployed on a specified path (see new sword-server.path configuration)
    • SWORDv1 has very basic integration tests to prove that it is responding properly.
    • NOTE: As SWORD does URL validation, to test SWORDv1 in this new deployment, you must currently override the SWORD URL configs in sword-server.cfg. This URL configuration issue has been noted as a TODO below.
    • To test SWORDv1 Service Document functionality, you must set sword-server.servicedocument.url = ${dspace.restUrl}/${sword-server.path}/servicedocument in your local.cfg
  • SWORDv2 module is auto-discovered / auto-deployed by Spring Boot & completely configurable.
    • SWORDv2 can be enabled/disabled by a new swordv2-server.enabled configuration
    • SWORDv2 can be deployed on a specified path (see new swordv2-server.path configuration)
    • SWORDv2 has very basic integration tests to prove that it is responding properly.
    • NOTE: SWORDv2 also requires proper URL configs to test it out, just like SWORDv1 above.
    • To test SWORDv2 Service Document functionality, you must set swordv2-server.servicedocument.url = ${dspace.restUrl}/${swordv2-server.path}/servicedocument in your local.cfg
  • OAI-PMH module is auto-discovered / auto-deployed by Spring Boot & completely configurable.
    • OAI can be enabled/disabled by a new oai.enabled configuration
    • OAI can be deployed on a specified path (see new oai.path configuration)
    • OAI has basic integration tests to prove that it is responding properly. Some of these integration tests had to be moved/migrated from the old OAI webapp into Spring Boot.
    • To test OAI-PMH functionality, you must set oai.url = ${dspace.restUrl}/${oai.path} in your local.cfg
  • RDF module is auto-discovered / auto-deployed by Spring Boot & completely configurable.
    • RDF can be enabled/disabled by a new rdf.enabled configuration
    • RDF can be deployed on a specified path (see new rdf.path configuration)
    • RDF has basic integration tests to prove that it is responding properly.
    • To test RDF functionality, you must set rdf.contextPath = ${dspace.restUrl}/${rdf.path} in your local.cfg

FINAL RESULT: The final result of this PR is that it just ensures all old Webapps (except dspace-rest and dspace-solr) run from a single (Spring Boot) webapp in the existing dspace-spring-rest module. While this module is not the most appropriately named (yet), merging this PR as-is should not have major conflicts with other development efforts within the current dspace-spring-rest module.

A FOLLOW-UP PR (coming soon) will be submitted that includes the following. This work will be submitted as a follow-up PR since it has a higher likelihood of causing conflicts in ongoing development efforts:

  • RENAME the "dspace-spring-rest" webapp to "dspace-server"
  • URL configuration cleanup required. All URL configurations in dspace.cfg will be standardized across all modules. Primarily this will involve creating a new dspace.server.url configuration to replace dspace.url, dspace.restUrl and dspace.baseUrl

@tdonohue tdonohue added work in progress PR is still being worked on & is not currently ready for review code task Code cleanup task labels Nov 16, 2018
@tdonohue tdonohue changed the title DSpace 7 backend as one webapp (REST, SWORD, OAI, RDF, etc) DSpace 7 backend as one webapp (RESTv7, SWORD, SWORDv2, OAI, RDF) Jan 21, 2019
@KevinVdV
Copy link
Member

Hi Tim,

This is a great idea and I took a quick look at what is already there, see my comments below:

I noticed that you deleted the "sword" folder from the "modules" directory. This "maven module" was used as for creating a webapp that could be deployed, so I understand why it was removed. But I think we might want to think about altering the purpose of it instead of straight out removing it entirely.
I would suggest refactoring it to act like the "additions" does now, as a placeholder to add your local overrides for the sword module. So basically altering the pom to create a "jar" file and add this jar as dependency to "spring-rest".
This will allow people who want to modify the codebase of the sword module to place their code in here. As it doesn't really belong in "additions" nor in "spring-rest".

I noticed that the dspace-sword pom.xml still contains the "maven-war-plugin", this can prob be left out (nitpicking now)

@tdonohue
Copy link
Member Author

tdonohue commented Jan 30, 2019

Hi @KevinVdV : thanks for the review.

A note on the ./dspace/modules/ folders. I don't believe it's possible to do what you are requesting (override the sword module with a new JAR project). The ./dspace/modules/additions/ folder does not override any specific module, but simply acts as a place for custom Java classes (packaged in their own separate JAR) that are auto-included in all webapps (and in [dspace]/lib). Maven doesn't support overrides for JAR modules in the same way that you can use WAR Overlays, which is why the additions folder isn't able to actually override or overlay dspace-api.jar directly, but as additions is loaded first, classes of the same name will take precedence over any in dspace-api.

If we were to create a "sword-additions" folder, this would similarly end up packaged as a separate JAR (from both sword.jar and additions.jar), which doesn't seem like it serves any real purpose. Why would we need multiple types of "additions" folders? If we need to customize SWORD behavior, why not just use the single "additions" folder, or use the Maven Overlay in the "spring-rest" WAR?

I feel that, with this one webapp approach, we logically will be forced into only supporting ./dspace/modules/additions (for custom code to package in a JAR), and ./dspace/modules/spring-rest/ (for WAR overlays, which allows overlaying/overriding any of the modules packaged into our single WAR). I don't see a clear path to continuing to support separate overlays per module (what we currently have).

UPDATE: As a sidenote, conceptually, I agree that overlaying SWORD in something called "spring-rest" sounds odd. But, keep in mind that, as this one webpap project moves forward, I feel we will need to rename spring-rest. It no longer will be only a REST webapp, it'll be the entire DSpace backend.

@benbosman
Copy link
Member

Hi @tdonohue, @KevinVdV

If there would be a need to make a customization of e.g. a SWORD class which is not implemented as a service, this is supported in the sword module in e.g. DSpace 6. Any changes in this module would end up in the WEB-INF/classes dir of the SWORD webapp. And the sword module is dependent on the standard DSpace SWORD webapp.

Using a single webapp approach, this would only fit in a module which is dependent on the standard DSpace SWORD webapp. So the additions module would not be a good fit. Such customizations would also not need to end up in [dspace]/lib

I think your sidenote is very relevant here.
Overlaying SWORD in something called "spring-rest" is indeed not ok. But, if spring-rest would be renamed to e.g. webapp (not the most original name), it would make sense to me.

Would you both agree this solves the issues in a nice and clean way?

@tdonohue
Copy link
Member Author

tdonohue commented Feb 7, 2019

@benbosman and @KevinVdV : Agreed, during this whole process I've always felt this PR would require us to rename the "spring-rest" webapp. I've updated the description of this PR to make that clear (by adding a checkbox at the end that we'll need to rename it).

If I'm understanding @benbosman's comments correctly, it sounds like you are saying I need not change anything in the current strategy for how I'm merging these webapps in this PR?

For what it's worth, I think there are three options that will still be available (and developers can choose whichever one they feel best aligns with their needs). For the descriptions below, I'm calling the "one-webapp" the "Backend Webapp":

  1. Developer could add custom classes to the Additions overlay ([src]/dspace/modules/additions). These custom classes would be pulled into the Backend Webapp, but they'd also be copied into [dspace]/lib/
  2. Developer could add custom classes to the Backend Webapp overlay (e.g. [src]/dspace/modules/[backend-webapp]). This would ensure the custom classes are only pulled into the backend Webapp (and NOT applied to [dspace]/lib/).
  3. Developer could add custom classes to a single module (e.g. SWORDv1) by just updating that module's code directly (e.g. for SWORDv1 updating [src]/dspace-sword/src/main/java). These custom classes would end up embedded in the JAR for that module, and therefore would be used where ever the module's JAR is used.

So, I think we'll still have plenty of options for developers to apply custom code. The only one that we lose is the ability for per webapp overlays -- as we no longer have multiple webapps, we only have one.

@benbosman
Copy link
Member

@tdonohue that sounds good, and backend-webapp is indeed suitable since none of the webapps is designed for end users.

@tdonohue tdonohue force-pushed the one_webapp_backend_redux branch 2 times, most recently from 6971655 to 748b14c Compare March 1, 2019 21:25
@tdonohue tdonohue removed the work in progress PR is still being worked on & is not currently ready for review label Mar 5, 2019
@tdonohue
Copy link
Member Author

tdonohue commented Mar 5, 2019

I've removed the "work-in-progress" label from this PR, as all (non-deprecated) webapps are now running in Spring Boot (with ITs to prove they work). This includes REST API v7, OAI-PMH, SWORDv1, SWORDv2 and RDF.

Reviewers are welcome. However, before merging, I think we should consider renaming the dspace-spring-rest (Spring Boot) webapp. Suggestions welcome. Some renaming options include "dspace-backend", "dspace-server", "dspace-web-services", "dspace-web-api".

@tdonohue tdonohue added this to the 7.0 milestone Mar 6, 2019
@mwoodiupui
Copy link
Member

[tangent] Regarding local overrides: we are going to have to re-think the Additions mechanism anyway. It is dependent on Tomcat behavior which was never part of the Servlet specification, and which was removed during Tomcat 8 development. Tomcat 8+ will search the JARs in lib/ in whatever order readdir() returns them, which depends on the filesystem implementation and the history of mutations of the directory. So our current method might or might not work on Day 1 in any particular instance, and might stop (or resume) working at any time. We need to find some way either to order the classpath ourselves, or to make actual replacements in dspace-*.jar from the custom classes at assembly time.

There is probably a better place to discuss this.

@tdonohue
Copy link
Member Author

tdonohue commented Mar 6, 2019

@mwoodiupui : Point taken. I suspect longer term questions about Overlays/Additions should likely move to the JIRA ticket on Tomcat 8+ support: https://jira.duraspace.org/browse/DS-3092

As a more general note, I'd rather not muddy up this PR discussion any further regarding Overlays or the "Additions" module. I only wanted to note that this PR (obviously) must remove the Maven WAR Overlay concept from any merged webapps...as Overlays are only applicable to WARs (and all merged webapps, like OAI, SWORDv1/v2, RDF, become JAR modules now). Longer term discussions of Overlays/Additions should move elsewhere, as this PR is not attempting to solve that problem.

@tdonohue
Copy link
Member Author

In order to simplify the review process of this PR, I've decided to move the module renaming (and corresponding configuration changes) to a separate, follow-up PR. As I began that process, I realized it involves touching a large number of files -- and I don't want to make this PR impossible to review.

So, this PR should be considered complete and ready to review.

The final result of this PR is that it just ensures all old Webapps (except dspace-rest and dspace-solr) run from a single (Spring Boot) webapp in the existing dspace-spring-rest module. While this module is not the most appropriately named (yet), merging this PR as-is should not have major conflicts with other development efforts within the current dspace-spring-rest module.

I will be submitting a separate, follow-up PR to rename the dspace-spring-rest module more appropriately, and correct any URL configurations in *.cfg files (based on that renaming). That work is in-progress, and an early PR should be expected in the coming days. This followup PR obviously will have a very high likelihood of conflicting with other development work on dspace-spring-rest and therefore may need to be delayed for an appropriate time.

@terrywbrady
Copy link
Contributor

terrywbrady commented Mar 15, 2019

I am testing this change in Docker. The spring-rest service came up nicely. Angular was able to connect.

Here are some preliminary observations. I will post more as I continue my testing and review.

I am deploying this app to /spring-rest. I initially presumed that setting oai.path or rdf.path would make those services available at /oai or /rdf. Looking at the code, I see that these paths will still fall below the one webapp. That might be useful to clarify in the documentation.

I set oai.enabled=true and I am able to access the service at /spring-rest/oai. All of the links on the oai page point to http://localhost:8080/oaixnull?verb=Identify.

I am attempting to look at config properties. I am seeing logging library error when I attempt to run the CLI command. The command does succeed.

$ winpty docker exec -it dspace //dspace/bin/dspace dsprop -p dspace.name
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/dspace/lib/log4j-slf4j-impl-2.6.2.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/dspace/lib/logback-classic-1.1.9.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]

I started up the RDF service. I am able to run rdfizer, but I am unable to access the rdf web service at /spring-rest/rdf.

Performance

With the one webapp, I saw the initialization time drop by 30% - 50%!

One Webapp

$ docker-compose -p d7 -f docker-compose.yml -f d7.override.yml -f src.override.yml up -d; time curl http://localhost:8080/spring-rest > /dev/null
Creating network "d7_dspacenet" with the default driver
Creating dspacedb ... done
Creating dspace   ... done
Creating dspace-angular ... done

real    1m5.339s
user    0m0.000s
sys     0m0.031s

Without One Webapp

$ docker-compose -p d7 -f docker-compose.yml -f d7.override.yml up -d; time curl http://localhost:8080/spring-rest > /dev/null
Creating network "d7_dspacenet" with the default driver
Creating dspacedb ... done
Creating dspace   ... done
Creating dspace-angular ... done

real    2m10.657s
user    0m0.000s
sys     0m0.015s

Error Page

A user-friendly error page needs to be displayed when a URL is invalid. Here is the text that currently displays.

Whitelabel Error Page
This application has no explicit mapping for /error, so you are seeing this as a fallback.

Fri Mar 15 22:45:00 UTC 2019
There was an unexpected error (type=Not Found, status=404).
No message available

@tdonohue
Copy link
Member Author

tdonohue commented Mar 20, 2019

@terrywbrady : Thanks for the quick test/review. A few responses to your comments:

RE: deployment question. you are correct, all the *.path configurations are now relative to the webapp location (e.g. /spring-rest/oai is where OAI-PMH will respond). This will become much clearer in the follow-up PR, when I collapse all the URL configs down to a single dspace.server.url (and then all *.path configurations will be relative to that configuration).

RE: OAI configuration issues The OAI-PMH links are all driven from the existing oai.url at this time. This is one of the URL oddities I'm pointing to that needs cleaning up in the follow-up PR. To test OAI-PMH with this PR, you need to set oai.url = ${dspace.restUrl}/${oai.path} in your local.cfg (I've added this note to the PR description above). In the followup PR, I will default oai.url = ${dspace.server.url}/${oai.path} which will ensure it defaults to the configured oai.path setting. (As a sidenote, other merged webapps will show some similar configuration issues. e.g. SWORDv1/v2 require currently configuring the SWORD URL configurations properly, see the notes in this PR's description. Again, this will be fixed with sane defaults in the follow-up PR.)

The configuration issues are definitely a pain at this point. But, I will clean them all up in the follow-up PR. I'm worried about lumping them into this PR, as it will require touching a large number of files, as I'll be removing the existing dspace.url, dspace.baseUrl, and dspace.restUrl configurations (which are used all over the place), and replacing them with a single dspace.server.url config.

RE: Performance Glad to hear the performance is better!!

RE: Error Page This is a general issue for the /spring-rest webapp (i.e. this generic 404 page already exists, e.g. https://dspace7.4science.it/dspace-spring-rest/blah ). I'd recommend we treat this as a separate ticket, as it's out-of-scope for this particular PR.

@tdonohue
Copy link
Member Author

@terrywbrady : On second thought, I think I might do some basic "sane" updates to existing configurations, to ensure testing this PR is a bit easier. So, while the notes on in the PR description will provide info on what to add to your local.cfg for immediate testing, I'll add a commit soon to set those as defaults (until I can clean up configs in the next PR).

@tdonohue
Copy link
Member Author

Based on @terrywbrady's feedback/fixes, I've just fixed a minor bug in the OAI-PMH module. It was not correctly determining which OAI context you were using (which broke some links in the browsable UI).

I've also defaulted dspace.baseUrl = http://localhost:8080/spring-rest. and set dspace.baseUrl = dspace.url = dspace.restUrl. This ensures you can now easily test any of the merged webapp modules just by enabling them (no further configuration should be needed) e.g.

oai.enabled = true
rdf.enabled = true
sword-server.enabled = true
swordv2-server.enabled = true

@tdonohue
Copy link
Member Author

This PR was just rebased on the latest master after the Solr upgrade was merged (#2058). So, testing this PR will now require installing an external Solr (like latest master).

@terrywbrady
Copy link
Contributor

I created a docker image to facilitate testing of this PR.

DSPACE_VER=one-webapp docker-compose -p d7 -f docker-compose.yml -f d7.override.yml up -d

Note that we do not yet have a mechanism to auto-build images for a specific PR, so this image would need to be rebuilt manually if the image changes.

@tdonohue , I confirmed that the OAI issues I had reported are resolved.

@terrywbrady terrywbrady self-requested a review March 26, 2019 22:18
Copy link
Contributor

@terrywbrady terrywbrady left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tdonohue, The Dockerfile creates symlinks for the DSpace webapps. The list should be pruned.

Options

  • Only provide /spring-rest
  • Provide both /spring-rest and /rest

The override for the solr service is no longer needed. Depending on the answer above, the override for /rest may not be needed. Based on that decision, we may be able to consolidate to only one Dockerfile.

Additional Notes. I have attempted to test services.

  • While I can access the OAI service, the links within that service are not valid. I am unclear if the code is wrong or if I need to pass a different variable.
  • I can rdfize the content, but I cannot access the rdf service.
  • I can deposit with sword v2 but not and v1

@tdonohue
Copy link
Member Author

@terrywbrady : I think I'm going to need help in correcting any Docker related issues here. I definitely want to ensure this is working properly with Docker. But, as you saw, I made no Docker related changes in this PR as I'm not familiar enough with where the changes should be made.

It also sounds like some of the problems you noted might be related to configuration in the Docker setup. (e.g. SWORDv1 does a lot of URL validation and actively blocks deposits if they are POSTed to a URL that doesn't exactly match the known SWORD URL . So, it's possible that SWORDv1 is blocking your requests if the URL configurations are not as expected.)

Is there a Docker setup you are using to perform these tests? Do you have a Docker setup working on master that is able to successfully show/test OAI, RDF, SWORDv1 and SWORDv2? If there's a setup that works well for master, I suspect I can help adapt that to this new PR, so that we can better see whether I've created new bugs/issues in this PR. Maybe we can talk more on Slack today/tomorrow to figure out a way to better "sync" this PR with the Docker setup to make it easier to verify/test.

tdonohue and others added 21 commits May 10, 2019 15:01
…F. Add note about upgrading Jena for future.
@terrywbrady
Copy link
Contributor

@tdonohue , the latest changes resolved the issue that I was seeing with the RDF service. Thanks for resolving the issue!

@tdonohue
Copy link
Member Author

As noted in last week's DSpace 7 Meeting, this can now be merged. It's at +2, and we wanted to merge this immediately after Preview Release (which came out yesterday).

@tdonohue tdonohue changed the title DSpace 7 backend as one webapp (RESTv7, SWORD, SWORDv2, OAI, RDF) DS-4257: DSpace 7 backend as one webapp (RESTv7, SWORD, SWORDv2, OAI, RDF) May 21, 2019
@tdonohue tdonohue merged commit 73c2e9d into DSpace:master May 21, 2019
@tdonohue tdonohue deleted the one_webapp_backend_redux branch May 21, 2019 16:03
@tdonohue tdonohue modified the milestones: 7.0, 7.0beta1 Jan 26, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
code task Code cleanup task
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants