Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DS-1696] DSpace REST API built in JERSEY #323

Merged
merged 48 commits into from Oct 22, 2013

Conversation

Projects
None yet
3 participants
@peterdietz
Copy link
Member

commented Oct 7, 2013

I have built a REST-API for DSpace, built using JAX-RS1 (JERSEY).

DS-1696 DSpace REST API built in JERSEY
DS-1657 Request for Official DSpace REST API

  • It is READ-ONLY
  • Respects DSpace Authorization
    • Only lists or shows objects that anonymous user has READ access to.
    • Hides hidden metadata from Items, i.e. provenance
  • Should support either DB

It has support for endpoints:

  • / - Index of Endpoints
  • /communities - Community (list and specific)
  • /collections - Collection (list and specific)
  • /items - Item (specific)
  • /bitstreams - Bitstream (specific, and specific retrieve)
  • /handle - Handles (can look up a handle to see the internal ID)

It gives responses in JSON, XML, and for a rare-few HTML. (Set your Accept header accordingly)
In testing I have had: Accept: application/xml;q=0.5,application/json;q=0.6, and tweaked preferences to get different response.

Whats missing?

  • Pagination
    • I'm thinking I'll use limit and offset
  • Search
    • Can't use deprecated lucene, need to use Discovery/SOLR, as per #dspace discussion.
  • root-level Item list
  • root-level Bitstream list
  • Expand is not complete for Item / Bitstream
  • Going from an object to its Parent is not consistently implemented
  • No way to login or authenticate as a specified user
  • Can't recursively view every sub-sub-resource of a resource, you can see its direct children.

What are the surprises?

  • If you don't have READ access to an object it does not show up in list
  • Looking up a handle does not give you EVERYTHING about that object, but gives you some data, and link to get more data.
  • Every object has a "link", a very weak attempt at HATEOS, you can always just to children/parent objects
  • Items metadata is a list of key/value pairs
  • I have for some reason touched a few core dspace-api classes to add some sugar or simplify accomplishing things. Perhaps this is unnecessary, but I'd rather do things a right way, as opposed to yet-another-wrong-way.
  • Item metadata is using the deprecated DCValue[], which I saw no alternative to.

This is all in [dspace-source]/dspace-rest.

As this is still code that is being improved from time-to-time, feel free to review, give feedback, or fix it yourself.

peterdietz added some commits Sep 19, 2013

Add a "link" element so that client, can follow to resource endpoint.
Still needs to include the contextPath i.e. /rest/
Cursory attempt at looking up object by its handle.
/rest/handle/1811/12345 returns a "DSpaceObject" with basic attributes,
client needs to follow the "link" to the real object...
@peterdietz

This comment has been minimized.

Copy link
Member Author

commented Oct 8, 2013

Just fixed it to pass the tests (license header, and then a non-existent dspace-rest.jar was being requested...).

For manual testing, I am using the Play! app I wrote, customized for this flavor of the API. Its essentially a mock-UI that gets all of its data solely via REST.
https://github.com/peterdietz/dspace-rest-play/tree/jersey

Once I figure out where I can host this JERSEY-REST-API, then I'll put up both this API, and the Play! app.

Also, Chrome Extension "Advanced Rest Client" has been useful in testing endpoints and response types.

@peterdietz

This comment has been minimized.

Copy link
Member Author

commented Oct 9, 2013

I have setup an Amazon EC2 instance to host this PR for testing.

See the REST endpoint at: http://ec2-75-101-213-28.compute-1.amazonaws.com:8080/rest/
(I've imported the same AIP-restore package that demo.dspace.org uses)

For kicking the tires, I have modified the Play! app to work as a client to the API, and have pushed that to Heroku.
http://dspace-rest-client-play.herokuapp.com/

Lastly, the "Advanced Rest Client" Chrome extension is: https://chrome.google.com/webstore/detail/advanced-rest-client/hgmloofddffdnphfgcellkdfbfbjeloo?hl=en-US

A good endpoint to start with is:
http://ec2-75-101-213-28.compute-1.amazonaws.com:8080/rest/communities?expand=all
Set header: Accept: application/xml;q=0.5,application/json;q=0.6

@mwoodiupui

This comment has been minimized.

Copy link
Member

commented Oct 9, 2013

At first glance this looks very good. +1

peterdietz added some commits Oct 14, 2013

Each of these "READ" endpoints should "autocommit", which prevents tr…
…ansaction abort issues (atleast with postgres).

If you got an error, (i.e. negative limit is bad), then PG will barf always.
current transaction is aborted, commands ignored until end of transaction block
Support limit/offset paging of Collections and Collection.Items
new method: Collection[] Collection.findAll(limit, offset)
new method: ItemIteractor Collection.getItems(limit, offset)
Add expands for Bitstream, to get to its parentDSO
Note: ParentDSO is either Item or Comm/Coll as logo
@peterdietz

This comment has been minimized.

Copy link
Member Author

commented Oct 22, 2013

I have load-tested the JERSEY REST API with JMETER, 5 threads (concurrent users) and forever loop against each endpoint concurrently/continuously. It had zero errors, and isn't spewing postgres errors in the logs. You can see the gist of my jmeter file: https://gist.github.com/peterdietz/7091412

Performance with only one thread per endpoint (hitting 7 endpoints at same time) is pretty good, with a throughput of:

  • 23 requests/second on an Amazon EC2 t1.micro instance (low single-core cpu, 613MB ram)
  • 566 requests/second on my my laptop (MBP i7 8-core, 16GB ram, SSD disk)

Endpoints in test were: COMMUNITIES, COMMUNITIES/1, COLLECTIONS, COLLECTIONS/1, ITEMS/1, BITSTREAMS/5, BITSTREAMS/5/RETRIEVE

I'm satisfied with this testing, and am ready to merge this in.

@hardyoyo

This comment has been minimized.

Copy link
Member

commented Oct 22, 2013

I'm +1 on merging this as well

peterdietz added a commit that referenced this pull request Oct 22, 2013

Merge pull request #323 from peterdietz/rest-jersey
[DS-1696] DSpace REST API built in JERSEY

@peterdietz peterdietz merged commit d595481 into DSpace:master Oct 22, 2013

1 check passed

default The Travis CI build passed
Details

peterdietz added a commit to osulibraries/DSpace that referenced this pull request Oct 30, 2013

peterdietz added a commit to peterdietz/DSpace that referenced this pull request Nov 13, 2013

Adding DSpace REST API from DSpace#323 to our 1.8.x
See also: DSpace#323

Conflicts:
	dspace-api/src/main/java/org/dspace/content/Collection.java

artlowel pushed a commit to atmire/DSpace that referenced this pull request Jun 13, 2014

Merge pull request DSpace#323 from peterdietz/rest-jersey
[DS-1696] DSpace REST API built in JERSEY

kosarko pushed a commit to kosarko/DSpace that referenced this pull request Aug 11, 2015

Merge pull request DSpace#323 from ufal/issue_309
fixing DSpace#309 by reverting variable name change to lindat specific one
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.