Skip to content
This repository has been archived by the owner on Aug 28, 2020. It is now read-only.

REST API Design #28

Open
Imran31 opened this issue Mar 19, 2016 · 5 comments
Open

REST API Design #28

Imran31 opened this issue Mar 19, 2016 · 5 comments
Assignees
Labels

Comments

@Imran31
Copy link
Contributor

Imran31 commented Mar 19, 2016

This issue is to brainstorm the design of the API endpoints and responses. I'll start with a couple of points on shorter URLs, HATEOAS and the folder structure.

Shorter URLs

I propose maintaining numeric IDs for each author, corpus, text, etc. and using those to construct the REST endpoints.

So, for example, endpoint GET /lang/latin/corpus/perseus/author/tacitus/text/germania becomes GET /lang/latin/corpus/1/author/6/text/8.

This keeps the URLs short while allowing the actual names that the IDs map to to be as long as needed.

A problem with this (assuming an external API consumer) is figuring out the ID of a specific author/corpus/text.

API Discoverability

The formal term for this is HATEOAS. This implies a user should be able to browse and discover all the endpoints of the REST API using the REST API itself.

Towards this, we should define endpoints like GET /lang/latin/corpus/ that returns a response:

{"corpora": [ {"name": "perseus", "id": "1"}, ... ]}

This way, the user will be able to query for all the available corpora and figure out the ID.

Another example of this is from my POS tagger implementation. It is possible to view the list of languages and POS tagging methods they support via GET /core/pos, and perform the actual POS tagging for a string via POST /core/pos.

In general, adding a GET request handler to endpoints like /lang, /lang/<int:lang_id>/corpus, etc. should make the API discoverable.

Folder Structure

Right now all the resources are defined in a single file (api_json.py), and so are tests (tests.py). There is also no distinction between files containing utility functions and actual REST resources.

I briefly mentioned this in my #20 (comment).

An example of my proposed organisation is in #27. Inside the folder for a specific function (/pos), the resources will be in views.py, the database stuff (if any) in models.py, utility functions in utils.py and parameters in constants.py.

(It may be better to keep constants.py at the root of the API folder structure, to easily find and change)

@kylepjohnson
Copy link
Member

@Imran31 Thanks for sharing your thoughts. Here are a few initial responses:

So, for example, endpoint GET /lang/latin/corpus/perseus/author/tacitus/text/germania becomes GET /lang/latin/corpus/1/author/6/text/8.

The API is intentionally explicit. I prefer this because it the URL is instantly recognizable. "Author 6, text 8" means nothing, but "Tacitus, Germania" is universally recognizable.

There is something to be said for keeping URLs short, but we are very, very far from what I would consider long.

a user should be able to browse and discover all the endpoints of the REST API using the REST API itself.

We have this already, though I think it could be made more intuitive. For example:

I'm open to hearing other ways of doing this.

About your POS addition to API, I'll need to look into this further. I will probably want to see an API which accounts for all "core" processing, not just individual parts.

@Imran31
Copy link
Contributor Author

Imran31 commented Mar 20, 2016

There is something to be said for keeping URLs short, but we are very, very far from what I would consider long.

Yeah IDs will not be more helpful then, I thought that the growing URL length is a problem. I too think the existing URLs are much easier to recognise.

About your POS addition to API, I'll need to look into this further. I will probably want to see an API which accounts for all "core" processing, not just individual parts.

Sure! Does it make sense to list out all the /core/* endpoints and how they will respond to different HTTP methods?

I'll start with a list of endpoints and their associated classes:

  • /core/jvreplacer: JVReplacer
  • /core/stem: Stemmer
  • /core/lemmatize: LemmaReplacer
  • /core/syllabify: Syllabifier
  • /core/ner: ner
  • core/tokenize: PunktLanguageVars, TokenizeSentence, word_tokenize
  • core/distance: TextReuse, Levenshtein

@lukehollis
Copy link
Member

I think this (#28 (comment)) looks good for the first iteration of the project and can revise them in the future as it makes sense for the more complex tasks.

@kylepjohnson
Copy link
Member

@lukehollis If you're comfortable with this API, then let's go for it. Just so long as everyone knows that the specifics will be subject to a revision sometime.

Thanks to all on this.

@Imran31
Copy link
Contributor Author

Imran31 commented Mar 24, 2016

Thanks @lukehollis, I have extended this discussion into my proposal (I just shared it with the organisation via the GSoC website). I look forward to your comments there.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

3 participants