API endpoints #11

stefanvermaas · 2014-10-20T16:44:02Z

Currently we need to call the API twice if we want both the column names and the column data. What's the reason behind this? Speed?

Cheers

stefanvermaas · 2014-10-20T19:55:02Z

In addition to this issue:

Keep two separate query calls is ok, though the "preview" call should have the column name in the data object.

Current situation when querying the preview data:

{
  0: 'some value',
  1: 'other value'
}

Preferred situation when querying the preview data:

{
  name: 'some value',
  someKey: 'other value'
}

On this way we can stop doing big data manipulations in the frontend.

stefanvermaas · 2014-10-27T19:07:37Z

Hi @andykram

As we discussed I would come up with some ideas about a more consistent API for AirPal, so here are some ideas.

Table endpoints:
Table search endpoints

[GET] api/tables?q=string (search for tables)
[GET] api/tables/:id/partitions?q=string (search for specific partitions in a specific table)

Table information endpoints

[GET] api/tables/:id/columns (returns all the columns of a table)
[GET] api/tables/:id/data (returns all the data of a table)

Query endpoints:

[GET] api/queries
This could return all the queries done by the current user or just the last 20 or so.
[POST] api/queries
This creates a new query based on the input of the user. It's the same as the current execute endpoint, but more RESTful, because this is the way new resource should be created.

Currently this is done by a PUT request, which doesn't make any sense (in my eyes), because a PUT (or PATCH) request should only be fired to the API on an existing resource. Though this resource doesn't exists yet, and should be created.

[GET] api/queries/:id (get info about a specific query)
This can be used to show some more information about a specific query. This could be quiet handy if we want to show the in-browser-result of a specific query (cc: @jamesmay). And it's RESTful.
[PATCH] api/queries/:id (update a query)
This request should be used to update the current status of a query, for example: this endpoint should be called to kill/cancel a specific query.
[DELETE] api/queries/:id
This should destroy a registered query, but I'm not so sure about this. I think you will always want to keep the older queries for reference, but maybe there is a use-case for this?!

This should not be used to kill/cancel a query, I would say we should use the PATCH request for this use-case.

If we make the API endpoints like the proposed endpoints, we've got a more consistent and RESTful API for the AirPal project.

I'm not sure about the last two endpoints for the Table yet. This is because I'm not sure how we currently use the schema and table. Can you explain some more about what the difference between a table and the schema is? And is it really needed to keep the schema and table name in an url? Or would only the table ID (or name?) be enough to get the right data?

@jamesmay: As I understood from the user interviews, most of the users don't need to see the other queries, right?! Is this something we want to keep or do you think it's more reasonable to only show your own queries?

What do you guys think?

stefanvermaas · 2014-11-03T09:00:56Z

I proposed that the [GET] api/query should return the users results or the last 20 queries, but I think we should go with the last results (which defaults to the last 20) and a parameter that indicates how many results we should show.

For the personal results I would like to propose a new API endpoint.

Query Endpoints

[GET] api/queries(/?results=integer)
Returns the last twenty or more (?) results.

User Endpoints

[GET] api/users/:id/queries
Returns the queries of the current user

stefanvermaas · 2014-11-03T09:23:22Z

Also would like to propose a new endpoint for the permissions API. This should be part of the user endpoints.

[GET] api/users/:id/permissions

And maybe also an endpoint for the logged in user.

[GET] api/users/:id

andykram · 2014-11-05T18:46:31Z

I think the recent suggestions make sense and I'll start on implementing them.

andykram · 2014-11-05T23:23:40Z

@stefanvermaas: In the proposal for the /api/users/:id endpoint, how does the frontend determine what the users ID is in order to make these queries?

stefanvermaas · 2014-11-06T10:42:34Z

I'm open for suggestions, because I haven't figured this out yet.. We could set the user id in the sessionsStorage when the user logs in? But I doubt how safe this is..

@jamesmay could you provide some feedback on the earlier questions?

andykram · 2014-11-06T20:51:23Z

@stefanvermaas: I propose an /api/user endpoint that will return something like:

{
  'name': 'andykram',
  'id': 'andykram',
  'permissions': {
  }
}

stefanvermaas · 2014-11-06T21:18:37Z

Sounds fine to me :)! 👍

—
Stefan
Verstuurd vanaf mijn iPhone

On Thu, Nov 6, 2014 at 9:51 PM, Andy Kramolisch notifications@github.com
wrote:

@stefanvermaas: I propose an /api/user endpoint that will return something like:
{
  'name': 'andykram',
  'id': 'andykram',
  'permissions': {
  }
}
Reply to this email directly or view it on GitHub:
#11 (comment)

jamesmay · 2014-11-20T00:01:51Z

Hey Stefan, sorry for the delay. Here is my feedback after reading the thread:

"[PATCH] api/queries/:id (update a query)
This request should be used to update the current status of a query, for example: this endpoint should be called to kill/cancel a specific query."

[James Mayfield] Love this! We should also allow users to delete queries from the interface. There are times when I am running a series of analysis, and mess up a query, and don't want to see it in my history while I'm building new queries that rely on previous work. It was very common for people to delete these things from their history.

"As I understood from the user interviews, most of the users don't need to see the other queries, right?! Is this something we want to keep or do you think it's more reasonable to only show your own queries?"

[James Mayfield] Users definitely want to see their own queries in Airpal by default. But, there are times when it is super useful to see how other people are querying a table, what they are joining it to, and if they are filtering out certain values. Additionally, it’s great to be able to look up specific functions and regex that co-workers are authoring. We should include this.

stefanvermaas · 2014-11-22T11:58:41Z

Awesome! Thanks for the feedback @jamesmay

stefanvermaas · 2014-11-22T15:55:37Z

Hi @andykram

Could you look into this: When a query is not found (GET api/queries/:id) it would be better to return a 404 status code, instead of an empty body. This way I can catch the error :)

Also: do you've got a example of the JSON data that will be returned for every resource?

Cheers!

andykram · 2014-11-26T17:51:36Z

/api/queries/:id/ isn't currently implemented, but I'll work on getting that supported.

JSON data examples:

`/api/queries[?results=n&table=y]`

[
  {
    "user": "username",
    "query": "SELECT COUNT(*) FROM some_table",
    "uuid": "AFEU-4f85-FGHE-8345",
    "output": {
      "type": "csv",
      "description": "",
      "location": "https://some.url/link/to.csv"
    },
    "queryStats": {
      "analysisTime": "372.76ms",
      "completedDrivers": 5013,
      "completedTasks": 99,
      "createTime": 1417020450821,
      "distributedPlanningTime": "1.14ms",
      "elapsedTime": "2.26m",
      "endTime": 1417020586667,
      "executionStartTime": 1417020451197,
      "lastHeartbeat": 1417020586771,
      "outputDataSize": "5.86kB",
      "outputPositions": 188,
      "processedInputDataSize": "438.12MB",
      "processedInputPositions": 13238777,
      "queuedDrivers": 0,
      "queuedTime": "189.92us",
      "rawInputDataSize": "1.22GB",
      "rawInputPositions": 13247834,
      "runningDrivers": 0,
      "runningTasks": 0,
      "totalBlockedTime": "1.03d",
      "totalCpuTime": "0.00ns",
      "totalDrivers": 5013,
      "totalMemoryReservation": "0B",
      "totalPlanningTime": "374.19ms",
      "totalScheduledTime": "2.52h",
      "totalTasks": 99,
      "totalUserTime": "0.00ns"
    },
    "state": "FINISHED",
    "columns": [],
    "tablesUsed": [
      {
        "connectorId": "hive",
        "schema": "default",
        "table": "some_table"
      }
    ],
    "queryStarted": 1417020587000,
    "queryFinished": 1417020587001,
    "error": null,
  }
]

`/api/user`

{
    "name": "andy_kram",
    "executionPermissions": {
        "canCreateTable": true,
        "canCreateCsv": true,
        "accessLevel": "top_secret"
    }
}

`/api/users/:id/permissions`

{
    "canCreateTable": true,
    "canCreateCsv": true,
    "accessLevel": "top_secret"
}

`/api/users/:id/queries?results=1`

[
  {
    "user": "username",
    "query": "SELECT COUNT(*) FROM some_table",
    "uuid": "AFEU-4f85-FGHE-8345",
    "output": {
      "type": "csv",
      "description": "",
      "location": "https://some.url/link/to.csv"
    },
    "queryStats": {
      "analysisTime": "372.76ms",
      "completedDrivers": 5013,
      "completedTasks": 99,
      "createTime": 1417020450821,
      "distributedPlanningTime": "1.14ms",
      "elapsedTime": "2.26m",
      "endTime": 1417020586667,
      "executionStartTime": 1417020451197,
      "lastHeartbeat": 1417020586771,
      "outputDataSize": "5.86kB",
      "outputPositions": 188,
      "processedInputDataSize": "438.12MB",
      "processedInputPositions": 13238777,
      "queuedDrivers": 0,
      "queuedTime": "189.92us",
      "rawInputDataSize": "1.22GB",
      "rawInputPositions": 13247834,
      "runningDrivers": 0,
      "runningTasks": 0,
      "totalBlockedTime": "1.03d",
      "totalCpuTime": "0.00ns",
      "totalDrivers": 5013,
      "totalMemoryReservation": "0B",
      "totalPlanningTime": "374.19ms",
      "totalScheduledTime": "2.52h",
      "totalTasks": 99,
      "totalUserTime": "0.00ns"
    },
    "state": "FINISHED",
    "columns": [],
    "tablesUsed": [
      {
        "connectorId": "hive",
        "schema": "default",
        "table": "some_table"
      }
    ],
    "queryStarted": 1417020587000,
    "queryFinished": 1417020587001,
    "error": null,
  }
]

andykram · 2014-11-26T18:16:19Z

Some thoughts on the /api/queries/:id endpoint. I think DELETE /api/queries/:id should cancel a query. Are there any additional use cases for modifying a query (PATCH /api/queries/:id) other than canceling a query?

stefanvermaas · 2014-11-30T11:28:15Z

Sorry for the delayed answer :(..

I guess a PATCH request would be more appropriate, because you're actually not deleting the query, but just canceling it. The query must stay in the personal history of a person, until the user removes the query.

As @jamesmay said, it might be handy to delete queries, so if we want to implement that feature, we should use the DELETE request for this and the PATCH request for updating/canceling the query.

Thanks BTW for the JSON data :)! Looks awesome!

Just some thoughts on the /api/users/:id/permissions: You're already sending the results of this query inside the user object (/api/user), so I guess we could omit it from the API? What do you think?

And I have to say: Fantastic work ^^! It's looking great. I've reserved this week a couple of days to work on the final version for the new front-end application.

andykram · 2014-12-02T20:14:31Z

I think the issue here is that there are two different entities, a Query History Item and a Query Execution Item, and the API doesn't entirely distinguish between them. I think DELETE is fine for now, since the query history is essentially immutable for reasons such as auding. Additionally, it seems that Jersey (the Java REST framework we use) lacks support for PATCH out of the box, and I don't really want to add it in. Although it may be conceptually impure, I feel strongly that we should stick to DELETE, at least for now.

stefanvermaas · 2014-12-02T21:44:06Z

Yeah indeed I noticed when I was implementing the new API tonight.. Indeed we need different endpoints for executing and for saving. I forgot about this difference while working out the API endpoints.

What do you think? Should we keep /queries/execute for running the query just once, without saving? On the other hand; is executing really different to saving a query? What makes it different, because when we execute a query, we're saving it, but without name and description right? Or am I missing something?

Would it make a difference to use PUT instead of PATCH? Otherwise we definitely stick with DELETE.

andykram · 2014-12-02T23:15:43Z

When you "create a query" I view it as issuing a query that will start running. When it is completed, regardless of success or failure, a record of it being ran is created. When you "save a query" you create a separate entity that remembers the query for you and will display it on subsequent page loads. For these reasons I view the DELETE verb as appropriate, since you're deleting the run, but not actually deleting the record of it. Does this make more sense?

Currently you "create a query" via PUT /api/execute, but I think this should ideally be POST /api/queries. I think /api/query/saved for "saving a query" should probably be moved to /api/queries/saved, unless you have a better suggestion for this?

stefanvermaas · 2014-12-03T09:53:32Z

Ah this totally makes sense! Shouldn't we separate these API calls then? Maybe one for runs and one for queries? This way we can make it more restfull and I guess this could be more descriptive than now.

andykram · 2014-12-05T00:46:00Z

Sorry for the delayed response. I think we keep it as it is for now unless it's blocking you. We can always do something more traditionally RESTful once the project is out.

stefanvermaas added the question label Oct 20, 2014

stefanvermaas added enhancement and removed question labels Oct 20, 2014

stefanvermaas changed the title ~~Two requests for some data~~ API endpoints Oct 27, 2014

andykram mentioned this issue Nov 11, 2014

[api] API updates #14

Merged

andykram mentioned this issue Nov 26, 2014

[bugfix][cancel-query] Bugfixes and cancel query feature #22

Merged

spikebrehm mentioned this issue Feb 18, 2015

Remove updateQuery action & related code #42

Merged

andykram closed this as completed Mar 5, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

API endpoints #11

API endpoints #11

stefanvermaas commented Oct 20, 2014

stefanvermaas commented Oct 20, 2014

stefanvermaas commented Oct 27, 2014

stefanvermaas commented Nov 3, 2014

stefanvermaas commented Nov 3, 2014

andykram commented Nov 5, 2014

andykram commented Nov 5, 2014

stefanvermaas commented Nov 6, 2014

andykram commented Nov 6, 2014

stefanvermaas commented Nov 6, 2014

jamesmay commented Nov 20, 2014

stefanvermaas commented Nov 22, 2014

stefanvermaas commented Nov 22, 2014

andykram commented Nov 26, 2014

andykram commented Nov 26, 2014

stefanvermaas commented Nov 30, 2014

andykram commented Dec 2, 2014

stefanvermaas commented Dec 2, 2014

andykram commented Dec 2, 2014

stefanvermaas commented Dec 3, 2014

andykram commented Dec 5, 2014

API endpoints #11

API endpoints #11

Comments

stefanvermaas commented Oct 20, 2014

stefanvermaas commented Oct 20, 2014

stefanvermaas commented Oct 27, 2014

stefanvermaas commented Nov 3, 2014

stefanvermaas commented Nov 3, 2014

andykram commented Nov 5, 2014

andykram commented Nov 5, 2014

stefanvermaas commented Nov 6, 2014

andykram commented Nov 6, 2014

stefanvermaas commented Nov 6, 2014

jamesmay commented Nov 20, 2014

stefanvermaas commented Nov 22, 2014

stefanvermaas commented Nov 22, 2014

andykram commented Nov 26, 2014

/api/queries[?results=n&table=y]

/api/user

/api/users/:id/permissions

/api/users/:id/queries?results=1

andykram commented Nov 26, 2014

stefanvermaas commented Nov 30, 2014

andykram commented Dec 2, 2014

stefanvermaas commented Dec 2, 2014

andykram commented Dec 2, 2014

stefanvermaas commented Dec 3, 2014

andykram commented Dec 5, 2014

`/api/queries[?results=n&table=y]`

`/api/user`

`/api/users/:id/permissions`

`/api/users/:id/queries?results=1`