Skip to content
This repository has been archived by the owner on Jan 23, 2019. It is now read-only.

Chainable query class for FeatureService #33

Closed
patrickarlt opened this issue Jun 4, 2013 · 40 comments
Closed

Chainable query class for FeatureService #33

patrickarlt opened this issue Jun 4, 2013 · 40 comments

Comments

@patrickarlt
Copy link
Contributor

I walk talking with @mpriour about this it would be great to have a class that would abstract the complexities of querying a feature service.

Here was the example I gave @mpriour.

Query().within(extent).where({"city": "Portland"})

We could also hang this off FeatureService.

query = new client.FeatureService.Query().within(extent).where({"city": "Portland"});

myFeatureService.query(query, callback);
@chelm
Copy link
Contributor

chelm commented Jun 4, 2013

This would be sweet. Makes total sense

@nixta
Copy link
Member

nixta commented Jun 4, 2013

Agreed, would be brilliant.

Is there a standardised/conventional way to express where clauses like this? I'm thinking wildcards, AND vs OR, etc.?

@patrickarlt
Copy link
Contributor Author

@nixta i dont think so you could do something like this

Query().within(extent).where({
  "city": "Portland",
  "score": "> 50",
  "type": "Food Cart"
})

I think each key/value in the "where" could be joined by an and "AND". Might need to think of a way to do "OR"s though. .where() could also just take a string directly.

Query().within(extent).where("WHERE city IS portland AND score > 50 AND type IS food cart").

^^ Probably not valid SQL

@andygup
Copy link
Member

andygup commented Jun 4, 2013

I have a suggestion. Extents are challenging to work with and figure out, especially for non-GIS and newbie GIS developers. What about a radius based search?

@patrickarlt
Copy link
Contributor Author

@andygup that sounds great. I think within could accept any geometry extent was just an example I done see why we couldn't also do this.

Query().within({
  x: -122.6764
  y: 45.5165
  radius: 500
}).where({
  "city": "Portland",
  "score": "> 50",
  "type": "Food Cart"
})

@andygup
Copy link
Member

andygup commented Jun 4, 2013

Nice. Having that as an option would be awesome.

@mpriour
Copy link
Member

mpriour commented Jun 4, 2013

Matt Priour
sent from my Droid RAZR
On Jun 4, 2013 4:51 PM, "Patrick Arlt" notifications@github.com wrote:

@nixta i dont think so you could do something like this

Query().within(extent).where({
"city": "Portland",
"score": "> 50",
"type": "Food Cart"
})
For OR I would have a chainable or method that you could include after
where ex:

.where ( { city: "Portland" }).or({ state:"TX"})

I think each key/value in the "where" could be joined by an and "AND".
Might need to think of a way to do "OR"s though. .where() could also just
way a string directly.

Query().within(extent).where("WHERE city IS portland AND score > 50 AND
type IS food cart").

^^ Probably not valid SQL


Reply to this email directly or view it on GitHub.

@patrickarlt
Copy link
Contributor Author

@mpriour that looks great to me.

For clarity since the email stripped it

.where ( { city: "Portland" }).or({ state:"TX"})

@mpriour
Copy link
Member

mpriour commented Jun 4, 2013

I also think a convention of using an array for a value within a 'where' or
'or' clause would create an OR for those values

Matt Priour
sent from my Droid RAZR

@patrickarlt
Copy link
Contributor Author

Like

.where ({
  state:["TX", "OR"]
})

@mpriour
Copy link
Member

mpriour commented Jun 4, 2013

Yep

Matt Priour
sent from my Droid RAZR

@patrickarlt
Copy link
Contributor Author

Ok looks like I get to tackle this and try to get it working in time for UC.

@nixta
Copy link
Member

nixta commented Jun 4, 2013

I found a couple of things that might be useful.

nodejsdb opts for chaining at the operator level. So you'd get Query().where("city = 'Portland'").or("state = 'TX'") but also allows a second parameter of an array of values to inject into the string like Query().where("city = '?' OR state = '?'", ['Portland','TX']).

Also some interesting ideas in Fluent Query Builder but some strange ones too.

@patrickarlt and @mpriour Agree re: array (see jsondb's last where example) but it would translate to an IN statement:

e.g. state IN ("TX", "OR")

@nixta
Copy link
Member

nixta commented Jun 4, 2013

Don't want to hijack the conversation but… :) @andygup The radius search would need more server-side help - still need something like Query().closest(200).within({x,y,radius}), which (I think?) really can only be done efficiently at the server side. @patrickarlt I think you piped up on a DevStorm email a while back about that. I still agree that we need it. Would be a powerful part of the query chain too.

@ungoldman
Copy link

Chaining AND and OR operators as functions might get hairy very quickly -- each function in the chain would have to be aware of the preceding function, we'd have to think about operator precedence as well, for example:

Query().within(extent).and(extent).where({"city": "Portland"}).or({"state": "TX"}).and({"type": "foo"})

@andygup
Copy link
Member

andygup commented Jun 4, 2013

@nixta it's easy to generate the radius-based polygon geometry on the client. No need for a server round trip.

@nixta
Copy link
Member

nixta commented Jun 4, 2013

@ngoldman You wouldn't have within(extent).and(extent2). You would have to determine your geometry union before the query, I should think.

Also, I think you would have to insist that the above be where({"city":"Portland"}).or("state = '?' AND type='?'", ["TX", "foo"]) if were were to adopt anything like the nodejsdb approach. But yes, gets messy quickly.

@ungoldman
Copy link

https://github.com/brianc/node-sql might be worth a look, they're building a "sql string builder for node" that supports chainable operators. They even have support for patterns like this:

.where(
  user.name.equals('boom').and(user.id.equals(1))
).or(
  user.name.equals('bang').and(user.id.equals(2))
)

@patrickarlt
Copy link
Contributor Author

@ngoldman you can't have more then on geometry in a query. look at the spatialRel, geometry, and geometryType properties on http://resources.arcgis.com/en/help/arcgis-rest-api/#/Query_Feature_Service/02r3000000w5000000/ so .and(extent) probably would just throw and error.

Looking at the query docs we would have .within(), .intersects(), .contains(), .crosses(), .overlaps(), .touches() and each could be used only once to define a geometry based query. If you tried to pass a geometry into where or or we would just throw and error

@nixta it wasn't me but that would be cool to do in a future release.

I think the nodedbjs is harder to read then raw SQL the fluent query building stuff looks a little like what we are doing here but in PHP. I think we need to go for something that feels really good in javascript.

@patrickarlt
Copy link
Contributor Author

@nixta i think what @andygup was talking about is generating a quick circle ont eh client and using the resulting polygon for the query.

@andygup
Copy link
Member

andygup commented Jun 4, 2013

@nixta @patrickarlt yep, simple radius search is the use case. Just drawing a basic circle polygon in either wgs84 or web mercator and using that as the input geometry for the query.

@JerrySievert
Copy link
Contributor

my fear is that someone will expect a circle search from a point and radius where we would actually be building a bounding box and searching on that.

@patrickarlt
Copy link
Contributor Author

@JerrySievert you can search on my geometry so we could search on the resulting polygon itself not its bounding box. See the geometry param on http://resources.arcgis.com/en/help/arcgis-rest-api/#/Query_Feature_Service/02r3000000w5000000/

@nixta
Copy link
Member

nixta commented Jun 4, 2013

@patrickarlt @andygup OK. If that's the use case, then yes, absolutely.

I was thinking more about the reason one would do a query like that. You have a focal point and want stuff nearby. It kind of pokes a hole in FeatureService queries. Combined with the feature limit on single queries on a FeatureServices, that can get inaccurate as soon as the outline gets biggish compared with your data density. I.e. if there are 2000 features within the radius you've requested but the FeatureService only returns 1000 per query, you won't get the 1000 closest, you'll get whatever comes back first in the R-Tree (probably). Again, true for any query, but I was thinking of a different use-case. Like I said, I feel I'm hijacking this conversation as this is a side-issue to do with Server.

@JerrySievert I think we can search within the radius fairly well (which is what @andygup meant). No need to degrade to bounding box.

@patrickarlt
Copy link
Contributor Author

@nixta is it impossible to page through a feature service query? I dont see any options to do it on the docs page.

It sucks that we can never get all the features for a query and that this is a limitation of feature services.

@andygup
Copy link
Member

andygup commented Jun 4, 2013

@nixta I'm fairly certain when you create the feature service you can bump up the number of results that can be returned. I don't have any links handy that I can post on it.

@patrickarlt would you really want to return all features? Could cause performance issues for browser-based apps, especially mobile ones.

@JerrySievert
Copy link
Contributor

@andygup web workers help deal with some of those performance issues. and don't forget, this is ostensibly a library for node that "just happens" to have a browser build - at least until we rename it something like arcgis-sdk-javascript :)

@nixta
Copy link
Member

nixta commented Jun 4, 2013

@patrickarlt Correct (AFAIK, though I'll be over the moon to be wrong). I suppose you could use the returnCountOnly query parameter and then make multiple requests at the service using some client-side breakdown of the desired search geometry until you've retrieve all the unique records, but that sounds painful.

@andygup You would at least want a way to get them all even if it was up to you to page through them to build the complete set. As it stands, you don't even know that you've hit the limit and you haven't got them all. And yes, I think you can up the return count when you create the feature service (although IIRC current advice is that you shouldn't do it without good reason).

EDIT: See my response below - you can page (from the client) and you can tell you've hit the limit in 10.1 services upwards.

@nixta
Copy link
Member

nixta commented Jun 4, 2013

Ah! I was wrong(ish). You can do paging, but it has to be client-side paging. So, you make a request using returnIdsOnly and then make multiple requests for subsets of the unique ObjectIds. That is, there is no concept of pages on the server side (which makes sense from a session-less REST perspective). See the documentation on returnIdsOnly.

However, @patrickarlt, I don't see that a FeatureService declares what its max feature count is, so one would have to know that on a per-service basis.

@patrickarlt
Copy link
Contributor Author

@andygup I want all the features because clients are powerful enough now to index 10000k+ points. clustering and indexing on the client is very fast, DOM manipulation is the limiting factor but can be worked around.

Also since this is for node you might want all the features to do your own processing on them.

@nixta
Copy link
Member

nixta commented Jun 5, 2013

OK. I was extra-wrong. As of 10.1, you can tell if your query would have returned more records than the FeatureService allows. You will get exceededTransferLimit returned with your feature set: http://resources.arcgis.com/en/help/rest/apiref/query.html

Also as of 10.1, you can read maxRecordCount from the FeatureService: http://resources.arcgis.com/en/help/rest/apiref/featureserver.html

@patrickarlt
Copy link
Contributor Author

Great so we know we, exceeded the limit, what the limit is and how much we exceeded it by. But we still cant figure out how to make another request to get ALL the features.

@nixta
Copy link
Member

nixta commented Jun 5, 2013

Seems simple enough. I see two possible options:

  1. Make a first request using the full query and returnIdsOnly = true. This will give you a full set of result Ids. Then make a series of queries passing X Ids at a time where X is maxRecordCount until you've passed all the Ids you got back from the returnIdsOnly request. Could just be one more request. Min Requests: 2. Max Requests: (Ceiling TotalResults/maxRecordCount) + 1.

  2. Make a first request using the full query and returnIdsOnly = false (or don't set it - false is the default). This may give you a full set of results. If it's not the full set, it'll also return exceededTransferLimit and then you know you need to make more requests. Then make a request with returnIdsOnly and remove from the returned set of Ids the Ids from the original request. The remaining Ids are the ones you want to make with additional calls (may just be one more) as in option 1). Min Requests: 1. Max Requests: (Ceiling TotalResults/maxRecordCount) + 1.

Option 2 seems better unless I've missed something.

All dependent on the service being 10.1 or greater of course.

Update: Apparently, on a pre-10.1 service we should use option 1 and request batches of 500 objectIds at a time.

@nixta
Copy link
Member

nixta commented Jun 5, 2013

Oh, to clarify, you would pass the objectIds parameter to the query (in this case I'm passing Object IDs 1,2,3,4): http://services.arcgis.com/OfH668nDRN7tbJh0/ArcGIS/rest/services/Philadelphia_Healthy_Corner_Stores/FeatureServer/0/query?objectIds=1%2C2%2C3%2C4&outFields=*&returnGeometry=true

Hope this is making some sense.

@mpriour
Copy link
Member

mpriour commented Jun 5, 2013

@mpriour
Copy link
Member

mpriour commented Jun 5, 2013

@nixta - those 2 paging options are the same ones that I found when I last researched it as well.
However, the default max records for pre 10.1 (or at least 9.3, 9.4) is 1,000

@patrickarlt
Copy link
Contributor Author

What is the reason to limit queries to 1000? Technically you could set it to whatever you want when you create a service so why limit it?

@mpriour
Copy link
Member

mpriour commented Jun 5, 2013

@patrickarlt It's a server setting. I helps prevent people from swamping the server with excessively large responses, etc...
Very annoying though

Basically in pre 10.1, if you truly want ALL the features, you HAVE to do any objectIdsOnly query and then page through the results. Best case scenario you do 2 requests. @nixta already wrote down the worst case formula.

@nixta
Copy link
Member

nixta commented Jun 5, 2013

@mpriour Yep. Thanks. I was a little surprised I was given the number 500 too. Have asked for clarification.

@ajturner
Copy link
Member

I'm curious where this ended up. Seems to have derailed on pagination.

I'm interested in the original topic of a chainable, relational algebra, type of api. Very similar to Arel.

This type of API would be easier to work with and also enable currying. Anyone want to help move it forward?

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

9 participants