-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fewer than total number of rows returned #21
Comments
Hmm works locally: |
Oddly:
|
@dmfenton koop.dc is running 0.1.2 of koop-socrata. I'll PR making it show up in the status |
Anddd, even after the file is served my node process seems to get permanently pegged. Lots of postgres action too. I guess that's the geohash indexing on Opendata-koop? |
Now it's just 504's at http://koop.dc.esri.com/socrata/seattle/3k2p-39jp.csv |
@dmfenton geohash is very low impact, I doubt that has any impact here... |
I've implemented a new way to page large data like this. However at this point in time we are not using externally managed queues. This means that paging over socrata data is not very stable and is only in-mem. So if a koop process were to die (via any number of issues with other providers or any cause at all) the paging would be lost and the dataset will be stuck in a processing state. This means in order to actually use this provider in production we'll need use a request worker strategy that is more durable and persistent. |
Is there a way to abstract the process so that it's easier for other providers to tap in to worker resources? |
@dmfenton potentially |
We could create a centralized request worker that would take an array or page urls, make requests, and insert data into tables while not knowing what specific type of work they are doing. |
I had trouble getting this particular dataset from socrata to cache locally. Are there rate limits that I am not aware of @dmfenton ? |
Throttling and Application Tokens Hold on a second! Before you go storming off to make the next great open data app, you should understand how SODA handles throttling. You can make a certain number of requests without an application token, but they come from a shared pool and you’re eventually going to get cut off. If you want more requests, register for an application tokenhttp://dev.socrata.com/register and your application will be granted up to 1000 requests per rolling hour period. If you need even more than that, special exceptions are made by request. Use the Help! tab on the right of this page to file a trouble ticket. On Tue, May 19, 2015 at 12:13 PM -0700, "Christopher Helm" <notifications@github.commailto:notifications@github.com> wrote: I had trouble getting this particular dataset from socrata to cache locally. Are there rate limits that I am not aware of @dmfentonhttps://github.com/dmfenton ? — |
so we probably need to add support for sending an app key/token with requests - this would be a config param |
@chelm I am having the same issue here. When I used to access this: http://koop.dc.esri.com/socrata/wastate/9ubz-5r4b/FeatureServer/0/query?where=1=1 I would get over 5000 records and now it is limited to 1000. Is there something I need to do to make it get all of the records? |
ahh so the FeatureServices respect the maxRecordCount. I'm surprised it ever returned more than 1000 features though. The koop-socrata provider needs to support setting a limit and offset before it goes to the DB to get the data. koop-agol does this https://github.com/Esri/koop-agol/blob/master/controller/index.js#L585-L586 |
FYI @sirws I just added this locally and it works great. I'll PR it. |
I implemented the PR and it does not seem to work at all for me. Thu, 28 May 2015 19:02:48 GMT express deprecated res.send(body, status): Use res.status(status).send(body) instead at node_modules\koop-socrata\controller\index.js:54:17 |
@sirws hmmm how would that happen? What URL are you trying? |
Weird http://geodata.wa.gov/koop/socrata/wa/9ubz-5r4b/FeatureServer/0 seems to be working now. But it is still only returning 1000 records...http://geo.wa.gov/datasets/405e3ffff86b4de48bb4ade6b57c8054_0?filterByExtent=false&uiTab=table How do I get it to return more records? |
@sirws i think the service only has 1000 features in the DB... that is odd http://geodata.wa.gov/koop/socrata/wa/9ubz-5r4b/FeatureServer/0/query?returnCountOnly=true -> 1000 Did you drop the cache? |
Ok, I thought I dropped the cache. It is now saying 5830, but the OD app is still saying 1000. Do i need to reindex on the opendata site? |
@sirws will you try re-indexing it? |
I kicked off the re-index. It could take a while. |
FYI @sirws, you can reindex individual datasets. But indeed this could take a while. |
But there should be 5830 |
http://geodata.wa.gov/koop/socrata/wa/9ubz-5r4b |
Something goofy is going on with my indexes. I check them and they show 5830, then 1148. Not sure what is going on. I dropped them again and it now showing 5830. And I can download them all but the OD app says there are only 1000 records. http://geo.wa.gov/datasets/405e3ffff86b4de48bb4ade6b57c8054_0?filterByExtent=false&uiTab=table |
Well the OD app is not going to update until reharvest. And we’re having some delays with that right now. |
Ok. I will let that go for a while then. Will check later on. |
I'm going to close this. Locally I get the correct results and the original bug has been fixed. |
-> 1000 rows
cc @pholleran
The text was updated successfully, but these errors were encountered: