Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ForEach only return 100 nodes #718

Closed
gokhanckgz opened this issue May 10, 2018 · 10 comments
Closed

ForEach only return 100 nodes #718

gokhanckgz opened this issue May 10, 2018 · 10 comments
Assignees
Labels
Milestone

Comments

@gokhanckgz
Copy link

gokhanckgz commented May 10, 2018

When iterate data which has nodes greater than 100 , foreach function only getting 100 of them.

g.V().ForEach( function(d) { g.Emit(d) })

and

g.V().All()

should be equal I think. But g.Emit(d) inside ForEach always returns only 100 nodes.

Cayley version : 0.7.3 , Backend : mongodb

@dennwc dennwc self-assigned this May 10, 2018
@dennwc dennwc added the bug label May 10, 2018
@dennwc dennwc added this to the v0.7.4 milestone May 10, 2018
@steffansluis
Copy link

I think this is because of the limit query parameter defined here. Try sending the query to /api/v1/query/gizmo?limit=10000.

@dennwc
Copy link
Member

dennwc commented May 11, 2018

@steffansluis Still, it's unexpected that it affects ForEach but doesn't affect All.

@steffansluis
Copy link

If that is indeed the case, then it is a bug. Although I'm not sure if by should be equal I think @gokhanckgz means that both only return 100 results or that one of them behaves differently.

@gokhanckgz
Copy link
Author

Yes, they behaves differently. In my case All returns all nodes (about 800) but ForEach not iterating all nodes (always iterates 100 of them).

@gokhanckgz
Copy link
Author

gokhanckgz commented May 11, 2018

I was using v2 HTTP API. But with /api/v1/query/gizmo?limit=10000 and ForEach I can iterate all nodes now , just like you said @steffansluis .

@dennwc dennwc closed this as completed in 924a8c2 Jun 3, 2018
@dennwc
Copy link
Member

dennwc commented Jun 3, 2018

It was a bug with g.All() not honoring a query limit. Now it also returns only the first 100 nodes. To iterate all nodes, set ?limit=-1.

@mojoex
Copy link

mojoex commented Jun 20, 2018

Why does g.All() limit and where is it set? Shouldn't it default to all edges? And where is the default limit set? This seems like an unnecessary change given the 'All()' statement literally does mean.. get me All() things.
This was a breaking change in our application - where we have around 50 strongly typed queries using All().

@dennwc
Copy link
Member

dennwc commented Jun 20, 2018

@mojoex Imagine a large database with millions of nodes executing All(). It shouldn't be a default behavior to return them all. But if you know that a number of nodes is limited, just pass ?limit=-1 with your queries and it will disable the limit.

Note that this does not affect Go applications that use Cayley as a library.

@iddan
Copy link
Collaborator

iddan commented Sep 17, 2019

I think this is counter intuitive.
Other solutions to large datasets are:

  • SQL doesn't limit by default but some REPLs do.
  • BigQuery gives a memory if a query is executed without a limit.
  • Mongo returns the first batch of a cursor with a default batch size of 1000 but all results can be iterated seamlessly.
  • Most UIs paginate results.

Maybe the Cayley UI should paginate results, and libraries use a cursor but HTTP to return all results by default?

@dennwc
Copy link
Member

dennwc commented Sep 18, 2019

Yes, pagination is the way to go and #121 now tracks this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

5 participants