Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AQL: function to get collection by name? #1138

Closed
CoDEmanX opened this issue Dec 1, 2014 · 8 comments
Closed

AQL: function to get collection by name? #1138

CoDEmanX opened this issue Dec 1, 2014 · 8 comments
Labels
1 Feature 3 AQL Query language related

Comments

@CoDEmanX
Copy link
Contributor

CoDEmanX commented Dec 1, 2014

There is COLLECTIONS() to retrieve all collections, but I'm not aware of a function to get a single collection by name / id.

I would like the following to work:

FOR coll in COLLECTIONS()
    FOR doc IN COLLECTION(coll.name)
        RETURN doc.attr
@jsteemann
Copy link
Contributor

You are right, there is currently no such function in AQL.

@jsteemann jsteemann added 3 AQL Query language related 1 Feature labels Dec 1, 2014
@fceller fceller added this to the Backlog milestone Dec 2, 2014
@jsteemann
Copy link
Contributor

Thinking a bit about this:
There are no such things as late-bound collections in AQL at the moment. It would easily be possible to add an AQL function that will return all documents (as a list) from a late-bound collection at once, but it would neither be efficient nor could benefit from any optimizations.
I think what we would first need for this is some extra node type, like an EnumerateCollection node, but with a late-bound collection name.

@sbakiu
Copy link

sbakiu commented Apr 10, 2015

Is this feature implemented till now?
I need something like:

LET c = 'vertex'
FOR f IN FULLTEXT(c, 'text', 'keyword') RETURN f

but i get: [1542] invalid argument type in call to function 'AQL_FULLTEXT()' (while executing)

@jsteemann
Copy link
Contributor

No, this is not implemented yet, because of the above concerns.

Note that you could use a bind parameter for the collection name, but that also requires the collection name to be known already when executing the query. So it won't be dynamic in the sense that the collection name is calculated from intermediate query results.

@jsteemann
Copy link
Contributor

If all access to a single collection is required but the collection name should not be hard-coded in the AQL query, then a bind parameter can be used as follows:

FOR f IN FULLTEXT(@@c, 'text', 'keyword') RETURN f

or

FOR f IN @@c FILTER f.something == 'foo' RETURN f

(with bind parameter @c containing the name for the collection)

What also works in 2.8 is injecting the collection name into AQL functions like this:

LET c = 'vertex'
FOR f IN FULLTEXT(c, 'text', 'keyword') RETURN f

What will not work is something like this:

LET c = [ 'vertex', 'test' ]
FOR f IN c /* do something with f */

The above will simply iterate over an array with two strings, but not over an array with 2 collections. Having a function such as COLLECTION(name) as suggested would fix this, but this would require us to make execution plans fully run-time dependent.

The reason is that we need to create query execution plans directly after parsing the query, and at point it will be unclear which collections will take part in the query. The execution plan parts may even be shipped around in a cluster to start the query, and making this all dynamic and happen at run-time while the query is running will require a completely new implementation.

@KirilOkun
Copy link

Any update on this since 2015? It would be very helpful to keep all of the logic in AQL rather than having to bring it out to js to get and bind a collection reference.

@jsteemann
Copy link
Contributor

Sorry, no update on this since 2014/2015.
AQL requires all the collection names to be present when the query is compiled.

The execution of an AQL query works as follows:

  1. parse query
  2. create query execution plan
  3. optimize query execution plan
  4. distribute execution plan to cluster nodes
  5. execute query execution plan

Steps 2 to 5 need to know which collections are involved. For example, the optimizing step selects the indexes to use, which is obviously collection-dependent. The distribution step also needs some knowledge about where shards are actually located.
If the collection names would vary inside the query at execution time, then the above processing pipeline could not be used for AQL queries. So honestly I currently don't see any good way to achieve this but to use results from an initial query and pass them to another query.

@CoDEmanX
Copy link
Contributor Author

CoDEmanX commented Nov 28, 2018 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
1 Feature 3 AQL Query language related
Projects
None yet
Development

No branches or pull requests

5 participants