-
Notifications
You must be signed in to change notification settings - Fork 80
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support for variables wrapped around RDD #79
Comments
This is being addressed. |
I have just debugged the following nearly-identical query (file locations are updated and json-file function is used). In the current state on master, this query is evaluated completely locally. Is there anything else missing for completely addressing this issue ? |
@CanBerker what is the status: are the two counts now evaluated in parallel? Thanks! |
The current result:
I remember testing at the time and verifying the local API was used to evaluate this query as let clause is capable of storing RDDs as variables. If I'm not missing something I think that this issue is addressed. |
Thanks! :) |
In the medium to remote future, we will want to smartly bind a variable with, instead of a materialized sequence of items, an "RDD wrapper" acting as a proxy to Spark in local expressions. This requires adapting the code of the dynamic context.
Example:
let $a := json-text("hdfs://.../file.json")
let $b := json-text("hdfs://.../file2.json")
return { a: count($a), b: count($b) }
The above FLWOR expression is local (i.e., the let clauses are executed locally, but wrapping on the RDD returned by json-text as if it were a local value in a blackbox), so that a prerequisite will be that local FLWORs are supported.
Note that this feature will be incompatible with FLWORs running on Spark, i.e., only "materialized" dynamic contexts can be used as RDDs because Spark forbids nesting.
The text was updated successfully, but these errors were encountered: