-
Notifications
You must be signed in to change notification settings - Fork 46
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sequences, using subqueries #523
Conversation
There will be a subsequent commit to make the server and worker control structures handle this well.
Switch from long queryID to QueryTaskId taskId, as prep work for subqueries. This commit also refactors some logging statements to not construct strings, but rather use the printf-style formatting that is more efficient. The LOGGER.ifBlahEnabled() checks are also removed when they are not more efficient (i.e., the logging command does no complex operations and hence no work). Also remove QUERY_PAUSE and QUERY_REMOVE from Myria, as these are never called. In particular, the IPC utility to generate a pause message from the server is never called.
This enables operators to advertise that they read or write relations.
always require a QueryStatusEncoding; make users construct one if they want hand coded strings.
If not started, return null for elapsed. If started and not finished, return elapsed since start.
It actually does the right thing with enums.
Option 1: plans and options come in at the same time (QueryEncoding) Option 2: plans and options come in separately (Fragments, then options)
So we can turn at JSON string into a DatasetStatus if need be.
…ponse And then use this to extract query id rather than hard coding it
It's not used at all by the OperationFuture classes, and it was implemented wrong by the AttachmentableAdaptor.
- refactor the way the server manages queries into QueryState, QueryTask - add MetaTask operators for Fragment, Sequence, JsonFragment - make the server issue multiple tasks one after the other. That is, it submits a task and then gets the next one. If null, success, otherwise it submits the next one and repeats. - refactor a bunch of work around how queries are submitted or exposed to the Catalog
This is a refactoring from the old OperationFuture to Google's ListenableFuture. THe goal is to have simpler code and not reinvent the wheel. I believe, at least for this particular task, this future is fully functional with very little code.
In reality, this is dead code though, I think.
Signed-off-by: Daniel Halperin <dhalperi@cs.washington.edu>
merging master. You know, since github doesn't handle that well. |
willing to try a rebase on master if that will help review. |
I will use No need to rebase. |
Nice option. But maybe you meant |
Yes, you are right. |
I should clearly go home. Can't even type a git command without an error any more. Also, for evening entertainment: http://git-man-page-generator.lokaltog.net/ |
I feel that I should look at changes and ask dumb questions as well :) First two dumb questions about the overall design before going into details:
|
If a subquery fails, the query fails. However, queries are not transactional. Completed subqueries stay completed. |
We could imagine adding restrictions to Myria to make queries more transactional. E.g., you may never append to a relation. (Then we can just tag each relation with which query it was created by, and we get automatic versioning too!) |
I'd really like to just rewrite the operators so that the operators themselves are able to be JSON-ed directly. Would simplify our lives. When I looked at the JsonQueryBuilder there were several things I really didn't like. E.g., it's written in frickin' Scala, so it's a pain to maintain. Also, it's not very fleshed out. |
There is nothing wrong with current semantic. In the long term, considering the following program (I tried very hard to not use the word query)
If we consider MyriaL as a normal programming language. We even should keep B there if sub-query1 is succeed. However, for the following program.
We should delete the intermediate result. (I just tried this on demo, raco is very smart and will combine these two queries into one. But let's say this program indeed translates to a sequence with 2 sub-queries.) So in myria side, I guess some design that distinguishes materalized intermediate result and the actual result would be great. We can surely do this in another PR or when we feel it is needed though. |
It's thoroughly unused.
If the same user executes the same query twice, can we not just run it from Fault tolerance features to save work is a good idea, but that's a (Plus, Dominik's plans for a materialization-aware optimizer may handle On Tue, May 27, 2014 at 11:14 PM, Shumo Chu notifications@github.comwrote:
|
It already is a SubQuery
They are literally only used in outdated tests. We have the ability to submit multiple queries and will soon have sequence operators; we don't need these any more. Signed-off-by: Daniel Halperin <dhalperi@cs.washington.edu>
public class SequenceTest extends SystemTestBase { | ||
|
||
@Test | ||
public void testSequence() throws Exception { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@stechu this is the test I was alluding to. Below, a similar test but submitted as JSON.
@dhalperi another dumb question, is there any design for nested sequence? From what I have understood so far, it is not supported yet? If it is, we should add tests. |
@stechu I believe a sequence inside a sequence should be fine. I will add a test. |
@stechu Looks like we asked the same question. |
yeah! On Monday, June 2, 2014, Coveralls notifications@github.com wrote:
Sent from Gmail Mobile |
fyi, I found a small race condition and pushed the fix directly to master. Yay testing! Somehow it seems to have been expose and/or triggered when updating the null analysis branch. |
Enable the client to submit a sequence of queries and have them be executed in series.