Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding tpl #371

Closed
wants to merge 59 commits into from

Conversation

@amlatyrngom
Copy link
Contributor

amlatyrngom commented May 6, 2019

Adding folder.
Work is still in progress (rename namespace, add test files, pass CI checks, ...).

TODO (FEATURES)

  • Seq scan to stdout (tested to work for arbitrary columns picked out of a single table)
  • Index scan to stdout
  • @insert/update/delete(table, tuple) extension

TODO (QUALITY)

  • clang-tidy
  • cpplint
  • DOXYGEN <-- this will be painful
  • Final cleanup pass, after which we run format

NICE TO HAVE

  • The interpreter should grab some hardcoded output_struct struct as its outputAlloc format
  • Name conflict resolution probably will have bugs, e.g. if two tables have the same name. Aliasing issues in general.
lmwnshn and others added 10 commits May 6, 2019
# Conflicts:
#	src/execution/ast/context.cpp
#	src/execution/ast/type.cpp
#	src/execution/compiler/consumer_context.cpp
#	src/execution/compiler/function_builder.cpp
#	src/execution/parsing/parser.cpp
#	src/execution/sema/sema.cpp
#	src/execution/sema/sema_expr.cpp
#	src/execution/sema/sema_stmt.cpp
#	src/execution/sql/aggregation_hash_table.cpp
#	src/execution/sql/generic_hash_table.cpp
#	src/execution/sql/join_hash_table.cpp
#	src/execution/sql/join_hash_table_vector_lookup.cpp
#	src/execution/sql/table_vector_iterator.cpp
#	src/execution/util/region.cpp
#	src/include/execution/ast/ast.h
#	src/include/execution/ast/builtins.h
#	src/include/execution/ast/context.h
#	src/include/execution/ast/type.h
#	src/include/execution/compiler/code_context.h
#	src/include/execution/compiler/codegen.h
#	src/include/execution/compiler/compilation_context.h
#	src/include/execution/compiler/execution_consumer.h
#	src/include/execution/compiler/function_builder.h
#	src/include/execution/compiler/operator/operator_translator.h
#	src/include/execution/compiler/pipeline.h
#	src/include/execution/sema/error_message.h
#	src/include/execution/sema/sema.h
#	src/include/execution/sql/generic_hash_table.h
#	src/include/execution/sql/join_hash_table_vector_lookup.h
#	src/include/execution/sql/table_vector_iterator.h
#	src/include/execution/sql/value.h
#	src/include/execution/sql/value_functions.h
#	src/include/execution/util/macros.h
#	test/execution/CMakeLists.txt
@lmwnshn lmwnshn force-pushed the amlatyrngom:adding-tpl branch from 8a97726 to 8e08692 May 8, 2019
lmwnshn added 2 commits May 8, 2019
tanujnay112 and others added 13 commits May 14, 2019
…n from InsertPlanNode. We aren't using a lot of stuff in the insert plan node though. Need to ask Wen/Gus where the multiple values for bulk insert are being stored. Should we treat the vec<TransientValue> as a flattened array? The fix for this is just a few lines of code. Testing is currently a very painful experience since we're testing against hardcoded tables...
@lmwnshn

This comment has been minimized.

Copy link
Contributor

lmwnshn commented May 19, 2019

Some questions for @GustavoAngulo and @wenxuanqiu , possibly for binder/optimizer related group meeting in general @apavlo :

What will the binder/optimizer do, for:

  1. Insert-from-select in Peloton was modeled as a InsertPlanNode with a SelectPlanNode child, this is the only time an insert plan node had a child and so it could be special cased. Will we go this route again?
  2. An InsertPlanNode should definitely be able to insert multiple values at once, but right now the mechanism for modeling the multiple values is not clear. Will we have a vec<vec>, should I treat vec as an array with (size / bulk_insert_count) elements?
  3. UpdatePlanNode currently does not contain the values that we are updating it to be. It has the target of the update and that's it.

In general, the current status is we should be good for having all the (TPL to storage to some kind of callback) part; we think we understand this well and have dummy stuff in place. In particular, single inserts work; updates and deletes are basically copy-paste logic except with a different call to SQLTable wrapper. But the input to TPL doesn't seem stable yet.

@wenxuanqiu

This comment has been minimized.

Copy link
Contributor

wenxuanqiu commented May 19, 2019

Some questions for @GustavoAngulo and @wenxuanqiu , possibly for binder/optimizer related group meeting in general @apavlo :

What will the binder/optimizer do, for:

  1. Insert-from-select in Peloton was modeled as a InsertPlanNode with a SelectPlanNode child, this is the only time an insert plan node had a child and so it could be special cased. Will we go this route again?
  2. An InsertPlanNode should definitely be able to insert multiple values at once, but right now the mechanism for modeling the multiple values is not clear. Will we have a vec, should I treat vec as an array with (size / bulk_insert_count) elements?
  3. UpdatePlanNode currently does not contain the values that we are updating it to be. It has the target of the update and that's it.

In general, the current status is we should be good for having all the (TPL to storage to some kind of callback) part; we think we understand this well and have dummy stuff in place. In particular, single inserts work; updates and deletes are basically copy-paste logic except with a different call to SQLTable wrapper. But the input to TPL doesn't seem stable yet.

I am encountering similar concerns when trying to create operators for INSERT, UPDATE, etc. I am not completely certain that whether the operators/plan_nodes should have ownership of values to be inserted or updated. I think these issues arise from building upstream and downstream objects from a basic version of plan nodes. I feel that we need to solve them at a higher level by figuring out where the values for INSERT and UPDATE need to be from the parser to the execution engine. I think this is somewhat related to the Context objects we briefly touched on during the meetings. Otherwise, we might have too many moving parts added over time that will bite back at us later.

…ng code, and we should talk about how "no predicates" are represented in scans.
// Run pipeline 1
pipeline_1(&state)
// Build table

This comment has been minimized.

Copy link
@apavlo

apavlo May 21, 2019

Member

Is this function really building the hash table? If yes, then what is is pipeline_1 doing?

@mbutrovich mbutrovich added do-not-merge and removed in-progress labels Jul 29, 2019
@mbutrovich

This comment has been minimized.

Copy link
Collaborator

mbutrovich commented Jul 29, 2019

There will be a new PR for this after #398 merges, since @amlatyrngom has a different branch that has been kept more up to date.

@mbutrovich mbutrovich closed this Jul 29, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
8 participants
You can’t perform that action at this time.