-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for inline-ing metastore over Flight #531
Conversation
tests/flight/mod.rs
Outdated
let config_text = format!( | ||
r#" | ||
[object_store] | ||
type = "memory" | ||
{object_store_section} | ||
|
||
[catalog] | ||
type = "sqlite" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably the only thing left in this PR is making this catalog optional (expect the metastore to be provided inline if it isn't specified) and erroring out if the user tries to run DDL (creating/deleting tables), which we might be getting automatically because of the not_impl()
thing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about making it optional and default to in-mem sqlite?
One can still provide the inline metastores, but we also support the full DDL spectrum, albeit ephemerally.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's possible, but kind of awkward? Since if you first create a table with the inline metastore in use (e.g. use some CREATE TABLE AS statement), you wouldn't be able to query it unless you stopped providing the inline metastore, so I don't yet see how useful that would be? It's the responsibility of the caller to be the metastore/catalog if the inline metastore is being used.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, well you can't create a table with the inline metastore at all, only reads are supported. This is because we implement only the basic SchemaStore
for the MemoryStore
, so we the table creation metastore calls would error out (due to not_impl
).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh I see, you meant just defaulting to the in-memory metastore if the inline metastore isn't provided. I don't really have strong feelings about it. I guess in the Seafowl configuration where you're using it with the inline metastore it'd surface errors to the developer faster if they forgot to provide it on some code path instead of quietly using the in-mem SQLite one -- that would be my only argument for erroring out.
let cmd = InlineMetastoreCommandStatementQuery { | ||
query, | ||
schemas: Some(schemas), | ||
}; | ||
get_flight_batches_inner(client, cmd.as_any()).await |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's nice to see that the usage example for this is very straightforward
@@ -74,6 +77,14 @@ impl SeafowlFlightHandler { | |||
self.context.clone() | |||
}; | |||
|
|||
if let Some(memory_store) = memory_store { | |||
// If metastore was inlined with the query use it for resolving tables/locations | |||
ctx = ctx.with_metastore(Arc::new(Metastore::new_from_memory( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As discussed, we might need to "merge" the in-memory and the real from-config metastore and fall back to the latter metastore if there's a table we can't find in the former, but we can do this in the separate PR if need be
What
Adds support for discrete, temporary, metastores inlined with queries called over arrow flight. These can be used to resolve identifiers parsed from the query in an ad-hoc manner.
How
MemoryStore
and implement all store traits for itListSchemaResponse
from regular clade metastore in a new message calledInlineMetastoreCommandStatementQuery
that also holds the queryget_sql_info
is called arrow fails to pattern match against known commands and uses the fallback methodMemoryStore
for a one-off context that serves the query