Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SQL support #133

Closed
turbo opened this issue Sep 5, 2019 · 7 comments
Closed

SQL support #133

turbo opened this issue Sep 5, 2019 · 7 comments

Comments

@turbo
Copy link

turbo commented Sep 5, 2019

Sorry if I missed an obvious doc link somewhere, but I wasn't able to find comprehensive documentation on the extend of SQL support in either noria and or noria-mysql.

I'm looking to replace some part of a 1bn+ row (postgres-based) query system with Noria, since a lot of it is written in Rust and queries are already using mat views in most cases, so here goes my feature list:

  • recursive and non-recursive CTE, where side-effect-free non-recursive CTE are not meant to be optimization fences (Postgres 12)
  • subqueries
  • UPSERT-type queries with a lot of rows (1-15 million)
  • all kinds of JOINs, but SQL-default JOINs would be fine for now
@TrustTheRust
Copy link

Interesting project. +1 for Postgres adapter for Noria. Good Luck.

@jonhoo
Copy link
Contributor

jonhoo commented Sep 6, 2019

Hi! This is very much still a research prototype, so I probably wouldn't recommend trying to use it in production unless you are willing to put significant effort into maintenance and development. Since it is a research system, SQL support is also very much driven by what our benchmark applications need, rather than aiming for complete support. Of what you mention:

  • We do not support CTE
  • We do support subqueries in certain positions, but only as long as they are not implicit joins (i.e., name fields or tables from the outer query)
  • UPSERTs aren't really supported, though we do have INSERT OR UPDATE
  • We do have support for "standard" JOINs, but so far only if the join condition is equality on a single field. This isn't fundamental to Noria's approach, but to our research prototype :)

Note also that writes in Noria aren't generally done through SQL (though they are if you run the MySQL protocol shim).

@turbo
Copy link
Author

turbo commented Sep 6, 2019

Thanks for the detailed answer,

wouldn't recommend trying to use it in production unless you are willing to put significant effort into maintenance and development

That's fine, my day job is juggling fire utilizing a lot of research-quality projects to drive innovation internally. E.g. I also maintain a similar solution (not-public) in another language, which is based on the SQLite VM and query planner, with in-memory mat views. I'd probably rip the useful parts out of Noria, but the approach struck me as useful.

UPSERTs aren't really supported, though we do have INSERT OR UPDATE

No entirely familiar with mysql, does this include noop conflict resolution? I'm using that to dedupe and normalize tables in PG in one step with DO NOTHING. Reminds me of another question: A lot of concurrent conflict resolutions on tables with unique constraints might cause deadlocks because the index tuples of two transactions can conflict in a way that causes the txs to wait for each other's completion based on the transaction visibility rules in MVCC. Is this a problem in Noria? Or on a more basic level: What does Noria's ACID and (TX/stmt) concurrency story look like? What kind of constraints are supported? Can columns be computed, too?

We do have support for "standard" JOINs, but so far only if the join condition is equality on a single field.

Nice. Multiple, single-field JOINs are supported then I take it?

Note also that writes in Noria aren't generally done through SQL

I understand, however the SQL adapter is essential when external systems are connecting to Noria for blending and analysis (Redash, DataGrip, Drill etc.)

@jonhoo
Copy link
Contributor

jonhoo commented Sep 6, 2019

No entirely familiar with mysql, does this include noop conflict resolution?

I'm not sure I follow? The feature I am thinking of is essentially this, which is "if the primary key already exists, do an update instead". Not sure what you mean by noop conflict resolution.

Or on a more basic level: What does Noria's ACID and (TX/stmt) concurrency story look like? What kind of constraints are supported? Can columns be computed, too?

Noria does not currently provide any transaction support. Reads are eventually consistent with exactly-once guarantees. Writes are durable once they enter the graph, but their effects are not necessarily visible immediately. Noria does not support constraints (yet). We are looking at schemes to add support for reading your own writes, but do not have anything finalized on that yet.

Not sure what you mean by computed columns?

Multiple, single-field JOINs are supported then I take it?

Yes indeed!

@turbo
Copy link
Author

turbo commented Sep 6, 2019

Not sure what you mean by noop conflict resolution.

It's the ON CONFLICT DO NOTHING resolution strategy in Postgres. Let's say you have a lot of names in lotta_names and a table names with a unique constraint on it's column name, then this would be the quickest way of inserting names not already known:

INSERT INTO names(name)
SELECT ln.name
FROM lotta_names ln
ON CONFLICT DO NOTHING

I guess since there's no constraint support, this doesn't matter, as Noria can't enforce e.g. uniqueness, right? So it would be on the "client" to enforce that prior to insertion.

Not sure what you mean by computed columns?

Computed columns are virtual colums in a table which represent the result of an expression, here's some pseudocode:

create table foo {
  a text
  b text
  c text computed(a + ' ' + b)
}

computed columns are computed on access by default, but can also be stored (computed on row mutation).

@turbo
Copy link
Author

turbo commented Sep 6, 2019

I think most of my questions were answered, though. So I'll close this, but discussion might continue 👍

@turbo turbo closed this as completed Sep 6, 2019
@ms705
Copy link
Member

ms705 commented Sep 6, 2019

Regarding the conflict resolution on insert: Noria does enforce uniqueness of primary keys, and it's default (and only) behavior is actually what you describe (ignore duplicates).

See the code here.

(Noria does not support insert queries that directly select from other tables, like in your example. The data would have to go through the client first.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants