Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

graph,graphql,server,store: Subgraph Sql Service #5382

Open
wants to merge 8 commits into
base: master
Choose a base branch
from

Conversation

gusinacio
Copy link
Contributor

@gusinacio gusinacio commented Apr 30, 2024

Semiotic Labs is thrilled to present the Subgraph SQL Service in partnership with TheGuild and Edge & Node. This exposes a GraphQL type to any graph-node user, providing a secure SQL interface to directly query a subgraph's entities.

SQL Service

We introduce a new sql field on the "Query" type that exposes inputs so graph-node can receive raw SQL.

We also implemented a lot of safe sql rewrite to remove functions that we think may break the postgres instance along with some schema safety.

We also expose each of the tables via Common Table Expressions where we can have granular control over what is exposed or not and some workaround exposing only the latest block data.

What we cover on this PR

  • New sql query type
  • Defined API for inputs for backward compatibility with future work
  • Environment variable/Config to enable or disable the SQL service
  • Safety measures for SQL threats via complete AST SQL scan.
  • Latest block data
  • Single subgraph query
  • JSON and CSV union output

What we do not cover on this PR

  • Cross subgraph joins
  • Query on specific block/snapshot
  • Query optimizations
  • Query parameters binding
  • Query/Join between two different blocks/snapshots

@lutter
Copy link
Collaborator

lutter commented May 1, 2024

Wow .. that's a really nice addition!

Before I start reviewing this in detail, I noticed a few things from a quick look at the PR:

  • there is no end-user documentation. Can you add something to docs/ that explains how to run queries? Obvious questions that should answer are: how do I send a query (example)? how do I use bind params? how does the GraphQL schema relate to the tables that this can query? What are the attributes/columns that can be queried? How much of SQL does this cover? I haven't looked but does this allow e.g., aggregations, window functions, CTEs etc.?
  • for operators: since SQL queries can be arbitrarily complex and take an arbitrary amount of time, how do they guard against long-running queries?
  • a minor nit is that commit messages should follow the guidelines here This won't cause us to not approve the PR, but it would be nice to keep with the current scheme.
  • this might be too much work, but I see that a sql crate appears and is then folded into the store - would it be possible to squash various commits so that there are fewer commits that fix up previous ones, not just for that crate but other fix commits, too? It'll make reviewing a lot easier

gusinacio and others added 8 commits May 8, 2024 08:53
Signed-off-by: Gustavo Inacio <gustavo@semiotic.ai>
Signed-off-by: Gustavo Inacio <gustavo@semiotic.ai>

executor: impelement methods to execute arbitary sql in query store and
deployment store
Signed-off-by: Gustavo Inacio <gustavo@semiotic.ai>
Signed-off-by: Gustavo Inacio <gustavo@semiotic.ai>

create store/postgres/src/sql

validator and formater: full refactor

Signed-off-by: Gustavo Inacio <gustavo@semiotic.ai>

refactor: move sql to store/postgres/src/sql

Signed-off-by: Gustavo Inacio <gustavo@semiotic.ai>

parser: test for array of byte fixed

Signed-off-by: Tümay Tuzcu <tumay@semiotic.ai>

parser: block_range and block$ added as available columns for tables,
sequence functions moved to black list

Signed-off-by: Tümay Tuzcu <tumay@semiotic.ai>

parser: use latest block as cte filter

parser: fix escape columns and tables

Fixes STU-217

Signed-off-by: Gustavo Inacio <gustavo@semiotic.ai>
…able_type

Signed-off-by: Gustavo Inacio <gustavo@semiotic.ai>
Signed-off-by: Gustavo Inacio <gustavo@semiotic.ai>
Signed-off-by: Gustavo Inacio <gustavo@semiotic.ai>

docs: add sql service docs section
… prelude CTE

Fixes SQL-266

Signed-off-by: Tümay Tuzcu <tumay@semiotic.ai>
@gusinacio
Copy link
Contributor Author

gusinacio commented May 9, 2024

Wow .. that's a really nice addition!

Before I start reviewing this in detail, I noticed a few things from a quick look at the PR:

  • there is no end-user documentation. Can you add something to docs/ that explains how to run queries? Obvious questions that should answer are: how do I send a query (example)? how do I use bind params? how does the GraphQL schema relate to the tables that this can query? What are the attributes/columns that can be queried? How much of SQL does this cover? I haven't looked but does this allow e.g., aggregations, window functions, CTEs etc.?
  • for operators: since SQL queries can be arbitrarily complex and take an arbitrary amount of time, how do they guard against long-running queries?
  • a minor nit is that commit messages should follow the guidelines here This won't cause us to not approve the PR, but it would be nice to keep with the current scheme.
  • this might be too much work, but I see that a sql crate appears and is then folded into the store - would it be possible to squash various commits so that there are fewer commits that fix up previous ones, not just for that crate but other fix commits, too? It'll make reviewing a lot easier

Hey @lutter, thank you for the information. Here are some notes:

  • we added end-user documentation about the SQL-service to /docs as requested
  • graph-node has some build-in mechanisms to cancel long-running queries and we are relying on that
  • thanks for pointing out the guidelines, we rebased and reworded all the commits to make it easier to review
  • while rebasing, we removed how it was made and created commits of the "happy-path". We removed this appearance and fold of the SQL crate and fix up some commits.

Theodus pushed a commit to edgeandnode/gateway that referenced this pull request May 15, 2024
## Issue

As work takes place to [add support for SQL
queries](graphprotocol/graph-node#5382) the
Gateway should invalidate such queries in the meantime.

## Solution

The Gateway should reject queries containing SQL fields.

We want to reuse this code in the `sql-gateway`. Therefore we use the
`SqlFieldBehavior` enum to inject the desired behavior.

---

Signed-off-by: Joseph Livesey <joseph@semiotic.ai>

Co-authored-by: Gustavo Inacio <gustavo@semiotic.ai>
Copy link
Collaborator

@lutter lutter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not entirely through reviewing this, but didn't want to hold the review back for even longer.

Besides the comments I made inline, one major missing piece are end-to-end tests. For GraphQL queries we have a pretty comprehensive test suite in store/test-store/tests/core/interfaces.rs and store/test-store/tests/graphql/query.rs. I would want a similarly comprehensive test suite for SQL queries - basically if a change to graph-node breaks SQL queries, I want to find out about that from a failing test similar to the ones for GraphQL.

@@ -30,6 +30,8 @@ git-testament = "0.2.5"
itertools = "0.12.1"
hex = "0.4.3"
pretty_assertions = "1.4.0"
sqlparser = { version = "0.40.0", features = ["visitor"] }
thiserror = "1.0.25"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We try to avoid importing crates more than once as much as possible (and we're far from perfect on that).

There's two ways in which we do that:

  • (the old way) Import the crate in graph/src/Cargo.toml and reexport it in graph/src/lib.rs. Other crates then say use graph::<a crate>
  • (the new way) Import the crate in the workspace Cargo.toml and add <a crate>.workspace = true in crates that use it

I just opened a PR #5403 to update sqlparser to the latest. Please use the workspace one and update this code for it.


use crate::data::graphql::ext::{
camel_cased_names, DefinitionExt, DirectiveExt, DocumentExt, ValueExt,
};
use crate::prelude::s::*;
use crate::prelude::*;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please avoid * imports - they become a huge headache later in some refactorings. Just list everything you actually need.

Also, the single letter crates, like s, q and r should always be used as such, i.e., code should always say s::Document instead of just Document

@@ -414,7 +413,7 @@ impl Resolver for StoreResolver {
return self.lookup_meta(field).await;
}

if object_type.is_sql() {
if ENV_VARS.graphql.enable_sql_service && object_type.is_sql() {
return self.handle_sql(field);
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The ENV_VARS check shouldn't be necessary - when SQL is turned off, a query that uses it shouldn't even make it here as it would have been rejected during validation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the reason why we have this check here is in the remote case where the schema contains a type called Sql which a disabled SQL service graph-node would allow it.

name if is_introspection_field(name) => intro_set.push(field)?,
META_FIELD_NAME | "__typename" => meta_items.push(field),
SQL_FIELD_NAME => sql_items.push(field),
_ => data_set.push(field)?,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the comment above should say something why sql_items gets split out here (it's also missing an explanation why meta_items are split out). In a nutshell:

  • intro_set needs to be handled differently because we use a different resolver (IntrospectionResolver)
  • meta_items need to be handled separately because prefetch can not handle them
  • sql_items need to be handled separately for the same reason (?) If so, maybe we can just put them into meta_items and rename that variable to something more descriptive (maybe noprefetch_items though I don't love that either)

let result = result.into_iter().map(|q| q.0).collect::<Vec<_>>();
let row_count = result.len();
// columns should be available even if there's no data
// diesel doesn't support "dynamic query" so it doesn't return column names
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's now a DynamicSelectStatement; PR #5372 uses that, but it requires quite a bit of ceremony. For now, this is fine, but would be great to avoid the whole to_jsonb dance at a later date.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is good to know. I got to this crate when I was developing but at the time graph-node wasn't using diesel v2. Gonna take a look at it

field accepts values of this type.

To enable the Subgraph:SQL Service, set the `GRAPH_GRAPHQL_ENABLE_SQL_SERVICE` environment
variable to `true` or `1`. This allows clients to execute SQL queries using the
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would just say to set this to true

environment variable.

- **Environment Variable:** `GRAPH_GRAPHQL_ENABLE_SQL_SERVICE`
- **Default State:** Off (Disabled)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you set this to true to turn it on, it would make sense to say that the default is false

permitted to be used within SQL queries executed by the Subgraph:SQL Service, while `POSTGRES_BLACKLISTED_FUNCTIONS`
serves as a safety mechanism to restrict the usage of certain PostgreSQL functions within SQL
queries. These blacklisted functions are deemed inappropriate or potentially harmful to the
system's integrity or performance. Both constants are defined in `store/postgres/src/sql/constants.rs`.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's sorta icky, but it would be nicer for users to just list what's on the whitelist. My understanding is that anything not on the whitelist is forbidden. Mentioning a blacklist here just makes users wonder what happens for functions that are neither on the whitelist or the blacklist


The GraphQL schema provided by the Subgraph:SQL Service reflects the structure of the SQL queries
it can execute. It does not directly represent tables in a database. Users need to
construct SQL queries compatible with their database schema.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This does not explain at all how I go from the GraphQL subgraph schema to the schema against which I can execute queries. If I have an entity type DailyPositionSnapshot, what's the table I query? Does it matter if that type is mutable, immutable, or a timeseries?

To avoid locking ourselves into how data is currently stored, I would suggest the following:

  • the table for an @entity type is the name of the type in snakecase
  • the columns of the table are all the non-derived attributes of the type, also snakecased
  • for aggregations, the table name is the name of the type snakecased together with the interval, i.e. type Stats @aggregation(intervals: ["hour", "day"], source: "Data" { .. } becomes stats('hour') in a SQL query
  • queries are always executed at a specific block, determined by the block constraint on the sql element

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, you (@lutter) have already summed up some part of it. Let's expand on this with examples.
Can you give a subgraph example with aggregation ?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This subgraph is kinda silly because it just aggregates block numbers, but it explains how to use aggregations in various ways. The aggregation docs also have some more realistic examples.

use super::{DocumentExt, ObjectTypeExt};

#[derive(Copy, Clone, Debug)]
pub enum QueryableType<'a> {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the name QueryableType much better than ObjectOrInterface; as I understand it, adding union here is necessary because of the return type of the sql query element, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Exactly! we want the user to select which type of response: JSON or CSV. This is also future-proof for possible new formats.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants