Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Discussion: How should we handle joins? #6426

Closed
Emyrk opened this issue Mar 2, 2023 · 4 comments · Fixed by #6429
Closed

Discussion: How should we handle joins? #6426

Emyrk opened this issue Mar 2, 2023 · 4 comments · Fixed by #6429
Assignees
Labels
chore Non-customer facing refactors, cleanup, or technical debt. design Request for more beauty

Comments

@Emyrk
Copy link
Member

Emyrk commented Mar 2, 2023

Problem

Our current use of SQLc uses almost 0 joins throughout all of our queries. This causes excessive db round trips. The function workspaceData does 9 database calls to completely populate the returned workspace from the api. This can be reduced with joins.

When dbauthz is enabled by default, this compounds the issue as many objects require fetching related objects to run authorization. Examples: workspace builds, template versions, jobs, build parameters, ... .

Solutions (this PR talks about both)

Allowing queries to leverage joins can reduce db round trips. The issue is how to handle these. We currently use SQLc, which would create a new model for each query and it becomes cumbersome. Additionally, we often have multiple SELECT queries for each datatype.

I have 2 proposals.

Keeping SQLc (originally investigated here)

If we want to keep SQLc, the best way we can do this is with views. I am suggesting non-materialized views. So essentially these are saved queries in our postgres database that we can reference from SQLc.

Because views are saved in postgres, they require the same migration maintenance a table would.

BEGIN;
-- workspace_builds_rbac includes the linked workspace information
-- required to perform RBAC checks on workspace builds without needing
-- to fetch the workspace.
CREATE VIEW workspace_builds_rbac AS
SELECT
workspace_builds.*,
workspaces.organization_id AS organization_id,
workspaces.owner_id AS workspace_owner_id
FROM
workspace_builds
INNER JOIN
workspaces ON workspace_builds.workspace_id = workspaces.id;
COMMIT;

A view is a "table" for SQLc, so a model is generated for it. In our .sql files, we just reference the view.

-- name: GetWorkspaceBuildByID :one
SELECT
*
FROM
workspace_builds_rbac
WHERE
id = $1
LIMIT
1;

Abandon SQLc and use SQLx + Go Templates (an implementation here)

We use SQLc to generate Golang code from our sql. But this code isn't actually that complex in Golang if we use SQLx. And then we can use Go templates to build dynamic queries.

A very basic implementation of this makes our *.sql files look like this. Note the highlighting will be a bit off as we are mixing SQL and Go templates.

{{ define "workspace_builds_rbac" }}
(
SELECT
	workspace_builds.*,
	workspaces.organization_id AS organization_id,
	workspaces.owner_id AS workspace_owner_id
FROM
	workspace_builds
INNER JOIN
	workspaces ON workspace_builds.workspace_id = workspaces.id
)
{{ end }}

-- To use the template above
{{ define "GetWorkspaceBuildByID" }}
SELECT
	*
FROM 
	{{ template "workspace_builds_rbac" }}
WHERE
	id = @build_id
{{ end }}

I investigated IDE Highlighting, but the tl;dr is that just using sql highlighting is likely the best. It will never be perfect with Go templates 😢.

How to use template query in golang?

I made a package called sqxqueriers that handles templates and keeping our @param named parameters for easier to read sqlc. You can see that here.

To call a query you made, you can use a generic function called sqlxGet for fetching one row:

func sqlxGet[RT any](ctx context.Context, q *sqlQuerier, queryName string, argument interface{}) (RT, error) {

The type embeds the sqlc WorkspaceBuild type. We would keep SQLc for generating models. We'd just move queries to this SQLx. The db struct tags are used for matching to columns. Obviously the query can be handled more manually if the tags do not match.

type WorkspaceBuildWithOwners struct {
WorkspaceBuild
OrganizationID uuid.UUID `db:"organization_id" json:"organization_id"`
OwnerID uuid.UUID `db:"owner_id" json:"owner_id"`
}
func (q *sqlQuerier) GetWorkspaceBuildByID(ctx context.Context, id uuid.UUID) (WorkspaceBuildWithOwners, error) {
return sqlxGet[WorkspaceBuildWithOwners](ctx, q, "GetWorkspaceBuildByID", id)
}

Comparison

SQLc Views SQLx + Templates
SQL highlighting ⚠️ (kinda)
Golang Code
Golang Simple Types
Golang Joined Types No embeds, duplicated Uses Embeds
Supports Dynamic Queries ❌ (gross strings replace)
No Migrations ❌ (migration to maintain view)

Highligting

SQL template highlighting in vscode isn't that bad. In Goland it's pretty terrible.

Golang Code

SQLx + Templates requires more infrastructure code to support, but this also gives us opportunity to add in features. SQLc is slow to update and add features we need.

Joined Types

SQLc joined types are duplicated structs with identical fields. We can do anonymous embeds for template joined types.

I would really like to see dbauthz not adhere to db.Store and then we can do some better type handling at this layer to make types more consistent. You cannot insert into a view, so when updating or inserting data, you cannot return the joined data. Meaning 2 types will exist. A thin and a joined.

Dynamic Queries

We currently use CASE WHEN statements for dynamic where clauses. This works fine, but is only supported in WHERE. Things not supported currently:

  • Conditional sort order (ASC vs DESC)
  • Conditional sort column
  • Conditional updates, only update fields provided. Caused bug in v1
@Emyrk Emyrk self-assigned this Mar 2, 2023
@Emyrk Emyrk added chore Non-customer facing refactors, cleanup, or technical debt. design Request for more beauty labels Mar 2, 2023
@deansheather
Copy link
Member

Templates seem better to me, it sucks that the highlighting is sorta janky but from the screenshots you sent me in DMs (you should post those in this thread) it looked okay. It could probably also be improved if we wrote our own syntax highlighting grammar for vscode.

@Emyrk
Copy link
Member Author

Emyrk commented Mar 2, 2023

Goland

Screenshot from 2023-03-02 16-49-41

VSCode

Screenshot from 2023-03-02 16-49-56

I think making our own grammer would be quite difficult.

I want to merge one of these PRS:

@Emyrk
Copy link
Member Author

Emyrk commented Mar 6, 2023

This came up as a feature request, negating search params.

It can be done with the existing code by defining 2 CASE WHEN blocks, or we could do it with templates with templates instead of CASE WHEN

{{ if .TemplateName }}
	template_id {{ if .TemplateName.Not }}!{{end}}= ANY(SELECT id FROM templates WHERE lower(name) = lower(@template_name) AND deleted = false)
{{ end }

OR mix and match

AND CASE
	WHEN @template_name :: text != '' THEN
		template_id {{ if .TemplateName.Not }}!{{ end }}= ANY(SELECT id FROM templates WHERE lower(name) = lower(@template_name) AND deleted = false)
	ELSE true
END

@spikecurtis
Copy link
Contributor

Template looks better to me as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
chore Non-customer facing refactors, cleanup, or technical debt. design Request for more beauty
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants