Track unique ids for each scope #54

crisptrutski · 2024-06-11T22:39:19Z

In order to track the provenance of query outputs we will need to follow the flow of data from the source table columns, through the various scopes, to the final top-level expression.

In essence we replace each string label in the in the :context vector with a [label, integer-id] pair, so that we can tell when two scopes are the same, and in future use these references to build a DAG of the data dependencies.

The test here is very basic, it just checks that our set representation does not de-duplicate the contents of the identical sub-selects. This satisfies the TDD gods¹, but isn't that useful on its own.

It'll be superseded when we get to more complex queries, but we're quite a way off from supporting the examples I have in mind.

Actually I lied, and didn't write the test first, and then when I checked I was telling the truth I discovered that they already wouldn't be de-duplicated as Macaw tracks the second CTE's scope as nested within the first one. Well, I still think it's a useful minimal test for establishing which IDs should match and which shouldn't. ↩

crisptrutski · 2024-06-12T09:11:26Z

test/macaw/core_test.clj

@@ -485,6 +503,21 @@
    (components (query-fixture :snowflakelet))
    (components (query-fixture :snowflake))))

+(defn sorted-vec


Oops, doesn't return a vec anymore. All the usage sites compare it to vectors, which works fine, and is why I dropped the explicit conversion. In the next PR I simply rename it to sorted, so think this is OK to slip through.

crisptrutski · 2024-06-12T09:11:59Z

test/resources/duplicate_scopes.sql

+SELECT
+    b.x,
+    c.x
+FROM b, c;


I add a trailing line-break in a later PR.

crisptrutski · 2024-06-12T09:16:36Z

test/macaw/core_test.clj

+  (is (=? [{:component {:column "x"}, :scope ["SELECT" (=?/same :subselect-1)]}
+           {:component {:column "x"}, :scope ["SELECT" (=?/same :subselect-2)]}


I don't think that Hawk will care if these two values are actually the same as each other, or as :top-level, but we certainly do! Maybe we need something new like (=?/unique :blah) which enforces uniqueness across all the captures?

tsmacdonald · 2024-06-13T10:45:26Z

src/macaw/core.clj

  (collect/query->components statement (merge {:preserve-identifiers? true} opts)))

 (defn replace-names
-  "Given a SQL query, apply the given table, column, and schema renames.
+  "Given an SQL query, apply the given table, column, and schema renames.


And now we know that I say "sequel" and you say "ess queue ell" :)

tsmacdonald

a lot to chew on, but it looks good to me

Track unique scope ids

deeab0a

crisptrutski added the .Team/BackendComponents label Jun 11, 2024

crisptrutski requested review from piranha and tsmacdonald June 11, 2024 22:39

crisptrutski self-assigned this Jun 11, 2024

Give up figuring out why hawk is sad

b0359e0

crisptrutski commented Jun 12, 2024

View reviewed changes

test/resources/duplicate_scopes.sql

SELECT

b.x,

c.x

FROM b, c;

Copy link

Collaborator Author

crisptrutski Jun 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I add a trailing line-break in a later PR.

crisptrutski commented Jun 12, 2024

View reviewed changes

crisptrutski mentioned this pull request Jun 12, 2024

Add a bunch of compoud query fixtures, with dubious analysis #56

Merged

tsmacdonald reviewed Jun 13, 2024

View reviewed changes

tsmacdonald approved these changes Jun 13, 2024

View reviewed changes

crisptrutski merged commit 5924980 into master Jun 13, 2024
4 checks passed

crisptrutski deleted the unique-scopes branch June 13, 2024 12:05

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Track unique ids for each scope #54

Track unique ids for each scope #54

crisptrutski commented Jun 11, 2024 •

edited

Loading

crisptrutski Jun 12, 2024

crisptrutski Jun 12, 2024

crisptrutski Jun 12, 2024

tsmacdonald Jun 13, 2024

tsmacdonald left a comment

		(is (=? [{:component {:column "x"}, :scope ["SELECT" (=?/same :subselect-1)]}
		{:component {:column "x"}, :scope ["SELECT" (=?/same :subselect-2)]}

Track unique ids for each scope #54

Track unique ids for each scope #54

Conversation

crisptrutski commented Jun 11, 2024 • edited Loading

Footnotes

crisptrutski Jun 12, 2024

Choose a reason for hiding this comment

crisptrutski Jun 12, 2024

Choose a reason for hiding this comment

crisptrutski Jun 12, 2024

Choose a reason for hiding this comment

tsmacdonald Jun 13, 2024

Choose a reason for hiding this comment

tsmacdonald left a comment

Choose a reason for hiding this comment

crisptrutski commented Jun 11, 2024 •

edited

Loading