SQL changes workover #169

a-tze · 2018-04-28T09:34:14Z

See issue #166 .

Functionality testing still outstanding. Some minor, mostly cosmetic differences to current maser exist.

Feel free to add commits to this branch, I'll squash them in before removing WIP status. Accordingly, expect history rewrites in this branch as long as it is in WIP status.

The current last migration file will be replaced in a final commit when removing WIP state. If there are other migrations in the meantime, then a suitable change will be added.

pegro · 2018-05-14T16:05:22Z

First of all: thanks for (re-)working (on) this.
Second: I would prefer keeping all these commits separate in the history and don't squash them. I think we aren't really bisectable anyway, so that shouldn't be a concern.

I'll try to comment on the other things inline.

pegro · 2018-05-14T16:17:49Z

src/Application/Migrations/17_function_ticket_dependency_satisfied.sql

@@ -27,11 +27,20 @@ BEGIN
 		t.id = param_ticket_id;

 	SELECT
-		state >= p.dependent_ticket_trigger_state INTO satisfaction
+		dependent_ticket_state.sort >= configured_trigger_state.sort INTO satisfaction


I think with some careful migration patches it would be possible to get the ticket states into a useful native order.
But not sure if it'd be worth the effort.

Not sure if this breaks the possibility of later changes without deleting all tickets, e.g. not being able to REPLACE the enum when there are references to it.
It is quite some code to use the sort property, but may be better in the long run.

Inplace replacing is difficult, that's true. But one could introduce the new enum with a temporary name, add temporary columns to all related tables, copy the value from the old column, remove the old one and rename the enum column back to the old name. Should™ work ;)

Sounds good.. but please create an issue for that. I like the idea, but dont want to do it in this PR...

Okay, see #173

pegro · 2018-05-14T16:27:58Z

src/Application/Migrations/17_function_ticket_dependency_satisfied.sql

+	JOIN
+		tbl_ticket_state configured_trigger_state ON
+			t.ticket_type = 'encoding' AND
+			configured_trigger_state.ticket_state = p.dependent_ticket_trigger_state


It always takes a while for me to understand that code.
I think what confuses me, is the use of dependent and depending "tickets". One could read "dependent_ticket_state" as "the state of the ticket the given ticket is depending on" or as "the state of the given ticket which is depending on another ticket" (making it a dependent ticket).

I would propose to use "dependee" (https://en.wiktionary.org/wiki/dependee) and "depender" (https://en.wiktionary.org/wiki/depender), or just use "master ticket", whenever you mean the master ticket.
If you absolutely detest that, maybe I'm ok with more comments explaining the relationships.

Oh and you should probably mention in the commit message, that you switched from using the param_ticket_id as the depender-ticket-id to the master (or dependee) ticket-id.
Which is confusing, since ticket_depending_encoding_ticket_state still asks for the depender-ticket-id.

Or did I miss anything?

or just use "master ticket", whenever you mean the master ticket.

I'm probably the reason we're not currently doing this. As we set depends_on for encoding profiles, we have no notion of a master or an informal subticket anywhere else in the UI or code. Therefor i suggested something like dependent. I like dependee and maybe dependent? The last is a little more common. This would also free us from the language issues around master/slave.

Dependee and dependent is fine with me.
Regarding the "master" notion, in #171 it's about to be called that, if I saw that correctly.

in #171 it's about to be called that, if I saw that correctly.

I think this is only a comment and a UI label – I already requested changing this some time ago.

The semantical change of the parameter is likely an error, I will investigate that.

pegro · 2018-05-14T16:29:02Z

src/Application/Migrations/22_view_serviceable_tickets.sql

+	LEFT JOIN
+		tbl_ticket_state masterstate ON
+			masterstate.ticket_type = 'encoding' AND
+			masterstate.ticket_state = COALESCE(ticket_depending_encoding_ticket_state(t.id),pj.dependent_ticket_trigger_state)


Now that I see the use of "master" here, I'd be in favor to use it above as well. See comment above.

In the last commit in this PR, the view is reworked again and the selection of the master ticket is much clearer. So I guess the same should be done in ticket_depending_encoding_ticket_state_satisfied, if the change of parameter semantics is reverted.

pegro · 2018-05-14T17:06:09Z

src/Application/Migrations/22_view_serviceable_tickets.sql

@@ -36,7 +36,7 @@ CREATE OR REPLACE VIEW view_serviceable_tickets AS

 	WHERE
 		pj.read_only = false AND
-		t.ticket_type != 'meta' AND
+		t.ticket_type IN ('recording','encoding','ingest') AND


I'm just curious: why is that faster?

Because for some reasons Postgres doesnt get it that this is actually the same. The newer line will result in using the ANY operation which will in turn make use of a hash index, while using != will result in a "simple filter" on a full index scan or even table scan.

pegro · 2018-05-14T17:09:01Z

src/Application/Controller/XMLRPC/Handler.php

@@ -616,7 +616,7 @@ public function assignNextUnassignedForState($ticketType = '', $ticketState = ''
 					->scoped([
 						'virtual_property_filter' => [$propertyFilters]
 					])
-					->orderBy('ticket_priority(id) DESC');
+					->orderBy('calculated_priority DESC');


Is there a particular reason why we order here again, when the view already returns ordered results?

This is probably just legacy and/or a case of "better safe than sorry".

pegro · 2018-05-14T18:24:05Z

src/Application/Migrations/06_tickets.sql

@@ -136,6 +136,9 @@ WITHOUT OIDS;
 CREATE INDEX tbl_ticket_project_id_idx ON tbl_ticket USING btree(project_id);
 CREATE INDEX tbl_ticket_fahrplan_id_idx ON tbl_ticket USING btree(fahrplan_id);
 CREATE INDEX tbl_ticket_handle_id_idx ON tbl_ticket USING btree(handle_id);
+CREATE INDEX tbl_ticket_parent_id_idx ON tbl_ticket USING hash(parent_id);
+CREATE INDEX tbl_ticket_project_id_idx ON tbl_ticket USING hash(project_id);
+CREATE INDEX tbl_ticket_view_servicable_idx ON tbl_ticket USING btree(failed, service_executable, ticket_type);


Those indexes are missing in the migration file.

Mh I guess the whole migration patch file is not updated.
Edit: Oh you mentioned that in the PR comment. Sorry!

Oh, and tbl_ticket_project_id_idx already exists. See line 136.

Good catch with the duplicate index, will change the name.

a-tze · 2018-05-15T06:34:54Z

@pegro The "squashing in" did not mean I want to squash the commits/ the PR once it is approved. It's the solely purpose of this PR to introduce multiple commits with more explanation.
What I would like to squash into other commits are the corrections that are done now, e.g. renaming the duplicate index or stuff like that, just to have a super clean history.
Does that make sense?

pegro · 2018-05-15T11:55:06Z

Yes, makes sense. ;)

This reverts commits: 0a1412f 7c1b961 154cee0 fa337c8 b9fd435 46d6fb9

… comparison Belongs to feature #157. Enums do not have correct native order, instead a dedicated sort property exists, so this must be used to check if the configured state is reached by the dependent ticket.

…hem in view

Using a function for ordering seems to be not optimized by Postgres and therefore very expensive. Additionally, it seems with older Postgres versions this makes the use of LIMIT impossible when querying the view.

Eliminate another function call in a JOIN clause by incorporating the logic. Limit performance impact by using LATERAL JOIN.

… branch

a-tze · 2018-05-15T21:59:23Z

Ok.. rebased to current master, added migration and did some testing. The branch now includes a somewhat consistent renaming to dependee and depender.

@pegro @jjeising please do a final review. Note: Github seems to get the order of commits wrong.

pegro

Looks good.

pegro · 2018-05-15T22:14:31Z

src/Application/Migrations/06_tickets.sql

@@ -137,7 +137,6 @@ CREATE INDEX tbl_ticket_project_id_idx ON tbl_ticket USING btree(project_id);
 CREATE INDEX tbl_ticket_fahrplan_id_idx ON tbl_ticket USING btree(fahrplan_id);
 CREATE INDEX tbl_ticket_handle_id_idx ON tbl_ticket USING btree(handle_id);
 CREATE INDEX tbl_ticket_parent_id_idx ON tbl_ticket USING hash(parent_id);


The existing indexes were btree indexes, the added ones are hash indexes. According to the PostgreSQL documentation they are of limited use: Hash indexes can only handle simple equality comparisons [...]. Maybe change them to btree indexes?!

This is intentional. You usually do not query "less than" or "greater than" for things like IDs. Unfortunately in my tests those indexes were not used according to the execution plan when they were of type btree. Makes some sense in my head for JOINS. I can test that again if you think btree should be used by the query optimizer in every case a hash index is used.

Ok, so this was a conscious choice. Just wanted to make sure.

pegro · 2018-05-15T22:19:14Z

src/Application/Migrations/17_function_ticket_dependency_satisfied.sql

-			wantedstate.ticket_type = t2.ticket_type AND
-			wantedstate.ticket_state = p.dependent_ticket_trigger_state
+		tbl_ticket_state configured_trigger_state ON
+			depender_ticket.ticket_type = 'encoding' AND


I get that the JOIN conditions in line 24 and 28 filter for encoding tickets, but since they don't refer to the table joined, there might be multiple ticket states with the same ticket_state value?!

But ticket_state is an enum and therefore unique, isnt it? The JOIN is meant to deliver the state attributes for encoding tickets and NULLs for other ticket types. So is your concern that these JOINS produce more tupels? I cannot think of a case where this might happen.

If your concern is that dependee_ticket_state and configured_trigger_state might contain the same tupel of the state table, then yes this is possible of course.

ticket_state is an enum, but tbl_ticket_state is a table (which you are joining here) and it contains multiple rows with e.g. ticket_state = 'removing'. I know, it's not really likely that the trigger state would be one of those, but technically it could happen.

The JOIN is meant to deliver the state attributes for encoding tickets

But the ticket_type of tbl_ticket_state is not part of any condition... That's what I'm referring to.

Now I know what you mean. I had in memory that the states are unique, but they aren't. So this clash could happen for example with the state "gone". You are absolutely right, there needs to be an additional or changed condition. For easy review, I will put a commit for this on top of the branch (and another one for truncating old migration file).

Looks good, although I'd have used dependee_ticket_state.ticket_type = depender_ticket.ticket_type just to have the hard-coded value only once.

And since you don't do any left joins and such, I'd suggest to move the depender_ticket.ticket_type = 'encoding' condition to the where clause, to only have it once. The function should still return NULL, if the ticket is not a encoding ticket.

pegro · 2018-05-15T22:21:43Z

src/Application/Migrations/__2018-04-15_add_variable_dependent_ticket_state.sql

@@ -3,17 +3,17 @@ BEGIN;
 SET ROLE TO postgres;


I guess this migration file could just be removed.

No thats a bigger change, because it contains ADD COLUMN dependent_ticket_trigger_state, while the newer migration file contains a RENAME of that column.

I wanted to leave it in there in this PR because some installations are in that state currently. But it could be shrunk to the ADD COLUMN part.

Yes, I'd be in favor of shrinking.

* refactor names to clarify dependency relationship * add indexes for performance * copy logic of functions into view_servicable_tickets to improve query times

a-tze · 2018-05-16T21:00:51Z

@pegro please have a look at the last 3 commits from the latest batch. Another round of testing is still outstanding.

pegro · 2018-05-16T21:26:23Z

src/Application/Migrations/17_function_ticket_dependency_satisfied.sql

-			wantedstate.ticket_type = t2.ticket_type AND
-			wantedstate.ticket_state = p.dependent_ticket_trigger_state
+		tbl_ticket_state configured_trigger_state ON
+			depender_ticket.ticket_type = 'encoding' AND


Looks good, although I'd have used dependee_ticket_state.ticket_type = depender_ticket.ticket_type just to have the hard-coded value only once.

pegro · 2018-05-16T21:30:57Z

src/Application/Migrations/17_function_ticket_dependency_satisfied.sql

-			wantedstate.ticket_type = t2.ticket_type AND
-			wantedstate.ticket_state = p.dependent_ticket_trigger_state
+		tbl_ticket_state configured_trigger_state ON
+			depender_ticket.ticket_type = 'encoding' AND


And since you don't do any left joins and such, I'd suggest to move the depender_ticket.ticket_type = 'encoding' condition to the where clause, to only have it once. The function should still return NULL, if the ticket is not a encoding ticket.

pegro · 2018-05-16T21:38:53Z

src/Application/Migrations/22_view_serviceable_tickets.sql

-	LEFT JOIN
-		tbl_ticket_state wantedstate ON wantedstate.ticket_type = 'encoding' AND wantedstate.ticket_state = pj.dependent_ticket_trigger_state
+		tbl_ticket_state configured_trigger_state ON
+			configured_trigger_state.ticket_type = 'encoding' AND


I guess configured_trigger_state.ticket_type = t.ticket_type would do the same, but it always be encoding as long we don't allow other ticket types as encoding to have a encoding_profile_version_id set.
I'm ok with this. It's just I like to avoid as much hard-coding as possible without performance impact.

a-tze · 2018-05-25T13:28:40Z

@pegro good idea, I made this change for the function. Ommitting the change for the view right now.

Also @jjeising: If no objections exist, I will merge this in the next 3 days so that the other PRs can move forward.

pegro · 2018-05-25T15:14:01Z

Fine with me

a-tze added this to To do in Documentation via automation Apr 28, 2018

jjeising removed this from To do in Documentation Apr 30, 2018

pegro reviewed May 14, 2018

View reviewed changes

a-tze added 12 commits May 15, 2018 23:54

Revert commits to start over. See #166

fb11b48

This reverts commits: 0a1412f 7c1b961 154cee0 fa337c8 b9fd435 46d6fb9

SQL: bugfix: use sort property of ticket state instead of direct enum…

617a3b1

… comparison Belongs to feature #157. Enums do not have correct native order, instead a dedicated sort property exists, so this must be used to check if the configured state is reached by the dependent ticket.

SQL: remove unused part of function

c65d44d

SQL: clarify table aliases

0232dff

SQL: optimize view, change subselect to join for room property

8d5af3b

SQL: optimize view, omit tickets from archived projects

a7832de

SQL: optimize view performance by adding suitable indexes and using t…

68aea60

…hem in view

SQL: optimize view: incorporate logic of ticket_priority

0850876

Using a function for ordering seems to be not optimized by Postgres and therefore very expensive. Additionally, it seems with older Postgres versions this makes the use of LIMIT impossible when querying the view.

SQL: remove redundant columns from view

aaf10e8

SQL: bugfix: exclude tickets whose dependee is a failed ticket

1184576

SQL: optimize view

f13ff3f

Eliminate another function call in a JOIN clause by incorporating the logic. Limit performance impact by using LATERAL JOIN.

SQL: clarify names in view, remove CASE that can never reach its else…

a6d8550

… branch

a-tze force-pushed the 166-workover branch from 5feeb93 to bf662d1 Compare May 15, 2018 21:54

a-tze changed the title ~~WIP: SQL changes workover~~ SQL changes workover May 15, 2018

pegro reviewed May 15, 2018

View reviewed changes

a-tze added 7 commits May 16, 2018 20:55

SQL, Model: refactor DB function name

bf4086b

SQL: refactor table names to clarify meaning

34b81c8

SQL: Refactor column name in tbl_project to clarify relationship

c7acb74

SQL: add migration for last changes:

f402327

* refactor names to clarify dependency relationship * add indexes for performance * copy logic of functions into view_servicable_tickets to improve query times

SQL: remove unused JOIN introduced in 0a1412f

2f052df

SQL: add JOIN condition to match on exact entry in ticket_state table

85506c9

SQL: truncate superfluous parts of previous migration file

ee56bc4

a-tze force-pushed the 166-workover branch from bf662d1 to ee56bc4 Compare May 16, 2018 18:56

pegro reviewed May 16, 2018

View reviewed changes

SQL: reduce occurences of hard coded values

ccbb19b

a-tze merged commit 65188b0 into master May 28, 2018

a-tze deleted the 166-workover branch May 28, 2018 16:39

a-tze mentioned this pull request Jun 12, 2018

Comment view_serviceable_tickets #166

Closed

jjeising mentioned this pull request Jun 26, 2018

Failed release of master ticket blocks dependent encodings #180

Open

SQL changes workover #169

SQL changes workover #169

Conversation

a-tze commented Apr 28, 2018

pegro commented May 14, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

pegro May 14, 2018 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

a-tze commented May 15, 2018

pegro commented May 15, 2018

a-tze commented May 15, 2018

pegro left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

pegro May 16, 2018 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

a-tze commented May 16, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

a-tze commented May 25, 2018

pegro commented May 25, 2018

pegro May 14, 2018 •

edited

pegro May 16, 2018 •

edited