New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixes #4337 #4338

Merged
merged 2 commits into from Oct 1, 2017

Conversation

Projects
None yet
4 participants
@Sesquipedalian
Member

Sesquipedalian commented Sep 29, 2017

Adds t.id_topic to the GROUP BY clause in the check_query of the poll_options_missing_poll test in loadForumTests(). This makes the query compatible with ONLY_FULL_GROUP_BY mode.

If @albertlast has any feedback, that'd be appreciated.

Fixes #4337
Signed-off-by: Jon Stovell <jonstovell@gmail.com>
@albertlast

This comment has been minimized.

Show comment
Hide comment
@albertlast

albertlast Sep 29, 2017

Collaborator

In my eyes your missing the other columns?
t.id_board, t.id_member_started, m.member_name

Also this query: https://github.com/Sesquipedalian/SMF2.1/blob/23c1bc281ca90cdd06f7c58ed04d64a56d5fa34f/Sources/RepairBoards.php#L1857
doesn't work under pg and also maybe not for innodb, but it create no error...

Collaborator

albertlast commented Sep 29, 2017

In my eyes your missing the other columns?
t.id_board, t.id_member_started, m.member_name

Also this query: https://github.com/Sesquipedalian/SMF2.1/blob/23c1bc281ca90cdd06f7c58ed04d64a56d5fa34f/Sources/RepairBoards.php#L1857
doesn't work under pg and also maybe not for innodb, but it create no error...

@Sesquipedalian

This comment has been minimized.

Show comment
Hide comment
@Sesquipedalian

Sesquipedalian Sep 29, 2017

Member

In my eyes your missing the other columns?
t.id_board, t.id_member_started, m.member_name

I don't get any errors even if those are not included. I was surprised by that, too. Do you think it would be better to include them anyway?

Also this query: https://github.com/Sesquipedalian/SMF2.1/blob/23c1bc281ca90cdd06f7c58ed04d64a56d5fa34f/Sources/RepairBoards.php#L1857
doesn't work under pg and also maybe not for innodb, but it create no error...

Would you mind putting together a PR to fix that?

Member

Sesquipedalian commented Sep 29, 2017

In my eyes your missing the other columns?
t.id_board, t.id_member_started, m.member_name

I don't get any errors even if those are not included. I was surprised by that, too. Do you think it would be better to include them anyway?

Also this query: https://github.com/Sesquipedalian/SMF2.1/blob/23c1bc281ca90cdd06f7c58ed04d64a56d5fa34f/Sources/RepairBoards.php#L1857
doesn't work under pg and also maybe not for innodb, but it create no error...

Would you mind putting together a PR to fix that?

@sbulen

This comment has been minimized.

Show comment
Hide comment
@sbulen

sbulen Sep 30, 2017

Contributor

ALTER TABLE ORDER BY is a proprietary MySQL command. There is no equivalent elsewhere. It physically rewrites the table in the order specified. (Wow... ?!?!?!?)

A quick search reveals it is known to be problematic. This article describes it & some of the wrinkles:
https://dev.mysql.com/worklog/task/?id=3681

There is a special pg identifier for "order_by_board_order" referenced, but it doesn't appear to be defined in subs_db_postgresql. I believe they can provide pg hints or unique SQL substitutions. (The special pg identifiers are supposed to allow for differences between pg & mysql.)

So... If I understand this correctly (big if...) there are two options - (1) fix the pg identifier and leave the ALTER TABLE alone, OR, (2) do not use the ALTER TABLE ORDER BY and instead add the index on board_order everywhere.

I suspect (1) is safer, since most users are on MySQL. (2) is more purist, and is likely the better long-term solution, but we'd have to look very carefully at board queries to ensure no performance hit.

Contributor

sbulen commented Sep 30, 2017

ALTER TABLE ORDER BY is a proprietary MySQL command. There is no equivalent elsewhere. It physically rewrites the table in the order specified. (Wow... ?!?!?!?)

A quick search reveals it is known to be problematic. This article describes it & some of the wrinkles:
https://dev.mysql.com/worklog/task/?id=3681

There is a special pg identifier for "order_by_board_order" referenced, but it doesn't appear to be defined in subs_db_postgresql. I believe they can provide pg hints or unique SQL substitutions. (The special pg identifiers are supposed to allow for differences between pg & mysql.)

So... If I understand this correctly (big if...) there are two options - (1) fix the pg identifier and leave the ALTER TABLE alone, OR, (2) do not use the ALTER TABLE ORDER BY and instead add the index on board_order everywhere.

I suspect (1) is safer, since most users are on MySQL. (2) is more purist, and is likely the better long-term solution, but we'd have to look very carefully at board queries to ensure no performance hit.

@albertlast

This comment has been minimized.

Show comment
Hide comment
@albertlast

albertlast Sep 30, 2017

Collaborator

I don't get any errors even if those are not included. I was surprised by that, too. Do you think it would be better to include them anyway?

That is no surprised because the other columns are functionally dependent, https://dev.mysql.com/doc/refman/5.7/en/group-by-functional-dependence.html
but when you look at the doc you see mysql below 5.7 doesn't support this,
that is the reason that you had to maintain the complet list.

Would you mind putting together a PR to fix that?

when you don't do it than i had to do it.

ALTER TABLE ORDER BY is a proprietary MySQL command. There is no equivalent elsewhere. It physically rewrites the table in the order specified. (Wow... ?!?!?!?)

commonly known as cluster: https://www.postgresql.org/docs/current/static/sql-cluster.html

So... If I understand this correctly (big if...) there are two options - (1) fix the pg identifier and leave the ALTER TABLE alone, OR, (2) do not use the ALTER TABLE ORDER BY and instead add the index on board_order everywhere.

Maybe i'm alone but i ask my self why we cluster the table by board_order,
in my eyes it's make no sense in worst case it's make everything slower.
Since the command make no sense to me would be solution for pg to skip this,
or to cluster this for a reasonable col.

Collaborator

albertlast commented Sep 30, 2017

I don't get any errors even if those are not included. I was surprised by that, too. Do you think it would be better to include them anyway?

That is no surprised because the other columns are functionally dependent, https://dev.mysql.com/doc/refman/5.7/en/group-by-functional-dependence.html
but when you look at the doc you see mysql below 5.7 doesn't support this,
that is the reason that you had to maintain the complet list.

Would you mind putting together a PR to fix that?

when you don't do it than i had to do it.

ALTER TABLE ORDER BY is a proprietary MySQL command. There is no equivalent elsewhere. It physically rewrites the table in the order specified. (Wow... ?!?!?!?)

commonly known as cluster: https://www.postgresql.org/docs/current/static/sql-cluster.html

So... If I understand this correctly (big if...) there are two options - (1) fix the pg identifier and leave the ALTER TABLE alone, OR, (2) do not use the ALTER TABLE ORDER BY and instead add the index on board_order everywhere.

Maybe i'm alone but i ask my self why we cluster the table by board_order,
in my eyes it's make no sense in worst case it's make everything slower.
Since the command make no sense to me would be solution for pg to skip this,
or to cluster this for a reasonable col.

@sbulen

This comment has been minimized.

Show comment
Hide comment
@sbulen

sbulen Sep 30, 2017

Contributor

Ah! I wasn't aware of pg's cluster. So this may be an opportunity for an smcFunc, where pg uses the cluster & mysql uses the alter table order by.

What I think the rationale is this... I believe the underlying assumption is that getting all board records in board_order is going to happen extremely often (at least every time you load the board-index???). Having the table already in physical order is quicker. Especially where the number of boards is relatively low (less than a couple hundred), and you need to read the whole table, already sequenced, in just one or two physical I/Os. In these cases, even an index adds unnecessary overhead.

I've seen similar structures used in the past. Extremely high volume reads, low volume data, low volatility. We used to worry about such things... Making it so a full table scan + already in the desired order in just a couple I/O = the perfect solution.

And you are correct @albertlast if these conditions are NOT met, just indexing it like everything else is definitely the way to go.

The real question is whether those base assumptions stand. If so, I'd lean towards an smcFunc. If not, no need for clustering, and a normal index would work.

Contributor

sbulen commented Sep 30, 2017

Ah! I wasn't aware of pg's cluster. So this may be an opportunity for an smcFunc, where pg uses the cluster & mysql uses the alter table order by.

What I think the rationale is this... I believe the underlying assumption is that getting all board records in board_order is going to happen extremely often (at least every time you load the board-index???). Having the table already in physical order is quicker. Especially where the number of boards is relatively low (less than a couple hundred), and you need to read the whole table, already sequenced, in just one or two physical I/Os. In these cases, even an index adds unnecessary overhead.

I've seen similar structures used in the past. Extremely high volume reads, low volume data, low volatility. We used to worry about such things... Making it so a full table scan + already in the desired order in just a couple I/O = the perfect solution.

And you are correct @albertlast if these conditions are NOT met, just indexing it like everything else is definitely the way to go.

The real question is whether those base assumptions stand. If so, I'd lean towards an smcFunc. If not, no need for clustering, and a normal index would work.

@albertlast

This comment has been minimized.

Show comment
Hide comment
@albertlast

albertlast Sep 30, 2017

Collaborator

Since you got sub-boards and subs-subs-boards etc. and not all level are shown on "board-index" ther is no guarantee that this order is good.
Also you could had hav user how are not allowed to see all boards...
bring me get to my question if it realy a good idea to cluster the table by board_order...

When you want to to use the pg cluster function than you need a index how describe the cluster and
for board_order didn't exists one.

Collaborator

albertlast commented Sep 30, 2017

Since you got sub-boards and subs-subs-boards etc. and not all level are shown on "board-index" ther is no guarantee that this order is good.
Also you could had hav user how are not allowed to see all boards...
bring me get to my question if it realy a good idea to cluster the table by board_order...

When you want to to use the pg cluster function than you need a index how describe the cluster and
for board_order didn't exists one.

Includes all selected columns in GROUP BY clause
Signed-off-by: Jon Stovell <jonstovell@gmail.com>
@Sesquipedalian

This comment has been minimized.

Show comment
Hide comment
@Sesquipedalian

Sesquipedalian Sep 30, 2017

Member

when you don't do it than i had to do it.

Since (1) it isn't the problem this PR is trying to correct, and (2) you know SQL better than I do, I think I will just leave it alone in this PR.

Member

Sesquipedalian commented Sep 30, 2017

when you don't do it than i had to do it.

Since (1) it isn't the problem this PR is trying to correct, and (2) you know SQL better than I do, I think I will just leave it alone in this PR.

@jdarwood007

This comment has been minimized.

Show comment
Hide comment
@jdarwood007

jdarwood007 Sep 30, 2017

Member

The query also doesn't work with innodb if I remember correctly. I have a feeling this code is from the MyISAM times and when this provided a performance boost. Nowadays I don't think it provides any performance benefits and innodb is now standard.

Member

jdarwood007 commented Sep 30, 2017

The query also doesn't work with innodb if I remember correctly. I have a feeling this code is from the MyISAM times and when this provided a performance boost. Nowadays I don't think it provides any performance benefits and innodb is now standard.

@albertlast

This comment has been minimized.

Show comment
Hide comment
@albertlast

albertlast Sep 30, 2017

Collaborator

When i had to provide a pr for this,
than would be content of my pr the delete of this query nothing more.

Collaborator

albertlast commented Sep 30, 2017

When i had to provide a pr for this,
than would be content of my pr the delete of this query nothing more.

@albertlast

This comment has been minimized.

Show comment
Hide comment
@albertlast

albertlast Oct 1, 2017

Collaborator

The pr #4347 is in place

Collaborator

albertlast commented Oct 1, 2017

The pr #4347 is in place

@Sesquipedalian Sesquipedalian merged commit 2c7756a into SimpleMachines:release-2.1 Oct 1, 2017

2 checks passed

Scrutinizer No new issues
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details

@Sesquipedalian Sesquipedalian deleted the Sesquipedalian:repair_boards branch Oct 21, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment