Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Session table improvements #19708

Closed
wants to merge 1 commit into from

Conversation

csthomas
Copy link
Contributor

@csthomas csthomas commented Feb 16, 2018

Summary of Changes

This PR provides an improvement in the performance of high-traffic sites.

  1. Change type of column session_id from varchar to varbinary.
  2. Change type of column time from varchar to integer.

Why I changed time column to integer?
Before this PR the value of time column is stored as text.
Comparing text to text such as "a" <'b' can not use the index key.

When joomla deletes all expired sessions then need to scan whole table because the index (of time column) is useless.

Drawback of varchar:

  • varchar type use language processing on comparison, it is slower
  • varchar takes up to 4 bytes per character when it is loaded into RAM
  • when we compare strings like '21' < '110' then the result returns false (instead true), see Can't login after setting lifetime to 15000000 #13933

Why I changed session_id column to varbinary?
A good explanation is at https://github.com/symfony/http-foundation/blob/master/Session/Storage/Handler/PdoSessionHandler.php#L214

Testing Instructions

  1. Install joomla 3.8.6 or newer
  2. Login and apply this PR 19708, for example by com_patchtester. For PostgreSQL database you should apply additional PR Repair the update of database schema changes on postgreSQL #19707 - it is merged.
  3. Go to backend -> extension -> manage -> database -> and click on fix button.
  4. Everything should go without errors. You should be still logged in.
  5. Your session works as before. You can login/logout without any problems.

Expected result

All works as before.

@@ -76,7 +76,7 @@ public function write($id, $data)
$query = $db->getQuery(true)
->update($db->quoteName('#__session'))
->set($db->quoteName('data') . ' = ' . $db->quote($data))
->set($db->quoteName('time') . ' = ' . $db->quote((int) time()))
->set($db->quoteName('time') . ' = ' . (int) time())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

time() returns an integer. Do we need (int)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will remove it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I should do this as in php5 there is an extension apd, pecl install apd and there is a function override_function(), which can change time() function to something else.
For now, I leave it as it is.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By that argument you can't rely on any of the core PHP functions to do as they document. Should someone be running one of those types of override tools or custom compiling PHP and changing C level internals, it's a problem for them and it's not something we can account for at our level without not using the language features.

@alikon
Copy link
Contributor

alikon commented Feb 17, 2018

we should take care of the update path, of course, but a good move

@joeforjoomla
Copy link
Contributor

Is this something that may happen again switching to VARBINARY?
#9423

@csthomas
Copy link
Contributor Author

@joeforjoomla

I do not know yet.
If someone use join to session table using session_id column then it would be good to check that.

@mbabker
Copy link
Contributor

mbabker commented Feb 17, 2018

If someone use join to session table using session_id column then it would be good to check that.

There are people joining the session table on that column for reasons I do not comprehend (I didn't in 2015 and I don't in 2018, I have not found one legitimate use case where my database schema needs a FK to a table storing the session identifier).

@joeforjoomla
Copy link
Contributor

A common use case to join the session table is for example when dealing with a chat extension allowing communication between guest users.

@csthomas
Copy link
Contributor Author

I did simple tests with success:

SELECT s.session_id, old.session_id FROM `j38_session` s INNER JOIN `j38_session_old` old ON s.`session_id` = old.`session_id`;
SELECT * FROM `j38_session` WHERE session_id = '4m82kgvd3nr49s3melhnmejqlg';

@joeforjoomla
Copy link
Contributor

joeforjoomla commented Feb 17, 2018

I did tests too, using MySQL 5.1, 5.5 and 5.7
In all cases with success, no errors are thrown because of the different field type/charset/collation so defintely there should not be any problem.

@Quy
Copy link
Contributor

Quy commented Feb 17, 2018

@joeforjoomla Please mark your test at Issue Tracker. Thanks.

@csthomas
Copy link
Contributor Author

It is too early to mark the test as a success.

This PR probably has to wait until #19687 will be merged.

`client_id` tinyint(3) unsigned DEFAULT NULL,
`guest` tinyint(4) unsigned DEFAULT 1,
`time` varchar(14) DEFAULT '',
`guest` tinyint(3) unsigned DEFAULT 1,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should not be changed from 4 to 3 since it is not done during updates?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@Quy
Copy link
Contributor

Quy commented Feb 23, 2018

@csthomas #19750 mentioned adding an index on guest column. Would that be another improvement to consider?

@csthomas
Copy link
Contributor Author

Yes, we can probably use the userid column instead of using guest. Column userid is already indexed.
If there is a relation like: if guest==0 then userid!=0 and if guest==1 then userid==0 then it should be ok.

* Add missing alter table for guest
* Remove cast to int for the result of time()
@csthomas csthomas changed the title [WIP] Session table improvements Session table improvements Feb 26, 2018
@csthomas
Copy link
Contributor Author

csthomas commented Feb 26, 2018

I updated the code. Now this PR is ready for testing.

@csthomas
Copy link
Contributor Author

@joeforjoomla Could you re-test this PR.

@csthomas
Copy link
Contributor Author

@BrainforgeUK Because you know the subject well, could you test it?

@joeforjoomla
Copy link
Contributor

@csthomas I re-tested this PR again up to MySQL 8.0 without any issue when joining between #__session.session_id AS varinary(192) and #__customtable.session_id AS varchar(191)

However i can't ensure that everything will work correctly on other RDBMS such as MariaDB, Postgresql, etc

@csthomas
Copy link
Contributor Author

@joeforjoomla thanks,

It is enough, great, now we need one more test and one of the maintainers will decide.

@ghost
Copy link

ghost commented Aug 14, 2018

@joeforjoomla please mark your Test as successfully:

  • open Issue Tracker
  • Login with your github-Account
  • Click on blue "Test this"-Button above Authors-Picture
  • mark your Test as successfully
  • hit "submit test result"

@joeforjoomla
Copy link
Contributor

I have tested this item ✅ successfully on 104f824


This comment was created with the J!Tracker Application at issues.joomla.org/tracker/joomla-cms/19708.

ALTER TABLE "#__session" ALTER COLUMN "session_id" TYPE bytea USING "session_id"::bytea;
ALTER TABLE "#__session" ALTER COLUMN "session_id" SET NOT NULL;
ALTER TABLE "#__session" ALTER COLUMN "time" DROP DEFAULT,
ALTER COLUMN "time" TYPE integer USING "time"::integer;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The above will work when executing the file on upgrade,
but will it work with the Database "Fix" button ?

The SQL parsing code for every ALTER TABLE statement seems to handle only 1 altering sub-clause case per ALTER TABLE statement
(i have not tested i just reviewed the following file and i think remember similar case for Mysql in the past)

libraries\src\Schema\ChangeItem\PostgresqlChangeItem.php
(same limitation i think applies for MysqlChangeItem too class)

So I think it should become

ALTER TABLE "#__session" ALTER COLUMN "time" DROP DEFAULT;
ALTER TABLE "#__session" ALTER COLUMN "time" TYPE integer USING "time"::integer;

If it can handle multiple sub-clauses then please ignore this

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The above will work when executing the file on upgrade,
but will it work with the Database "Fix" button ?

Yes

@csthomas
Copy link
Contributor Author

@ggppdk

The SQL parsing code for every ALTER TABLE statement seems to handle only 1 altering sub-clause case per ALTER TABLE statement

In short, joomla will see only the last sub-clause (to compare difference with db), but db did all clauses.

Explanation for PostgreSQL,

ALTER TABLE "#__session" ALTER COLUMN "time" DROP DEFAULT,
                         ALTER COLUMN "time" TYPE integer USING "time"::integer;
ALTER TABLE "#__session" ALTER COLUMN "time" SET DEFAULT 0;

** The first line ** removes the default value of the time column, but joomla did not see it because there is a second line (after the comma) - joomla should not know that the default value is removed, because it back in the third line,
in the second line, we change the column type, and now joomla sees it,
in the third line we set a new default column value and joomla also sees it.

We use two operations in one "ALTER TABLE" so that joomla can not see the change in the first line. This is useful when we can not directly change the default value (or change the INDEX of column), but first we need to remove the default value and then add another value.

I have tested it again a few minutes ago (by update) and in pgAdmin3 now I have:

  "time" integer NOT NULL DEFAULT 0,

@csthomas
Copy link
Contributor Author

Joomla on PostreSQL before click on "Database Fix"

screenshot_20180814_145204

@alikon
Copy link
Contributor

alikon commented Aug 14, 2018

I have tested this item ✅ successfully on 104f824

can't say i was able to meter performance gain, but it's ok for me


This comment was created with the J!Tracker Application at issues.joomla.org/tracker/joomla-cms/19708.

@ghost
Copy link

ghost commented Aug 14, 2018

Ready to Commit after two successful tests.

@joomla-cms-bot joomla-cms-bot added the RTC This Pull Request is Ready To Commit label Aug 14, 2018
@csthomas
Copy link
Contributor Author

Thank you all for your interest.

@mbabker mbabker added this to the Joomla 3.9.0 milestone Aug 21, 2018
mbabker pushed a commit that referenced this pull request Aug 28, 2018
@mbabker
Copy link
Contributor

mbabker commented Aug 28, 2018

Merged to 3.9-dev via 6d1e90f

@mbabker mbabker closed this Aug 28, 2018
@joomla-cms-bot joomla-cms-bot removed the RTC This Pull Request is Ready To Commit label Aug 28, 2018
@csthomas csthomas deleted the session_id_and_time branch August 29, 2018 10:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants