New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MDEV-11039 - Add new scheduling algorithm for reducing tail latencies (for 10.2) #248

Merged
merged 19 commits into from Oct 24, 2016

Conversation

Projects
None yet
6 participants
@sensssz
Contributor

sensssz commented Oct 22, 2016

This branch introduces a new scheduling algorithm (Variance-Aware-Transaction-Scheduling, VATS) for the record lock manager of InnoDB based on MariaDB 10.2. Instead of using First-Come-First-Served (FCFS), the newly introduced algorithm prefers the eldest transaction. A configuration parameter (innodb_lock_schedule_algorithm) is introduced for users to choose between VATS and FCFS (the default one). We've extensively tested this algorithm in many workloads. The algorithm is very simple, and the changes are very local, but it significantly improves performance (in terms of average latency and throughput) and predictability (in terms of reduction of tail and quantile latencies) For more details, please refer to this paper http://arxiv.org/abs/1602.01871

@janlindstrom

Changes look correct and fine. There is small differences on used style and normal InnoDB style because InnoDB style uses tab for indention not spaces. Could you please fix the indention before I will merge this. Currently on 10.2 there is no xtradb.

Show outdated Hide outdated sql/rpl_parallel.cc
@@ -1410,7 +1410,7 @@ handle_rpl_parallel_thread(void *arg)
mysql_mutex_unlock(&rpt->LOCK_rpl_thread);
my_thread_end();

This comment has been minimized.

@janlindstrom

janlindstrom Oct 23, 2016

Contributor

This is unnecessary.

@janlindstrom

janlindstrom Oct 23, 2016

Contributor

This is unnecessary.

Show outdated Hide outdated storage/innobase/include/trx0trx.h
@@ -1096,6 +1096,8 @@ struct trx_t {
time_t start_time; /*!< time the state last time became
TRX_STATE_ACTIVE */
clock_t start_time_micro; /*!< start time of the transaction
in microseconds. */

This comment has been minimized.

@janlindstrom

janlindstrom Oct 23, 2016

Contributor

Indention does not look correct in all places, InnoDB uses tabs not spaces.

@janlindstrom

janlindstrom Oct 23, 2016

Contributor

Indention does not look correct in all places, InnoDB uses tabs not spaces.

Show outdated Hide outdated storage/innobase/lock/lock0lock.cc
if (trx_is_high_priority(lock1->trx)) {
return true;
}
if (trx_is_high_priority(lock2->trx)) {

This comment has been minimized.

@janlindstrom

janlindstrom Oct 23, 2016

Contributor

Indention does not look correct here.

@janlindstrom

janlindstrom Oct 23, 2016

Contributor

Indention does not look correct here.

Show outdated Hide outdated storage/innobase/lock/lock0lock.cc
ulint space;
ulint page_no;
ulint rec_fold;
hash_table_t* hash;

This comment has been minimized.

@janlindstrom

janlindstrom Oct 23, 2016

Contributor

Indention in this function does not look correct, use tab-characters not spaces.

@janlindstrom

janlindstrom Oct 23, 2016

Contributor

Indention in this function does not look correct, use tab-characters not spaces.

Show outdated Hide outdated storage/innobase/lock/lock0lock.cc
@@ -4823,7 +5197,7 @@ lock_release(
ut_d(lock_check_dict_lock(lock));
if (lock_get_type_low(lock) == LOCK_REC) {

This comment has been minimized.

@janlindstrom

janlindstrom Oct 23, 2016

Contributor

This looks unnecessary.

@janlindstrom

janlindstrom Oct 23, 2016

Contributor

This looks unnecessary.

Show outdated Hide outdated storage/innobase/lock/lock0lock.cc
@@ -52,12 +53,16 @@ Created 5/7/1996 Heikki Tuuri
#include "row0mysql.h"
#include "pars0pars.h"
#include <inttypes.h>

This comment has been minimized.

@janlindstrom

janlindstrom Oct 23, 2016

Contributor

Where you use this ?

@janlindstrom

janlindstrom Oct 23, 2016

Contributor

Where you use this ?

@sensssz

This comment has been minimized.

Show comment
Hide comment
@sensssz

sensssz Oct 23, 2016

Contributor

@janlindstrom Hi, Jan. I've replaced all space indention with tab indention in my code. I also undid all changes to XtraDB. Could you kindly take a look?

Contributor

sensssz commented Oct 23, 2016

@janlindstrom Hi, Jan. I've replaced all space indention with tab indention in my code. I also undid all changes to XtraDB. Could you kindly take a look?

@svoj

This comment has been minimized.

Show comment
Hide comment
@svoj

svoj Oct 24, 2016

Contributor

Hi Jiamin,

Thanks for your contribution. These changes were originally proposed for 10.1 in PR#245, Normally there's no need to create separate pull request for 10.2, because it will be merged from 10.1 anyway.

But I guess target version of this PR is not decided currently, so I keep both PR's open.

Thanks,
Sergey

Contributor

svoj commented Oct 24, 2016

Hi Jiamin,

Thanks for your contribution. These changes were originally proposed for 10.1 in PR#245, Normally there's no need to create separate pull request for 10.2, because it will be merged from 10.1 anyway.

But I guess target version of this PR is not decided currently, so I keep both PR's open.

Thanks,
Sergey

@svoj svoj changed the title from Add new scheduling algorithm for reducing tail latencies (for 10.2) to MDEV-11039 - Add new scheduling algorithm for reducing tail latencies (for 10.2) Oct 24, 2016

@janlindstrom janlindstrom merged commit b09b316 into MariaDB:10.2 Oct 24, 2016

1 check failed

continuous-integration/travis-ci/pr The Travis CI build could not complete due to an error
Details
@sensssz

This comment has been minimized.

Show comment
Hide comment
@sensssz

sensssz Oct 24, 2016

Contributor

@svoj Hi, Sergey. I created a separate pull request for 10.2 because the locking system has undergone some significant code refactor in it, so the pull request for 10.1 would not be compatible with 10.2. Also from our information it would be easier for you to accept new features in version 10.2.

This patch contains several fixes that hasn't been adopted in the previous patch yet. Do you want me to update that patch as well?

Contributor

sensssz commented Oct 24, 2016

@svoj Hi, Sergey. I created a separate pull request for 10.2 because the locking system has undergone some significant code refactor in it, so the pull request for 10.1 would not be compatible with 10.2. Also from our information it would be easier for you to accept new features in version 10.2.

This patch contains several fixes that hasn't been adopted in the previous patch yet. Do you want me to update that patch as well?

@janlindstrom

This comment has been minimized.

Show comment
Hide comment
@janlindstrom

janlindstrom Oct 24, 2016

Contributor

Actually, I did accept this also for 10.1. So if there is some fixed needed could you create new pull request for them and make VATS as default.

Contributor

janlindstrom commented Oct 24, 2016

Actually, I did accept this also for 10.1. So if there is some fixed needed could you create new pull request for them and make VATS as default.

@svoj

This comment has been minimized.

Show comment
Hide comment
@svoj

svoj Oct 24, 2016

Contributor

@sensssz, you're right. It perfectly makes sense if code is too different between versions. Please follow up with Jan regarding additional patches.

Contributor

svoj commented Oct 24, 2016

@sensssz, you're right. It perfectly makes sense if code is too different between versions. Please follow up with Jan regarding additional patches.

@sensssz

This comment has been minimized.

Show comment
Hide comment
@sensssz

sensssz Oct 24, 2016

Contributor

@janlindstrom Sure I will create a pull request for 10.1. For that version should I also implement it in XtraDB or should I undo all the changes to XtraDB?

Contributor

sensssz commented Oct 24, 2016

@janlindstrom Sure I will create a pull request for 10.1. For that version should I also implement it in XtraDB or should I undo all the changes to XtraDB?

@barzan

This comment has been minimized.

Show comment
Hide comment
@barzan

barzan Oct 24, 2016

Why not keep the XtraDB changes (for 10.1)?

barzan commented Oct 24, 2016

Why not keep the XtraDB changes (for 10.1)?

@vuvova

This comment has been minimized.

Show comment
Hide comment
@vuvova

vuvova Oct 25, 2016

Member

@janlindstrom I don't think this should go into 10.1 at all, 10.1 is GA for quite a while, no new features.

Member

vuvova commented Oct 25, 2016

@janlindstrom I don't think this should go into 10.1 at all, 10.1 is GA for quite a while, no new features.

@janlindstrom

This comment has been minimized.

Show comment
Hide comment
@janlindstrom

janlindstrom Oct 25, 2016

Contributor

No undo for XtraDB and for 10.1 for now.

Contributor

janlindstrom commented Oct 25, 2016

No undo for XtraDB and for 10.1 for now.

@svoj svoj added this to the 10.2 milestone Jul 11, 2017

@dr-m

This comment has been minimized.

Show comment
Hide comment
@dr-m

dr-m Jul 5, 2018

Contributor

@sensssz Hi Jiamin,
The parameter innodb_lock_schedule_algorithm=vats seems to be causing a debug assertion failure in MariaDB, reported as MDEV-16664.

I see that you also contributed this to MySQL, and it was introduced there under WL#10793, MySQL Bug#84266, mysql/mysql-server#115, mysql/mysql-server@fb056f4

In MySQL 8.0, I found the following bug fixes related to this:
mysql/mysql-server@f395242 (a test case refers to MySQL Bug #89829, which I cannot access).
mysql/mysql-server@2248a3e, mysql/mysql-server@c917825 (these could be addressing a similar issue as MDEV-16664)
mysql/mysql-server@c1990af (memory management issue, apparently specific to the MySQL 8.0 implementation; in MariaDB lock_grant_and_move_on_page() which corresponds to the MySQL 8.0 lock_grant_vats() does not allocate memory)

It looks like MySQL 8.0 employs a threshold based on lock_sys->n_waiting < LOCK_VATS_THRESHOLD (defined as 32), so it would be dynamically choosing between the algorithms FCFS and VATS (also called CATS by them) based on load. This could make it harder to repeat the MDEV-16664 failure scenario in MySQL 8.0.

I would appreciate it if you could comment how the implementations of the algorithm differ between MariaDB 10.2 and MySQL 8.0.

Contributor

dr-m commented Jul 5, 2018

@sensssz Hi Jiamin,
The parameter innodb_lock_schedule_algorithm=vats seems to be causing a debug assertion failure in MariaDB, reported as MDEV-16664.

I see that you also contributed this to MySQL, and it was introduced there under WL#10793, MySQL Bug#84266, mysql/mysql-server#115, mysql/mysql-server@fb056f4

In MySQL 8.0, I found the following bug fixes related to this:
mysql/mysql-server@f395242 (a test case refers to MySQL Bug #89829, which I cannot access).
mysql/mysql-server@2248a3e, mysql/mysql-server@c917825 (these could be addressing a similar issue as MDEV-16664)
mysql/mysql-server@c1990af (memory management issue, apparently specific to the MySQL 8.0 implementation; in MariaDB lock_grant_and_move_on_page() which corresponds to the MySQL 8.0 lock_grant_vats() does not allocate memory)

It looks like MySQL 8.0 employs a threshold based on lock_sys->n_waiting < LOCK_VATS_THRESHOLD (defined as 32), so it would be dynamically choosing between the algorithms FCFS and VATS (also called CATS by them) based on load. This could make it harder to repeat the MDEV-16664 failure scenario in MySQL 8.0.

I would appreciate it if you could comment how the implementations of the algorithm differ between MariaDB 10.2 and MySQL 8.0.

@sensssz

This comment has been minimized.

Show comment
Hide comment
@sensssz

sensssz Jul 5, 2018

Contributor

Hi, Marko,

The algorithms we merged into MariaDB is actually different from what we merged into MySQL 8.0. They are based on two of our papers (VATS, CATS). VATS grants locks to transactions according to their age, while CATS does so according to each transaction's dependency set size (the number of transactions being blocked on it). Therefore, CATS's implementation is much more complicated because it also needs to maintain each transaction's dependency set size.

For VATS, the idea is very simple: it inserts transactions to the queue according to their ages so that they are granted in this order. An additional thing I did was to keep the granted locks at the front of the queue.

Is there a specific test case that I can use to reproduce the problem? I will look into this problem as well.

Contributor

sensssz commented Jul 5, 2018

Hi, Marko,

The algorithms we merged into MariaDB is actually different from what we merged into MySQL 8.0. They are based on two of our papers (VATS, CATS). VATS grants locks to transactions according to their age, while CATS does so according to each transaction's dependency set size (the number of transactions being blocked on it). Therefore, CATS's implementation is much more complicated because it also needs to maintain each transaction's dependency set size.

For VATS, the idea is very simple: it inserts transactions to the queue according to their ages so that they are granted in this order. An additional thing I did was to keep the granted locks at the front of the queue.

Is there a specific test case that I can use to reproduce the problem? I will look into this problem as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment