Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MDEV-11039 - Add new scheduling algorithm for reducing tail latencies (for 10.2) #248

Merged
merged 19 commits into from
Oct 24, 2016

Conversation

sensssz
Copy link
Contributor

@sensssz sensssz commented Oct 22, 2016

This branch introduces a new scheduling algorithm (Variance-Aware-Transaction-Scheduling, VATS) for the record lock manager of InnoDB based on MariaDB 10.2. Instead of using First-Come-First-Served (FCFS), the newly introduced algorithm prefers the eldest transaction. A configuration parameter (innodb_lock_schedule_algorithm) is introduced for users to choose between VATS and FCFS (the default one). We've extensively tested this algorithm in many workloads. The algorithm is very simple, and the changes are very local, but it significantly improves performance (in terms of average latency and throughput) and predictability (in terms of reduction of tail and quantile latencies) For more details, please refer to this paper http://arxiv.org/abs/1602.01871

Copy link
Contributor

@janlindstrom janlindstrom left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes look correct and fine. There is small differences on used style and normal InnoDB style because InnoDB style uses tab for indention not spaces. Could you please fix the indention before I will merge this. Currently on 10.2 there is no xtradb.

@@ -1410,7 +1410,7 @@ handle_rpl_parallel_thread(void *arg)
mysql_mutex_unlock(&rpt->LOCK_rpl_thread);

my_thread_end();

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is unnecessary.

@@ -1096,6 +1096,8 @@ struct trx_t {

time_t start_time; /*!< time the state last time became
TRX_STATE_ACTIVE */
clock_t start_time_micro; /*!< start time of the transaction
in microseconds. */
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indention does not look correct in all places, InnoDB uses tabs not spaces.

if (trx_is_high_priority(lock1->trx)) {
return true;
}
if (trx_is_high_priority(lock2->trx)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indention does not look correct here.

ulint space;
ulint page_no;
ulint rec_fold;
hash_table_t* hash;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indention in this function does not look correct, use tab-characters not spaces.

@@ -4823,7 +5197,7 @@ lock_release(
ut_d(lock_check_dict_lock(lock));

if (lock_get_type_low(lock) == LOCK_REC) {

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks unnecessary.

@@ -52,12 +53,16 @@ Created 5/7/1996 Heikki Tuuri
#include "row0mysql.h"
#include "pars0pars.h"

#include <inttypes.h>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where you use this ?

@sensssz
Copy link
Contributor Author

sensssz commented Oct 23, 2016

@janlindstrom Hi, Jan. I've replaced all space indention with tab indention in my code. I also undid all changes to XtraDB. Could you kindly take a look?

@svoj
Copy link

svoj commented Oct 24, 2016

Hi Jiamin,

Thanks for your contribution. These changes were originally proposed for 10.1 in PR#245, Normally there's no need to create separate pull request for 10.2, because it will be merged from 10.1 anyway.

But I guess target version of this PR is not decided currently, so I keep both PR's open.

Thanks,
Sergey

@svoj svoj changed the title Add new scheduling algorithm for reducing tail latencies (for 10.2) MDEV-11039 - Add new scheduling algorithm for reducing tail latencies (for 10.2) Oct 24, 2016
@janlindstrom janlindstrom merged commit b09b316 into MariaDB:10.2 Oct 24, 2016
@sensssz
Copy link
Contributor Author

sensssz commented Oct 24, 2016

@svoj Hi, Sergey. I created a separate pull request for 10.2 because the locking system has undergone some significant code refactor in it, so the pull request for 10.1 would not be compatible with 10.2. Also from our information it would be easier for you to accept new features in version 10.2.

This patch contains several fixes that hasn't been adopted in the previous patch yet. Do you want me to update that patch as well?

@janlindstrom
Copy link
Contributor

Actually, I did accept this also for 10.1. So if there is some fixed needed could you create new pull request for them and make VATS as default.

@svoj
Copy link

svoj commented Oct 24, 2016

@sensssz, you're right. It perfectly makes sense if code is too different between versions. Please follow up with Jan regarding additional patches.

@sensssz
Copy link
Contributor Author

sensssz commented Oct 24, 2016

@janlindstrom Sure I will create a pull request for 10.1. For that version should I also implement it in XtraDB or should I undo all the changes to XtraDB?

@barzan
Copy link

barzan commented Oct 24, 2016

Why not keep the XtraDB changes (for 10.1)?

@vuvova
Copy link
Member

vuvova commented Oct 25, 2016

@janlindstrom I don't think this should go into 10.1 at all, 10.1 is GA for quite a while, no new features.

@janlindstrom
Copy link
Contributor

No undo for XtraDB and for 10.1 for now.

@svoj svoj added this to the 10.2 milestone Jul 11, 2017
@dr-m
Copy link
Contributor

dr-m commented Jul 5, 2018

@sensssz Hi Jiamin,
The parameter innodb_lock_schedule_algorithm=vats seems to be causing a debug assertion failure in MariaDB, reported as MDEV-16664.

I see that you also contributed this to MySQL, and it was introduced there under WL#10793, MySQL Bug#84266, mysql/mysql-server#115, mysql/mysql-server@fb056f4

In MySQL 8.0, I found the following bug fixes related to this:
mysql/mysql-server@f395242 (a test case refers to MySQL Bug #89829, which I cannot access).
mysql/mysql-server@2248a3e, mysql/mysql-server@c917825 (these could be addressing a similar issue as MDEV-16664)
mysql/mysql-server@c1990af (memory management issue, apparently specific to the MySQL 8.0 implementation; in MariaDB lock_grant_and_move_on_page() which corresponds to the MySQL 8.0 lock_grant_vats() does not allocate memory)

It looks like MySQL 8.0 employs a threshold based on lock_sys->n_waiting < LOCK_VATS_THRESHOLD (defined as 32), so it would be dynamically choosing between the algorithms FCFS and VATS (also called CATS by them) based on load. This could make it harder to repeat the MDEV-16664 failure scenario in MySQL 8.0.

I would appreciate it if you could comment how the implementations of the algorithm differ between MariaDB 10.2 and MySQL 8.0.

@sensssz
Copy link
Contributor Author

sensssz commented Jul 5, 2018

Hi, Marko,

The algorithms we merged into MariaDB is actually different from what we merged into MySQL 8.0. They are based on two of our papers (VATS, CATS). VATS grants locks to transactions according to their age, while CATS does so according to each transaction's dependency set size (the number of transactions being blocked on it). Therefore, CATS's implementation is much more complicated because it also needs to maintain each transaction's dependency set size.

For VATS, the idea is very simple: it inserts transactions to the queue according to their ages so that they are granted in this order. An additional thing I did was to keep the granted locks at the front of the queue.

Is there a specific test case that I can use to reproduce the problem? I will look into this problem as well.

@vuvova
Copy link
Member

vuvova commented Aug 17, 2020

Hi, @sensssz,

sorry that you didn't get a reply earlier.

MDEV-16664 is still not closed. May be you'd like to check the issue itself, it's https://jira.mariadb.org/browse/MDEV-16664

There are few ways to reproduce the problem, in issue comments. Either with an RQG (random query generator, https://github.com/MariaDB/randgen) or simply by running one of the mysql-test cases on repeat.

@janlindstrom
Copy link
Contributor

One additional problem with VATS is the fact that it is not compatible with Galera replication where if lock request is done by brute force thread (BF) no other lock request may granted before it as BF transactions may not wait for locks (if there is already granted lock request its holder thread is selected as victim and killed). This fact naturally can be fixed with, if wsrep_thd_is_BF(trx->mysql_thd) we set transaction age so high that no other transaction lock request can be granted before it.

@dr-m
Copy link
Contributor

dr-m commented Sep 18, 2020

@sensssz, I posted in MDEV-16664 some information from an rr replay trace that I analyzed. It looks like two LOCK_X were wrongly granted on the same record for two concurrently running DELETE statements. Can you please check that analysis and suggest what to check further? The GDB commands watch and break are rather powerful in rr replay, but only when you know what to look for. Somehow the conflicting lock is not being reached via the hash table.

@dr-m
Copy link
Contributor

dr-m commented Oct 2, 2020

@sensssz Due to lack of response, I am going to deprecate the configuration parameter innodb_lock_schedule_algorithm in MariaDB Server 10.2 (issue a warning if it is set to the value VATS) and remove the parameter in MariaDB Server 10.6.

@dr-m
Copy link
Contributor

dr-m commented Oct 5, 2020

I added a deprecation and corruption warning in 295e2d5
and removed this code from the MariaDB Server 10.6 branch in b4fb15c.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
6 participants