Skip to content

Commit

Permalink
MDEV-25121: innodb_flush_method=O_DIRECT fails on compressed tables
Browse files Browse the repository at this point in the history
Tests with 4096-byte sector size confirm that it is
safe to use O_DIRECT with page_compressed tables.
That had been disabled on Linux, in an attempt to fix MDEV-21584
which had been filed for the O_DIRECT problems earlier.

The fil_node_t::block_size was being set mostly correctly until
commit 10dd290 (MDEV-17380)
introduced a regression in MariaDB Server 10.4.4.

fil_node_t::read_page0(): Initialize fil_node_t::block_size.
This will probably make similar code in fil_space_extend_must_retry()
redundant, but we play it safe and will not remove that code.

Thanks to Vladislav Vaintroub for testing this on Microsoft Windows
using an old-fashioned rotational hard disk with 4KiB sector size.

Reviewed by: Vladislav Vaintroub
  • Loading branch information
dr-m committed Mar 18, 2021
1 parent 00f620b commit 6505662
Showing 1 changed file with 27 additions and 13 deletions.
40 changes: 27 additions & 13 deletions storage/innobase/fil/fil0fil.cc
Original file line number Diff line number Diff line change
Expand Up @@ -566,6 +566,10 @@ bool fil_node_t::read_page0(bool first)

this->size = ulint(size_bytes / psize);
space->committed_size = space->size += this->size;

if (block_size == 0) {
block_size = os_file_get_block_size(handle, name);
}
} else if (space->id != TRX_SYS_SPACE || space->size_in_header) {
/* If this is not the first-time open, do nothing.
For the system tablespace, we always get invoked as
Expand Down Expand Up @@ -605,12 +609,15 @@ static bool fil_node_open_file(fil_node_t* node)

const bool first_time_open = node->size == 0;

bool o_direct_possible = !FSP_FLAGS_HAS_PAGE_COMPRESSION(space->flags);
if (const ulint ssize = FSP_FLAGS_GET_ZIP_SSIZE(space->flags)) {
compile_time_assert(((UNIV_ZIP_SIZE_MIN >> 1) << 3) == 4096);
if (ssize < 3) {
o_direct_possible = false;
}
ulint type;
compile_time_assert(((UNIV_ZIP_SIZE_MIN >> 1) << 3) == 4096);
switch (FSP_FLAGS_GET_ZIP_SSIZE(space->flags)) {
case 1:
case 2:
type = OS_DATA_FILE_NO_O_DIRECT;
break;
default:
type = OS_DATA_FILE;
}

if (first_time_open
Expand All @@ -632,9 +639,7 @@ static bool fil_node_open_file(fil_node_t* node)
? OS_FILE_OPEN_RAW | OS_FILE_ON_ERROR_NO_EXIT
: OS_FILE_OPEN | OS_FILE_ON_ERROR_NO_EXIT,
OS_FILE_AIO,
o_direct_possible
? OS_DATA_FILE
: OS_DATA_FILE_NO_O_DIRECT,
type,
read_only_mode,
&success);

Expand Down Expand Up @@ -668,9 +673,7 @@ static bool fil_node_open_file(fil_node_t* node)
? OS_FILE_OPEN_RAW | OS_FILE_ON_ERROR_NO_EXIT
: OS_FILE_OPEN | OS_FILE_ON_ERROR_NO_EXIT,
OS_FILE_AIO,
o_direct_possible
? OS_DATA_FILE
: OS_DATA_FILE_NO_O_DIRECT,
type,
read_only_mode,
&success);
}
Expand Down Expand Up @@ -3601,11 +3604,22 @@ fil_ibd_create(
return(err);
}

ulint type;
compile_time_assert(((UNIV_ZIP_SIZE_MIN >> 1) << 3) == 4096);
switch (FSP_FLAGS_GET_ZIP_SSIZE(flags)) {
case 1:
case 2:
type = OS_DATA_FILE_NO_O_DIRECT;
break;
default:
type = OS_DATA_FILE;
}

file = os_file_create(
innodb_data_file_key, path,
OS_FILE_CREATE | OS_FILE_ON_ERROR_NO_EXIT,
OS_FILE_NORMAL,
OS_DATA_FILE,
type,
srv_read_only_mode,
&success);

Expand Down

0 comments on commit 6505662

Please sign in to comment.