Skip to content

Commit 50324ce

Browse files
committed
MDEV-21351 Replace recv_sys.heap with list of buf_block_t
InnoDB crash recovery used a special type of mem_heap_t that allocates backing store from the buffer pool. That incurred a significant overhead, leading to underutilization of memory, and limiting the maximum contiguous allocated size of a log record. recv_sys_t::blocks: A linked list of buf_block_t that are allocated by buf_block_alloc() for redo log records. Replaces recv_sys_t::heap. We repurpose buf_block_t::unzip_LRU for linking the elements. recv_sys_t::max_log_blocks: Renamed from recv_n_pool_free_frames. recv_sys_t::max_blocks(): Accessor for max_log_blocks. recv_sys_t::alloc(): Allocate memory from the current recv_sys_t::blocks element, or allocate another block. In debug builds, various free() member functions must be invoked, because we repurpose buf_page_t::buf_fix_count for tracking allocations. recv_sys_t::free_corrupted_page(): Renamed from recv_recover_corrupt_page() recv_sys_t::is_memory_exhausted(): Renamed from recv_sys_heap_check() recv_sys_t::pages and its elements are allocated directly by the system memory allocator. recv_parse_log_recs(): Remove the parameter available_memory. We rename some variables 'store_to_hash' to 'store', because recv_sys.pages is not actually a hash table. This is joint work with Thirunarayanan Balathandayuthapani.
1 parent a983b24 commit 50324ce

File tree

6 files changed

+218
-124
lines changed

6 files changed

+218
-124
lines changed

extra/mariabackup/xtrabackup.cc

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@ MariaBackup: hot backup tool for InnoDB
44
Originally Created 3/3/2009 Yasufumi Kinoshita
55
Written by Alexey Kopytov, Aleksandr Kuzminsky, Stewart Smith, Vadim Tkachenko,
66
Yasufumi Kinoshita, Ignacio Nin and Baron Schwartz.
7-
(c) 2017, 2019, MariaDB Corporation.
7+
(c) 2017, 2020, MariaDB Corporation.
88
Portions written by Marko Mäkelä.
99
1010
This program is free software; you can redistribute it and/or modify
@@ -2680,7 +2680,7 @@ static lsn_t xtrabackup_copy_log(lsn_t start_lsn, lsn_t end_lsn, bool last)
26802680

26812681
store_t store = STORE_NO;
26822682

2683-
if (more_data && recv_parse_log_recs(0, &store, 0, false)) {
2683+
if (more_data && recv_parse_log_recs(0, &store, false)) {
26842684

26852685
msg("Error: copying the log failed");
26862686

storage/innobase/buf/buf0buf.cc

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -5984,7 +5984,7 @@ buf_page_io_complete(buf_page_t* bpage, bool dblwr, bool evict)
59845984
buf_corrupt_page_release(bpage, space);
59855985

59865986
if (recv_recovery_is_on()) {
5987-
recv_recover_corrupt_page(corrupt_page_id);
5987+
recv_sys.free_corrupted_page(corrupt_page_id);
59885988
}
59895989

59905990
space->release_for_io();

storage/innobase/buf/buf0rea.cc

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
/*****************************************************************************
22
33
Copyright (c) 1995, 2017, Oracle and/or its affiliates. All Rights Reserved.
4-
Copyright (c) 2015, 2019, MariaDB Corporation.
4+
Copyright (c) 2015, 2020, MariaDB Corporation.
55
66
This program is free software; you can redistribute it and/or modify it under
77
the terms of the GNU General Public License as published by the Free Software
@@ -766,7 +766,7 @@ buf_read_recv_pages(
766766
ulint count = 0;
767767

768768
buf_pool = buf_pool_get(cur_page_id);
769-
while (buf_pool->n_pend_reads >= recv_n_pool_free_frames / 2) {
769+
while (buf_pool->n_pend_reads >= recv_sys.max_blocks() / 2) {
770770

771771
os_thread_sleep(10000);
772772

storage/innobase/include/log0recv.h

Lines changed: 51 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
/*****************************************************************************
22
33
Copyright (c) 1997, 2016, Oracle and/or its affiliates. All Rights Reserved.
4-
Copyright (c) 2017, 2019, MariaDB Corporation.
4+
Copyright (c) 2017, 2020, MariaDB Corporation.
55
66
This program is free software; you can redistribute it and/or modify it under
77
the terms of the GNU General Public License as published by the Free Software
@@ -47,11 +47,6 @@ dberr_t
4747
recv_find_max_checkpoint(ulint* max_field)
4848
MY_ATTRIBUTE((nonnull, warn_unused_result));
4949

50-
/** Remove records for a corrupted page.
51-
This function should called when srv_force_recovery > 0.
52-
@param[in] page_id page id of the corrupted page */
53-
ATTRIBUTE_COLD void recv_recover_corrupt_page(page_id_t page_id);
54-
5550
/** Apply any buffered redo log to a page that was just read from a data file.
5651
@param[in,out] bpage buffer pool page */
5752
ATTRIBUTE_COLD void recv_recover_page(buf_page_t* bpage);
@@ -106,14 +101,12 @@ bool recv_sys_add_to_parsing_buf(const byte* log_block, lsn_t scanned_lsn);
106101
to wait merging to file pages.
107102
@param[in] checkpoint_lsn the LSN of the latest checkpoint
108103
@param[in] store whether to store page operations
109-
@param[in] available_memory memory to read the redo logs
110104
@param[in] apply whether to apply the records
111105
@return whether MLOG_CHECKPOINT record was seen the first time,
112106
or corruption was noticed */
113107
bool recv_parse_log_recs(
114108
lsn_t checkpoint_lsn,
115109
store_t* store,
116-
ulint available_memory,
117110
bool apply);
118111

119112
/** Moves the parsing buffer data left to the buffer start */
@@ -223,6 +216,10 @@ struct page_recv_t
223216
iterator end() { return NULL; }
224217
bool empty() const { ut_ad(!head == !tail); return !head; }
225218
inline void clear();
219+
#ifdef UNIV_DEBUG
220+
/** Declare the records as freed; @see recv_sys_t::alloc() */
221+
inline void free() const;
222+
#endif
226223
} log;
227224

228225
/** Ignore any earlier redo log records for this page. */
@@ -284,8 +281,6 @@ struct recv_sys_t{
284281
record, or 0 if none was parsed */
285282
/** the time when progress was last reported */
286283
time_t progress_time;
287-
mem_heap_t* heap; /*!< memory heap of log records and file
288-
addresses*/
289284

290285
using map = std::map<const page_id_t, page_recv_t,
291286
std::less<const page_id_t>,
@@ -314,6 +309,26 @@ struct recv_sys_t{
314309
/** Last added LSN to pages. */
315310
lsn_t last_stored_lsn;
316311

312+
private:
313+
/** Maximum number of buffer pool blocks to allocate for redo log records */
314+
ulint max_log_blocks;
315+
316+
/** Base node of the redo block list (up to max_log_blocks)
317+
List elements are linked via buf_block_t::unzip_LRU. */
318+
UT_LIST_BASE_NODE_T(buf_block_t) blocks;
319+
public:
320+
/** @return the maximum number of buffer pool blocks for log records */
321+
ulint max_blocks() const { return max_log_blocks; }
322+
/** Check whether the number of read redo log blocks exceeds the maximum.
323+
Store last_stored_lsn if the recovery is not in the last phase.
324+
@param[in,out] store whether to store page operations
325+
@return whether the memory is exhausted */
326+
inline bool is_memory_exhausted(store_t *store);
327+
328+
#ifdef UNIV_DEBUG
329+
/** whether all redo log in the current batch has been applied */
330+
bool after_apply= false;
331+
#endif
317332
/** Initialize the redo log recovery subsystem. */
318333
void create();
319334

@@ -352,6 +367,32 @@ struct recv_sys_t{
352367
progress_time = time;
353368
return true;
354369
}
370+
371+
/** Get the memory block for storing recv_t and redo log data
372+
@param[in] len length of the data to be stored
373+
@param[in] store_recv whether to store recv_t object
374+
@return pointer to len bytes of memory (never NULL) */
375+
inline byte *alloc(size_t len, bool store_recv= false);
376+
377+
#ifdef UNIV_DEBUG
378+
private:
379+
/** Find the buffer pool block that is storing a redo log record.
380+
@param[in] data pointer to buffer returned by alloc()
381+
@return redo list element */
382+
inline buf_block_t *find_block(const void *data) const;
383+
public:
384+
/** Declare a redo log record freed from a buffer pool block.
385+
@param[in] data pointer to buffer returned by alloc() */
386+
inline void free(const void *data) const;
387+
#endif
388+
389+
/** @return the free length of the latest alloc() block, in bytes */
390+
inline size_t get_free_len() const;
391+
392+
/** Remove records for a corrupted page.
393+
This function should only be called when innodb_force_recovery is set.
394+
@param page_id corrupted page identifier */
395+
ATTRIBUTE_COLD void free_corrupted_page(page_id_t page_id);
355396
};
356397

357398
/** The recovery system */
@@ -392,10 +433,4 @@ times! */
392433
roll-forward */
393434
#define RECV_SCAN_SIZE (4U << srv_page_size_shift)
394435

395-
/** This many frames must be left free in the buffer pool when we scan
396-
the log and store the scanned log records in the buffer pool: we will
397-
use these free frames to read in pages when we start applying the
398-
log records to the database. */
399-
extern ulint recv_n_pool_free_frames;
400-
401436
#endif

storage/innobase/include/mem0mem.h

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
/*****************************************************************************
22
33
Copyright (c) 1994, 2016, Oracle and/or its affiliates. All Rights Reserved.
4-
Copyright (c) 2017, 2019, MariaDB Corporation.
4+
Copyright (c) 2017, 2020, MariaDB Corporation.
55
66
This program is free software; you can redistribute it and/or modify it under
77
the terms of the GNU General Public License as published by the Free Software
@@ -59,7 +59,6 @@ buffer pool; the latter method is used for very big heaps */
5959
/** Different type of heaps in terms of which datastructure is using them */
6060
#define MEM_HEAP_FOR_BTR_SEARCH (MEM_HEAP_BTR_SEARCH | MEM_HEAP_BUFFER)
6161
#define MEM_HEAP_FOR_PAGE_HASH (MEM_HEAP_DYNAMIC)
62-
#define MEM_HEAP_FOR_RECV_SYS (MEM_HEAP_BUFFER)
6362
#define MEM_HEAP_FOR_LOCK_HEAP (MEM_HEAP_BUFFER)
6463

6564
/** The following start size is used for the first block in the memory heap if

0 commit comments

Comments
 (0)