Skip to content

Commit

Permalink
WT-8372 Enhance test/format to mirror writes across tables and verify…
Browse files Browse the repository at this point in the history
… correctness (#7662)

* Add a callback to the Btree search routine so that format can dump pages before they're evicted.
* Move the timestamp code into its own file.
* Change the next/prev code to quit doing any key order comparisons at all. The problem was that
nextprev() changed TINFO.keyno, which borked mirror operations, but that was fixable -- on the
other hand, key order checking slows things down, and the WiredTiger library (in HAVE_DIAGNOSTIC
mode), has better checking.
* Remove g.rts_no_check; because format no longer does multiple runs, it's never going to fire.
* If transaction.timestamps are explicitly configured off, turn off ops.prepare.
* Fix wrong printf(3) types.
* Don't trace repeated reads unless configured for all trace records.
* Update the usage message.
* FLCS fix: silently allow modifications in FLCS objects by converting them to updates.

FLCS fix: fix paths where no bit value was generated for FLCS objects.

Promote global values to per-table values before selecting a run source, otherwise we can pick
one that doesn't match when re-opening a database.

Rework trace support using configuration options instead of the -T command-line option, it
gives us more flexibility. Expand trace configuration, separating bulk, reads, cursor and TS
operations out, add a new option to configure the number of log files that are retained.

* Add a README.

* If we can't insert a new column-store record, any enclosing transaction must be rolled back.

* In configurations with no inserts, we can end a mirror check before the end of the original
rows (when the original rows have been removed). Handle early not-found from the cursors.

* Don't configure more than 10 tables for in-memory runs, it can blow out the cache.

* Don't attempt a checkpoint on an in-memory run, it's not allowed.

* Add a comment, the cache is configured last.

* Fix a comment.

* Alphabetize fields.

* Clarify.

* More VLCS/FLCS caution around the end of the mirrored rows. It's possible for FLCS/VLCS to
move from within the original rows to within the inserted rows based on delete and insert
patterns.

* Remove a debugging printf.

* Use a non-regex character as the modify pad character so it's easier to search the output.

* Log library verbose messages along with format trace messages.

Use the library verbose functions to create the message prefix for format's tracing.

Allow reconfiguration of verbose options, the bug is that any reconfiguration of the
connection clears the existing verbose configurations, regardless of whether or not
the configuration includes verbose settings.

Rework the general error functions to handle big error messages, necessary to avoid
truncating long verbose messages.

Remove the option to trace format operations in the primary database, it's no longer
needed now that primary database verbose messages are logged in the secondary.

Revert the change to move trace options into the CONFIG file, it's more trouble than
it's worth.

* Use the ap_copy on error if it's been set.

* update

* It's OK to mirror in-memory, let that configuration proceed.

* Initialize the locks earlier, bulk-load calls functions that use them.

* Add -T option to turn on tracing in format.

* Fix a bug introduced in the develop merge.

* Add trace option to format.sh recovery test.

* Evict pages when resetting operation cursors.
Minor clarification to truncate trace message.

* typo

* Replace testutil_checksys() calls with testutil_assert_errno().

* Don't output MongoDB verbose messages in standalone builds.

* Overwrite the '=' sign on retain=xx

* Turn off the buffer overflow error message, format easily exceeds the buffer in trace mode.

* txn_timestamp.c:760:33: error: variable 'use_pinned_ts' set but not used
[-Werror=unused-but-set-variable]

* Another try and making both standalone & not-standalone builds compile cleanly.

* Remove the variable entirely.

* WT-8487 FLCS reconciliation zeroes the entire disk buffer in advance (#7264)

Don't zero the whole FLCS disk buffer up front.

* WT-8502 Remove duplicate key in Evergreen file (#7274)

* When there's a checkpoint thread running in format, check for mirror object consistency after
each checkpoint.
Add trace messages for checkpoints.

* Add tracing in backup and expand mirror verify output.

* WT-8466 Move btree I/O functionality into the block-cache implementation (#7248)

Combine the btree I/O functions and the block-manager block-cache functionality into a new layer between the two.

* WT-8518 Tidy the FLCS verify code to iterate the page only once. (#7284)

Tidy the FLCS verify code to iterate the page only once.

* WT-8514 Apply an explicit page size limit to FLCS pages (#7281)

Apply an explicit page size limit to FLCS pages.

* WT-8497 Initial riscv64 support (#7269)

Initial riscv64 support

* lint

* Put the additional transaction-oriented verbose messages on a separate trace variable.
Format the format.i file.

* Add "multi_fail" trace option to keep going after the first failure (up to 20 total).
Simplify verify trace output, add base/table URIs to failure output.

* Fix up configurations with timestamps and salvage or prepare, flag why we can't mix salvage
and timestamps.

* Fix a comment.

* Dump the cursor pages on mirror failures.
Don't include the mirror-fail trace option in the "all" option.

* Missing newline.

* Fix transactional configuration in case nothing at all was explicitly specified.

* If transaction.timestamps explicitly specified, still have to deal with prepare/logging choice.

* Don't verify the metadata when opening the backup as discussed in WT-8561.  When opening the
backup/recovery directory with verify_metadata=true, the verify code checkpoints the metadata
after running the verify. That checkpoint, resets the LSN for the metadata to WT_MAX_LSN and
causes recovery of the backup database to skip applying the log records.

* Protect cursor-dump-page functions with HAVE_DIAGNOSTIC.

* snap.c:284:18: error: format specifies type 'unsigned char' but the argument has type 'unsigned
int' [-Werror,-Wformat]

* Another attempt to fix the compiler complaint.

* iutil.c:300:29: error: unused parameter 'cursor' [-Werror,-Wunused-parameter]

* Configure file_close_sync=0 if mirrors are enabled.

* Don't try to display the keys unless object is type ROW.

* Make __wt_debug_cursor_tree_hs() match the commen description, it takes a cursor handle
not a session handle (choosing cursor for no particular reason).

* Make __wt_debug_cursor_tree_hs() match the commen description, it takes a cursor handle
not a session handle (choosing cursor for no particular reason).

* Remove the code to dump the HS file -- it just hangs all the time because we can't get a lock
to open the HS handle. We need to find a caching solution (which will in turn fail once we
have a HS file per object). My preference is to leave this problem up to the btree diagnostic
routines, they should know how to get a HS handle and dump only the parts of the HS file that
are relevant.

* Clear the return value, it might have been set and we'll quit doing any work in that case.

* typo

* Add the format operations ID to the trace log, update the README file to explain it.
Don't trace the stable TS at RTS, it's not useful.
Rework FLCS output on failure so we get specific information on remove operations and
not-found returns, rather than just a value of 0x0.

* Flag SIGKILL as the Linux OOM signal.

* Use a barrier instead of an atomic instruction, it's sufficent and more efficient.

* Get the stable timestamp after calling rollback-to-stable and update the local copy.

We no longer call RTS with active worker threads, we can use a single WT_SESSION when
replaying RTS operations.

* Add trace messages around alter call.

* Add RTS key debugging.

* Add home_backup to know if we're reopening from a backup.

* Hack the format-mirror debugging code in RTS.

* Finish the merge.

* Remove Sue's original RTS printf's, Hari has replaced them in the develop branch.

* FLCS fixes

* Print the checkpoint name (if any) on a mirroring failure.

* unused variable.

* snap.c:183:45: error: implicit conversion loses integer precision: 'unsigned
int' to 'uint8_t' (aka 'unsigned char') [-Werror,-Wconversion]

* file_close_sync is gone.

* Display failure information for FLCS.

* Pick the access method before promoting any "already set" values from the base configuration,
that allows type configurations like "row,var" to work for all tables, rather than using the
base value chosen.

* When doing an update for an FLCS modify operation, the FLCS value has to come from the
tinfo.value buffer, not the tinfo.new_value buffer.

* Print out FLCS mismatch in hex.

* Get rid of tinfo.new_value, we only need a single value buffer for each thread.

* Revert "Get rid of tinfo.new_value, we only need a single value buffer for each thread."

This reverts commit 39c4013.

* Make FLCS work: put the "new" value for modify, update and insert operations into the new_value
buffer, otherwise our cursor next/prev ops can overwrite the value we need for upcoming FLCS
operations.

We do blind modifies, but FLCS does updates instead of modifies. If the first blind modify doesn't
find a row, skip the rest of the operation, else the FLCS operation won't match a deleted row.

* Update format to use the faster set-timestamp functions.

* Minor shuffling, more TS stuff in ts.c

* Configure stress testing of FLCS.

* Fix a problem with FLCS non-overwrite insert.

The value returned for tombstones and implicit rows was uninitialized rubbish rather than 0.

* Turn on checkpoint cursors, they should work now.

* The fast-truncate code no longer pins pages into memory, and we've significantly increased the
number of files and rows in a stress run over time. Increase the number of concurrent truncates
to 4, but decrease the number of rows truncated to a max of 2% of the object.

* Dump the last successful match for RS/VLCS so it's easier to spot truncate runs.

* Fix the drop checkpoint loop, just use the testutil one.

* Add a space between the message and the checkpoint name.

* format

* Loop until we get a matching set of checkpoint cursors.

* Verify named checkpoints just like default checkpoints.

Fail if we see EBUSY returned from a checkpoint.

Minor trace message spacing.

* message format

* Note how to sort the LOG to get the ops you want.

* The page dump chunks weren't quite identical, merge them into one.

* WT-9091 Allow customisation of the cppsuite tracking table (#7872)

Co-authored-by: Etienne Petrel <etienne.petrel@mongodb.com>

* WT-9271 Clean up use of hard-coded numbers for object ids. (#7878)

* WT-9059 Improve error handling for S3 extension classes (#7871)

* Implemented error code mapping between HTTP response code expected by calls to aws and errno codes in S3Connection.

* WT-9164 Add a comment explaning why the exact value could be non-zero in __curhs_insert. (#7890)

* WT-9252 instantiated fast-truncate pages in read-only trees fail eviction tests (#7865)

Instantiated fast-truncate pages in read-only trees fail eviction tests.

* WT-9275 Update s_style to avoid matching on words with punctuation (#7895)

* Updating the regex to avoid matching on punctuation

* Update regex to match  spaces and fullstops at the end of a word

* Use \s to denote spaces in the regex

* Put the checkpoint name on a line by itself.

* Remove "assert.write_timestamp" configuration, it's no longer meaningful in WT.
Remove "write_timestamp_usage" configuration, it's no longer meaningful in WT.
Run format.

* If mirroring is already configured for the tables, don't redo it.

* Rename maximum_read_ts() to maximum_committed_ts(), we're going to need it for more general
purposes.

* Quit using all_durable as the basis for oldest/stable timestamp calculations (it's generally more
expensive/slow than the alternative, and requires locking). Instead, take all read, oldest and
stable timestamps from outside the namespace of the timestamps used for uncommitted transactions.

* Don't increment the global TS, it can cause commits to get behind active readers.

* Clean up global timestamp processing -- use fetch_add where we intend to use the value as
a component of the future commit/prepare (and, therefore, a blocker for oldest/stable).

* We can't calculate a useful minimum in-use timestamp if a thread hasn't yet set its
last-used commit timestamp, that thread might be about to use a commit timestamp in
the range we'd return.

* Don't attempt to set the oldest/stable timestamps if we don't have a valid maximum in-use
timestamp.

* When skipping over deleted FLCS entries, don't move out of the initial range and
into appended items.

Fix a typo.

* In-memory runs don't support compaction, check for that configuration and optionally turn it off.

* Don't copy the base mirror setting into the individual table settings, it's more complicated to
configure mirrors than that.

* WT-9371 Add tiered support to random_directio (#8003)

* WT-9371 Add tiered support to random_directio

* Add -B option to testutils general args.

* Only configure tiered when set for the program.

* Add tiered to main wiredtiger_open

* Add more locks. Remove ifdef'd code.

* One more level of indirection for the extension.

* Make copy_directory recursive for bucket subdirectory usage.

* Add bucket extension to recovery wiredtiger_open

* Fixes for using extension on recovery.

* Use system to clean out prior directories to catch subdirs.

* Remove debugging line.

* Fix bug in getopt string. Add support for preserve.

* Remove references to timestamps. That's not applicable here.

* WT-9149 Missing sanitizer flags in Evergreen (#8005)

Moved the following tests to use "compile wiredtiger address sanitizer" so that it compiles with ASan enabled, in this function ASan is part of the default compiler flags used.
  * race-condition-stress-sanitizer-test
  * race-condition-stress-sanitizer-test-no-barrier

* Adding DHAVE_BUILTIN_EXTENSION_ZSTD=1 flag to compile wiredtiger address sanitizer function.

* WT-9438 Skip cppsuite files in s_string (#8010)

* WT-9439 Don't attempt to evict using a checkpoint-cursor snapshot (#8011)

Don't use checkpoint-cursor transactions for eviction.

* When mirror checks fail, dump all of the pages from all the objects.

* Update FLCS matching.

* Wrap the session argument in parenthesis.

* Rename wiredtiger_XXX to wt_wrap_XXX for clarity.

* WT-9561 Unify usage of -t and -T parameters in test format binary and format.sh (#8103)

* WT-9563 Improve code style and readability (#8112)

* WT-9584 Fixed verification logic in table_verify_mirror (#8122)

* WT-9653 Fix typo in comment (#8159)

* WT-9654 Use flags instead of booleans to store tracing options (#8163)

* WT-9555 Add dedicated mirror config and trace arguments to test/format (#8187)

* WT-9736 Change the way format.sh stores and prints tracing options (#8206)

* WT-9735 Change mirroring probability to match comment + docs (#8207)

Co-authored-by: Keith Bostic <keith.bostic@mongodb.com>
Co-authored-by: David A. Holland <dholland+wt@sauclovia.org>
Co-authored-by: Etienne Petrel <77254506+etienneptl@users.noreply.github.com>
Co-authored-by: Andrew C. Morrow <acm@mongodb.com>
Co-authored-by: Andrew Morton <andrew.morton@mongodb.com>
Co-authored-by: Etienne Petrel <etienne.petrel@mongodb.com>
Co-authored-by: Sean <41533874+Sean04@users.noreply.github.com>
Co-authored-by: Siddhartha Mahajan <55784503+Siddhartha8899@users.noreply.github.com>
Co-authored-by: clarissecheah <65272339+clarissecheah@users.noreply.github.com>
Co-authored-by: Ruby Chen <ruby.chen@mongodb.com>
Co-authored-by: Monica Ng <30430618+mm-ng@users.noreply.github.com>
Co-authored-by: Alexey Anisimov <alexey.anisimov@mongodb.com>
Co-authored-by: Alexey Anisimov <84690529+AlexeyAnisimovAU@users.noreply.github.com>
Co-authored-by: Will Korteland <korteland@users.noreply.github.com>
Co-authored-by: Will Korteland <will.korteland@mongodb.com>
Co-authored-by: Monica Ng <monica.ng@mongodb.com>
  • Loading branch information
17 people committed Aug 15, 2022
1 parent ffba1bc commit addbbad
Show file tree
Hide file tree
Showing 41 changed files with 2,594 additions and 1,534 deletions.
4 changes: 2 additions & 2 deletions dist/s_string.ok
Expand Up @@ -53,8 +53,7 @@ BerkeleyDB
Bitfield
Bitstring
Bitwise
BlqR
BqRt
BqRv
Brueckner
Bsearch
Btree
Expand Down Expand Up @@ -572,6 +571,7 @@ aspell
assertfmt
async
atomicity
atou
auth
autocommit
autoconf
Expand Down
8 changes: 8 additions & 0 deletions src/btree/bt_cursor.c
Expand Up @@ -870,6 +870,14 @@ __wt_btcur_search(WT_CURSOR_BTREE *cbt)
} else
ret = WT_NOTFOUND;

/*
* The format test program does repeatable reads testing, and wants to dump the cursor page on
* failure. It sets up a callback for that purpose, and we pay a cache miss per search to make
* that work.
*/
if (session->format_private != NULL && (ret == 0 || ret == WT_NOTFOUND))
session->format_private(ret, session->format_private_arg);

#ifdef HAVE_DIAGNOSTIC
if (ret == 0)
WT_ERR(__wt_cursor_key_order_init(cbt));
Expand Down
4 changes: 2 additions & 2 deletions src/btree/bt_debug.c
Expand Up @@ -1026,15 +1026,15 @@ __wt_debug_cursor_page(void *cursor_arg, const char *ofile)
* Dump the history store tree given a user cursor.
*/
int
__wt_debug_cursor_tree_hs(void *session_arg, const char *ofile)
__wt_debug_cursor_tree_hs(void *cursor_arg, const char *ofile)
WT_GCC_FUNC_ATTRIBUTE((visibility("default")))
{
WT_BTREE *hs_btree;
WT_CURSOR *hs_cursor;
WT_DECL_RET;
WT_SESSION_IMPL *session;

session = (WT_SESSION_IMPL *)session_arg;
session = CUR2S(cursor_arg);
WT_RET(__wt_curhs_open(session, NULL, &hs_cursor));
hs_btree = __wt_curhs_get_btree(hs_cursor);
WT_WITH_BTREE(session, hs_btree, ret = __wt_debug_tree_all(session, NULL, NULL, ofile));
Expand Down
8 changes: 8 additions & 0 deletions src/include/connection.h
Expand Up @@ -30,6 +30,14 @@ struct __wt_process {
bool fast_truncate_2022; /* fast-truncate fix run-time configuration */

WT_CACHE_POOL *cache_pool; /* shared cache information */

/*
* WT_CURSOR.modify operations set unspecified bytes to space in 'S' format and to a nul byte in
* all other formats. It makes it easier to debug format test program stress failures if strings
* are printable and don't require encoding to trace them in the log; this is a hook that allows
* format to set the modify pad byte to a printable character.
*/
uint8_t modify_pad_byte;
};
extern WT_PROCESS __wt_process;

Expand Down
2 changes: 1 addition & 1 deletion src/include/extern.h
Expand Up @@ -636,7 +636,7 @@ extern int __wt_debug_addr_print(WT_SESSION_IMPL *session, const uint8_t *addr,
WT_GCC_FUNC_DECL_ATTRIBUTE((warn_unused_result));
extern int __wt_debug_cursor_page(void *cursor_arg, const char *ofile) WT_GCC_FUNC_DECL_ATTRIBUTE(
(visibility("default"))) WT_GCC_FUNC_DECL_ATTRIBUTE((warn_unused_result));
extern int __wt_debug_cursor_tree_hs(void *session_arg, const char *ofile)
extern int __wt_debug_cursor_tree_hs(void *cursor_arg, const char *ofile)
WT_GCC_FUNC_DECL_ATTRIBUTE((visibility("default")))
WT_GCC_FUNC_DECL_ATTRIBUTE((warn_unused_result));
extern int __wt_debug_disk(WT_SESSION_IMPL *session, const WT_PAGE_HEADER *dsk, const char *ofile)
Expand Down
5 changes: 4 additions & 1 deletion src/include/session.h
Expand Up @@ -31,7 +31,7 @@ struct __wt_hazard {
};

/* Get the connection implementation for a session */
#define S2C(session) ((WT_CONNECTION_IMPL *)(session)->iface.connection)
#define S2C(session) ((WT_CONNECTION_IMPL *)((WT_SESSION_IMPL *)(session))->iface.connection)

/* Get the btree for a session */
#define S2BT(session) ((WT_BTREE *)(session)->dhandle->handle)
Expand Down Expand Up @@ -62,6 +62,9 @@ struct __wt_session_impl {

void *lang_private; /* Language specific private storage */

void (*format_private)(int, void *); /* Format test program private callback. */
void *format_private_arg;

u_int active; /* Non-zero if the session is in-use */

const char *name; /* Name */
Expand Down
3 changes: 2 additions & 1 deletion src/support/modify.c
Expand Up @@ -168,7 +168,8 @@ __modify_apply_one(WT_SESSION_IMPL *session, WT_ITEM *value, WT_MODIFY *modify,
*/
if (value->size <= offset) {
if (value->size < offset)
memset((uint8_t *)value->data + value->size, sformat ? ' ' : 0, offset - value->size);
memset((uint8_t *)value->data + value->size,
sformat ? ' ' : __wt_process.modify_pad_byte, offset - value->size);
memcpy((uint8_t *)value->data + offset, data, data_size);
value->size = offset + data_size;
return (0);
Expand Down
2 changes: 1 addition & 1 deletion test/csuite/wt3338_partial_update/main.c
Expand Up @@ -206,7 +206,7 @@ modify_run(TEST_OPTS *opts)
modify_build();
testutil_check(__wt_buf_set(session, &cursor->value, localA->data, localA->size));
testutil_check(__wt_modify_apply_api(cursor, entries, nentries));
testutil_modify_apply(localA, &modtmp, entries, nentries);
testutil_modify_apply(localA, &modtmp, entries, nentries, '\0');
compare(localB, localA, &cursor->value);

/*
Expand Down
12 changes: 11 additions & 1 deletion test/evergreen.yml
Expand Up @@ -449,7 +449,7 @@ functions:
set -o errexit
set -o verbose
for i in $(seq ${times|1}); do
./t -c ${config|../../../test/format/CONFIG.stress} ${extra_args|} || ( [ -f RUNDIR/CONFIG ] && cat RUNDIR/CONFIG ) 2>&1
./t -c ${config|../../../test/format/CONFIG.stress} ${trace_args|-T bulk,txn,retain=100} ${extra_args|} || ( [ -f RUNDIR/CONFIG ] && cat RUNDIR/CONFIG ) 2>&1
done
"format test script":
command: shell.exec
Expand Down Expand Up @@ -2610,6 +2610,15 @@ tasks:
RW_LOCK_FILE=$(pwd)/../../cmake_build/test/csuite/rwlock/test_rwlock ./time_shift_test.sh /usr/local/lib/faketime/libfaketimeMT.so.1 0-1 2>&1
- name: format-mirror-test
commands:
- func: "get project"
- func: "compile wiredtiger"
- func: "format test"
vars:
config: ../../../test/format/CONFIG.mirror
trace_args: -T all

- name: format-stress-pull-request-test
tags: ["pull_request"]
commands:
Expand Down Expand Up @@ -3981,6 +3990,7 @@ buildvariants:
- name: static-wt-build-test
- name: linux-directio
distros: ubuntu2004-build
- name: format-mirror-test
- name: format-smoke-test
- name: format-failure-configs-test
- name: data-validation-stress-test-checkpoint
Expand Down
2 changes: 2 additions & 0 deletions test/format/CMakeLists.txt
Expand Up @@ -18,7 +18,9 @@ set(format_sources
snap.c
t.c
trace.c
timestamp.c
util.c
verify.c
wts.c
)

Expand Down
8 changes: 8 additions & 0 deletions test/format/CONFIG.mirror
@@ -0,0 +1,8 @@
# Similar to the stress config, with the addition of mirroring.
btree.huffman_value=0
cache.minimum=20
runs.mirror=1
runs.rows=1000000:5000000
runs.tables=3:10
runs.threads=4:32
runs.timer=6:30
1 change: 0 additions & 1 deletion test/format/CONFIG.stress
Expand Up @@ -5,4 +5,3 @@ runs.rows=1000000:5000000
runs.tables=3:10
runs.threads=4:32
runs.timer=6:30
runs.type=row,var
84 changes: 84 additions & 0 deletions test/format/README
@@ -0,0 +1,84 @@
Tracing format operations can be useful in debugging.

Turn on tracing with the "-T" flag:

-T Trace operations into a WiredTiger log

That causes format to write trace messages into a WiredTiger log, using WT_SESSION.log_printf()
to track operations. Operations are logged into a separate database from the run, "OPS.TRACE"
in the run's home directory.

By default only modifications and transactions are traced to limit how much data is stored. The
-T option takes an argument, so to configure the default use an empty string, '-T ""'.

You can trace more operations by adding comma or white-space separated arguments to the -T option:

all trace everything
bulk trace bulk loads
cursor trace cursor operations
mirror_fail trace multiple failures in mirror checks (up to a maximum of 20).
read trace read operations
retain=N log files are retained by default, retain N instead.
timestamp trace timestamp operations (setting oldest, stable TS)
txn add transactional information to the trace log on operations

For example, to trace the usual operations plus bulk load, and retain 25 log files, use:

-T "bulk,retain=25"

Additionally if tracing is configured, verbose messages written by the database itself will be
logged as well. For example, to trace the usual operations and include verbose block manager
operations, use:

-C 'verbose=(block)' -T ""

After the run, you can dump the trace records from the WiredTiger log files with the wt utility,
for example:

wt -h RUNDIR/OPS.TRACE printlog -um > LOG

The first step is to determine which record failed and then look for transactions involving
that record. For example, imagine that record number 1063233 failed, you can grep for it in
the LOG and the output will look something like this:

% grep 1063233 LOG
[1641602344:13689][3028368:0x7fa21affd700], t: [WT_VERB_TEMPORARY][INFO]: 63842 table:wt: insert 1063233 {0x7e}
[1641602357:730642][3028368:0x7fa21a7fc700], t: [WT_VERB_TEMPORARY][INFO]: 134675 commit read-ts=1062901, commit-ts=1063233
[1641602377:413530][3028368:0x7fa2197fa700], t: [WT_VERB_TEMPORARY][INFO]: 216631 table:wt: remove 1063233
[1641602399:300000][3028368:0x7fa221ffb700], t: [WT_VERB_TEMPORARY][INFO]: 224868 table:wt: reserve 1063233
[1641602399:300014][3028368:0x7fa221ffb700], t: [WT_VERB_TEMPORARY][INFO]: 224868 table:wt: update 1063233 {0x5e}

The white-space separated fields are a timestamp, a thread and session ID, the program name,
the debug message classification, the format operation ID and then the operation. When the
operation is per record (like an update or remove), there's additionally the object name,
an operation and a row number, and perhaps the updated value.

The next step is to examine the complete format operation. For example, to find the transaction for
the update at timestamp [1641602399:300014], you can grep for the operation ID. Note that threads
use the same operation numbers, so you should grep for the session ID followed by the operation ID:

% egrep '0x7fa221ffb700.*: 224868' LOG
[1641602399:299982][3028368:0x7fa221ffb700], t: [WT_VERB_TEMPORARY][INFO]: 224868 begin snapshot read-ts=1797649 (not repeatable)
[1641602399:300000][3028368:0x7fa221ffb700], t: [WT_VERB_TEMPORARY][INFO]: 224868 table:wt: reserve 1063233
[1641602399:300014][3028368:0x7fa221ffb700], t: [WT_VERB_TEMPORARY][INFO]: 224868 table:wt: update 1063233 {0x5e}
[1641602399:300149][3028368:0x7fa221ffb700], t: [WT_VERB_TEMPORARY][INFO]: 224868 table:wt: remove 189864
[1641602399:312617][3028368:0x7fa221ffb700], t: [WT_VERB_TEMPORARY][INFO]: 224868 table:wt: remove 903917
[1641602399:312737][3028368:0x7fa221ffb700], t: [WT_VERB_TEMPORARY][INFO]: 224868 table:wt: update 175251 {0x7e}
[1641602399:312859][3028368:0x7fa221ffb700], t: [WT_VERB_TEMPORARY][INFO]: 224868 commit read-ts=1797649, commit-ts=1797672

And that gives you the complete transaction.

Or, something like:

% grep -w 224868 | sort -T: -k2,3 LOG

will list all of the operations ID appearances in the log, sorted by session ID.

When format detects a mismatch in the row that's about to be returned and the row the read
expected, it will dump the cursor page that was the source of the row that didn't match the
expected value, as well as the history store. The dumps are in format's run directory, in files
named "FAIL.pagedump" and "FAIL.HSdump". Alternatively, if you have a debugger attached to the
failure, you can dump the underlying Btree pages the cursor references, using either a WT_CURSOR
or WT_CURSOR_BTREE cursor:

(gdb) print __wt_debug_cursor_page(cursor, "/tmp/XX")
13 changes: 10 additions & 3 deletions test/format/alter.c
Expand Up @@ -35,15 +35,17 @@
WT_THREAD_RET
alter(void *arg)
{
SAP sap;
TABLE *table;
WT_CONNECTION *conn;
WT_DECL_RET;
WT_SESSION *session;
u_int period;
u_int counter, period;
char buf[32];
bool access_value;

(void)(arg);

conn = g.wts_conn;

/*
Expand All @@ -53,7 +55,9 @@ alter(void *arg)
access_value = false;

/* Open a session */
testutil_check(conn->open_session(conn, NULL, NULL, &session));
memset(&sap, 0, sizeof(sap));
wt_wrap_open_session(conn, &sap, NULL, &session);
counter = 0;

while (!g.workers_finished) {
period = mmrand(NULL, 1, 10);
Expand All @@ -64,14 +68,17 @@ alter(void *arg)

/* Alter can return EBUSY if concurrent with other operations. */
table = table_select(NULL);
trace_msg(session, "Alter #%u URI %s start %s", ++counter, table->uri, buf);

while ((ret = session->alter(session, table->uri, buf)) != 0 && ret != EBUSY)
testutil_die(ret, "session.alter");
trace_msg(session, "Alter #%u URI %s stop", counter, table->uri);
while (period > 0 && !g.workers_finished) {
--period;
__wt_sleep(1, 0);
}
}

testutil_check(session->close(session, NULL));
wt_wrap_close_session(session);
return (WT_THREAD_RET_VALUE);
}
25 changes: 17 additions & 8 deletions test/format/backup.c
Expand Up @@ -37,7 +37,6 @@ check_copy(void)
{
WT_CONNECTION *conn;
WT_DECL_RET;
WT_SESSION *session;
size_t len;
char *path;

Expand Down Expand Up @@ -65,12 +64,12 @@ check_copy(void)

/* Now setup and open the path for real. */
testutil_check(__wt_snprintf(path, len, "%s/BACKUP", g.home));
wts_open(path, &conn, &session, false);
wts_open(path, &conn, false);

/* Verify the objects. */
tables_apply(wts_verify, conn);
wts_verify(conn, true);

wts_close(&conn, &session);
wts_close(&conn);

free(path);
}
Expand Down Expand Up @@ -462,11 +461,12 @@ WT_THREAD_RET
backup(void *arg)
{
ACTIVE_FILES active[2], *active_now, *active_prev;
SAP sap;
WT_CONNECTION *conn;
WT_CURSOR *backup_cursor;
WT_DECL_RET;
WT_SESSION *session;
u_int incremental, period;
u_int counter, incremental, period;
uint64_t src_id, this_id;
const char *config, *key;
char cfg[512];
Expand All @@ -475,15 +475,17 @@ backup(void *arg)
(void)(arg);

conn = g.wts_conn;

/* Open a session. */
testutil_check(conn->open_session(conn, NULL, NULL, &session));
memset(&sap, 0, sizeof(sap));
wt_wrap_open_session(conn, &sap, NULL, &session);

__wt_seconds(NULL, &g.backup_id);
active_files_init(&active[0]);
active_files_init(&active[1]);
active_now = active_prev = NULL;
incr_full = true;
incremental = 0;
counter = incremental = 0;
/*
* If we're reopening an existing database and doing incremental backup we reset the initialized
* variables based on whatever they were at the end of the previous run. We want to make sure
Expand Down Expand Up @@ -579,6 +581,11 @@ backup(void *arg)
/*
* open_cursor can return EBUSY if concurrent with a metadata operation, retry in that case.
*/
if (config == NULL)
trace_msg(session, "Backup #%u start", ++counter);
else
trace_msg(session, "Backup #%u start: (%s)", ++counter, config);

while (
(ret = session->open_cursor(session, "backup:", NULL, config, &backup_cursor)) == EBUSY)
__wt_yield();
Expand All @@ -605,6 +612,8 @@ backup(void *arg)
testutil_check(session->truncate(session, "log:", backup_cursor, NULL, NULL));

testutil_check(backup_cursor->close(backup_cursor));
trace_msg(session, "Backup #%u stop%s%s%s", counter, config == NULL ? "" : ": (",
config == NULL ? "" : config, config == NULL ? "" : ")");
lock_writeunlock(session, &g.backup_lock);
active_files_sort(active_now);
active_files_remove_missing(active_prev, active_now);
Expand Down Expand Up @@ -637,7 +646,7 @@ backup(void *arg)

active_files_free(&active[0]);
active_files_free(&active[1]);
testutil_check(session->close(session, NULL));
wt_wrap_close_session(session);

return (WT_THREAD_RET_VALUE);
}

0 comments on commit addbbad

Please sign in to comment.