New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WT-8497 Initial riscv64 support #7269
Conversation
A few notes on this:
I'm also going to leave a few more detailed notes on the changes themselves, where interesting. |
@@ -64,6 +64,10 @@ AS_CASE([$host_cpu], | |||
[s390x*], [wt_cv_zseries="yes"], | |||
[wt_cv_zseries="no"]) | |||
AM_CONDITIONAL([ZSERIES_HOST], [test "$wt_cv_zseries" = "yes"]) | |||
AS_CASE([$host_cpu], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pure copy paste from the nearby arm stanza, but I did an autoreconf and configure / make and it all seemed to work OK.
@@ -18,6 +18,7 @@ config_choice( | |||
"aarch64;WT_AARCH64;" | |||
"ppc64le;WT_PPC64;" | |||
"s390x;WT_S390X;" | |||
"riscv64;WT_RISCV64;" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By analogy with WT_AARCH64 case.
@@ -0,0 +1,23 @@ | |||
# |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By analogy with the aarch64 one.
set(WT_BUFFER_ALIGNMENT_DEFAULT "4096" CACHE STRING "") | ||
|
||
# ARMv8-A is the 64-bit ARM architecture, turn on the optional CRC instructions. | ||
set(CMAKE_C_FLAGS "${CMAKE_C_FLAGS} -march=rv64imafdc -mabi=lp64d" CACHE STRING "" FORCE) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See https://www.sifive.com/blog/all-aboard-part-1-compiler-args for background on the mafdc
and lp64d
arguments here.
@@ -60,6 +60,8 @@ def prototypes_extern(): | |||
continue | |||
if fnmatch.fnmatch(name, '*/checksum/power8/*'): | |||
continue | |||
if fnmatch.fnmatch(name, '*/checksum/riscv64/*'): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By analogy, untested.
One real concern with this of course is testing. I don't have a great answer there. I did the development work for this in the mongodb tree where I used SCons to cross compile to riscv64 on Ubuntu, and then I did runtime testing with a |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, Andrew!
@alisonfel, I ignored all of the cmake changes.
If you want something to stress your barriers a bit, you can run test/format
on a standalone build, also there are some heavily threaded programs in the standalone tests that will fail if there are barrier problems.
@acmorrow, if you run locally The only error I can see so far is:
You will need to update the |
@keithbostic - Since my testing platform is |
@keithbostic, @etienneptl - Updated, PTAL. @alisonfel - Would love your thoughts on build system stuff. |
@acmorrow, the PR system runs basic scripting over the source tree, and some of the checks are failing. If you run |
@keithbostic - Yeah, I just did something silly updating the spelling file. I've pushed an update that I think will fix those items. |
@keithbostic - I don't know how to get |
Sure, will do. |
# Linux requires buffers aligned to 4KB boundaries for O_DIRECT to work. | ||
set(WT_BUFFER_ALIGNMENT_DEFAULT "4096" CACHE STRING "") | ||
|
||
# ARMv8-A is the 64-bit ARM architecture, turn on the optional CRC instructions. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You might want to update this comment, which seems to have been pasted from the arm file :-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
All seems ok to me; don't see any reason the cmake stuff shouldn't work, but I'm not a cmake expert. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @acmorrow, thanks for these changes.
I had a look at the CMake changes and they all make sense. Using the other architecture configs as an analogy is more or less the correct approach here :)
This LGTM. I have a comment below, but I don't want it to be blocking (as is it could be an issue on my end)
I wanted to ask if cross-compiling with the gcc toolchain has worked on your end? I gave it a go and wound up with linker errors, having undefined symbols for the atomic functions:
libwiredtiger.so.10.0.1: undefined reference to `__atomic_compare_exchange_2'
libwiredtiger.so.10.0.1: undefined reference to `__atomic_compare_exchange_1'
This does seem a little odd, I would've thought these were builtins. Manually passing through a linker flag to the atomics library fixes the issue -latomic
.
The initial cmake command I ran: cmake -DCMAKE_TOOLCHAIN_FILE=../cmake/toolchains/gcc.cmake -DCMAKE_CROSSCOMPILING=TRUE -DWT_ARCH=riscv64 -DWT_OS=linux -DCMAKE_PREFIX_PATH=/usr/riscv64-linux-gnu/lib -G Ninja ../.
(riscv64-linux-gnu-gcc -v
-> gcc version 7.5.0
)
I don't want this to block the PR on this as cross-compiling is not a major use case (its sometimes non-perfect for PPC and ZSeries also). This could easily be an issue with the toolchain setup on my end (as cross-compiling setups can be a bit finicky).
More so curious if you encountered this? I'm guessing this is a non-issue when compiling on an actual (or emulated) RISCV host?
# Linux requires buffers aligned to 4KB boundaries for O_DIRECT to work. | ||
set(WT_BUFFER_ALIGNMENT_DEFAULT "4096" CACHE STRING "") | ||
|
||
# ARMv8-A is the 64-bit ARM architecture, turn on the optional CRC instructions. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment seems to come from the aarch64 config. Might be worth rewording to quickly state what the rv64imafdc
and lp64d
flags are for.
(Thanks for the background link!)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
@alisonfel - I did cross compile, but I did so via the server's SCons-driven build of WT (the fourth build system!). The fact that you needed to manually add |
Thanks all for the review, I think this is ready to merge. What is the process? |
I'll merge it as soon as evergreen confirms. |
Initial riscv64 support
Initial riscv64 support
… correctness (#7662) * Add a callback to the Btree search routine so that format can dump pages before they're evicted. * Move the timestamp code into its own file. * Change the next/prev code to quit doing any key order comparisons at all. The problem was that nextprev() changed TINFO.keyno, which borked mirror operations, but that was fixable -- on the other hand, key order checking slows things down, and the WiredTiger library (in HAVE_DIAGNOSTIC mode), has better checking. * Remove g.rts_no_check; because format no longer does multiple runs, it's never going to fire. * If transaction.timestamps are explicitly configured off, turn off ops.prepare. * Fix wrong printf(3) types. * Don't trace repeated reads unless configured for all trace records. * Update the usage message. * FLCS fix: silently allow modifications in FLCS objects by converting them to updates. FLCS fix: fix paths where no bit value was generated for FLCS objects. Promote global values to per-table values before selecting a run source, otherwise we can pick one that doesn't match when re-opening a database. Rework trace support using configuration options instead of the -T command-line option, it gives us more flexibility. Expand trace configuration, separating bulk, reads, cursor and TS operations out, add a new option to configure the number of log files that are retained. * Add a README. * If we can't insert a new column-store record, any enclosing transaction must be rolled back. * In configurations with no inserts, we can end a mirror check before the end of the original rows (when the original rows have been removed). Handle early not-found from the cursors. * Don't configure more than 10 tables for in-memory runs, it can blow out the cache. * Don't attempt a checkpoint on an in-memory run, it's not allowed. * Add a comment, the cache is configured last. * Fix a comment. * Alphabetize fields. * Clarify. * More VLCS/FLCS caution around the end of the mirrored rows. It's possible for FLCS/VLCS to move from within the original rows to within the inserted rows based on delete and insert patterns. * Remove a debugging printf. * Use a non-regex character as the modify pad character so it's easier to search the output. * Log library verbose messages along with format trace messages. Use the library verbose functions to create the message prefix for format's tracing. Allow reconfiguration of verbose options, the bug is that any reconfiguration of the connection clears the existing verbose configurations, regardless of whether or not the configuration includes verbose settings. Rework the general error functions to handle big error messages, necessary to avoid truncating long verbose messages. Remove the option to trace format operations in the primary database, it's no longer needed now that primary database verbose messages are logged in the secondary. Revert the change to move trace options into the CONFIG file, it's more trouble than it's worth. * Use the ap_copy on error if it's been set. * update * It's OK to mirror in-memory, let that configuration proceed. * Initialize the locks earlier, bulk-load calls functions that use them. * Add -T option to turn on tracing in format. * Fix a bug introduced in the develop merge. * Add trace option to format.sh recovery test. * Evict pages when resetting operation cursors. Minor clarification to truncate trace message. * typo * Replace testutil_checksys() calls with testutil_assert_errno(). * Don't output MongoDB verbose messages in standalone builds. * Overwrite the '=' sign on retain=xx * Turn off the buffer overflow error message, format easily exceeds the buffer in trace mode. * txn_timestamp.c:760:33: error: variable 'use_pinned_ts' set but not used [-Werror=unused-but-set-variable] * Another try and making both standalone & not-standalone builds compile cleanly. * Remove the variable entirely. * WT-8487 FLCS reconciliation zeroes the entire disk buffer in advance (#7264) Don't zero the whole FLCS disk buffer up front. * WT-8502 Remove duplicate key in Evergreen file (#7274) * When there's a checkpoint thread running in format, check for mirror object consistency after each checkpoint. Add trace messages for checkpoints. * Add tracing in backup and expand mirror verify output. * WT-8466 Move btree I/O functionality into the block-cache implementation (#7248) Combine the btree I/O functions and the block-manager block-cache functionality into a new layer between the two. * WT-8518 Tidy the FLCS verify code to iterate the page only once. (#7284) Tidy the FLCS verify code to iterate the page only once. * WT-8514 Apply an explicit page size limit to FLCS pages (#7281) Apply an explicit page size limit to FLCS pages. * WT-8497 Initial riscv64 support (#7269) Initial riscv64 support * lint * Put the additional transaction-oriented verbose messages on a separate trace variable. Format the format.i file. * Add "multi_fail" trace option to keep going after the first failure (up to 20 total). Simplify verify trace output, add base/table URIs to failure output. * Fix up configurations with timestamps and salvage or prepare, flag why we can't mix salvage and timestamps. * Fix a comment. * Dump the cursor pages on mirror failures. Don't include the mirror-fail trace option in the "all" option. * Missing newline. * Fix transactional configuration in case nothing at all was explicitly specified. * If transaction.timestamps explicitly specified, still have to deal with prepare/logging choice. * Don't verify the metadata when opening the backup as discussed in WT-8561. When opening the backup/recovery directory with verify_metadata=true, the verify code checkpoints the metadata after running the verify. That checkpoint, resets the LSN for the metadata to WT_MAX_LSN and causes recovery of the backup database to skip applying the log records. * Protect cursor-dump-page functions with HAVE_DIAGNOSTIC. * snap.c:284:18: error: format specifies type 'unsigned char' but the argument has type 'unsigned int' [-Werror,-Wformat] * Another attempt to fix the compiler complaint. * iutil.c:300:29: error: unused parameter 'cursor' [-Werror,-Wunused-parameter] * Configure file_close_sync=0 if mirrors are enabled. * Don't try to display the keys unless object is type ROW. * Make __wt_debug_cursor_tree_hs() match the commen description, it takes a cursor handle not a session handle (choosing cursor for no particular reason). * Make __wt_debug_cursor_tree_hs() match the commen description, it takes a cursor handle not a session handle (choosing cursor for no particular reason). * Remove the code to dump the HS file -- it just hangs all the time because we can't get a lock to open the HS handle. We need to find a caching solution (which will in turn fail once we have a HS file per object). My preference is to leave this problem up to the btree diagnostic routines, they should know how to get a HS handle and dump only the parts of the HS file that are relevant. * Clear the return value, it might have been set and we'll quit doing any work in that case. * typo * Add the format operations ID to the trace log, update the README file to explain it. Don't trace the stable TS at RTS, it's not useful. Rework FLCS output on failure so we get specific information on remove operations and not-found returns, rather than just a value of 0x0. * Flag SIGKILL as the Linux OOM signal. * Use a barrier instead of an atomic instruction, it's sufficent and more efficient. * Get the stable timestamp after calling rollback-to-stable and update the local copy. We no longer call RTS with active worker threads, we can use a single WT_SESSION when replaying RTS operations. * Add trace messages around alter call. * Add RTS key debugging. * Add home_backup to know if we're reopening from a backup. * Hack the format-mirror debugging code in RTS. * Finish the merge. * Remove Sue's original RTS printf's, Hari has replaced them in the develop branch. * FLCS fixes * Print the checkpoint name (if any) on a mirroring failure. * unused variable. * snap.c:183:45: error: implicit conversion loses integer precision: 'unsigned int' to 'uint8_t' (aka 'unsigned char') [-Werror,-Wconversion] * file_close_sync is gone. * Display failure information for FLCS. * Pick the access method before promoting any "already set" values from the base configuration, that allows type configurations like "row,var" to work for all tables, rather than using the base value chosen. * When doing an update for an FLCS modify operation, the FLCS value has to come from the tinfo.value buffer, not the tinfo.new_value buffer. * Print out FLCS mismatch in hex. * Get rid of tinfo.new_value, we only need a single value buffer for each thread. * Revert "Get rid of tinfo.new_value, we only need a single value buffer for each thread." This reverts commit 39c4013. * Make FLCS work: put the "new" value for modify, update and insert operations into the new_value buffer, otherwise our cursor next/prev ops can overwrite the value we need for upcoming FLCS operations. We do blind modifies, but FLCS does updates instead of modifies. If the first blind modify doesn't find a row, skip the rest of the operation, else the FLCS operation won't match a deleted row. * Update format to use the faster set-timestamp functions. * Minor shuffling, more TS stuff in ts.c * Configure stress testing of FLCS. * Fix a problem with FLCS non-overwrite insert. The value returned for tombstones and implicit rows was uninitialized rubbish rather than 0. * Turn on checkpoint cursors, they should work now. * The fast-truncate code no longer pins pages into memory, and we've significantly increased the number of files and rows in a stress run over time. Increase the number of concurrent truncates to 4, but decrease the number of rows truncated to a max of 2% of the object. * Dump the last successful match for RS/VLCS so it's easier to spot truncate runs. * Fix the drop checkpoint loop, just use the testutil one. * Add a space between the message and the checkpoint name. * format * Loop until we get a matching set of checkpoint cursors. * Verify named checkpoints just like default checkpoints. Fail if we see EBUSY returned from a checkpoint. Minor trace message spacing. * message format * Note how to sort the LOG to get the ops you want. * The page dump chunks weren't quite identical, merge them into one. * WT-9091 Allow customisation of the cppsuite tracking table (#7872) Co-authored-by: Etienne Petrel <etienne.petrel@mongodb.com> * WT-9271 Clean up use of hard-coded numbers for object ids. (#7878) * WT-9059 Improve error handling for S3 extension classes (#7871) * Implemented error code mapping between HTTP response code expected by calls to aws and errno codes in S3Connection. * WT-9164 Add a comment explaning why the exact value could be non-zero in __curhs_insert. (#7890) * WT-9252 instantiated fast-truncate pages in read-only trees fail eviction tests (#7865) Instantiated fast-truncate pages in read-only trees fail eviction tests. * WT-9275 Update s_style to avoid matching on words with punctuation (#7895) * Updating the regex to avoid matching on punctuation * Update regex to match spaces and fullstops at the end of a word * Use \s to denote spaces in the regex * Put the checkpoint name on a line by itself. * Remove "assert.write_timestamp" configuration, it's no longer meaningful in WT. Remove "write_timestamp_usage" configuration, it's no longer meaningful in WT. Run format. * If mirroring is already configured for the tables, don't redo it. * Rename maximum_read_ts() to maximum_committed_ts(), we're going to need it for more general purposes. * Quit using all_durable as the basis for oldest/stable timestamp calculations (it's generally more expensive/slow than the alternative, and requires locking). Instead, take all read, oldest and stable timestamps from outside the namespace of the timestamps used for uncommitted transactions. * Don't increment the global TS, it can cause commits to get behind active readers. * Clean up global timestamp processing -- use fetch_add where we intend to use the value as a component of the future commit/prepare (and, therefore, a blocker for oldest/stable). * We can't calculate a useful minimum in-use timestamp if a thread hasn't yet set its last-used commit timestamp, that thread might be about to use a commit timestamp in the range we'd return. * Don't attempt to set the oldest/stable timestamps if we don't have a valid maximum in-use timestamp. * When skipping over deleted FLCS entries, don't move out of the initial range and into appended items. Fix a typo. * In-memory runs don't support compaction, check for that configuration and optionally turn it off. * Don't copy the base mirror setting into the individual table settings, it's more complicated to configure mirrors than that. * WT-9371 Add tiered support to random_directio (#8003) * WT-9371 Add tiered support to random_directio * Add -B option to testutils general args. * Only configure tiered when set for the program. * Add tiered to main wiredtiger_open * Add more locks. Remove ifdef'd code. * One more level of indirection for the extension. * Make copy_directory recursive for bucket subdirectory usage. * Add bucket extension to recovery wiredtiger_open * Fixes for using extension on recovery. * Use system to clean out prior directories to catch subdirs. * Remove debugging line. * Fix bug in getopt string. Add support for preserve. * Remove references to timestamps. That's not applicable here. * WT-9149 Missing sanitizer flags in Evergreen (#8005) Moved the following tests to use "compile wiredtiger address sanitizer" so that it compiles with ASan enabled, in this function ASan is part of the default compiler flags used. * race-condition-stress-sanitizer-test * race-condition-stress-sanitizer-test-no-barrier * Adding DHAVE_BUILTIN_EXTENSION_ZSTD=1 flag to compile wiredtiger address sanitizer function. * WT-9438 Skip cppsuite files in s_string (#8010) * WT-9439 Don't attempt to evict using a checkpoint-cursor snapshot (#8011) Don't use checkpoint-cursor transactions for eviction. * When mirror checks fail, dump all of the pages from all the objects. * Update FLCS matching. * Wrap the session argument in parenthesis. * Rename wiredtiger_XXX to wt_wrap_XXX for clarity. * WT-9561 Unify usage of -t and -T parameters in test format binary and format.sh (#8103) * WT-9563 Improve code style and readability (#8112) * WT-9584 Fixed verification logic in table_verify_mirror (#8122) * WT-9653 Fix typo in comment (#8159) * WT-9654 Use flags instead of booleans to store tracing options (#8163) * WT-9555 Add dedicated mirror config and trace arguments to test/format (#8187) * WT-9736 Change the way format.sh stores and prints tracing options (#8206) * WT-9735 Change mirroring probability to match comment + docs (#8207) Co-authored-by: Keith Bostic <keith.bostic@mongodb.com> Co-authored-by: David A. Holland <dholland+wt@sauclovia.org> Co-authored-by: Etienne Petrel <77254506+etienneptl@users.noreply.github.com> Co-authored-by: Andrew C. Morrow <acm@mongodb.com> Co-authored-by: Andrew Morton <andrew.morton@mongodb.com> Co-authored-by: Etienne Petrel <etienne.petrel@mongodb.com> Co-authored-by: Sean <41533874+Sean04@users.noreply.github.com> Co-authored-by: Siddhartha Mahajan <55784503+Siddhartha8899@users.noreply.github.com> Co-authored-by: clarissecheah <65272339+clarissecheah@users.noreply.github.com> Co-authored-by: Ruby Chen <ruby.chen@mongodb.com> Co-authored-by: Monica Ng <30430618+mm-ng@users.noreply.github.com> Co-authored-by: Alexey Anisimov <alexey.anisimov@mongodb.com> Co-authored-by: Alexey Anisimov <84690529+AlexeyAnisimovAU@users.noreply.github.com> Co-authored-by: Will Korteland <korteland@users.noreply.github.com> Co-authored-by: Will Korteland <will.korteland@mongodb.com> Co-authored-by: Monica Ng <monica.ng@mongodb.com>
No description provided.