Skip to content
This repository has been archived by the owner on Aug 2, 2022. It is now read-only.

database memory locking & hugepage support #6854

Merged
merged 2 commits into from
Mar 1, 2019

Conversation

spoonincode
Copy link
Contributor

@spoonincode spoonincode commented Feb 28, 2019

Change Description

nodeos has historically only had one built-in way to use its database state files: as a memory mapped file. This PR adds two new modes of operation along with hugepage support on Linux. Users can switch between the three modes of operation on nodeos startup and all modes are compatible with each other such that no replay is required when switching modes. Motivation for this change is that database performance increases over 10% when using 1GB hugepages. There are some other interesting use cases too though, such as heap mode making it more practical to run nodeos on a HDD.

The modes of operation are:

  • mapped: The classic behavior and still the default.
  • heap: Preloads the entire database in to swappable memory.
  • locked: Preloads the entire database in to non-swappable memory and optionally allows use of hugepages on Linux.

Both heap and locked mode will require nodeos to write out entire database files on shutdown in addition to loading the entire database file on startup. These operations could take considerable time depending on database size and speed of storage.

Hugepage usage

By far the most interesting aspect of this change is hugepage support. Remember that hugepage support is only available on Linux and only available in locked mode of operation. Despite (at the moment) hugepages on Linux being non-swappable, nodeos still expects the mlock() call to succeed on the hugepages and thus the ulimit of the process will need to be set accordingly.

nodeos will make use of the largest hugepage size it can without "over-committing". Some of examples of configuration:

database-map-mode = locked 
chain-state-db-size-mb = 4096
reversible-blocks-db-size-mb = 340
database-hugepage-path = /dev/hugepages     #contains hugetlbfs with 1GB page size

On startup will see the output below because the 340MB size of the reversible database is not a multiple of the 1GB hugepage size:

CHAINBASE: Database "state" using 1073741824 byte pages
CHAINBASE: Database "reversible" not using huge pages

If we were to change this configuration to

database-map-mode = locked 
chain-state-db-size-mb = 4096
reversible-blocks-db-size-mb = 1024
database-hugepage-path = /dev/hugepages     #contains hugetlbfs with 1GB page size

Then, since the reversible database is now a multiple of 1GB, on startup will see:

CHAINBASE: Database "state" using 1073741824 byte pages
CHAINBASE: Database "reversible" using 1073741824 byte pages 

Another option though would be to specify multiple hugepage sizes:

database-map-mode = locked 
chain-state-db-size-mb = 4096
reversible-blocks-db-size-mb = 340
database-hugepage-path = /dev/hugepages1G     #contains hugetlbfs with 1GB page size
database-hugepage-path = /dev/hugepages2M     #contains hugetlbfs with 2MB page size

Upon startup will see different page sizes being used:

CHAINBASE: Database "state" using 1073741824 byte pages
CHAINBASE: Database "reversible" using 2097152 byte pages

This PR waiting on EOSIO/chainbase#33 first.

A note on the unit test: While I'm generally against POSIX stuff being added to the code base, the impression I got looking at some of the python unit tests is that there is some considerable dependency on POSIX behaviors there. The unit test here is a shell script which of course won't run on Win32... but much of the python unit tests won't either.

Closes out #6694

Consensus Changes

API Changes

Documentation Additions

  • Documentation Additions

See description above.

@spoonincode spoonincode merged commit abd7b1a into develop Mar 1, 2019
@spoonincode spoonincode deleted the use_chainbase_db_modes branch March 1, 2019 22:11
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants