Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kernel: Prune leveldb headers #28186

Merged
merged 16 commits into from
Aug 7, 2023

Conversation

TheCharlatan
Copy link
Contributor

Leveldb headers are currently included in the dbwrapper.h file and thus available to many of Bitcoin Core's source files. However, leveldb-specific functionality should be abstracted by the dbwrapper and does not need to be available to the rest of the code. Having leveldb included in a widely-used header such as dbwrapper.h bloats the entire project's header tree.

The dbwrapper is a key component of the libbitcoinkernel library. Future users of this library would not want to contend with having the leveldb headers exposed and potentially polluting their project's namespace.

For these reasons, the leveldb headers are removed from the dbwrapper by moving leveldb-specific code to the implementation file and creating a pimpl where leveldb member variables are indispensable. As a final step, the leveldb include flags are removed from the BITCOIN_INCLUDES and moved to places where the dbwrapper is compiled.


This pull request is part of the libbitcoinkernel project, and more specifically its stage 1 step 3 "Decouple most non-consensus headers from libbitcoinkernel".

@DrahtBot
Copy link
Contributor

DrahtBot commented Jul 30, 2023

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Reviews

See the guideline for information on the review process.

Type Reviewers
ACK stickies-v, MarcoFalke
Concept ACK hebasto, dergoegge
Stale ACK Sjors

If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

Conflicts

Reviewers, this pull request conflicts with the following ones:

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

@hebasto
Copy link
Member

hebasto commented Jul 30, 2023

Concept ACK.

@dergoegge
Copy link
Member

Concept ACK

@maflcko
Copy link
Member

maflcko commented Jul 31, 2023

The dbwrapper is a key component of the libbitcoinkernel library. Future users of this library would not want to contend with having the leveldb headers exposed and potentially polluting their project's namespace.

Not sure about the motivation. I don't think any user should include the dbwrapper either? No user should be directly writing or reading from the db. The only reason why they'd have to include the header is to get access to DBOptions, which would alternatively and trivially be achieved by moving DBOptions into a separate header, no?

And if you are worried about the txdb.h include, maybe do a pimpl there? IIRC #22242 did that, or something like that.

@TheCharlatan
Copy link
Contributor Author

Re #28186 (comment)

I don't think any user should include the dbwrapper either? No user should be directly writing or reading from the db.

It's hard to speculate on how this will be used in the future. Something calling the db directly for e.g. data science use cases would not be unheard of.

And if you are worried about the txdb.h include, maybe do a pimpl there? IIRC #22242 did that, or something like that.

I am also thinking about the base.h include. Though not in the kernel, it pollutes a very significant portion of source files. Implementing a pimpl in base.h could work, but I don't think that would really be preferable to this PR here. Getting rid of a big, unused besides a single source file include with a bunch of mostly move-only changes seems like a win to me. If this is not sufficient motivation for you though, I'd be happy to work with a reopened #22242, and separate PR for splitting the DBOptions and DBParams.

@maflcko
Copy link
Member

maflcko commented Aug 1, 2023

Thanks, makes sense. I initially wondered about the 15 commits and +100 LOC, but having more commits for this template-heavy code makes it easier to review, and I found a way to reduce the number of LOC. Just a style nit though, feel free to ignore.

Copy link
Member

@maflcko maflcko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

review ACK 0280dc4 🔨

Show signature

Signature:

untrusted comment: signature from minisign secret key on empty file; verify via: minisign -Vm "${path_to_any_empty_file}" -P RWTRmVTMeKV5noAMqVlsMugDDCyyTSbA3Re5AkUrhvLVln0tSaFWglOw -x "${path_to_this_whole_four_line_signature_blob}"
RUTRmVTMeKV5npGrKx1nqXCw5zeVHdtdYURB/KlyA/LMFgpNCs+SkW9a8N95d+U4AP1RJMi+krxU1A3Yux4bpwZNLvVBKy0wLgM=
trusted comment: review ACK 0280dc44d227ae34e8fcbb708833bfe9d66c7b5f 🔨
EwQOJEQzugff502JSeduHe1I8VbH7Du0bAvETIvbAxyKfRuGdPQJ9HgpXRv4wHXJz97qTnvIhTIdbkKbnucEAg==

src/dbwrapper.h Outdated Show resolved Hide resolved
src/dbwrapper.h Show resolved Hide resolved
src/dbwrapper.h Outdated Show resolved Hide resolved
src/dbwrapper.cpp Outdated Show resolved Hide resolved
src/dbwrapper.h Outdated Show resolved Hide resolved
src/dbwrapper.h Show resolved Hide resolved
Copy link
Member

@Sjors Sjors left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

utACK 0280dc4 modulo 6005165.

I also checked that all the intermediate commits compile and pass the tests (on Ubuntu 23.04).

I'm not familiar with the LevelBD code, but the refactor makes sense to me.

I'll look at 6005165 later since it might get simplified based on @MarcoFalke's suggestion.

src/dbwrapper.h Outdated Show resolved Hide resolved
src/dbwrapper.h Show resolved Hide resolved
@maflcko
Copy link
Member

maflcko commented Aug 1, 2023

I'd be happy to work with a reopened #22242

I guess that pull is mostly orthogonal (shouldn't (silently) conflict with this one). Rebased and reopened in #28195

@TheCharlatan
Copy link
Contributor Author

Thank you for the reviews @Sjors and @MarcoFalke,

Updated 0280dc4 -> e8a3a0a (cleaveLeveldbHeaders_0 -> cleaveLeveldbHeaders_1, compare)

@maflcko
Copy link
Member

maflcko commented Aug 1, 2023

re-ACK e8a3a0a 🗞

Show signature

Signature:

untrusted comment: signature from minisign secret key on empty file; verify via: minisign -Vm "${path_to_any_empty_file}" -P RWTRmVTMeKV5noAMqVlsMugDDCyyTSbA3Re5AkUrhvLVln0tSaFWglOw -x "${path_to_this_whole_four_line_signature_blob}"
RUTRmVTMeKV5npGrKx1nqXCw5zeVHdtdYURB/KlyA/LMFgpNCs+SkW9a8N95d+U4AP1RJMi+krxU1A3Yux4bpwZNLvVBKy0wLgM=
trusted comment: re-ACK e8a3a0a0d761c8544e20ffc3ee3d4c1029c44518  🗞
H6bBOIgZy20U7oO/DsSwT8YNb6GfT92ONblBYOrZTkfGEfWVP/VaQ7KBD5xXg9+vmO9WNcCQBEkDecHYcykwDg==

@DrahtBot DrahtBot requested a review from Sjors August 1, 2023 15:52
Copy link
Member

@Sjors Sjors left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

re-ACK e8a3a0a

Copy link
Contributor

@stickies-v stickies-v left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Concept ACK, did a first run-through of the code but will dig into it deeper.

src/dbwrapper.cpp Outdated Show resolved Hide resolved
src/dbwrapper.h Outdated Show resolved Hide resolved
src/dbwrapper.h Outdated Show resolved Hide resolved
src/dbwrapper.h Outdated
@@ -237,6 +236,7 @@ class CDBWrapper
bool m_is_memory;

bool ReadImpl(std::function<void(DataStream&)> write_key_to_stream, std::function<void(CDataStream&)> read_value_from_stream) const;
bool ExistsImpl(DataStream& ssKey) const;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: const DataStream& ssKey?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

style nit: const DataStream& seems fine here, but if you want to pass an immutable view of raw bytes, Span<const std::byte> may be better. (Span{ssKey} should do the conversion, but it will likely happen implicitly by the compiler already).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, CharCast can be moved to the cpp file in the last commit, if you want

Wrap leveldb::DestroyDB in a helper function without exposing
leveldb-specifics.

Also, add missing optional include.

The context of this commit is an effort to decouple the dbwrapper header
file from leveldb includes. To this end, the includes are moved to the
dbwrapper implementation file. This is done as part of the kernel
project to reduce the number of required includes for users of the
kernel.
@TheCharlatan
Copy link
Contributor Author

Thank you for the review @stickies-v,

Updated e8a3a0a -> 8c4481e (cleaveLeveldbHeaders_1 -> cleaveLeveldbHeaders_2, compare)

@TheCharlatan
Copy link
Contributor Author

TheCharlatan commented Aug 3, 2023

Thank you for the re-review @MarcoFalke,

Updated 8c4481e -> f263967 (cleaveLeveldbHeaders_2 -> cleaveLeveldbHeaders_3, compare)

  • Addressed @MarcoFalke's comment, moved CharCast to implementation file
  • Addressed @MarcoFalke's comment, using Span now instead of DataStream types for passing immutable raw bytes.

size_estimate = 0;
}

void CDBBatch::WriteImpl(const Span<const std::byte>& key, CDataStream& ssValue)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
void CDBBatch::WriteImpl(const Span<const std::byte>& key, CDataStream& ssValue)
void CDBBatch::WriteImpl(Span<const std::byte> ssKey, CDataStream& ssValue)

Span is cheap to copy, so there is no need to add const&. Also, it would be better to do the renames in a separate commit. (Maybe at the end for all touched lines in this pull, or not at all). Otherwise, this breaks review options such as --color-moved=dimmed-zebra --color-moved-ws=ignore-all-space

@TheCharlatan
Copy link
Contributor Author

Updated f263967 -> 3fb2dac (cleaveLeveldbHeaders_3 -> cleaveLeveldbHeaders_4, compare)

  • Addressed @MarcoFalke's comment, changed const Span<const std::byte>& to Span<const std::byte>.
  • Split out renaming of some of the key variables into a separate commit at the end.

Copy link
Contributor

@stickies-v stickies-v left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

light crACK 3fb2dac

src/dbwrapper.cpp Outdated Show resolved Hide resolved
src/dbwrapper.cpp Show resolved Hide resolved
src/dbwrapper.cpp Show resolved Hide resolved
src/dbwrapper.cpp Outdated Show resolved Hide resolved
src/dbwrapper.h Outdated Show resolved Hide resolved
options = GetOptions(params.cache_bytes);
options.create_if_missing = true;
m_db_context = std::make_unique<LevelDBContext>();
DBContext().penv = nullptr;
Copy link
Contributor

@stickies-v stickies-v Aug 3, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Would this be an improvement, to highlight the reference nature of db_context, as well as I think generally it's slightly cleaner to not call DBContext() every time (I know it's a very cheap call)? Not sure though, would be a lot of change, happy to keep as is.

Suggested change
DBContext().penv = nullptr;
auto& db_context{DBContext()};
db_context.penv = nullptr;

Comment on lines +279 to +289
delete DBContext().pdb;
DBContext().pdb = nullptr;
delete DBContext().options.filter_policy;
DBContext().options.filter_policy = nullptr;
delete DBContext().options.info_log;
DBContext().options.info_log = nullptr;
delete DBContext().options.block_cache;
DBContext().options.block_cache = nullptr;
delete DBContext().penv;
DBContext().options.env = nullptr;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: would it make sense to move this to the LevelDBContext destructor?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would, but then we'd also have to forward-declare a destructor class for it and I don't think it's really worth doing that.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair enough. Can easily improve later if/when necessary.

@DrahtBot DrahtBot requested review from maflcko and Sjors August 3, 2023 14:48
@DrahtBot DrahtBot removed the CI failed label Aug 3, 2023
Copy link
Contributor

@stickies-v stickies-v left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK 3fb2dac

I would however strongly prefer to simplify the lambda logic, and probably this duplication should also be removed.

src/dbwrapper.h Show resolved Hide resolved
Keep the generic serialization in the header, while moving
leveldb-specifics to the implementation file.

The context of this commit is an effort to decouple the dbwrapper header
file from leveldb includes. To this end, the includes are moved to the
dbwrapper implementation file. This is done as part of the kernel
project to reduce the number of required includes for users of the
kernel.
Keep the generic serialization in the header, while moving
leveldb-specifics to the implementation file.

The context of this commit is an effort to decouple the dbwrapper header
file from leveldb includes. To this end, the includes are moved to the
dbwrapper implementation file. This is done as part of the kernel
project to reduce the number of required includes for users of the
kernel.
Hide the leveldb::WriteBatch member variable with a pimpl in order not
to expose it directly in the header.

Also move CDBBatch::Clear to the dbwrapper implementation to use the new
impl_batch.

The context of this commit is an effort to decouple the dbwrapper header
file from leveldb includes. To this end, the includes are moved to the
dbwrapper implementation file. This is done as part of the kernel
project to reduce the number of required includes for users of the
kernel.
Keep the generic serialization in the header, while moving
leveldb-specifics to the implementation file.

The context of this commit is an effort to decouple the dbwrapper header
file from leveldb includes. To this end, the includes are moved to the
dbwrapper implementation file. This is done as part of the kernel
project to reduce the number of required includes for users of the
kernel.
Keep the generic serialization in the header, while moving
leveldb-specifics to the implementation file.

The context of this commit is an effort to decouple the dbwrapper header
file from leveldb includes. To this end, the includes are moved to the
dbwrapper implementation file. This is done as part of the kernel
project to reduce the number of required includes for users of the
kernel.
Keep the generic serialization in the header, while moving
leveldb-specifics to the implementation file.

The context of this commit is an effort to decouple the dbwrapper header
file from leveldb includes. To this end, the includes are moved to the
dbwrapper implementation file. This is done as part of the kernel
project to reduce the number of required includes for users of the
kernel.
Hide the leveldb::Iterator member variable with a pimpl in order not to
expose it directly in the header.

Also, move CDBWrapper::NewIterator to the dbwrapper implementation to
use the pimpl for CDBIterator initialziation.

The context of this commit is an effort to decouple the dbwrapper header
file from leveldb includes. To this end, the includes are moved to the
dbwrapper implementation file. This is done as part of the kernel
project to reduce the number of required includes for users of the
kernel.
Keep the generic serialization in the header, while moving
leveldb-specifics to the implementation file.

The context of this commit is an effort to decouple the dbwrapper header
file from leveldb includes. To this end, the includes are moved to the
dbwrapper implementation file. This is done as part of the kernel
project to reduce the number of required includes for users of the
kernel.
These were uncovered as missing by the next commit.
Keep the generic serialization in the header, while moving
leveldb-specifics to the implementation file.

The context of this commit is an effort to decouple the dbwrapper header
file from leveldb includes. To this end, the includes are moved to the
dbwrapper implementation file. This is done as part of the kernel
project to reduce the number of required includes for users of the
kernel.
Make it a static function in dbwrapper.cpp, since it is not used
elsewhere and when left in the header, would expose a leveldb type.

The context of this commit is an effort to decouple the dbwrapper header
file from leveldb includes. To this end, the includes are moved to the
dbwrapper implementation file. This is done as part of the kernel
project to reduce the number of required includes for users of the
kernel.
Keep the generic serialization in the header, while moving
leveldb-specifics to the implementation file.

Since CharCast is no longer needed in the header, move it to the
implementation file.

The context of this commit is an effort to decouple the dbwrapper header
file from leveldb includes. To this end, the includes are moved to the
dbwrapper implementation file. This is done as part of the kernel
project to reduce the number of required includes for users of the
kernel.
The context of this commit is an effort to decouple the dbwrapper header
file from leveldb includes. To this end, the includes are moved to the
dbwrapper implementation file. This is done as part of the kernel
project to reduce the number of required includes for users of the
kernel.
Since leveldb is no longer in our header tree, move its include flags to
whereever dbwrapper.cpp is built.
The ss- prefix should connotate a DataStream variable. Now that these
variables are byte spans, drop the prefix.
@TheCharlatan
Copy link
Contributor Author

Thank you for the review @stickies-v,

Updated 3fb2dac -> 1be0472 (cleaveLeveldbHeaders_4 -> cleaveLeveldbHeaders_5, compare)

  • Addressed @stickies-v's comment_1 and comment_2, added some line breaks in constructor lists.
  • Addressed @stickies-v's comment, further split up the CDBWrapper::Read implementation. The ReadImpl now returns an option of the read database value string.
  • Addressed @stickies-v's comment, removed left over m_db_context initializer.
  • Addressed @stickies-v's comment, added LIFETIMEBOUND to DBContext reference getter.
  • Addressed @stickies-v's comment, added docstring to m_db_contex CDBWrapper member variable.

Copy link
Contributor

@stickies-v stickies-v left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

re-ACK 1be0472

nit: these don't seem to actually have been addressed

Addressed @stickies-v's #28186 (comment) and #28186 (comment), added some line breaks in constructor lists.

src/dbwrapper.h Outdated Show resolved Hide resolved
@TheCharlatan
Copy link
Contributor Author

TheCharlatan commented Aug 7, 2023

Updated 1be0472 -> d8f1222 (cleaveLeveldbHeaders_5 -> cleaveLeveldbHeaders_6, compare)

Sorry for the re-push @stickies-v, did not include two changes in my previous push by mistake that should have addressed your latest comments. Pushed now.

@stickies-v
Copy link
Contributor

re-ACK d8f1222

Copy link
Member

@maflcko maflcko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK d8f1222 🔠

Show signature

Signature:

untrusted comment: signature from minisign secret key on empty file; verify via: minisign -Vm "${path_to_any_empty_file}" -P RWTRmVTMeKV5noAMqVlsMugDDCyyTSbA3Re5AkUrhvLVln0tSaFWglOw -x "${path_to_this_whole_four_line_signature_blob}"
RUTRmVTMeKV5npGrKx1nqXCw5zeVHdtdYURB/KlyA/LMFgpNCs+SkW9a8N95d+U4AP1RJMi+krxU1A3Yux4bpwZNLvVBKy0wLgM=
trusted comment: ACK d8f1222ac50f089a0af29eaf8ce0555bad8366ef  🔠
9kmrmmglu0pLTAnlkFVLGRaV07BiskXJWhuBAXoJ8FtylkMHfRojj+S77g6nhGaMuZlTOzX7YETCJ4ZtX4yEAw==

Comment on lines +145 to +147
CDBBatch::CDBBatch(const CDBWrapper& _parent) : parent(_parent),
m_impl_batch{std::make_unique<CDBBatch::WriteBatchImpl>()},
ssValue(SER_DISK, CLIENT_VERSION){};
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

style nit in ea8135d: May be better to add a newline before : to reduce the leading whitespace bloating block.

@@ -365,11 +388,12 @@ struct CDBIterator::IteratorImpl {
explicit IteratorImpl(leveldb::Iterator* _iter) : iter{_iter} {}
};

CDBIterator::CDBIterator(const CDBWrapper& _parent, std::unique_ptr<IteratorImpl> _piter) : parent(_parent), m_impl_iter(std::move(_piter)) {}
CDBIterator::CDBIterator(const CDBWrapper& _parent, std::unique_ptr<IteratorImpl> _piter) : parent(_parent),
m_impl_iter(std::move(_piter)) {}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

c95b37d: same

@fanquake fanquake merged commit b565485 into bitcoin:master Aug 7, 2023
15 checks passed
sidhujag pushed a commit to syscoin/syscoin that referenced this pull request Aug 9, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Done or Closed or Rethinking
Development

Successfully merging this pull request may close these issues.

None yet

8 participants