Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor(tm2): split pkg/db into sub-packages #1602

Merged
merged 20 commits into from
Mar 1, 2024

Conversation

thehowl
Copy link
Member

@thehowl thehowl commented Jan 30, 2024

Changes mostly involve shuffling code around.

The db package now has its basic types and helper functions in pkg/db, and then sub-packages for each DB which is actually used. This allows a package using it to only import the database code it needs.

The end goal I wanted was to remove the transitive dependency of cmd/gno on goleveldb. Turns out this is more complicated than I thought. In any case, I already did it in another branch; will make a second PR shortly after this one.

Notes for reviewing:

  • pkg/db has also the helper packages _all and _tags.
    • _all imports all the databases while respecting the cgo build tag.1 (This is automatically set by the go compiler depending on whether cgo is enabled).
    • The _tags package, instead, imports the databases specified by the build tags. I meant this for end-user binaries, like gno.land eventually (currently it only supports GoLevelDB), so a user can compile with the given build tags.
  • I removed badger and gorocksdb. They were so important that they didn't compile and nobody noticed.
    • badger probably makes sense to be re-added and tried out eventually, but it's outside of the scope of this PR.
    • gorocksdb is unmaintained and will not compile with the latest version of RocksDB. In fact, we already had a DB implementation for grocksdb, the maintained fork. I fixed some code there too to actually make it compile.
  • Removing the two dependencies made our go.mods a bunch lighter. πŸŽ‰
  • Some changes also involve the CI.
    • I removed splitting the build for each database being tested, as the tests on the package run quickly anyway so there is no real need to split them.
    • I also severely downgraded grocksdb so that its version is compatible with the version of RocksDB present on ubuntu 22.04's repositories. There could be a variety of different changes to the CI to allow building on a more recent version of the database; but after having seen that the compilation of RocksDB on a beefy machine with make -j4 is already beyond 5 mintues, I decided we don't need a slow CI for a feature literally no-one uses, so here we are.
  • This PR also fixes the benchmarks in tm2/pkg/iavl/benchmarks (fixes BenchmarkMedium panicking on memdbΒ #908). The fix itself is stupid.

Footnotes

  1. A while ago I was upset that my go compilation was taking long and then found out that it was using cgo for no apparent reason -- it's actually because a few places in the standard library will by default use cgo unless it's disabled. For this reason I actually set CGO_ENABLED=0 in my .profile. ↩

@thehowl thehowl added the πŸ“¦ 🌐 tendermint v2 Issues or PRs tm2 related label Jan 30, 2024
@thehowl thehowl self-assigned this Jan 30, 2024
@github-actions github-actions bot added πŸ“¦ πŸ€– gnovm Issues or PRs gnovm related πŸ“¦ ⛰️ gno.land Issues or PRs gno.land package related labels Jan 30, 2024
@thehowl thehowl requested review from a team and gfanton as code owners January 30, 2024 14:08
Copy link

codecov bot commented Jan 30, 2024

Codecov Report

Attention: Patch coverage is 0% with 3 lines in your changes are missing coverage. Please review.

Project coverage is 47.46%. Comparing base (b965e5e) to head (b0252a5).
Report is 3 commits behind head on master.

Files Patch % Lines
gno.land/pkg/gnoland/node_inmemory.go 0.00% 2 Missing ⚠️
gno.land/pkg/gnoland/app.go 0.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #1602      +/-   ##
==========================================
- Coverage   47.47%   47.46%   -0.01%     
==========================================
  Files         385      384       -1     
  Lines       61377    61356      -21     
==========================================
- Hits        29140    29124      -16     
+ Misses      29802    29797       -5     
  Partials     2435     2435              

β˜” View full report in Codecov by Sentry.
πŸ“’ Have feedback on the report? Share it here.

@kristovatlas
Copy link
Contributor

Just reading the PR comment, I don't see any obvious security implications. πŸ‘

Copy link
Member

@zivkovicmilos zivkovicmilos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for cleaning up the db sub-packages πŸ™

Pinging @ajnavarro to give a second round of reviews, and also to discuss the comment about using a single DB implementation for now, for the sake of time and simplicity

tm2/pkg/crypto/keys/lazy_keybase.go Show resolved Hide resolved
tm2/pkg/db/_all/all.go Show resolved Hide resolved
tm2/pkg/db/db.go Show resolved Hide resolved
tm2/pkg/db/internal/test_common.go Show resolved Hide resolved
tm2/pkg/db/types.go Show resolved Hide resolved
Copy link
Member

@gfanton gfanton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good πŸ‘

tm2/pkg/db/common_test.go Show resolved Hide resolved
tm2/pkg/db/goleveldb/go_level_db_test.go Show resolved Hide resolved
tm2/pkg/db/boltdb/boltdb.go Outdated Show resolved Hide resolved
tm2/pkg/db/cleveldb/no_cgo_test.go Show resolved Hide resolved
tm2/pkg/db/_all/all.go Show resolved Hide resolved
@thehowl
Copy link
Member Author

thehowl commented Feb 29, 2024

After discussing with @zivkovicmilos, placing this in the review meeting so we can discuss, following this discussion: #1602 (comment)

I've changed the PR to remove all databases except GoLevelDB and MemDB (the only ones used in real code), so we can discuss with the team whether it makes sense to keep in our codebase implementations for databases no-one uses, or to pick one database for our use case and expect to eventually leverage even specific features for our usecase, as @ajnavarro suggests.

@thehowl
Copy link
Member Author

thehowl commented Feb 29, 2024

cc/ @jaekwon

With relation to the above comment, we haven't fully reached a consensus in the review meeting on what we should do here, so we'd be asking you for guidance: see #1714 for the PR that splits away from this and tries to sum up the two different takes. Thanks!

@thehowl thehowl merged commit 2bb80df into master Mar 1, 2024
190 of 192 checks passed
@thehowl thehowl deleted the dev/morgan/split-db-package branch March 1, 2024 11:53
thehowl added a commit that referenced this pull request Mar 1, 2024
![image](https://github.com/gnolang/gno/assets/4681308/761f1109-3e0d-4846-b331-3c60b6e803c0)


![image](https://github.com/gnolang/gno/assets/4681308/312b9226-49fa-48ac-b280-f4d704a341df)

The PR depends on #1602 but it's good for review. For ease of reviewing
I put that PR's branch as the base branch; once it is merged I will
rebase on master.

A bunch of changes to get there:

- Split `proxy` into `appconn` and `proxy`. `proxy` has a dependency on
`kvstore`, which then depends on pkg/db and goleveldb.
- Tools for inspecting dependency chains:
- `go list -f '{{ join .Deps "\n" }}'
github.com/gnolang/gno/tm2/your/pkg`
  - [`depth`](https://github.com/KyleBanks/depth)

The original point of the PR was to not include an entire key-store
database as part of `cmd/gno`, but really the better side-effect is that
the dependency is no longer in the rpc package. Sadly, `gnoclient` still
imports `goleveldb`, but I'll leave understanding why that is the case
for another refactor-investigation :)
leohhhn pushed a commit to leohhhn/gno that referenced this pull request Mar 6, 2024
Changes mostly involve shuffling code around.

The `db` package now has its basic types and helper functions in
`pkg/db`, and then sub-packages for each DB which is actually used. This
allows a package using it to only import the database code it needs.

The end goal I wanted was to remove the transitive dependency of
`cmd/gno` on `goleveldb`. Turns out this is more complicated than I
thought. In any case, I already did it in another branch; will make a
second PR shortly after this one.

Notes for reviewing:

- `pkg/db` has also the helper packages `_all` and `_tags`.
- `_all` imports all the databases while respecting the `cgo` build
tag.[^1] (This is automatically set by the go compiler depending on
whether cgo is enabled).
- The `_tags` package, instead, imports the databases specified by the
build tags. I meant this for end-user binaries, like gno.land eventually
(currently it only supports GoLevelDB), so a user can compile with the
given build tags.
- I removed `badger` and `gorocksdb`. They were so important that they
didn't compile and nobody noticed.
- `badger` probably makes sense to be re-added and tried out eventually,
but it's outside of the scope of this PR.
- `gorocksdb` is unmaintained and will not compile with the latest
version of RocksDB. In fact, we already had a DB implementation for
[grocksdb](https://github.com/linxGnu/grocksdb), the maintained fork. I
fixed some code there too to actually make it compile.
- Removing the two dependencies made our `go.mods` a bunch lighter.
:tada:
- Some changes also involve the CI. 
- I removed splitting the build for each database being tested, as the
tests on the package run quickly anyway so there is no real need to
split them.
- I also severely downgraded `grocksdb` so that its version is
compatible with the version of RocksDB present on ubuntu 22.04's
repositories. There could be a variety of different changes to the CI to
allow building on a more recent version of the database; but after
having seen that the compilation of RocksDB on a beefy machine with
`make -j4` is already beyond 5 mintues, I decided we don't need a slow
CI for a feature literally no-one uses, so here we are.
- This PR also fixes the benchmarks in `tm2/pkg/iavl/benchmarks` (fixes
gnolang#908). [The fix itself is
stupid.](gnolang@87ea39d#diff-112673453ecf482fb6bfd8004ebc11eff7536b812cc48bd6dff3328767550908R227-R229)

[^1]: A while ago I was upset that my go compilation was taking long and
then found out that it was using cgo for no apparent reason -- it's
actually because a few places in the standard library will by default
use cgo unless it's disabled. For this reason I actually set
`CGO_ENABLED=0` in my `.profile`.

---------

Co-authored-by: Antonio Navarro <antnavper@gmail.com>
leohhhn pushed a commit to leohhhn/gno that referenced this pull request Mar 6, 2024
![image](https://github.com/gnolang/gno/assets/4681308/761f1109-3e0d-4846-b331-3c60b6e803c0)


![image](https://github.com/gnolang/gno/assets/4681308/312b9226-49fa-48ac-b280-f4d704a341df)

The PR depends on gnolang#1602 but it's good for review. For ease of reviewing
I put that PR's branch as the base branch; once it is merged I will
rebase on master.

A bunch of changes to get there:

- Split `proxy` into `appconn` and `proxy`. `proxy` has a dependency on
`kvstore`, which then depends on pkg/db and goleveldb.
- Tools for inspecting dependency chains:
- `go list -f '{{ join .Deps "\n" }}'
github.com/gnolang/gno/tm2/your/pkg`
  - [`depth`](https://github.com/KyleBanks/depth)

The original point of the PR was to not include an entire key-store
database as part of `cmd/gno`, but really the better side-effect is that
the dependency is no longer in the rpc package. Sadly, `gnoclient` still
imports `goleveldb`, but I'll leave understanding why that is the case
for another refactor-investigation :)
thehowl added a commit that referenced this pull request Mar 7, 2024
This PR sets `CGO_ENABLED=0` in all top-level makefiles. This comes as a
consequence of a rare occurrence where [bft is actually failing for a
valid
reason](https://github.com/gnolang/gno/actions/runs/8111419409/job/22170651604).
This is the relevant part of the logs:

```
2024-03-01T12:36:50.7818357Z ok  	github.com/gnolang/gno/tm2/pkg/bft/rpc/lib/types	0.004s	coverage: 33.3% of statements
2024-03-01T12:36:50.7853384Z 	github.com/gnolang/gno/tm2/pkg/bft/rpc/test		coverage: 0.0% of statements
2024-03-01T12:36:51.5524659Z # github.com/jmhodges/levigo
2024-03-01T12:36:51.5556601Z ##[error]../../../../go/pkg/mod/github.com/jmhodges/levigo@v1.0.0/batch.go:4:11: fatal error: leveldb/c.h: No such file or directory
2024-03-01T12:36:51.5567246Z     4 | // #include "leveldb/c.h"
2024-03-01T12:36:51.5567886Z       |           ^~~~~~~~~~~~~
2024-03-01T12:36:51.5568424Z compilation terminated.
2024-03-01T12:36:59.3179828Z FAIL	github.com/gnolang/gno/tm2/pkg/bft/state [build failed]
```

This is likely a consequence of
#1602. I've additionally made it so
that `bft/state` doesn't require any database it doesn't need in its
tests, but personally I have been repeatedly bitten by finding out CGO
was enabled and compiling in unexpected places (including the Go stdlib)
that I think it makes sense to keep it disabled by default in our
makefiles.

This PR tackles that, and takes the chance to re-organise the overall
structure of makefiles to document prominently the available variables
that can be changed.
albttx pushed a commit to albttx/gno that referenced this pull request Mar 15, 2024
This PR sets `CGO_ENABLED=0` in all top-level makefiles. This comes as a
consequence of a rare occurrence where [bft is actually failing for a
valid
reason](https://github.com/gnolang/gno/actions/runs/8111419409/job/22170651604).
This is the relevant part of the logs:

```
2024-03-01T12:36:50.7818357Z ok  	github.com/gnolang/gno/tm2/pkg/bft/rpc/lib/types	0.004s	coverage: 33.3% of statements
2024-03-01T12:36:50.7853384Z 	github.com/gnolang/gno/tm2/pkg/bft/rpc/test		coverage: 0.0% of statements
2024-03-01T12:36:51.5524659Z # github.com/jmhodges/levigo
2024-03-01T12:36:51.5556601Z ##[error]../../../../go/pkg/mod/github.com/jmhodges/levigo@v1.0.0/batch.go:4:11: fatal error: leveldb/c.h: No such file or directory
2024-03-01T12:36:51.5567246Z     4 | // #include "leveldb/c.h"
2024-03-01T12:36:51.5567886Z       |           ^~~~~~~~~~~~~
2024-03-01T12:36:51.5568424Z compilation terminated.
2024-03-01T12:36:59.3179828Z FAIL	github.com/gnolang/gno/tm2/pkg/bft/state [build failed]
```

This is likely a consequence of
gnolang#1602. I've additionally made it so
that `bft/state` doesn't require any database it doesn't need in its
tests, but personally I have been repeatedly bitten by finding out CGO
was enabled and compiling in unexpected places (including the Go stdlib)
that I think it makes sense to keep it disabled by default in our
makefiles.

This PR tackles that, and takes the chance to re-organise the overall
structure of makefiles to document prominently the available variables
that can be changed.
@jaekwon
Copy link
Contributor

jaekwon commented Mar 25, 2024

Thank you!

thehowl added a commit that referenced this pull request May 15, 2024
Split from #1602. Context:

> @zivkovicmilos:[^1] Unpopular opinion: we should probably just drop
these DBs entirely from the codebase, they are never gonna be used in
place of LevelDB in the near future
>
> @ajnavarro: IMHO the needed database is defined by the application
needs, not from any other external usecases. Sticking to one of the
existing key/value databases, we can use their special capabilities to
make it more performant for our use case (yes, even if they are all
key/value DBs, they differ in a lot of low-level functionality). \
> I would say to use one and add support in the future for more if
needed.

After a discussion in the review meeting w/ @jaekwon, this PR is now
changed to maintain another reference implementation with boltdb. It
also makes sure that it is selectable, when available, from the tm2 node
configuration.

After this PR, there are no databases in the codebase using cgo.

[^1]: #1602 (comment)
leohhhn pushed a commit to leohhhn/gno that referenced this pull request May 21, 2024
Split from gnolang#1602. Context:

> @zivkovicmilos:[^1] Unpopular opinion: we should probably just drop
these DBs entirely from the codebase, they are never gonna be used in
place of LevelDB in the near future
>
> @ajnavarro: IMHO the needed database is defined by the application
needs, not from any other external usecases. Sticking to one of the
existing key/value databases, we can use their special capabilities to
make it more performant for our use case (yes, even if they are all
key/value DBs, they differ in a lot of low-level functionality). \
> I would say to use one and add support in the future for more if
needed.

After a discussion in the review meeting w/ @jaekwon, this PR is now
changed to maintain another reference implementation with boltdb. It
also makes sure that it is selectable, when available, from the tm2 node
configuration.

After this PR, there are no databases in the codebase using cgo.

[^1]: gnolang#1602 (comment)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
πŸ“¦ 🌐 tendermint v2 Issues or PRs tm2 related πŸ“¦ ⛰️ gno.land Issues or PRs gno.land package related πŸ“¦ πŸ€– gnovm Issues or PRs gnovm related
Projects
Status: Done
Status: No status
Archived in project
Development

Successfully merging this pull request may close these issues.

BenchmarkMedium panicking on memdb
6 participants