Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cleveldb socket #1162

Closed
ebuchman opened this issue Jan 26, 2018 · 25 comments
Closed

cleveldb socket #1162

ebuchman opened this issue Jan 26, 2018 · 25 comments
Labels
C:consensus Component: Consensus C:sync Component: Fast Sync, State Sync T:perf Type: Performance
Milestone

Comments

@ebuchman
Copy link
Contributor

We need to use cleveldb instead of goleveldb because its drastically better.

But there are issues around cross compiling Go+C.

Solution is to talk to CLevelDB running in a separate processes - we only need to build the binaries for that once (or else rarely) for each platform.

Alternatively, they might be some magic we can do with Go plugins ...

@ebuchman ebuchman added C:sync Component: Fast Sync, State Sync T:perf Type: Performance C:consensus Component: Consensus priority? labels Jan 26, 2018
@greg-szabo
Copy link
Contributor

How about BadgerDB? - Frey
Let's try that too. - Bucky
There's a PR in tmlibs that is related to this: tendermint/tmlibs#58

@adrianbrink
Copy link
Contributor

The issue with including C is that it screws with deterministic builds right?

Because, I think in the long run it makes sense to be able to include C in our builds.

@greg-szabo greg-szabo added this to the launch milestone Feb 6, 2018
odeke-em added a commit to tendermint/tmlibs that referenced this issue Mar 12, 2018
Fixes tendermint/tendermint#1162

Databases as a service!

Can now access Databases as a remote service
via gRPC for performance and easy deployment.

The caveat is that each service is stateful in regards to
the DB i.e. each unique service uses only one unique DB
but nonetheless multiple clients can access it.

A full standalone example

```go
package main

import (
  "bytes"
  "context"
  "log"

  grpcdb "github.com/tendermint/tmlibs/grpcdb"
  protodb "github.com/tendermint/tmlibs/proto"
)

func main() {
  addr := ":8998"
  go func() {
    if err := grpcdb.BindRemoteDBServer(addr); err != nil {
      log.Fatalf("BindRemoteDBServer: %v", err)
    }
  }()

  client, err := grpcdb.NewClient(addr, false)
  if err != nil {
    log.Fatalf("Failed to create grpcDB client: %v", err)
  }

  ctx := context.Background()
  // 1. Initialize the DB
  in := &protodb.Init{
    Type: "leveldb",
    Name: "grpc-uno-test",
    Dir:  ".",
  }
  if _, err := client.Init(ctx, in); err != nil {
    log.Fatalf("Init error: %v", err)
  }

  // 2. Now it can be used!
  query1 := &protodb.Entity{Key: []byte("Project"), Value:
[]byte("Tmlibs-on-gRPC")}
  if _, err := client.SetSync(ctx, query1); err != nil {
    log.Fatalf("SetSync err: %v", err)
  }

  query2 := &protodb.Entity{Key: []byte("Project")}
  read, err := client.Get(ctx, query2)
  if err != nil {
    log.Fatalf("Get err: %v", err)
  }
  if g, w := read.Value, []byte("Tmlibs-on-gRPC"); !bytes.Equal(g, w) {
    log.Fatalf("got= (%q ==> % X)\nwant=(%q ==> % X)", g, g, w, w)
  }
}
```
@melekes
Copy link
Contributor

melekes commented Mar 19, 2018

@odeke-em just curious, have you looked at go plugins or other means of registering components (with C code)?

@odeke-em
Copy link
Contributor

odeke-em commented Mar 23, 2018

@odeke-em just curious, have you looked at go plugins or other means of registering components (with C code)?

@melekes great question. I see very many problems with plugins that I am not sure it is a maintenance road that I would want to take also that would require passing around the *.so files which are another dependency that at that point I'd rather just ask the customer to install gcc on their computer. Using a socket or rather RPC IMHO is a better option, decouples dependencies since now we'll just need a URL, but also we can vendor a Docker container with the CLevelDB service that anyone can run locally or on the cloud.

@srmo
Copy link
Contributor

srmo commented Mar 24, 2018

Solution is to talk to CLevelDB running in a separate processes

Hm, we as core users would violate one of a main restriction/selling points in our services with that: namely that there are no additional service dependencies that require additional maintenance, etc.
Do you consider keeping a embedded/in process hook?

@srmo
Copy link
Contributor

srmo commented Mar 24, 2018

PS:
otherwise, we woul very much appreciate if this RPC/Socket interface to an external DB is abstracted away such that we can use whatever DB we like.
Are the following discussions related to this?
#674
#803
#1161

IIRC, there was a PR not long ago that replaced the leveldb with MongoDB (not great performance wise) but it was a start. I think that PR / proposal was rejected.

@odeke-em
Copy link
Contributor

PS:
otherwise, we woul very much appreciate if this RPC/Socket interface to an external DB is abstracted away such that we can use whatever DB we like.

@srmo in deed, already done in tendermint/tmlibs#162 and for brevity there is a sample test/usage https://github.com/tendermint/tmlibs/blob/d5d68050977ca4d44308b078b977aa7de65fd4d2/remotedb/remotedb_test.go#L24-L40

@srmo
Copy link
Contributor

srmo commented Mar 25, 2018

Thanks for the heads up. Will you keep the embedded option?

@odeke-em
Copy link
Contributor

Will you keep the embedded option?

@srmo what do you mean? What's going to happen is that the server will be deployed remotely e.g. via a Docker container or by a database broker i.e. a separate process from the client. Am not sure "embedding" will apply here.

@srmo
Copy link
Contributor

srmo commented Mar 25, 2018

„Embedded“ as it is know. In our use cases with tendermint core, deploying additional components is not option.

@srmo
Copy link
Contributor

srmo commented Mar 25, 2018

*Edit (on mobile) : as it is now, i.e. in process DB

@odeke-em
Copy link
Contributor

„Embedded“ as it is know. In our use cases with tendermint core, deploying additional components is not option.

Yeah, within the same process doesn't fly and that's the point of the isolation, some machines might not have gcc installed. Your requirements don't permit IPC either? In such a case as on the same machine, perhaps it'd be a cross compiled binary for your system(having been built on a system with gcc) then deployed on as a server on the same machine. Depends though on the urgency and use case though. We certainly could explore a case of plugins/shared object libraries that then get linked at runtime time to run in the same process. However, the usecase and urgency are what warrant the complexity and prioritization of solutions. So, which brings me to the question: what's your usecase? What are your constraints? Hope this makes sense.

@srmo
Copy link
Contributor

srmo commented Mar 25, 2018

Our stack revolves around the following:

  • A) a node, consisting of tendermint + abci app
  • B) a special functional node, connecting to A

This provides us with a deployment scenario where customers only have to install
A) the node
B) the functional node

In total 3 services. That's also our main selling point, you only have to install and maintain min 2, max 3 services, Tendermint + ABCI app (+ functional node).

This works because tendermint core uses an in-process/embedded DB, namely leveldb.
If this possibility is removed by requiring installation of an additional service, we loose one of the main selling points. Why? Because this additional service requires additional maintenance/backup strategies/ all the administrative overhead, which is not necessary in the current setup.

So as long as this solution works as a backup, we would be fine. Otherwise the whole selling point has to be reworked/redesigned.

Sure, we also wondered how an external DB with powerful query options would help in certain use cases but this brings us back to the points raised above.

Does this help?

@srmo
Copy link
Contributor

srmo commented Mar 25, 2018

PS: I'm not sure if IPC helps, it might but I'm not very familiar with deployment processes for such a setup.

@ebuchman
Copy link
Contributor Author

We will continue to support in-process Go based level db for folks that just want the simple build/ops pipeline. For folks with stricter requirements around the robustness and performance of the underlying database, they will be able to use the SocketDB.

Closing this in favor of new issue to track this work tendermint/tmlibs#203

cwgoes pushed a commit to tendermint/tmlibs that referenced this issue May 7, 2018
Fixes tendermint/tendermint#1162

Databases as a service!

Can now access Databases as a remote service
via gRPC for performance and easy deployment.

The caveat is that each service is stateful in regards to
the DB i.e. each unique service uses only one unique DB
but nonetheless multiple clients can access it.

A full standalone example

```go
package main

import (
  "bytes"
  "context"
  "log"

  grpcdb "github.com/tendermint/tmlibs/grpcdb"
  protodb "github.com/tendermint/tmlibs/proto"
)

func main() {
  addr := ":8998"
  go func() {
    if err := grpcdb.BindRemoteDBServer(addr); err != nil {
      log.Fatalf("BindRemoteDBServer: %v", err)
    }
  }()

  client, err := grpcdb.NewClient(addr, false)
  if err != nil {
    log.Fatalf("Failed to create grpcDB client: %v", err)
  }

  ctx := context.Background()
  // 1. Initialize the DB
  in := &protodb.Init{
    Type: "leveldb",
    Name: "grpc-uno-test",
    Dir:  ".",
  }
  if _, err := client.Init(ctx, in); err != nil {
    log.Fatalf("Init error: %v", err)
  }

  // 2. Now it can be used!
  query1 := &protodb.Entity{Key: []byte("Project"), Value:
[]byte("Tmlibs-on-gRPC")}
  if _, err := client.SetSync(ctx, query1); err != nil {
    log.Fatalf("SetSync err: %v", err)
  }

  query2 := &protodb.Entity{Key: []byte("Project")}
  read, err := client.Get(ctx, query2)
  if err != nil {
    log.Fatalf("Get err: %v", err)
  }
  if g, w := read.Value, []byte("Tmlibs-on-gRPC"); !bytes.Equal(g, w) {
    log.Fatalf("got= (%q ==> % X)\nwant=(%q ==> % X)", g, g, w, w)
  }
}
```
@ackratos
Copy link
Contributor

ackratos commented Feb 28, 2019

We need to use cleveldb instead of goleveldb because its drastically better.

will we use cleveldb at cosmos mainnet launch? @ebuchman

@melekes
Copy link
Contributor

melekes commented Feb 28, 2019

will we use cleveldb at cosmos mainnet launch?

Choice of the database is up to validators & other people running full-nodes. Tendermint does not enforce it.

@ackratos
Copy link
Contributor

Choice of the database is up to validators & other people running full-nodes. Tendermint does not enforce it.

I see, thank you:)
Does tendermint has any recommendations/preference considered performance and stability (if you want run some nodes by yourselves)?
As golevel db is default implementation, how confident are you about the stability of cleveldb?

@melekes
Copy link
Contributor

melekes commented Feb 28, 2019

I'd definitely use cleveldb 🦇 Although, we have some reports of possible memory leaks (make sure to use latest version).

@odeke-em
Copy link
Contributor

I'd definitely use cleveldb 🦇 Although, we have some reports of possible memory leaks (make sure to use latest version)

@melekes long time man! Please share with me reports of memory leaks if these reports are public. Thanks.

@ackratos
Copy link
Contributor

ackratos commented Mar 1, 2019

make sure to use latest version

Thank you very much!
you mean latest tendermint right? I presume leveldb (compiled lib) itself doesn't leak

@melekes
Copy link
Contributor

melekes commented Mar 1, 2019

you mean latest tendermint right? I presume leveldb (compiled lib) itself doesn't leak

I meant latest leveldb.

@ackratos
Copy link
Contributor

I meant latest leveldb.

okay, thank you. we are using latest release https://github.com/google/leveldb/releases/tag/v1.20
would you suggest latest master branch?

@melekes
Copy link
Contributor

melekes commented Mar 15, 2019

No, v.1.20 is fine.

@melekes
Copy link
Contributor

melekes commented Mar 15, 2019

⚠️ We've fixed a memory leak in Tendermint connected to CLevelDB https://github.com/tendermint/tendermint/blob/master/CHANGELOG.md#v0302

Cashmaney pushed a commit to scrtlabs/tendermint that referenced this issue Aug 2, 2023
…ndermint#1162)

Bumps [bufbuild/buf-setup-action](https://github.com/bufbuild/buf-setup-action) from 1.24.0 to 1.25.0.
- [Release notes](https://github.com/bufbuild/buf-setup-action/releases)
- [Commits](bufbuild/buf-setup-action@v1.24.0...v1.25.0)

---
updated-dependencies:
- dependency-name: bufbuild/buf-setup-action
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C:consensus Component: Consensus C:sync Component: Fast Sync, State Sync T:perf Type: Performance
Projects
None yet
Development

No branches or pull requests

7 participants