Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add more telemetry #2059

Merged
merged 21 commits into from
May 24, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 4 additions & 2 deletions gno.land/cmd/gnoland/start.go
Original file line number Diff line number Diff line change
Expand Up @@ -247,8 +247,10 @@
// Wrap the zap logger
logger := log.ZapLoggerToSlog(zapLogger)

// Initialize telemetry
telemetry.Init(*cfg.Telemetry)
// Initialize the telemetry
if err := telemetry.Init(*cfg.Telemetry); err != nil {
return fmt.Errorf("unable to initialize telemetry, %w", err)

Check warning on line 252 in gno.land/cmd/gnoland/start.go

View check run for this annotation

Codecov / codecov/patch

gno.land/cmd/gnoland/start.go#L252

Added line #L252 was not covered by tests
}

// Write genesis file if missing.
// NOTE: this will be dropped in a PR that resolves issue #1886:
Expand Down
66 changes: 47 additions & 19 deletions gno.land/pkg/sdk/vm/handler.go
Original file line number Diff line number Diff line change
@@ -1,12 +1,17 @@
package vm

import (
"context"
"fmt"
"strings"

abci "github.com/gnolang/gno/tm2/pkg/bft/abci/types"
"github.com/gnolang/gno/tm2/pkg/sdk"
"github.com/gnolang/gno/tm2/pkg/std"
"github.com/gnolang/gno/tm2/pkg/telemetry"
"github.com/gnolang/gno/tm2/pkg/telemetry/metrics"
"go.opentelemetry.io/otel/attribute"
"go.opentelemetry.io/otel/metric"
)

type vmHandler struct {
Expand Down Expand Up @@ -51,14 +56,6 @@
}
res.Data = []byte(resstr)
return
/* TODO handle events.
ctx.EventManager().EmitEvent(
sdk.NewEvent(
sdk.EventTypeMessage,
sdk.NewAttribute(sdk.AttributeKeyXXX, types.AttributeValueXXX),
),
)
*/
}

// Handle MsgRun.
Expand All @@ -71,7 +68,7 @@
return
}

//----------------------------------------
// ----------------------------------------
// Query

// query paths
Expand All @@ -84,27 +81,58 @@
QueryFile = "qfile"
)

func (vh vmHandler) Query(ctx sdk.Context, req abci.RequestQuery) (res abci.ResponseQuery) {
switch secondPart(req.Path) {
func (vh vmHandler) Query(ctx sdk.Context, req abci.RequestQuery) abci.ResponseQuery {
var (
res abci.ResponseQuery
path = secondPart(req.Path)
)

Check warning on line 88 in gno.land/pkg/sdk/vm/handler.go

View check run for this annotation

Codecov / codecov/patch

gno.land/pkg/sdk/vm/handler.go#L84-L88

Added lines #L84 - L88 were not covered by tests

switch path {

Check warning on line 90 in gno.land/pkg/sdk/vm/handler.go

View check run for this annotation

Codecov / codecov/patch

gno.land/pkg/sdk/vm/handler.go#L90

Added line #L90 was not covered by tests
case QueryPackage:
return vh.queryPackage(ctx, req)
res = vh.queryPackage(ctx, req)

Check warning on line 92 in gno.land/pkg/sdk/vm/handler.go

View check run for this annotation

Codecov / codecov/patch

gno.land/pkg/sdk/vm/handler.go#L92

Added line #L92 was not covered by tests
case QueryStore:
return vh.queryStore(ctx, req)
res = vh.queryStore(ctx, req)

Check warning on line 94 in gno.land/pkg/sdk/vm/handler.go

View check run for this annotation

Codecov / codecov/patch

gno.land/pkg/sdk/vm/handler.go#L94

Added line #L94 was not covered by tests
case QueryRender:
return vh.queryRender(ctx, req)
res = vh.queryRender(ctx, req)

Check warning on line 96 in gno.land/pkg/sdk/vm/handler.go

View check run for this annotation

Codecov / codecov/patch

gno.land/pkg/sdk/vm/handler.go#L96

Added line #L96 was not covered by tests
case QueryFuncs:
return vh.queryFuncs(ctx, req)
res = vh.queryFuncs(ctx, req)

Check warning on line 98 in gno.land/pkg/sdk/vm/handler.go

View check run for this annotation

Codecov / codecov/patch

gno.land/pkg/sdk/vm/handler.go#L98

Added line #L98 was not covered by tests
case QueryEval:
return vh.queryEval(ctx, req)
res = vh.queryEval(ctx, req)

Check warning on line 100 in gno.land/pkg/sdk/vm/handler.go

View check run for this annotation

Codecov / codecov/patch

gno.land/pkg/sdk/vm/handler.go#L100

Added line #L100 was not covered by tests
case QueryFile:
return vh.queryFile(ctx, req)
res = vh.queryFile(ctx, req)

Check warning on line 102 in gno.land/pkg/sdk/vm/handler.go

View check run for this annotation

Codecov / codecov/patch

gno.land/pkg/sdk/vm/handler.go#L102

Added line #L102 was not covered by tests
default:
res = sdk.ABCIResponseQueryFromError(
return sdk.ABCIResponseQueryFromError(

Check warning on line 104 in gno.land/pkg/sdk/vm/handler.go

View check run for this annotation

Codecov / codecov/patch

gno.land/pkg/sdk/vm/handler.go#L104

Added line #L104 was not covered by tests
std.ErrUnknownRequest(fmt.Sprintf(
"unknown vm query endpoint %s in %s",
secondPart(req.Path), req.Path)))
}

// Log the telemetry
logQueryTelemetry(path, res.IsErr())

Check warning on line 111 in gno.land/pkg/sdk/vm/handler.go

View check run for this annotation

Codecov / codecov/patch

gno.land/pkg/sdk/vm/handler.go#L111

Added line #L111 was not covered by tests
zivkovicmilos marked this conversation as resolved.
Show resolved Hide resolved

return res

Check warning on line 113 in gno.land/pkg/sdk/vm/handler.go

View check run for this annotation

Codecov / codecov/patch

gno.land/pkg/sdk/vm/handler.go#L113

Added line #L113 was not covered by tests
}

// logQueryTelemetry logs the relevant VM query telemetry
func logQueryTelemetry(path string, isErr bool) {
if !telemetry.MetricsEnabled() {

Check warning on line 118 in gno.land/pkg/sdk/vm/handler.go

View check run for this annotation

Codecov / codecov/patch

gno.land/pkg/sdk/vm/handler.go#L117-L118

Added lines #L117 - L118 were not covered by tests
return
}

metrics.VMQueryCalls.Add(
context.Background(),
1,
metric.WithAttributes(
attribute.KeyValue{
Key: "path",
Value: attribute.StringValue(path),
},
),
)

Check warning on line 131 in gno.land/pkg/sdk/vm/handler.go

View check run for this annotation

Codecov / codecov/patch

gno.land/pkg/sdk/vm/handler.go#L122-L131

Added lines #L122 - L131 were not covered by tests

if isErr {
metrics.VMQueryErrors.Add(context.Background(), 1)

Check warning on line 134 in gno.land/pkg/sdk/vm/handler.go

View check run for this annotation

Codecov / codecov/patch

gno.land/pkg/sdk/vm/handler.go#L133-L134

Added lines #L133 - L134 were not covered by tests
}
}

// queryPackage fetch a package's files.
Expand Down Expand Up @@ -187,7 +215,7 @@
return
}

//----------------------------------------
// ----------------------------------------
// misc

func abciResult(err error) sdk.Result {
Expand Down
71 changes: 71 additions & 0 deletions gno.land/pkg/sdk/vm/keeper.go
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@

import (
"bytes"
"context"
"fmt"
"os"
"strings"
Expand All @@ -16,6 +17,10 @@
"github.com/gnolang/gno/tm2/pkg/sdk/bank"
"github.com/gnolang/gno/tm2/pkg/std"
"github.com/gnolang/gno/tm2/pkg/store"
"github.com/gnolang/gno/tm2/pkg/telemetry"
"github.com/gnolang/gno/tm2/pkg/telemetry/metrics"
"go.opentelemetry.io/otel/attribute"
"go.opentelemetry.io/otel/metric"
)

const (
Expand Down Expand Up @@ -215,6 +220,17 @@
}
}()
m2.RunMemPackage(memPkg, true)

// Log the telemetry
logTelemetry(
m2.GasMeter.GasConsumed(),
m2.Cycles,
attribute.KeyValue{
Key: "operation",
Value: attribute.StringValue("m_addpkg"),
},
)

return nil
}

Expand Down Expand Up @@ -312,7 +328,19 @@
res += "\n"
}
}

// Log the telemetry
logTelemetry(
m.GasMeter.GasConsumed(),
m.Cycles,
attribute.KeyValue{
Key: "operation",
Value: attribute.StringValue("m_call"),
},
)

res += "\n\n" // use `\n\n` as separator to separate results for single tx with multi msgs

return res, nil
// TODO pay for gas? TODO see context?
}
Expand Down Expand Up @@ -418,6 +446,17 @@
}()
m2.RunMain()
res = buf.String()

// Log the telemetry
logTelemetry(
m2.GasMeter.GasConsumed(),
m2.Cycles,
attribute.KeyValue{
Key: "operation",
Value: attribute.StringValue("m_run"),
},
)

return res, nil
}

Expand Down Expand Up @@ -636,3 +675,35 @@
return res, nil
}
}

// logTelemetry logs the VM processing telemetry
func logTelemetry(
gasUsed int64,
cpuCycles int64,
zivkovicmilos marked this conversation as resolved.
Show resolved Hide resolved
attributes ...attribute.KeyValue,
) {
if !telemetry.MetricsEnabled() {
return
}

// Record the operation frequency
metrics.VMExecMsgFrequency.Add(
context.Background(),
1,
metric.WithAttributes(attributes...),
)

Check warning on line 694 in gno.land/pkg/sdk/vm/keeper.go

View check run for this annotation

Codecov / codecov/patch

gno.land/pkg/sdk/vm/keeper.go#L690-L694

Added lines #L690 - L694 were not covered by tests

// Record the CPU cycles
metrics.VMCPUCycles.Record(
context.Background(),
cpuCycles,
metric.WithAttributes(attributes...),
)

Check warning on line 701 in gno.land/pkg/sdk/vm/keeper.go

View check run for this annotation

Codecov / codecov/patch

gno.land/pkg/sdk/vm/keeper.go#L697-L701

Added lines #L697 - L701 were not covered by tests

// Record the gas used
metrics.VMGasUsed.Record(
context.Background(),
gasUsed,
metric.WithAttributes(attributes...),
)

Check warning on line 708 in gno.land/pkg/sdk/vm/keeper.go

View check run for this annotation

Codecov / codecov/patch

gno.land/pkg/sdk/vm/keeper.go#L704-L708

Added lines #L704 - L708 were not covered by tests
}
11 changes: 11 additions & 0 deletions misc/telemetry/Makefile
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
.PHONY: up
up:
docker compose up -d --build

.PHONY: down
down:
docker compose down

.PHONY: clean
clean:
docker compose down -v
56 changes: 56 additions & 0 deletions misc/telemetry/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,56 @@
## Overview

The purpose of this Telemetry documentation is to showcase the different node metrics exposed by the Gno node through
OpenTelemetry, without having to do extraneous setup.

The containerized setup is the following:

- Grafana dashboard
- Prometheus
- OpenTelemetry collector (separate service that needs to run)
- Single Gnoland node, with 1s block times and configured telemetry (enabled)
- Supernova process that simulates load periodically (generates network traffic)

## Starting the containers

### Step 1: Spinning up Docker

Make sure you have Docker installed and running on your system. After that, within the `misc/telemetry` folder run the
following command:

```shell
make up
```

This will build out the required Docker images for this simulation, and start the services

### Step 2: Open Grafana

When you've verified that the `telemetry` containers are up and running, head on over to http://localhost:3000 to open
the Grafana dashboard.

Default login details:

```
username: admin
password: admin
```

After you've logged in (you can skip setting a new password), on the left hand side, click on
`Dashboards -> Gno -> Gno Node Metrics`:
![Grafana](assets/grafana-1.jpeg)

This will open up the predefined Gno Metrics dashboards (added for ease of use) :
![Metrics Dashboard](assets/grafana-2.jpeg)

Periodically, these metrics will be updated as the `supernova` process is simulating network traffic.

### Step 3: Stopping the cluster

To stop the cluster, you can run:

```shell
make down
```

which will stop the Docker containers. Additionally, you can delete the Docker volumes with `make clean`.
Binary file added misc/telemetry/assets/grafana-1.jpeg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added misc/telemetry/assets/grafana-2.jpeg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
22 changes: 22 additions & 0 deletions misc/telemetry/collector/collector.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
receivers:
otlp:
protocols:
grpc:
endpoint: 0.0.0.0:4317

processors:
batch:

exporters:
prometheus:
endpoint: collector:8090

service:
telemetry:
logs:
level: "debug"
pipelines:
metrics:
receivers: [ otlp ]
processors: [ batch ]
exporters: [ prometheus ]
Loading
Loading