Structured logs for relayer logic #1491

adizere · 2021-10-25T16:23:21Z

Closes: #1537

Description

For contributor use:

Added a changelog entry, using unclog.
If applicable: Unit tests written, added test to CI.
Linked to Github issue with discussion and accepted design OR link to spec that describes this work.
Updated relevant documentation (docs/) and code comments.
Re-reviewed Files changed in the Github PR explorer.

relayer-cli/src/commands.rs

relayer/src/connection.rs

This was inspired from commit b63335b (see rpc.rs therein).

Just examine the hash map, do not allocate any copies.

config.toml

relayer/src/chain/cosmos.rs

relayer/src/channel.rs

relayer/src/chain/tx.rs

relayer/src/link/relay_path.rs

relayer/src/supervisor/spawn.rs

Co-authored-by: Mikhail Zabaluev <mikhail@informal.systems>

Add tracking_id method to use in tracing instead. Also, use less cryptic span names in send_messages_*.

Make the tracking ID construction parameter generic to accept any Into<String> parameter.

Identify TrackedMsgs batches for creating channels and connections with the specific message that is used.

Use the `odata` key in tracing.

mzabaluev

Looks good, but I've made some changes that may need another look.

adizere

Thanks for the adjustments Mikhail! I think we should just adjust a bit the tracking ids, because they can make the output lines very long -- I left suggestions for that.

Can we also signal the changes in the changelog? https://github.com/informalsystems/ibc-rs/blob/master/CONTRIBUTING.md#examples

adizere · 2022-01-05T09:08:46Z

relayer/src/chain/cosmos.rs

@@ -1111,7 +1112,7 @@ impl ChainEndpoint for CosmosSdkChain {
    ) -> Result<Vec<Response>, Error> {
        crate::time!("send_messages_and_wait_check_tx");

-        let span = span!(Level::DEBUG, "send", id = %tracked_msgs);
+        let span = span!(Level::DEBUG, "send_tx_check", id = %tracked_msgs.tracking_id());


Thanks, this is clearer!

relayer/src/channel.rs

relayer/src/connection.rs

adizere · 2022-01-05T09:23:53Z

relayer/src/connection.rs

@@ -828,7 +828,7 @@ impl<ChainA: ChainHandle, ChainB: ChainHandle> Connection<ChainA, ChainB> {
            .map_err(|e| ConnectionError::chain_query(self.dst_chain().id(), e))?;
        let client_msgs = self.build_update_client_on_src(src_client_target_height)?;

-        let tm = TrackedMsgs::new(client_msgs, "create connection");
+        let tm = TrackedMsgs::new(client_msgs, "update client on source for ConnectionOpenTry");


I can't think of a shorter way to formulate this without losing context. I guess we can keep it as it is.

relayer/src/connection.rs

relayer/src/link/relay_path.rs

The operation name is usually enough.

adizere

Once we do the release (merge #1712), let's merge this. Thanks Mikhail!

adizere · 2022-01-10T13:49:03Z

Post-merge artifact that might need our attention: when we encounter an account sequence number mismatch error, we see the following error message missing any tracking id:

2022-01-10T13:45:14.341974Z WARN ThreadId(1723) task PacketWorker(ibc-1:transfer/channel-0 -> ibc-0) encountered ignorable error: link errror: failed with underlying error: gRPC call failed with status: status: InvalidArgument, message: "account sequence mismatch, expected 86, got 85: incorrect account sequence: invalid request", details: [], metadata: MetadataMap { headers: {"content-type": "application/grpc"} }

The task lifetime is better tracked with a tracing span, which also reduces code repetition in tracking macros by putting in the task name once.

mzabaluev · 2022-01-10T19:04:47Z

@adizere The problem with that tracing entry is that it is produced higher up in the call stack than the specific task that creates the tracking id.

relayer/src/util/task.rs

spawn_background_task receives a Span constructed by the caller. This allows embedding contextual information for the task, which is used to reduce repetition in logging macros for the workers.

mzabaluev · 2022-01-11T13:21:33Z

relayer/src/worker/client.rs

+        span!(
+            tracing::Level::ERROR,
+            "DetectMisbehaviorWorker",
+            client = %client.id,
+            src_chain = %client.src_chain.id(),
+            dst_chain = %client.dst_chain.id(),
+        ),


This is a bit more repetitive than the previous approach, but this allows filtering logs by any of the individual ID fields.

If this level of detail is not desirable, we could implement tracing::Value on the objects that were logged with Display (removing the issue of implementing a general-purpose formatting trait just for logging purposes), and resurrect the previous compact string representation.

mzabaluev · 2022-01-11T13:23:34Z

relayer/src/worker/client.rs

+        let _span = span!(
+            tracing::Level::DEBUG,
+            "DetectMisbehaviorFirstCheck",
+            client = %client.id,
+            src_chain = %client.src_chain.id(),
+            dst_chain = %client.dst_chain.id(),
+        )


This could be folded into the "DetectMisbehaviorWorker" span below, but I thought it would be useful to distinguish these contexts since the other one is performed in a background worker task.

With spans injecting all of the information that was formatted using the RelayPath Display impl, this is redundant.

For the batch worker tasks, we should have a span with the task name and other details, so the top-level span is not needed. For other supervisor tasks, the span name is shortened.

We have the span now, these are redundant and bad style.

adizere · 2022-01-17T13:53:42Z

Mikhail and I had one more sync. session on this PR and we're both in agreement that this is ready for merging. It's a great first step in making the relayer's logs less grating on the eyes (and the codebase more idiomatic). The last nit is the changelog.

Cool stuff.

* Printing tx hashes from SendPacket events. This was inspired from commit b63335b (see rpc.rs therein). * log the tx hashes in ibc_channel event SendPacket * Redo displaying for `OperationalData` Add `OperationalInfo` that can hold the displayable data on the batch, either borrowed or owned with transforming from first to the other. Implement `Display` on the `OperationalInfo` instead of `OperationalData` for clarity. * Improve logging of operational data Use the `odata` key in tracing. * Use a tracing span for task log messages The task lifetime is better tracked with a tracing span, which also reduces code repetition in tracking macros by putting in the task name once. spawn_background_task receives a Span constructed by the caller. This allows embedding contextual information for the task, which is used to reduce repetition in logging macros for the workers. * Erase Display impl on RelayPath, use spans instead With spans injecting all of the information that was formatted using the RelayPath Display impl, this is redundant. * Erase [rest] prefixes from log messages We have the span now, these are redundant and bad style. Co-authored-by: Mikhail Zabaluev <mikhail@informal.systems>

adizere added 4 commits October 25, 2021 11:49

More structure in logs, pass 1

a29c9c7

Pass 2

4df293f

Pass 3

5ae8e42

Resolving todos, refactoring

50ff524

adizere requested review from ancazamfir and romac as code owners October 25, 2021 16:23

adizere marked this pull request as draft October 25, 2021 16:23

Better config.toml comment

794a7bf

adizere linked an issue Nov 23, 2021 that may be closed by this pull request

Structured logs: add identifiers & tx hashes in log output #1537

Closed

9 tasks

adizere commented Nov 25, 2021

View reviewed changes

relayer-cli/src/commands.rs Show resolved Hide resolved

adizere commented Nov 25, 2021

View reviewed changes

relayer/src/connection.rs Outdated Show resolved Hide resolved

adizere added 2 commits November 26, 2021 10:24

Merge branch 'master' into adi/structured_logs

cc5e158

Post-merge fixes

d75c832

mzabaluev self-assigned this Dec 17, 2021

mzabaluev and others added 6 commits December 17, 2021 18:19

Merge branch 'master' into adi/structured_logs

9d17f44

Post-merge fix

a2c3311

Sketch: printing tx hashes from SendPacket events.

5906035

This was inspired from commit b63335b (see rpc.rs therein).

log the tx hashes in ibc_channel event SendPacket

74913b7

Improve code to print out the tx hash

dcf662a

Just examine the hash map, do not allocate any copies.

Actually enter the tracing span

f9ca4db

adizere marked this pull request as ready for review December 22, 2021 14:53

adizere mentioned this pull request Dec 22, 2021

CLI create channel can fail and spam with unused clients/connections #1421

Closed

7 tasks

mzabaluev reviewed Dec 22, 2021

View reviewed changes

adizere and others added 7 commits December 23, 2021 09:25

Apply suggestions from code review

5de3f2b

Co-authored-by: Mikhail Zabaluev <mikhail@informal.systems>

Comment explaining TrackedMsgs

1d15085

Removed use of TrackedEvents Display impl

60e66e6

Merge branch 'master' into adi/structured_logs

4d94d54

Erase Display impl for TrackedMsgs

d189d2c

Add tracking_id method to use in tracing instead. Also, use less cryptic span names in send_messages_*.

Allow passing IDs without copy in TrackedMsgs

3166f76

Make the tracking ID construction parameter generic to accept any Into<String> parameter.

Different tracking ids for creation flows

c93249b

Identify TrackedMsgs batches for creating channels and connections with the specific message that is used.

mzabaluev added 2 commits December 23, 2021 15:07

Deabbreviate an info level log message

6df6b63

Improve logging of operational data

85c60fa

Use the `odata` key in tracing.

mzabaluev approved these changes Dec 23, 2021

View reviewed changes

adizere commented Jan 5, 2022

View reviewed changes

mzabaluev added 2 commits January 5, 2022 13:30

Merge branch 'master' into adi/structured_logs

b8e9c7c

Remove verbose wording on TrackedMsgs IDs

d84bccd

The operation name is usually enough.

adizere commented Jan 6, 2022

View reviewed changes

mzabaluev added 3 commits January 10, 2022 20:26

Merge branch 'master' into adi/structured_logs

c5f7d87

Fix typos in descriptions of RunError variants

0507161

Use a tracing span for task log messages

914373c

The task lifetime is better tracked with a tracing span, which also reduces code repetition in tracking macros by putting in the task name once.

adizere commented Jan 11, 2022

View reviewed changes

relayer/src/util/task.rs Outdated Show resolved Hide resolved

Rework tracing spans for background tasks

d1ae093

spawn_background_task receives a Span constructed by the caller. This allows embedding contextual information for the task, which is used to reduce repetition in logging macros for the workers.

mzabaluev force-pushed the adi/structured_logs branch from 4f67e34 to d1ae093 Compare January 11, 2022 13:11

Merge branch 'master' into adi/structured_logs

108d081

mzabaluev reviewed Jan 11, 2022

View reviewed changes

mzabaluev and others added 5 commits January 11, 2022 17:34

Erase Display impl on RelayPath, use spans instead

e6b41f9

With spans injecting all of the information that was formatted using the RelayPath Display impl, this is redundant.

Shorten or remove span IDs for supervisor tasks

73d135e

For the batch worker tasks, we should have a span with the task name and other details, so the top-level span is not needed. For other supervisor tasks, the span name is shortened.

Erase [rest] prefixes from log messages

6cb9023

We have the span now, these are redundant and bad style.

Merge branch 'master' into adi/structured_logs

c69aa81

Simplification & consolidation w/ Mikhail

7ae8880

mzabaluev added 2 commits January 17, 2022 16:10

Changelog entry for #1491

bfec7d6

Merge branch 'master' into adi/structured_logs

45a9761

mzabaluev merged commit 2757031 into master Jan 17, 2022

mzabaluev deleted the adi/structured_logs branch January 17, 2022 14:42

mzabaluev mentioned this pull request Jan 19, 2022

Fast start for chains configured with an allow list #1705

Merged

6 tasks

mzabaluev mentioned this pull request Feb 8, 2022

Reverse PendingTx chain id args and remove its Display impl #1843

Merged

6 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Structured logs for relayer logic #1491

Structured logs for relayer logic #1491

adizere commented Oct 25, 2021 •

edited by mzabaluev

Loading

mzabaluev left a comment

adizere left a comment

adizere Jan 5, 2022

adizere Jan 5, 2022

adizere left a comment

adizere commented Jan 10, 2022

mzabaluev commented Jan 10, 2022

mzabaluev Jan 11, 2022

mzabaluev Jan 11, 2022

adizere commented Jan 17, 2022

Structured logs for relayer logic #1491

Structured logs for relayer logic #1491

Conversation

adizere commented Oct 25, 2021 • edited by mzabaluev Loading

Description

mzabaluev left a comment

Choose a reason for hiding this comment

adizere left a comment

Choose a reason for hiding this comment

adizere Jan 5, 2022

Choose a reason for hiding this comment

adizere Jan 5, 2022

Choose a reason for hiding this comment

adizere left a comment

Choose a reason for hiding this comment

adizere commented Jan 10, 2022

mzabaluev commented Jan 10, 2022

mzabaluev Jan 11, 2022

Choose a reason for hiding this comment

mzabaluev Jan 11, 2022

Choose a reason for hiding this comment

adizere commented Jan 17, 2022

adizere commented Oct 25, 2021 •

edited by mzabaluev

Loading