Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(iroh, iroh-gossip): introduce log-self feature #1544

Closed
wants to merge 7 commits into from

Conversation

divagant-martian
Copy link
Contributor

@divagant-martian divagant-martian commented Sep 28, 2023

Description

Aff a feature flag to turn on logging our own peer_id in short form and plug this in in sync, gossip and the downloader. This is enabled by default (not sure if we want this for releases thought. Having it as a feature flag at least gives us the option to chose.

This is what I did to debug #1531 and it was really useful

Notes & open questions

Log sample:

Sync

2023-09-28T20:51:18.908126Z DEBUG sync{me=oywqb57ovmdry243}: iroh::sync_engine::live: sync[dial]: start namespace=NamespaceId(q47zmyim5r6isz22…) peer=PublicKey(2jnygwapdm26wwa2) reason=DirectJoin last_state=None

Gossip

2023-09-28T20:58:42.854730Z DEBUG gossip{me=zooj7iazcl7rsoal}: iroh_gossip::net: handle out_event EmitEvent(TopicId(wno5nwxtkhhtnqsm…), Received(GossipEvent { content: <33b>, delivered_from: PublicKey(7a7kkzndvbt6eulu), scope: Neighbors }))

Downloader

2023-09-28T21:00:11.107411Z DEBUG downloader{me=4vabufwku3wselbl}: iroh::downloader: download completed peer=2jnygwapdm26wwa26tvcprm3m5vqajmhwz7lx5xizwx637ad5sea kind=Blob { hash: Hash(224746ea89d286220e0770f89cda2ab138143b00e384dd795d5a13b77b094822) }

Change checklist

  • Self-review.
  • Documentation updates if relevant.
  • Tests if relevant.

@divagant-martian divagant-martian changed the title feat(iroh, iroh-gossip0: introduce log-self feature feat(iroh, iroh-gossip): introduce log-self feature Sep 28, 2023
Copy link
Contributor

@flub flub left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not always do this instead of the extra feature and conditional compilation? If it's useful, which it clearly is, then let's just always do it.

if let Err(err) = actor.run().await {
let fut = actor.run();
#[cfg(feature = "log-self")]
let fut = fut.instrument(trace_span!("gossip", %me));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since this is a spawned future I'd strongly recommend to use an info_span. And also to not make it depend on the feature. Every task should have it's own info_span attached as a general rule.

The span should also cover the bit of error handing in the lines below that are still inside this spawned task.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you explain why info_span? looking throughout the code we still have trace_span once in baomap, debug_span 16 times. Looking at what Frando did in his own branch for his needs he went with trace as well and it seems to work well. What's the difference?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My general guideline is to use info_span for spawned tasks, motivation:

  • If something is spawned it usually is a "top level task" of some sort (with a serious pinch of salt).
  • In this case you want to see most log messages with that span attached
  • and if you set the level to info, debug or trace you will see that attached span by default
  • (bonus: this matches the "info is for operators" guidance we just discussed)

This does not mean we already do this everywhere consistently. I've done a few PRs in the past making this more consistent but it certainly isn't complete.

Now I understand your reluctance to always having this logging on and that having this as debug or trace makes it even easier to hide. but i do really think this is more useful at info.

Copy link
Contributor Author

@divagant-martian divagant-martian Oct 2, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok just to be clear about my question I don't understand what this changes, not sure what the effect of this is.

Is the effect to add the fields only from info and above? if so then don't think this will help in debugging tests
if the effect is to limit to info and below then it's probably better

but then it's confusing that trace still gives me debug logs with peer_ids.

TLDR: I'm confused about what the levels do here

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See also https://discord.com/channels/949724860232392765/950683937661935667/1156506832567808060 where @Frando is also asked me this question and I (hopefully!) gave a similar answer.

@@ -237,6 +241,12 @@ impl Downloader {

let service = Service::new(getter, dialer, concurrency_limits, msg_rx);

#[cfg(feature = "log-self")]
{
service.run().instrument(trace_span!("downloader", %me))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here, it should always attach the span and be info_span as it's run in a task. Again try to do it around the entire spawned future and not just the service.run() method.

if let Err(err) = actor.run().instrument(span).await {
let fut = actor.run();
#[cfg(feature = "log-self")]
let fut = fut.instrument(trace_span!("sync", %me));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same note, info_span and cover entire future that's spawned into the task.

@Arqu
Copy link
Collaborator

Arqu commented Sep 29, 2023

This might / will be PII eventually, so having the option to scrub it is nice.

@dignifiedquire
Copy link
Contributor

This might / will be PII eventually

unlikely, peerids are sent over the network and shared as identifiers.
if we really need to scrub PII we should do so on themlog storage side, as there is already much more in the logs, like ip addresses

@b5 b5 added this to the v0.7.0 milestone Sep 29, 2023
@Frando
Copy link
Member

Frando commented Oct 2, 2023

Yeah, let's add this. Test logs with multiple nodes are basically unusable without. I did parts of this in #1530 but let's get this in here!

@flub has more experience with tracing then I do so don't have much to add on the implementation front.

In addition to the places where you added the span, I'd recommend to...

  • also add a similar span to the main node actor loop
  • change the span in iroh-net to use the same format (%me) with me = peer_id.fmt_short() instead of the magicsock name that is used currently
    but those could also be followups

@divagant-martian
Copy link
Contributor Author

can people comment on deciding to make this a compile time flag? @Frando @dignifiedquire @flub (heck we are four here let's just bring @ramfox and @b5 ) my thinking is that logs are already very hard to read. Your own peer id is useful only when you are dealing with multiple nodes in the single run.

This is not the case of users in general, and I would not include this in release builds by default for example. Basically I think it's very nice to be able to have cleaner logs by just removing a feature because this adds "a lot" to already noisy logs

@divagant-martian
Copy link
Contributor Author

agreed feature is not necessary

@flub
Copy link
Contributor

flub commented Oct 2, 2023

agreed feature is not necessary

for completeness: everyone agreed the logging added here is super useful and we should totally have it. but we prefer it always-on, so without the crate-level feature to control it.

i don't mind if you want to re-open this and remove the crate-feature or re-create the PR. though I think it's a reasonable refactor to do to this PR so re-opening might be easier.

@divagant-martian divagant-martian mentioned this pull request Oct 2, 2023
3 tasks
github-merge-queue bot pushed a commit that referenced this pull request Oct 2, 2023
## Description

second take at #1544 since forced pushed closed prs can't be reopened

#### Magicsock
```
2023-10-02T18:46:34.455601Z DEBUG magicsock{me=2luekswh7o3a5tz4}:derp.actor:recv_detail:client-connect{key=2luekswh7o3a5tz4enymovsoksgnpb2qpmxlvifp6ywwjnacihya}:connect_0: rustls::client::tls13: TLS1.3 encrypted extensions: [ServerNameAck]
```
#### Sync
```log
2023-09-28T20:51:18.908126Z DEBUG sync{me=oywqb57ovmdry243}: iroh::sync_engine::live: sync[dial]: start namespace=NamespaceId(q47zmyim5r6isz22…) peer=PublicKey(2jnygwapdm26wwa2) reason=DirectJoin last_state=None
```
#### Gossip
```logs
2023-09-28T20:58:42.854730Z DEBUG gossip{me=zooj7iazcl7rsoal}: iroh_gossip::net: handle out_event EmitEvent(TopicId(wno5nwxtkhhtnqsm…), Received(GossipEvent { content: <33b>, delivered_from: PublicKey(7a7kkzndvbt6eulu), scope: Neighbors }))
```

#### Downloader
```
2023-09-28T21:00:11.107411Z DEBUG downloader{me=4vabufwku3wselbl}: iroh::downloader: download completed peer=2jnygwapdm26wwa26tvcprm3m5vqajmhwz7lx5xizwx637ad5sea kind=Blob { hash: Hash(224746ea89d286220e0770f89cda2ab138143b00e384dd795d5a13b77b094822) }
```

## Notes & open questions

probably we will add this in other places or change the logging level
but would be good to get this in to at least have something over which
we can iterate

## Change checklist

- [x] Self-review.
- [ ] Documentation updates if relevant.
- [ ] Tests if relevant.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

None yet

6 participants