Discovery background processing feedback #162

amydevs · 2024-03-25T05:01:49Z

Specification

There should be status indicators for TaskManager tasks related to the Discovery domain.

This command should provide feedback on the progress of discovery tasks of gestalts, and what gestalts have been recently discovered.

Ideally there should also be a way to cancel some of these tasks.

Discovery events

Discovery is already evented since it's async-init decorated. But we need to updated to have it's usual domain specific events along with discovery related events. We need an event for the following. More events may be added if any seem relevant.

Vertex queued
Vertex Processed
Vertex failed to process
Vertex culled from future processing due to failing to process for too long

~`audit discovery` command~

NOTE: this section is being split out to a new issue/PR that addresses the audit domain generally.

We need to add the audit domain and the audit discovery command. It will function similar to connection auditing where we provide an endless stream of the above events.

Part of the seeking will include a start point and an end point. Both of these are optional and can be set in the past or the future. When outputting events, it will stream the history from the start timestamp to the end timestamp. If the endpoint is in the past then the command will yield the history till that point and then end. If the start point or end point is into the future then the command will wait until the end point or cancellation to end. Future events will be yielded as they happen.

We want to add some degree of filtering to the events that are yielded. We could allow filtering for a specific or collection of nodes/identities. But also possibly filter on a vertex's parent or gestalt. If filtering is added then we'd need to work out how to specify filters as CLI options and how they interact.`

`identities discover` command

This command needs to be modified to provide feedback. Normally this command is non-blocking where it just adds a vertex to the queue and returns. But we need to add a --blocking flag or some flag that indicates feedback. When enabled this command will output events related to the vertex it queued and it's children vertices. Then command will end when all child vertices are processed. It could also be cancelled the normal way by aborting the command.

When events are printed out, they should be consistent between the two commands.

Additional context

MatrixAI/Polykey#691

Tasks

1. Add relevant discovery events to be emitted by the discovery domain.
1. ~Implement audit discovery command.(Name TBD)~ - Being addressed in a new PR for the audit domain
3. Implement feedback for identities discover command. Command will default to non-blocking with no feedback unless flagged for blocking.

The text was updated successfully, but these errors were encountered:

linear · 2024-03-25T05:02:22Z

ENG-28 `identities status` Command

tegefaulkes · 2024-03-26T00:35:37Z

I feel like this should be an endless stream command much like the audit domain. Where it in real time outputs steps the discovery tasks are taking. It should be pretty useful for debug too.

I'm also thinking the manual discovery command will do the same but only output the events relevant to it.

tegefaulkes · 2024-03-28T02:19:33Z

I think this is more of a generic feedback for discovery issue. To fully address it we'd need a command that returns the events of all discovery happening. But also identities discover should be updated to provide feedback as well.

`identities discover` command

The identities discover command is currently non-blocking. All it does is add a vertex to the discovery queue. So for it to provide the feedback it needs a blocking version of the logic. Where it queues a vertex. but also waits for all discovery events relating to that first vertex. But given that discovery functions as a search algorithm, the discovery doesn't keep good track of what triggered it. So we'll need to maintain a tree of what triggered each event so we can track and do some detailed filtering of the events.

For example, we queue a vertex A, this discovers B and C. both B,C are queued for the next step, Say B discovers D. We'd form a tree of A -> B -> D, A - C. If we tracked the parents for each step we'd be able to simply filter the events for a general group of discovery. This will be very useful for the other command. Also note that we could have a vertex E that links to C as well that was started separately. This would make E and A a parent to C.

But does that make sense? A link is bi-directional, so its really just and edge and not a parent. By that logic the filtering on a per gestalt basis. I suppose that makes sense since gestalts is the only real way we group vertices anyway.

discovery auditing.

Right now this issue is called identities status Command. But in reality the functionality is an auditing feature very similar to the connections auditing we added a while back. This can just be an endless stream of discovery events including the all of the history. We can include the filtering mentioned above as well. We can use the connection auditing command as a template here, since it's identical but for discovery events rather than connection events.

So the auditing command will be the big picture history and version of the feedback. While the identities discover command will provide feedback just for the discovery tasks it initiated.

Any thoughts about this @amydevs ?

amydevs · 2024-03-28T02:28:52Z

I originally thought that identities discover should still be non-blocking, but identities status would instead just show indicators for all discovery tasks as to their completion progress. Whilst, audit events would show the raw events that the indicators would derive from.

tegefaulkes · 2024-03-28T03:06:13Z

For the most case yeah, identities discover would be non-blocking. But we can add a --block flag that waits for discovery to complete. Or a feedback flag that makes it long-running with the feedback.

amydevs · 2024-03-28T03:06:59Z

ah alright

tegefaulkes · 2024-04-10T01:02:33Z

Along with this, we also need to list out queued discovery tasks. So I suppose it will list queued, then move on to emitting events when they are processed? Not sure, needs more spec.

tegefaulkes · 2024-04-23T00:49:17Z

The GestaltIdentityIdEncoded uses " marks in it's formatting. This means it needs escape characters when used in the JSON format or if we wanted to pass it through a command flag. We may want to look into using an alternative format for it.

CMCDragonkai · 2024-04-23T02:21:03Z

Can you give examples of this?

tegefaulkes · 2024-04-23T02:34:03Z

Human readable

    queued         identity-["test-provider","abc"]
    queued         node-v6np16s1ui7m70sm802lne5pc923uk7ln1shoesujjg9h0are4r2g
    processed      identity-["test-provider","abc"]
    queued         node-v6k1uedkaqogphvm4lv5af1i8q8ofclvv45223ou9tm3jtk8ggntg
    processed      node-v6np16s1ui7m70sm802lne5pc923uk7ln1shoesujjg9h0are4r2g
    processed      node-v6k1uedkaqogphvm4lv5af1i8q8ofclvv45223ou9tm3jtk8ggntg

Formatted as JSON

   {"event":"queued","vertex":"identity-[\"test-provider\",\"abc\"]"}
    {"event":"queued","vertex":"node-vm6p5coet30377qql6vpm0oc6l9i5v4q6bn3e80digsv3ii4s9mm0"}
    {"event":"processed","vertex":"identity-[\"test-provider\",\"abc\"]"}
    {"event":"queued","vertex":"node-v4kjv3139hij4kv609e1dc53ruujhhlhmbuirh8uu8vh06j6kcbi0"}
    {"event":"processed","vertex":"node-vm6p5coet30377qql6vpm0oc6l9i5v4q6bn3e80digsv3ii4s9mm0"}
    {"event":"processed","vertex":"node-v4kjv3139hij4kv609e1dc53ruujhhlhmbuirh8uu8vh06j6kcbi0"}

So the format of identity-["test-provider","abc"] is not ideal in JSON Or if we had to provide it as a CLI option. I'm not sure how we should format it though. Since the ProviderId and IdentityId can be pretty permissive, likely URL valid format names, We're pretty limited on how to delimit it. Possibly using |? as in identity|test-provider|abc or identity|abc@test-provider?

CMCDragonkai · 2024-04-23T07:19:20Z

The encoded version looks fine to me. Why is it a problem? I don't really want to introduce custom delimitation. It should be easily machine parseable.

tegefaulkes · 2024-04-24T02:31:48Z

I'm just thinking if we had to provide an encodedGestaltId via a command flag, having to include escape characters when writing that out would be clunky.

CMCDragonkai · 2024-04-24T07:24:31Z

It is fine because CLI commands use the providerId:identityId format.

This is the case with polykey identities claim and polykey identities discover. When it is possible to share to a gestalt, then polykey vaults share providerId:identityId should be possible too. And : is a restricted character for hostnames. https://stackoverflow.com/a/3523068/582917

CMCDragonkai · 2024-04-24T07:25:11Z

It looks like what you're describing is logging output/auditing output. In that case, it may continue to work… since its more diagnostic. But we reserve the right to change it the more we test it.

CMCDragonkai · 2024-04-24T07:59:43Z

So if it is now:

polykey audit discovery - this should be going into a new audit issue - this is whole new polykey CLI command
polykey identities discover provideId:identityId --feedback - where a flag makes the command blocking… and shows the status of this. But I'm not sure about --feedback. In a recent discussion we have --force, --follow for streaming the data, I think we can use --follow here even if --seek and --limit doesn't apply here.

CMCDragonkai · 2024-04-24T08:01:17Z

Actually since --feedback still terminates, --follow doesn't make sense.

CMCDragonkai · 2024-04-24T08:01:30Z

How about --monitor?

CMCDragonkai · 2024-04-24T08:02:01Z

Link up the new issue for audit commands then here too @brian.botha. Especially polykey audit P1 P2 P3 … where P1 can be topic.topic.topic.

tegefaulkes · 2024-04-24T08:02:39Z

Some minor details that need to be addressed

--future-events should be called --follow
--feedback should be called --monitor
flags specific to a command should not have a short option.
Update the spec of the identities status issue.

CMCDragonkai · 2024-04-24T08:03:18Z

The title of this issue isn't accurate, since there is not identities status command anymore.

tegefaulkes · 2024-04-24T08:03:56Z

An additional Issue and PR needs to be made for the audit domain.

The current audit discovery changes need to be split out into a new PR. A new issue for a generic audit domain command needs to be made and speced out. Paths can be specified as a list of dot paths a.b.c, a.b. Multiple of these paths can be specified. Some optimization needs to be done to reduce work done when iterating over multiple paths. We're opting for a multiple iterator zipper method to achieve this.

CMCDragonkai · 2024-04-24T08:08:21Z

Yes you would want to filter out all subpaths of polykey audit P1 P2 P3 …to eliminate unnecessary work.

tegefaulkes · 2024-05-03T00:45:58Z

I accidentally linked to the wrong linear issue when creating a new PR. As a result this got automatically re-opened. Closing it now.

amydevs added the development Standard development label Mar 25, 2024

CryptoTotalWar mentioned this issue Mar 25, 2024

Failure to Update Gestalt with New Node ID and Issues with Discovery Mechanism Transparency #163

Closed

tegefaulkes mentioned this issue Mar 26, 2024

CLI Beta Launch #40

Closed

amydevs mentioned this issue Mar 28, 2024

General Discovery fixes and features MatrixAI/Polykey#692

Closed

tegefaulkes closed this as not planned Won't fix, can't repro, duplicate, stale Apr 11, 2024

tegefaulkes self-assigned this Apr 11, 2024

tegefaulkes reopened this Apr 11, 2024

tegefaulkes changed the title ~~identities status Command~~ Discovery background processing feedback Apr 26, 2024

tegefaulkes mentioned this issue Apr 26, 2024

Audit CLI commands for inspecting audit event history #177

Closed

tegefaulkes closed this as completed Apr 29, 2024

tegefaulkes reopened this May 3, 2024

tegefaulkes closed this as completed May 3, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Discovery background processing feedback #162

Discovery background processing feedback #162

amydevs commented Mar 25, 2024 •

edited by tegefaulkes

Loading

linear bot commented Mar 25, 2024

tegefaulkes commented Mar 26, 2024

tegefaulkes commented Mar 28, 2024

amydevs commented Mar 28, 2024

tegefaulkes commented Mar 28, 2024

amydevs commented Mar 28, 2024

tegefaulkes commented Apr 10, 2024

tegefaulkes commented Apr 23, 2024

CMCDragonkai commented Apr 23, 2024

tegefaulkes commented Apr 23, 2024

CMCDragonkai commented Apr 23, 2024

tegefaulkes commented Apr 24, 2024

CMCDragonkai commented Apr 24, 2024

CMCDragonkai commented Apr 24, 2024

CMCDragonkai commented Apr 24, 2024 •

edited

Loading

CMCDragonkai commented Apr 24, 2024

CMCDragonkai commented Apr 24, 2024

CMCDragonkai commented Apr 24, 2024 •

edited

Loading

tegefaulkes commented Apr 24, 2024

CMCDragonkai commented Apr 24, 2024

tegefaulkes commented Apr 24, 2024 •

edited

Loading

CMCDragonkai commented Apr 24, 2024

tegefaulkes commented May 3, 2024

Discovery background processing feedback #162

Discovery background processing feedback #162

Comments

amydevs commented Mar 25, 2024 • edited by tegefaulkes Loading

Specification

Discovery events

~audit discovery command~

identities discover command

Additional context

Tasks

linear bot commented Mar 25, 2024

tegefaulkes commented Mar 26, 2024

tegefaulkes commented Mar 28, 2024

identities discover command

discovery auditing.

amydevs commented Mar 28, 2024

tegefaulkes commented Mar 28, 2024

amydevs commented Mar 28, 2024

tegefaulkes commented Apr 10, 2024

tegefaulkes commented Apr 23, 2024

CMCDragonkai commented Apr 23, 2024

tegefaulkes commented Apr 23, 2024

CMCDragonkai commented Apr 23, 2024

tegefaulkes commented Apr 24, 2024

CMCDragonkai commented Apr 24, 2024

CMCDragonkai commented Apr 24, 2024

CMCDragonkai commented Apr 24, 2024 • edited Loading

CMCDragonkai commented Apr 24, 2024

CMCDragonkai commented Apr 24, 2024

CMCDragonkai commented Apr 24, 2024 • edited Loading

tegefaulkes commented Apr 24, 2024

CMCDragonkai commented Apr 24, 2024

tegefaulkes commented Apr 24, 2024 • edited Loading

CMCDragonkai commented Apr 24, 2024

tegefaulkes commented May 3, 2024

amydevs commented Mar 25, 2024 •

edited by tegefaulkes

Loading

~`audit discovery` command~

`identities discover` command

`identities discover` command

CMCDragonkai commented Apr 24, 2024 •

edited

Loading

CMCDragonkai commented Apr 24, 2024 •

edited

Loading

tegefaulkes commented Apr 24, 2024 •

edited

Loading