Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add kinds 10 and 11 to prevent race conditions when updating contact lists #349

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

staab
Copy link
Member

@staab staab commented Mar 10, 2023

A very common experience on Nostr is that of "losing follows" due to race conditions when sending kind 3 events. Earlier this week someone signed in to Coracle, their contact list failed to fully sync before they followed someone, and they ended up deleting all their follows. The only solution a client can implement to avoid this currently is to disallow certain functionality if a person's account isn't fully synced, but of course, you don't know what you don't have, it's always possible a user hasn't yet followed anyone.

This change is backwards compatible, and simply introduces two new event kinds to solve the above problem. There's no reason to remove kind 3, since there are valid use cases when a user might want to definitively say, "this is my contacts list".

The same problem exists for relay lists to a less severe degree (because relay list cardinality is lower), and for #183, but I figured we'd have the conversation here first.

Sources:

nostr:nevent1qyt8wumn8ghj7etyv4hzumn0wd68ytnvv9hxgtcpz4mhxue69uhhyetvv9ujuerpd46hxtnfduhszythwden5te0dehhxarj9emkjmn99uq3kamnwvaz7tmjv4kxz7fwdaexzmn8v4cxjmrv9ejx2a30qy2hwumn8ghj7un9d3shjtn4w3ux7tnrdakj7qghwaehxw309aex2mrp0yhxummnw3ezu6twvehj7qgewaehxw309aex2mrp0yhxummnw3exzarf9e3k7mf0qyt8wumn8ghj7un9d3shjtnddaehgu3wwp6kytcqyr9yfkqe2yuky5lyl6wgexn6s6wjc9erntn9zp20j5cqdlk6qywsgnfgvxn)

@staab
Copy link
Member Author

staab commented Mar 10, 2023

One potential implementation difficulty is that if someone follows/unfollows a single user multiple times, when streaming the events back in out of order they'll have to keep track of when each pubkey's follow status was last updated.

@fiatjaf
Copy link
Member

fiatjaf commented Mar 10, 2023

Interesting. This looks useful for more than what you describe.

@mikedilger
Copy link
Contributor

mikedilger commented Mar 10, 2023

I like to think of the ContactList sync problem fundamentally being that you create an event that doesn't account for past events you didn't know about. This can be solved as suggested here with "diff" type events.

But most software that deals with merging things that have gone out of sync (git, palm OS sync, AvantGo's thing) has over time been done by storing different states, and computing differences from a common ancestor, and not storing the differences themselves. The reasons are subtle I think and have to do with missing diffs just like you might miss contact lists.

In any case, all we are missing from doing that kind of diff algorithm is the link back to the previous contact list. If every contact list had a link back to the previous contact list that it thought it was replacing, we could find a common ancestor, do the diff algorithms, and merge without needing kinds 10 and 11, and I think it would avoid future issues with diffs themselves going missing.

Using the motivating example of this PR: User starts coracle and adds a new person to follow. Coracle generates a ContactList event B with just 1 person, pointing back to NULL (somehow indicating it has no prior). Then Coracle receives a contact list A pointing back to NULL that was created earlier. This triggers it to create a merged contact list C which it then pushes out, and that merged list points back to both A and B.

This handles deletions just as well as insertions. DIFF(NULL,A) may have both deletions and insertions. Same with DIFF(NULL,B). Then you apply all those deletions and insertions. In the event of a collision, you just pick one, probably the latest one.

If you have a ContactList and didn't have the previous one, you don't know the full set of diffs to apply, but at least you know they are missing, whereas with diff events you don't.

@mikedilger
Copy link
Contributor

Having thought a bit more on this, I'm not sure about my idea. It could be better or worse and I can't decide. You would have to keep old ContactLists to find a common ancestor so it couldn't be replaceable meaning we'd need a new kind. And I'm not actually sure what the downsides of using patches (kind 10/11 as suggested) are, I just have a belief that the industry learned not to do that over time, but my understanding is not nuanced enough to really make any determination without doing some research.

@monlovesmango
Copy link
Member

monlovesmango commented Mar 11, 2023

in astral I just created a warning every time someone creates their first follow to confirm that they don't actually have any follows.

my gut feeling about this pr is that its is trying to solve an edge case that is only an issue bc of our lack of targeted relay strategy. as we adopt @mikedilger 's proposal for relay metadata lists this should really cease to be an issue. generally it feels wrong and overly complex to have 3 different kinds just for your follows list.

without fixing the relay strategy first, we are likely to have the same issues we see with kind 3 with kind 10 and 11 as well. what in this proposal makes kind 10 and 11 more reliably retrievable than kind 3? and once relay strategy is more targeted, will kind 3 really be so unreliable to find that we still need kind 10 and 11?

if we do move forward on this pr, can we have more recommendations around when kind 3, 10, and 11 should actually be created? should all three be created with every contact list update? if only 10 and 11 should be updated regularly, when should we be updating kind 3? and if kind 3 isn't updated every time, is this really backwards compatible for older clients? when kind 10 is initially created should it contain all existing follows from kind 3?

@vitorpamplona
Copy link
Collaborator

vitorpamplona commented Mar 12, 2023

I am in favor of killing the idea of a contact list altogether. I know this breaks current implementations and makes it heavier to assemble a user's home feed, but here's the thing: the list shouldn't even exist. It's not how people think about their follows. The current spec forces a consensus (order of events) that is not required.

This is equivalent to Nostr not needing a blockchain: the consensus of an order of events is not required for a social protocol.

Think about contacts as likes (kind 7). If you like a user, you are following it. If you delete the like, you stop following. Same for relay lists. Follows is a simple type filter with authors=me and my followers is a similar type filter, but with p == me.

Then we also kill the aberration of a sub-10000 replaceable event.

@monlovesmango
Copy link
Member

at first glance, I really like this idea @vitorpamplona . making an individual follow event for each person you follow seems pretty straightforward and eliminates the possibility of wiping existing contact lists.

@staab
Copy link
Member Author

staab commented Mar 13, 2023

As I work more with it, I think Nostr is basically an event-sourcing database with domain objects implicitly layered on top. In this PR I started by trying to define new "edit operations" on our implicit "contact list" domain object because the current ones don't represent what the user wants to do. But we should be thinking about what the user wants to do and modeling those operations instead. I think @mikedilger's proposal is more in line with "editing implicit domain objects directly", and @vitorpamplona's observation gets at the heart of the issue (although I think there are use cases for saying "forget everything I said before, THIS is my contact list", so I don't see a reason to deprecate kind 3's).

I looked briefly into whether CRDTs would help with this problem, and I don't think they're really relevant, except to say that a list with independent set, add, and remove operations is a CRDT (CRDTs are usually more complex because they deal with order or dependency of elements). The problem we currently have is that we're trying to map add/remove onto the set operation, resulting in apparent conflicts.

Edit: another way to think of this problem is by re-framing "kind" as an operation rather than a data type. Kind 3 is not a "contacts list", it is a "set contacts list operation".

Some relevant conversation also exists on the NIP 65 discussion at #218 (comment) and following.

@vitorpamplona
Copy link
Collaborator

vitorpamplona commented Mar 13, 2023

The problem we currently have is that we're trying to map add/remove onto the set operation, resulting in apparent conflicts.

The real problem is the thinking that there is a single contact list shared among the multiple relays a user has access to. You are assuming there is only ONE list to add/remove to. Which is incorrect. The majority of active users have outdated lists in old relays, either on purpose (because in that relay, they follow more people) or not. Any implementation of edit operators must consider diverging lists.

@fabianfabian
Copy link

@vitorpamplona no need to kill contact lists, they are useful for building a web of trust, they should be shared as a courtesy to others so they can be used as input against spam or other things.

Seperate from that It's up to the clients to decide how they implement their following feed, using whatever is coming from the relays as the source of truth is probably not a good idea.

@vitorpamplona
Copy link
Collaborator

no need to kill contact lists, they are useful for building a web of trust

A list is not required to build the web of trust. Individual events to demonstrate follows work just as well as an ordered list. My point is simply that the following list doesn't need to be ordered, or packed together in the same event.

@fiatjaf
Copy link
Member

fiatjaf commented Mar 13, 2023

I like Vitor's idea, which is more-or-less what this PR proposes. We'll have to live with both approaches, but eventually one may win against the other.

Is there a difference between deleting a "follow" event and publishing an "unfollow" event though? What is better?

@staab
Copy link
Member Author

staab commented Mar 13, 2023

To delete an event, you must know its id. For lots of reasons, you might not have that id, but especially if the follow came from a kind 3 or 10002.

@fiatjaf
Copy link
Member

fiatjaf commented Mar 13, 2023

Wait, kind 10002 has follows too?

@vitorpamplona
Copy link
Collaborator

vitorpamplona commented Mar 13, 2023

To delete an event, you must know its id.

From a UI perspective, you have to know that the user is following another to hit unfollow. You will have to have the event.

@staab
Copy link
Member Author

staab commented Mar 13, 2023

Wait, kind 10002 has follows too?

Sorry, I get confused between follow lists and relay lists, ignore that

From a UI perspective, you have to know that the user is following another to hit unfollow. You will have to have the event.

If the follow came from a kind 3, you don't want to delete the entire list and re-create it. More broadly, I think event deletions are a mistake, because they introduce a dependency graph between events. They also don't represent a user's intention in many cases.

@staab
Copy link
Member Author

staab commented Mar 16, 2023

It occurred to me this morning that another use case this supports is allowing clients to show notifications when you are followed/unfollowed by someone. Currently, since 3's are replaceable, there's no way to do a diff without keeping a cache of kind 3 for all users.

@Egge21M
Copy link
Contributor

Egge21M commented Mar 16, 2023

@staab can you give an example of how a kind 10 event would look like? If i understand this PR right, then clients would have to gather the latest kind3, and all kind 10 and 11 events that are older than the kind 3 in order to build a list, right? If I follow 2-5 keys per day, that would add up quite quickly, no?

@staab
Copy link
Member Author

staab commented Mar 16, 2023

I'm thinking something along the lines of: {kind: 10, content: "", tags: [["p", <pubkey1>], ["p", <pubkey2>]]}

all kind 10 and 11 events that are older than the kind 3 in order to build a list

All 10 and 11 events that are newer, but I expect that's what you meant. Pulling a few thousand events doesn't seem that bad to me, especially since the alternative is pulling a single event with thousands of tags instead.

@vitorpamplona
Copy link
Collaborator

vitorpamplona commented Mar 16, 2023

Practically speaking this proposal's performance and kind-3s alone are quite the same. For reference, Amethyst takes 5-10 seconds to parse 4000 kind-3s of your followers (to do the follower list/count).

Processing 4000 individual reaction events (kind 7) to do the same work (follower list/count) happens in less than 0.5 seconds.

@Egge21M
Copy link
Contributor

Egge21M commented Mar 16, 2023

Thanks for clarifying. I definitely agree that we need a less-destructive approach for follower-lists. Now I agree that the overhead of an event is quite slime, so having multiple events instead of a single one with a long tag-array does not matter performance wise, but I am worried about the connection.

Sometimes realys are slow and because of the nature of this approach I would have to wait on EOSE before beginning to construct a follower list, as otherwise we would be introducing a new race condition between kind 10 and 11 events.

Additionally clients would need to verify the signature verfication of every single event.

@arthurfranca
Copy link
Contributor

arthurfranca commented Mar 16, 2023

edit: I may have mixed follow/follower, I think it is fixed now

  • kind 3 CL only
    ⬇ [bad] slower followers counting cause it is big
    ⬇ [bad] may have race condition problem (maybe cause of bad client logic maybe not)
    ⬇ [bad] unknown max follows due to relay event size limit
    ⬆ [good] faster for getting all follows for loading follows feed
    ⬆ [good] easy to migrate event to other relay

  • 3 CL + 10 follow + 11 unfollow (by @staab)
    ⬇ [bad] slow followers counting cause it still uses CL along with extra events
    ⬇ [bad] legacy clients may still wipe CL;
    ⬇ [bad] slower for getting all follows, having to deal with follow+unfollow+follow_again_same_person
    ⬇ [bad] hard to migrate events to other relay (can get rate limited)
    ⬇ [bad] some relays may have incomplete set of events
    ⬇ [bad] not clear when to merge 10 and 11 into updated CL.
    ⬆ [good] may ease race condition problem
    ⬆ [good] "someone has followed you" notification
    ⬆ [good] "someone has stopped following you" notification

  • kind X follow (by @vitorpamplona)
    ⬇ [bad] slower follows list building/counting but faster than 3+10+11 kinds
    ⬇ [bad] clients may ignore and not migrate out from CL
    ⬇ [bad] hard to migrate events to other relay (can get rate limited)
    ⬇ [bad] some relays may have incomplete set or events that should be already deleted
    ⬇ [bad] some old events may get deleted by relay policy since they won't be refreshed like CL updates
    ⬆ [good] no race condition
    ⬆ [good] faster followers counting
    ⬆ [good] "someone has followed you" notification

Now I think kind 3 CL isn't that bad xD. Maybe @staab idea should be just a client-side logic between CL edit moments.

As an addition, we could keep kind 3 CL and just add follow events with expiration 3 days ahead with the sole purpose of notifying like "someone has followed you". Or maybe it's not worth it.

@staab
Copy link
Member Author

staab commented Mar 16, 2023

That list seems pretty unfair tbh, most of those downsides on my proposal are marginal performance differences, general criticisms of nostr itself (relays may have incomplete set of events), or restatements of how kind 3 is already broken and the proposal doesn't automatically fix kind 3. Whereas the importance of the race condition is understated, it's one of the most common and annoying problems people experience with nostr.

@vitorpamplona
Copy link
Collaborator

vitorpamplona commented Mar 17, 2023

Additional point, after loading up all contact lists, if your client just wants to listen to contact lists updates (to keep things up to date) the amount of events is significant and they are all super large events. If we switch to a lighter model, clients could keep listening and just receiving the new follows/unfollows as opposed to an entire list over and over again.

Keep in mind Battery life and networking use (most people will use NOSTR in their phones, on mobile data plans).

@staab
Copy link
Member Author

staab commented Mar 21, 2023

It occurs to me a difficulty with this is counting followers, since if these aren't replaceable events, there can be duplicates. A "d" tag would probably fix this.

@mikedilger
Copy link
Contributor

While there are these other reasons to not have monolithic full contact list events (and those battles need to continue to be fought), I've never been convinced that the current state of affairs actually creates race condition problems of any real significance. I think if you do the following you should be good (and I must admit gossip isn't doing this currently):

  1. Clients need to remember the date of the contact list they are using, and never update from an older contact list than what they already have. They may still go from very old to old, but that's better than going from newest to older.
  2. That leaves the problem of clients editing and writing back from an old contact list and thus clobbering the newer one. But that can't happen if they first read from the relay they intend to write to (except for rapid race condition which is quite unlikely). If they find a newer contact list there, they should use that one and merge as necessary. They may write back a newer contact list to the relay that isn't the newest of all, but only to relays that already had old lists so it shouldn't actually clobber anything important.

@erskingardner
Copy link
Contributor

What do clients do if they receive a new kind-3 event, that they don't have the previous one too, and the relays doesn't have it either? Also how would clients handle a new kind-3 event without any e/prev tag?

They can do whatever they want. That's the point. Clients decide how to handle situations like this.

Btw instead of an e tag pointing to the old event version, it should be a prev tag so it can work in any NIP-51 list, including bookmarks and pins (which already have e tags).

I like this. prev tag also doesn't get indexed so doesn't add much, if any, overhead to relays. Obviously, again, it's up to clients to handle this in more complex ways if they want. The naive approach of just accepting the newest event as the current is still fine in most cases.

@Egge21M
Copy link
Contributor

Egge21M commented Jan 2, 2024

The naive approach of just accepting the newest event as the current is still fine in most cases.

This is what I do not like about this approach. It does not fix the issue at hand. If a client does not have the latest state, it will overwrite the old state, and once relays receive the new event, they will also drop and overwrite. Nothing is gained vs. regular caching of kind-3 events and chaining by timestamp.

@mikedilger
Copy link
Contributor

mikedilger commented Jan 2, 2024

Is this is a fundamentally hard problem? Do we need some caffeinated University graduate students in the prime of their mental capacity working on it under a research grant? None of the solutions proposed actually work.

My mental capacity has suffered recently due to unknown factors (I think perhaps it is sleep apnea and am getting tested soon).

EDIT: Yes I had severe sleep apnea! But I'm using a CPAP now and I'm feeling more awake.

@alexgleason
Copy link
Member

This is one of those weird cases where putting data on a blockchain-like structure (really just a linked-list) would kind of make sense.

@alltheseas
Copy link
Contributor

@jb55 how do we fix broken contact lists via NIP

@mikedilger
Copy link
Contributor

As for Vitor's suggestion: #349 (comment) relays could filter out the 'bad people' you are following and you would never know. Having the entire list together provides assurance that it wasn't truncated maliciously.

I still like my merge idea, which @alexgleason has restated more briefly. It would have to be a new kind, not replaceable, and have links back to the prior contact list (or multiple if merged) that it was modifying.

@jb55
Copy link
Contributor

jb55 commented Jan 29, 2024 via email

@staab
Copy link
Member Author

staab commented Jan 29, 2024

relays could filter out the 'bad people' you are following and you would never know

Isn't the entire point of nostr that you don't have to trust a single relay to give you a complete list?

@jb55 how are you envisioning handling missing events? The original PR here is basically exactly diffs, but with no prev tag, so I assume you're talking about the same thing Mike is. Here's how it could go wrong:

  1. SET follows = [a]
  2. DIFF -a
  3. SET follows = [a, b]
  4. DIFF +c

Nostr is not consistent in terms of the CAP theorem. If a client fails to load the second contact list published (step 3), kind 4 will be interpreted in the wrong context, resulting in [c] rather than [a, b, c]. If you then SET it you'll be in trouble. You could avoid setting a new list if you're missing a prev, but since kind 3's are replaceable you might never find the precedent. My original proposal has the same problem, it just seems to me that establishing dependencies between events only adds complexity without solving any problems.

@jb55
Copy link
Contributor

jb55 commented Jan 29, 2024

My original proposal has the same problem, it just seems to me that establishing dependencies between events only adds complexity without solving any problems.

all I can do is show how the new list is different from the local version, it will not be a consistent history like git.

@staab
Copy link
Member Author

staab commented Jan 29, 2024

all I can do is show how the new list is different from the local version, it will not be a consistent history like git.

So is it essentially the same as this PR? Or is there some significant difference?

@jb55
Copy link
Contributor

jb55 commented Jan 29, 2024

it doesn't require any changes to the protocol, it just delta-encodes the contact list so it can store a revision history efficiently.

@staab
Copy link
Member Author

staab commented Jan 29, 2024

I see, you're just talking an implementation detail to detect large deltas?

@jb55
Copy link
Contributor

jb55 commented Jan 29, 2024

yeah I was thinking it could just be an implementation detail of the contact list storage. the biggest issue is that contact lists are huge, so most people don't store each revision. if they were smaller it would be more viable: start with a base contact list and store deltas.

@staab
Copy link
Member Author

staab commented Jan 29, 2024

I can see how that would be useful, but not how it helps with the race condition problem.

@mikedilger
Copy link
Contributor

mikedilger commented Jan 29, 2024

On Mon, Jan 29, 2024 at 08:42:37AM -0800, alltheseas wrote: @jb55 how do we fix broken contact lists via NIP
I'm planning on resolving this by creating a delta-contact list format, so each change can be represented by "diff" of the previous one you have stored. This will allow us to show every change in a compact way, very much like "git diff" for your contact list, and it's backwards compatible with everything. Then you will be able to review and rollback to any previous revision, perhaps damus can do this automatically if it sees large drops of contacts at once.

How about two new event kinds: One is the full list, one is a diff. A diff points back to one event of either kind. A full list points back to 1 or more events of either kind. Full lists can consolidate long sequences of diffs, or merge multiple divergent histories. You could have a full list point at a full list and clients can compute the diff... or you can publish a diff and clients can compute the full list. I think having both of these has benefits.

EDIT: Maybe I should open my own PR instead of polluting this one with my divergent ideas.

@jb55
Copy link
Contributor

jb55 commented Jan 29, 2024 via email

@jb55
Copy link
Contributor

jb55 commented Jan 29, 2024 via email

@alexgleason
Copy link
Member

@staab This idea is still legit I think. But now it's invalid because NIP-29 reserved kinds 10 and 11.

@alexgleason
Copy link
Member

@staab I wonder if it should just be labels. You can label people you follow.

@mleku
Copy link

mleku commented May 11, 2024

this seems like a logical solution for follow list hell and makes a path towards sanity, ack, pls resolve conflicten

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.