Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

idxInSync is incorrect #251

Closed
KyleMaas opened this issue Dec 12, 2022 · 20 comments · Fixed by #301
Closed

idxInSync is incorrect #251

KyleMaas opened this issue Dec 12, 2022 · 20 comments · Fixed by #301
Labels
bug Something isn't working

Comments

@KyleMaas
Copy link
Contributor

See #250

idxInSync is supposed to be set when indexes are not up-to-date. And that's true when processing existing data. But when new data is injected after the indexes are marked as "live", idxInSync is never set again, which means you can never do a synchronous operation like publishing and then asking the indexes for results (which at least one of the broken tests tries to do) because you have no way of knowing when the indexes are done processing their new data. We really need some method of tracking new changes to the receiveLog and when they come in s.idxInSync.Add(1) until the message has gone through all of the indexes, at which point it is cleared with s.idxInSync.Done(). This would allow the existing WaitUntilIndexesAreSynced() function to actually work properly.

@decentral1se
Copy link
Member

@KyleMaas Nice.

I saw in https://dev.scuttlebutt.nz/#/golang/?id=database-conceptual-overview that there are two entrypoints for message appending and I wonder is that the right place to intervene 🤔 It's not clear if the message will always end up in an index (so potentially wasteful to Add(1)) and if idxInSync will be available at that level of abstraction though.

Feels like directly inside the serveIndexFrom would be best since then the messages are constrained by the margaret.Query and are defintiely due to end up in the index. Perhaps via the https://pkg.go.dev/github.com/ssbc/go-luigi#Pump call? I'm not quite sure how to intervene there but the implementation looks fairly simple https://github.com/ssbc/go-luigi/blob/324065b9a7c6/stream.go#L54 Perhaps this can be re-worked and the calls to s.idxInSync.Add(1) threaded in?

@KyleMaas
Copy link
Contributor Author

@decentral1se

The best I could work out is to put a goroutine with a Luigi pump right about here (before indexes are attached) with the sole purpose of Add(1)ing idxInSync:

// get(msgRef) -> rxLog sequence index

And one here after the indexes are initialized to Done() idxInSync:

if s.Replicator == nil {

That way, each ingested message would be processed once before and once after all of the other message processing hooks for the indexes. Plus, it's scoped within sbot where idxInSync is, rather than being handled in a plugin or somewhere further downstream which might not have access to it.

Anyway, those were my thoughts. I'm just not familiar enough yet with Go and with the codebase to know how to reliably do a Luigi pump to just a single simple callback with a Margaret query which selects all messages to be able to implement it this way myself. I figure I could stumble through it if you can't, but it's probably quicker and easier for you to build that.

@KyleMaas
Copy link
Contributor Author

@decentral1se

Expanding on my earlier comment, the disadvantage I see with doing it at the serveIndexFrom layer is that, with there being multiple indexes, there would be multiple Add(1) and Done() calls per message. I tend to think it would be less wasteful to either:

  1. Run it on every message regardless of whether it will be indexed or not, but only do it at the start and end of the pipeline so it's only done once per message, or
  2. Do it similarly to that, but limit it to a Margaret query which somehow encompasses the selection criteria for every other index.

But, to me, I'd think that the waste of processing in the now two Margaret queries (one at the beginning and one at the end of the pipeline) would vastly surpass the processing overhead of a single increment and decrement operation per message.

@decentral1se
Copy link
Member

@KyleMaas I'm just kinda getting into making code changes here and have been mostly focusing on "easy to do" maintenance tasks otherwise, so I don't think I'm any better placed to do this than you! Feel free to dive in with a PR and we see what we can do.

Thanks for these thoughts, will aim to come back on this later today. I feel like we're getting closer to a solid fix plan 🤞

@KyleMaas
Copy link
Contributor Author

@decentral1se

So...that approach isn't working at all. The Luigi pumps are running in parallel instead of in a pipeline, which means that since the increments and decrements are the fastest, they're generally both executing before the index process completes. Doing it that way would require that there was a definitive blocking first and last task to do the increment and decrement operations synchronously. Unfortunately, I'm not familiar enough with Margaret and Luigi to figure that out, so I think for the moment I'm stuck. You can take a look at what doesn't work over on #255

@decentral1se decentral1se added testing bug Something isn't working and removed testing labels Dec 18, 2022
@decentral1se
Copy link
Member

decentral1se commented Dec 18, 2022

@KyleMaas Hmmmm, I see 🤔 We have indexStateMu for per-index state locking, is there a way to lock more generally above that? If you're using Matrix, I'm lurking #golang-ssb-general:autonomic.zone for more back/forth random ideas. No worries if not. Also, not sure which identity you're using on SSB 🙃

@KyleMaas
Copy link
Contributor Author

@decentral1se

Hmm...don't know on that one. I'll have to take a closer look.

Not on Matrix. On SSB, my accounts tend to be transitory, but I do have a semi-permanent testing account:

@MI0aKQls1HIrZlyl4qSAW4ErLqb4xVUN+diSbG0j7Fc=.ed25519

I'm not particularly active on it most of the time however because I tend to wipe it out quite frequently as I'm working on stuff. However, if you'd like to do more off-topic communication, you're welcome to e-mail me.

I can't say I'm really part of the SSB "community", per se. I'm not a very social person, so the entire concept of social media is not particularly attractive for me, bordering on distasteful. What I love about SSB is more the capability to do disconnected asynchronous communication among real-world known peers, more along the lines of @dominictarr's sailboat use case. I had actually designed up a protocol very similar to SSB before discovering SSB, and it was so similar I figured I'd try and help rather than reinvent the wheel. My available time to work on open source stuff tends to come in waves, so I'm not particularly consistent, but I do like to help when and where I can because I love the concept.

@decentral1se
Copy link
Member

I'll message on SSB then 😸

@KyleMaas
Copy link
Contributor Author

@decentral1se

Okay, so it looks like indexStateMu would not be helpful in this case, since it doesn't trigger during the Luigi pumps but only during:

  1. Construction of indexes
  2. Initial message processing
  3. The shutdown process

It couldn't really be patched into the Luigi pump, either, since those are run concurrently so you'd still have a race condition - message could come in and if it didn't process immediately the indexes would still show as in sync. We really need something to be able to trigger any time a message is added to the receive log and then trigger something else when all of the Luigi pumps are done processing it, but I'm not sure of a good way to do that with all of the concurrency.

One way I thought of that you might be able to do this would be to have each index record the last seen message's sequence number. Then you could convert idxInSync into a function which waits until all of the indexes catch up with the latest sequence number at that point. That way, it would be entirely non-blocking - the rest of the sbot could continue to operate, and you'd know that the indexes were at least in sync up to the point where the sync function was called. The problem is that there's no way to do this, either, because not all indexes receive all messages, so if the latest message in the receive log had a sequence number of 100 and it wasn't, for example, an about message, then you'd be waiting forever (or at least until you received another about message in one of the other threads) because it would be filtered out by the Margaret query and the about index would never catch up so would never reach sequence number 100.

@decentral1se
Copy link
Member

Damnit gotta get a fix out for this 😅 Got some cycles free tomorrow, gonna give it another shot.

@decentral1se
Copy link
Member

decentral1se commented Dec 29, 2022

Some notes from my investigation... 👀

@decentral1se
Copy link
Member

I'm gonna try raise cryptix for a lead on a fix for this one! It might turn into a call, lemme know @KyleMaas if you wanna join.

@KyleMaas
Copy link
Contributor Author

KyleMaas commented Jan 2, 2023

@decentral1se

Depends on the time. My schedule this week is rather chaotic. But if you want to e-mail me the info (won't have access to an SSB client with my test account on it) I'd be happy to join if I can.

@KyleMaas
Copy link
Contributor Author

KyleMaas commented Jan 2, 2023

It's definitely a very thorny problem for the way this is architected. Concurrency's all well and good until you combine it with anything sequential.

@KyleMaas
Copy link
Contributor Author

KyleMaas commented Jan 2, 2023

@decentral1se

Maybe this could be of use? If you look through some of the tests, it looks like there might be a way to use this to test for an end-of-stream condition where the stream isn't actually terminated:

https://github.com/ssbc/go-luigi/blob/324065b9a7c66588b6a42896de8da1b8cff6668d/stream.go#L21

@decentral1se
Copy link
Member

@KyleMaas what's your email? Can fire @ helo[at]autonomic[dot]zone if you don't want to share here. Thanks for links/thinking along!

@decentral1se
Copy link
Member

decentral1se commented Jan 2, 2023

@cryptix if you end up having a look, I think the questions on #293 are also important and related (not to add more work but yanno 😅)

@KyleMaas
Copy link
Contributor Author

KyleMaas commented Jan 3, 2023

I think I may have found a way to pull this off in an elegant manner. Will let you know if what I think will work does.

@KyleMaas
Copy link
Contributor Author

Well, my idea didn't work.

But it led me to another really simple way to do it, which has made it so that TestNames has successfully run over 500 times with zero failures. It involves some minor changes to go-luigi. Pull requests imminent.

@KyleMaas
Copy link
Contributor Author

Hm...just thought of a more betterer way to do the same thing. I think I'm going to work on it more tomorrow and do pull requests then.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
Development

Successfully merging a pull request may close this issue.

2 participants