Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Bulk append validation #17

Closed
wants to merge 7 commits into from
Closed

Conversation

Happy0
Copy link

@Happy0 Happy0 commented Aug 7, 2019

This is a pull request to support ssbc/ssb-db#264 . See that pull request for more context.

@dominictarr
Copy link
Contributor

I don't think it's necessary to duplicate all this code. when you call validate.append(state, hmac_key, content) it only validates it and queues it internally, so it should be fine to just call that in a loop, and then write the queue once.

Oh the one thing would be to check that the messages you are adding are all gonna be valid. Maybe the easy way to do that is to check the content will fit and then create the messages?

@Happy0
Copy link
Author

Happy0 commented Aug 8, 2019

I don't think it's necessary to duplicate all this code. when you call validate.append(state, hmac_key, content) it only validates it and queues it internally, so it should be fine to just call that in a loop, and then write the queue once.

But this would mean it would be possible for items from the batch being appended in separate writes by AsyncWrite in ssb-db, if they cross its write buffer boundary right? And at that point, publishAll is no longer an atomic operation.

i.e. some of the append elements are added before isFull is true ( https://github.com/ssbc/ssb-db/blob/dd961e4fbaf5321a314266adbcd32ac22ec3d413/minimal.js#L128 ), and some are added afterwards - or the unlikely but possible event that the whole batch exceeds the limit.

Maybe the easy way to do that is to check the content will fit and then create the messages?

append in ssb-validate wouldn't be able to know if the queue's full or not, right? And even if it did, what would it do in such a situation? We'd also have to tag things we pass it as being part of an atomic batch so it knows when to check, etc...

The other consideration is the flush queue which calls back the caller when the write is performed: https://github.com/ssbc/ssb-db/blob/dd961e4fbaf5321a314266adbcd32ac22ec3d413/minimal.js#L208. For a batch, this is the full batch. I guess we could just make all but the last flush pushed a no-op, with the last one being the batch.

@Happy0
Copy link
Author

Happy0 commented Aug 13, 2019

@dominictarr - just bumping my reply above. I hope this isn't rude - I just wanted to make sure it got through to you (as I'd imagine you get a lot of messages.)

@dominictarr
Copy link
Contributor

hey sorry life has been pretty hectic recently. currently half way home to NZ though. I definitely want to get this feature in. My concern is that ssb stays as simple as possible. message validation is the probably the most important thing in ssb. If we introduce bugs here we'll have to support them forever (there are already several embarrassing querks in message validation because of this)

There is quite a bit of duplicated code here. for example, createAll duplicates code inside of create. couldn't it just call create in a loop? this module has very rigorous test coverage but if you duplicate code then we can't trust the duplication. (I recommend nyc to get test coverage: nyc --reporter=lcov npm test; firefox coverage/lcov-report/index.html currently 89%)

Possibly the base functions we have are not quite what you need, maybe we can refactor something to make a bulk append easier?

Also, look at queue.

hmm, idea: maybe all you actually need here is a revert function that takes the current queue, and rewinds the states for those feeds the first previous message and sequence? then if the write fails you can remove it?

@christianbundy
Copy link
Contributor

@Happy0 Could I ask for a status update on this when you have time? I'd love to hear where you're at with this, but no huge rush.

@stale
Copy link

stale bot commented Feb 20, 2020

Is this still relevant? If so, what is blocking it? Is there anything you can do to help move it forward?

@stale stale bot added the stale label Feb 20, 2020
@christianbundy
Copy link
Contributor

Going to close this for now, if anyone has the time/energy/interest to pick this up please open a new PR!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants