-
Notifications
You must be signed in to change notification settings - Fork 19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat[marketplace]: add slot queue pausing #752
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work! 👍
However, I am a bit confused about one part of your PR. In your proposal at codex-storage/codex-research#188, there were no mentions of the concept of "empty availabilities". I remember we discussed this before and I think reaching the definition of what "empty availability" is was bit tough.
I noticed that you have two ways to pause the queue. The first is based on the seen
flag as per your original proposal. The second is triggered when "all availabilities are empty". I am unsure why you added the second part. Wouldn't the seen
flag mechanism already cover this scenario? It seems to introduce unnecessary complexity and complicates understanding and reasoning about the codebase.
What do you think?
That's a great point -- the queue passing should work without the added complexity of pausing when availabilities are "empty". I'll try to remove and write a test to assert that the queue pauses once availabilities are "emptied". |
@AuHau the PR has been updated with your feedback. Thanks for the review! 🙌 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pretty great work! There are some cleanup items, but generally LGTM.
Before merging lets discuss on our call about the reprocessing of error states.
tests/codex/sales/testsales.nim
Outdated
test "queue is paused once availability is insufficient to service slots in queue": | ||
createAvailability() # enough to fill a single slot | ||
await market.requestStorage(request) | ||
# await sleepAsync(10.millis) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Dead code? Can be deleted?
tests/codex/sales/testsales.nim
Outdated
await market.requestStorage(request) | ||
await sleepAsync(10.millis) | ||
# await sleepAsync(10.millis) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another dead code?
5a91aa0
to
ee42b61
Compare
This PR should be ready to go, however I've some (small) changes to the way the queue pauses, so maybe it might be prudent to review again @AuHau ? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just few comments but generally LGTM! Great work! 👍
item.seen = false # does not maintain the heap invariant | ||
|
||
# force heap reshuffling to maintain the heap invariant | ||
doAssert self.queue.update(self.queue[0]), "slot queue failed to reshuffle" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this raises, where is it caught? Or will this crash the whole node? That should not really be necessary, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
update()
only returns false when the argument is not in the queue. Since we're supplying self.queue[0]
as the arguments, this will never happen. And if it does happen, then it's a programmer error and the node should crash, I think.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, exactly. In addition, we know that self.queue[0] exists because there is an early return
when if self.queue.empty
is true. If the queue is empty when we don't expect
it to be, there is no real way to recover.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work @emizzle, thanks! I have a few small comments, but other than that good to go!
item.seen = false # does not maintain the heap invariant | ||
|
||
# force heap reshuffling to maintain the heap invariant | ||
doAssert self.queue.update(self.queue[0]), "slot queue failed to reshuffle" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
update()
only returns false when the argument is not in the queue. Since we're supplying self.queue[0]
as the arguments, this will never happen. And if it does happen, then it's a programmer error and the node should crash, I think.
Where are we with this? Should it be merged? |
940dae4
to
6946cbf
Compare
Will merge once green. Had to address a few small comments and rebase on master. |
Re-add processed slots to queue if the sale was ignored or errored
- when processing slots in queue, pause queue if item was marked seen - if availability size is increased, trigger onAvailabilityAdded callback - in sales, on availability added, clear 'seen' flags, then unpause the queue - when items pushed to the queue, unpause the queue
The slot queue should also have nothing to do with availabilities
An empty availability is defined as size < DefaultBlockSize as this means even the smallest possible request could not be served. However, this is up for discussion.
onAvailabilityAdded and onAvailabilitiesEmptied are now only called from reservations.update (and eventually reservations.delete once implemented). - Add empty routine for Availability and Reservation - Add allEmpty routine for Availability and Reservation, which returns true when all all Availability or Reservation objects in the datastore are empty.
Includes tests for sales states cancelled, errored, ignored to ensure onCleanUp is called with correct parameters
- indent `self.unpause` - update comment for `clearSeenFlags`
Queue pausing when all availiabilies are "emptied" is not necessary, given that the node would not be able to service slots once all its availabilities' freeSize are too small for the slots in the queue, and would then be paused anyway. Add test that asserts the queue is paused once the freeSpace of availabilities drops too low to fill slots in the queue.
The asyncheapqueue update overload would need to check index bounds and ultimately a different solution was found using the mitems iterator.
Previously, when a seen item was processed, it was first popped off the queue, then the queue was paused waiting to process that item once the queue was unpaused. Now, when a seen item is processed, it is popped off the queue, the queue is paused, then the item is re-added to the queue and the queue will wait until unpaused before it will continue popping items off the queue. If the item was not re-added to the queue, it would have been processed immediately once unpaused, however there may have been other items with higher priority pushed to the queue in the meantime. The queue would not be unpaused if those added items were already seen. In particular, this may happen when ignored items due to lack of availability are re-added to a paused queue. Those ignored items will likely have a higher priority than the item that was just seen (due to it having been processed first), causing the queue to the be paused.
6946cbf
to
449cedb
Compare
Adds slot queue pausing based on the design in codex-storage/codex-research#188.
@AuHau -- I took the changes in commit 1425bf5 and added to this PR as I needed to update that routine.
Close #549