Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

implement kernel-side escalator scheduler #23

Open
warner opened this issue Mar 26, 2019 · 7 comments
Open

implement kernel-side escalator scheduler #23

warner opened this issue Mar 26, 2019 · 7 comments
Assignees
Labels
needs-design SwingSet package: SwingSet

Comments

@warner
Copy link
Member

warner commented Mar 26, 2019

The kernel currently holds a queue of messages to deliver, and executes them in strict order of their submission. In the longer run, we want messages to be able to pay for faster execution, using the "meter" and "keeper" mechanisms from KeyKOS.

The first step will omit the whole financial aspect and just implement the scheduling algorithm. Each pending delivery, once it is unblocked on any dependency it might have (none for now, but Flows will add some sooner or later), gets put on an escalator with some starting offer price and delta-price/delta-time slope. Nothing executes while its offer price is below zero (this enables timed delivery, more or less). As time passes, all messages have their price increased by their slope. The highest offer gets to execute.

In our isolated environment, "time passing" does not have a lot of meaning. But we should be able to simulate it somewhat. It will become more relevant when we're limited by how much computation we can execute at once, e.g. if the Vats are living in a blockchain and the limit is how much computation can go into each block.

A later phase will be to implement Meters and have some sort of tokens being spent. For now, userspace should just be able to specify the initial price and slope to be anything they like, without concern for how it will pay for the prioritization it wants.

We need to think through the userspace interface for this. Most of the time, each outbound message should use the same Meter (and scheduling parameters?) as the triggering inbound message was using. But it should be possible to fork off an separate scheduler as an option when making outbound calls. Maybe a dynamic-scope tool (what would be a "context manager" in Python) that causes all message sends within that scope to use the alternate settings.

Dean's Flow work should cover this: any messages sent within the context of a Flow will use the same Flow (and will be dependent upon all previous messages in that Flow), but there's a .fork() operation that gives you a new Flow, maybe with different parameters. We'll need to make the kernel aware of these Flow objects.

@warner warner transferred this issue from Agoric/SwingSet Dec 1, 2019
@warner warner added the SwingSet package: SwingSet label Dec 1, 2019
dckc pushed a commit to dckc/agoric-sdk that referenced this issue Dec 5, 2019
dckc pushed a commit to dckc/agoric-sdk that referenced this issue Dec 5, 2019
Fix Agoric#22: Remove domain property from promise
dckc pushed a commit to dckc/agoric-sdk that referenced this issue Dec 5, 2019
dckc pushed a commit to dckc/agoric-sdk that referenced this issue Dec 5, 2019
comments within 80 columns when possible
@warner
Copy link
Member Author

warner commented Oct 8, 2020

At today's kernel meeting, we tried to figure out what the simpest possible metering/scheduling mechanism might be. Some of the goals are:

  • protect the kernel (and the swingset as a whole) against runaway vats, by terminating ones that use too much time
  • protect contract vats against attack by lots of noisy/useless inbound messages
  • avoid metering faults as much as possible: killing a vat is pretty traumatic, and abandoning a crank in the middle (non-vat-fatal underflow, to be resumed on a better-funded meter in some future crank) is really expensive because we have to rebuild the vat from a checkpoint+transcript
  • be efficient in the majority case
  • don't nickle-and-dime contract authors. We assume that validators/hosts are adequately compensated for their costs already, so we don't need to account for every single CPU cycle or byte of RAM
  • customers shouldn't have to predict exactly how much CPU their message might trigger
  • execution fees should probably be paid by the recipient, who includes the cost in their price list (I think someone described this as "published prices, variable costs")

The basic proposal we sketched out (which does metering but significant scheduling) is:

  • we have just the one run-queue (initially, we'll add the escalators later)
  • metering is coarse: you can buy computrons/bytes in units of Small/Medium/Large
  • when a message is delivered, it gets one Small runtime for free
  • the receiving vat should be written to do one of the following within that Small runtime:
    1: process the message to completion
    2: switch to a different meter, to do Medium or Large work on it
  • the receiving vat thus has enough meter to make a decision about the message without risking an underflow
  • meters should be stacked, so "switch to a different meter" roughly means "push a larger-capacity meter underneath the initial one"
  • vats should have a way to sense if/when a meter is used. This is probably easier to use than asking when a meter has been exhausted (i.e. the vat doesn't need to ask about the initial meter, just the local one that it switches to). We don't think this should be synchronous, but three options are:
    • poll (but when? vats can't sense end-of-crank)
    • the first time a meter gets used in a given crank, it schedules an eventual-send to a subscriber, which will fire in some later turn of the current crank
    • the kernel enqueues a message to the vat (i.e. to some subscriber object within the vat) for delivery in some future crank, possibly a lot later if the escalators are full

Vats would maintain a fairly deep stack of meter capacity, to avoid an expensive (or fatal) underflow. But they'd check on the backstop meter, and if it ever gets used, they react somehow (deny service to the calling client?).

Scheduling

On top of this, we'd add the escalators. When sending a message, vats would need to indicate which meter the message would use, and the start+slope of the escalator. The most common path is to use the same meter that was used by the message which initiated the crank, but vats must be able to switch to a different one for this to be at all useful.

Escalator fees are for priority, not execution cost. Each message is bidding against the other messages for the opportunity to consume one of the limited execution slots.

The userspace API for this is not clear: @dtribble 's Flow scheme would provide a nice UI, but 1: we haven't implemented it here yet, and won't be able to before we need the escalators, and 2: its primary purpose is to inform ordering constraints, and we'd need to think through how to manage that when the ordering constraint says A must be delivered before B, but the escalator priorities cause B to be runnable first (does the entire Flow block until both constraints are met? Does the blocked message continue to climb the escalator and increase its bid? Probably each message should not even join the escalator until the other constraints allow it to run)

A high-level question raised by @dtribble was "should fees be additive or multiplicative"? I'll defer the exact definition to him, but my understanding was that it relates to chains of messages (the customer sends a message to vat A, which spawns a dozen messages between other vats before the economic operation is complete). If the customer pays for expidited service, so their first message is delivered quickly, but the other dozen messages (not created by them, but created in response to their request) are low-priority and get delayed, then they won't have gotten the expidited service they paid for.

@erights pointed out that the vats involved can and should be aware of the priority and take steps to honor the request properly: if I buy next-day parcel delivery, the service I hired is not going to sub-contract to a week-long shipper (if they want to keep my business). I think @dtribble 's "multiplicative" option means that assigning a similar high-priority to each additional "internal" message causes the overall costs to be multiplied by the number of those messages.

An "additive" option should mean (I think, there was some disagreement here) that splitting a task into multiple messages (perhaps across multiple vats) should not significantly increase the overall cost over doing it all in a single message. @erights pointed out that messages crossing between distinct swingset machines (e.g. multiple chains in a pegged-currency trade) should incur extra costs, as those messages are not as cheap as intra-swingset or intra-vat ones.

One approach that might achieve "additive" costs would be for meters to ride the escalators, not individual messages (so each meter could hold a number of messages). When the meter reaches the top, we run multiple messages (perhaps to different vats), adding their costs together instead of requiring each one to climb the escalator separately. Another would be to allow vats to place some number of outbound messages at the front of the run-queue (maybe they add the message to the current/inbound meter, and since that meter is still at the top of the escalator, the message runs right away). Or to give some amount of escalator credits to messages being created by the current vat, if they're small, or if the current vat hasn't run for too long.

@dtribble made an analogy with Unix-style kernel schedulers (in particular KeyKOS, I think), where you bid for CPU time and "win" a 4ms slice. If the process finishes early and returns to the kernel quickly (e.g. it processed a message and returns to waiting for IO), the process gets credit for being polite, in the form of an elevated priority. The next time IO is ready and the process can run again, it will be at the top of the list. In contrast, a process which consumes its entire 4ms time slice is probably CPU bound, and is likely to do the same again next time, so it gets a lower priority.

In our system, the "don't nickle-and-dime developers" goal means we don't really want to do fine-grained usage measurements. But if we had that, we could conclude "message X bought a Medium slice but only used Small", and then maybe we let new messages it creates travel on the coattails of the initial bid and execute quickly. Ideally, if all dozen related messages add together to less than Medium, they should all run immediately, rather than going back to the escalators.

@warner
Copy link
Member Author

warner commented Oct 29, 2020

Notes from today's kernel meeting (28-oct-2020):

  • all CPU usage is quantized into small/medium/large bundles
  • prioritization is different than resource consumption
  • metering is primarily about preventing DoS attacks and accidental runaways, not reimbursing validators for every CPU cycle or incentivizing efficient contracts
  • vat rollback is really expensive, so we want to avoid it as much as possible. Vats may be willing to spend their own funds to avoid incurring a rollback.
  • we need to create some new terms, for now I'm going to talk about "prioritization meters" and "execution meters"
  • prioritization meters ride the escalators, not individual messages
  • each prioritization meter contains a FIFO of messages
  • when a prioritization meter reaches the top of the escalator, it can run as many messages as it can pay for
  • each message costs 1 Small stamp, and entitles the vat to execute some Small amount of computation (CPU/memory/stack) before it expires.
  • when the message is delivered to a vat, a new "execution meter" is created, with a budget of 1 Small
  • each vat has a stack of execution meters. The 1 Small meter that was paid for by the delivery is on top
  • vat code can push a new meter onto the stack (or maybe push the meter under the current top-of-stack, not sure) at any time, but all turns see the same stack (we cannot sense end-of-turn, so we can't treat one turn differently than any others)
    • as a consequence, mutually-suspicious code A within a single vat has no way to prevent the other (B) from pushing work onto the promise queue, which will then get to run against a meter that A switched to
  • in general, the vat must switch to a different meter before the Small budget is consumed. Small should be enough to let the vat make a decision, and it should make a decision before scheduling any additional work (especially before giving control to less-trusted code)
  • Most vats will have a long-lived fallback meter pushed under the 1 Small delivery budget. They don't expect this meter to be used much, but it's not a big deal if it is.
    • @erights / @dtribble 's example was a hotel cleaning fee
    • if you trash the hotel room, there's a process to pay for cleanup
    • but that's not the service they're trying to provide, so it's exceptional and might be handled the next time someone tries to use the meter, or something
  • If we find the fallback meter is being used too often, then maybe we should raise prices, or examine the code more carefully to find ways that callers could provoke unexpectedly large amounts of activity

Some vats are "instance contracts": closely-held (typically by two-ish parties), short-lived, single-purpose. These are likely to be given a decent-sized meter to pay for their operation. Every entry point to the contract will start by switching from the delivery meter to the operational one. The analogy is that you pay up front for all the postage necessary to do business, so you don't have to provide money with each individual message. This is less prone to abuse because the only parties who can deliver messages are the participants that were known when the contract was created.

Other vats are "service contracts": long-lived, widely-held. Here, we cannot rely upon the good behavior of the two-ish participants. Callers to these objects are expected to pay for their requests, by including a meter with any delivery that might not be completable within a 1 Small CPU budget.

It should be possible to run synchronous code on a separate meter. The API should look like a lexical block, maybe runUnderMeter(meter, () => { code.. }). It should also be possible to switch the entire crank to a different meter: pushMeter(meter), which remains in effect until the end-of-crank (and spans however many turns it takes to get there).

The meter used for sending messages, however, should be associated with the Promise that wraps the Presence for the remote object. We only have one Presence per remote object, but we can have arbitrary numbers of Promises for each one. We're thinking that each Promise has a prioritization/delivery meter associated with it, such that all messages sent through that Promise will use that meter.

Tentative API:

  • meters are special kernel objects, managed through c-lists
  • vats can create them with an ambient function: meter = makeMeter(escalatorParameters)
  • meters are funded with meter~.refill(payment), which is an async call that take a Payment against a special kernel Issuer
  • p = makeMeteredPromise(presence/promise, meter) gets you a new Promise that knows about the meter
  • E(p).method(args) pushes a message onto the escalator described by the embedded meter
    • we preferred embedding the meter in the Promise over an augmented syntax like E.sendMetered(target, meter).method(args), because the latter would be inaccessible to tildot syntax (target~.method(args) has no place to add additional arguments)
    • and because it makes it easy to have meters be contagious: the result promise you get back is tied to the same meter
  • the priority fee is deducted from the meter when it reaches the top of the escalator
  • then the kernel pulls messages off the priority meter and delivers them, one crank at a time, until the meter's queue is empty or it can no longer pay the 1 Small fee for each one
    • the kernel can sense underflow of the prioritization meter before delivering the message, so it doesn't risk an expensive vat rollback. The delivery fee is charged up front
    • if the prioritzation meter underflows: who gets notified? who refills it? we need a way to allow any remaining messages to be delivered eventually
    • this scheme might mean that messages emitted in response to the inbound message will go back onto the same top-of-the-escalator prioritization meter that is already running, allowing them to all executed before anything else on the other escalators
      • this is exactly what we want to support high-priority economic actions, like buying a house
      • but we need a way to think about the message ordering properties that result, make sure programmers can still rely upon something, things don't get delivered out-of-turn when they shouldn't be
        • within a single prioritization meter, there's a FIFO, so sending multiple messages to the same promise will be delivered in-order, and chaining messages will retain their delivery order. That might be enough
  • each message delivery pushes an execution meter with 1 Small onto the stack, then runs
  • if that meter is exhausted before the crank is complete, the vat switches to the next meter on the stack
  • if the entire stack is exhausted, the vat is either terminated or rolled back (expensive) and held pre-delivery until a keeper decides whether to let it die or refill the meter and re-deliver
  • the vat can execute an ambient switchToMeter(meter), or maybe pushMeter(meter), and/or maybe the function should take an additional budget (S/M/L) argument to say how much execution we expect to be performed on this meter
    • the meter would be deducted up front for the claimed budget

If contract authors don't do anything special (they never call switchToMeter or makeMeteredPromise):

  • any crank which takes more than 1 Small causes the vat to die
  • vat authors should be careful to not allow incoming messages to do more work than that

If the contract authors don't do switchToMeter or makeMeteredPromise, but they did push a fallback meter when the vat was first created, then:

  • any crank which takes more than 1 Small will drain something from the fallback meter
  • somebody should pay attention to the balance in the fallback meter, and if it ever gets too low, they should top it up (and maybe figure out what caused it to drain, and do something about it)

For "instance contracts" which are pre-paid for their (small number of) clients, either the fallback meter is used all the time, or all entry points should switchToMeter(myOwnMeter) right away.

For "service contracts" that need to do non-trivial work for new clients all the time, the exposed objects should accept a newMeter argument, and they should test the meter (make sure it has enough to cover the operation), and then switchToMeter onto that meter. They should finish all activity in a single crank, maybe.

@warner
Copy link
Member Author

warner commented Dec 16, 2020

Notes from today's meeting, @dtribble and @erights describing their recent conversation:

  • add a "meter" field to each kernel object table entry, and to each kernel promise table subscription entry (the subscribers set of vat-ids turns into a Map from vat-id to meter)
  • each meter has a set of escalator parameters associated with it (starting bid and rate)
  • each pending delivery (targetting a specific kernel object, rather than an as-yet-unresolved promise) is scheduled according to the escalator parameters associated with the object's meter
  • each pending notification (targetting a vatid+promiseid pair) is scheduled according to the meter associated with that subscription
  • the syscall.send() method acquires an additional "meter" argument
    • this meter is not used to control how its message gets sent
    • rather, it defines the meter to be associated with any newly exported objects that appear in the arguments
    • and for the sending vat's subscription to the result promise that appears in the syscall
  • syscall.subscribe() acquires an additional "meter" argument
    • used to define the subscription's meter
  • or, we add a syscall.switchToMeter(meter), instead of adding the argument to send/subscribe
    • however that would establish a weird temporal-but-cross-turn scope, which sounds perilous
  • liveslots is told the meter upon which a dispatch.deliver or dispatch.notify arrived, probably through an additional argument
  • liveslots should default to using this same meter for all outbound send or subscribe performed during that crank
    • but it should be able to override that somehow

We didn't come up with a syntax for how vat code should switch to a different meter. Ideally use of a given meter is expressed in a lexical scope. But the syscalls (or syscall arguments) needed to tell the kernel about the meter selection is dynamically scoped. We speculated about some sort of source transform:

function handler(args) {
  run_with_meter(meter1, {
    send_thing();
    send_other_thing();
  });
  run_with_meter(meter2, {
    do_third_third();
  });
}

I could imagine a source-to-source transform which rewrites that to:

function handler(args) {
  set_meter(meter1);
  send_thing();
  send_other_thing();
  set_meter(meter2);
  do_third_third();
}

but I can't imagine how it would deal with e.g. send_thing() invoking some other metered code which switched to meter3 before doing its own thing: we'd need some way to switch back to meter1 before doing anything else, which would mean the transform needs to inject a set_meter call before every single statement, or between expressions (do_thing().do_thing_with_return_value() -> set_meter(); const tmp = do_thing(); set_meter(); tmp.do_thing_with_return_value(); set_meter();), which sounds pretty horrible.

We also said that the execution costs of a message should be charged against the same meter which was used for the prioritization costs.

@dckc
Copy link
Member

dckc commented Dec 30, 2020

@warner and @kriskowal and I looked at ...

in the context of this issue.

The one conclusion I recall clearly is that no, we don't need to deal with this for a while; we should stay focused on getting the XS vat worker with snapshots is working. (#511, #2145, ...)

@warner
Copy link
Member Author

warner commented Feb 15, 2021

Notes from a conversation with @dtribble today:

  • messages sent to Presences are subject to the escalators; messages sent to Promises generally aren't (they aren't deliverable until the Promise is resolved, so there's not way to expedite their processing)
  • there is a many-to-one mapping from Presence to Meter (or perhaps to a stack of fallback Meters)
  • there is a many-to-one mapping from Meter to Escalator
  • we'll have a small number of Escalators, maybe 5; no need to provide infinite resolution
  • each Escalator has an initial price and a slope (increase in price per.. ideally per unit time, but realistically either block number, crank number, or cumulative computron count)
    • we might also allow a "starting bonus": each Escalator would have a "max starting bonus", each Meter would have a starting bonus, and the message is added to the escalator somewhere other than the bottom
  • processing of the escalators should be O(num_escalators): each one is basically a dequeue (with a non-zero "starting bonus" causing insertion into the middle rather than only on the bottom), and only the top item is eligible, so we just need to compare the bids of the tops of each escalator
    • although.. each Meter has a current balance, which imposes a ceiling on the bid that any message can make, which means the top-most message might not actually get to bid that much, which might drag some O(num_messages) behavior back in. @dtribble , did you have a solution for that in mind?
  • when the message reaches the top of the escalator, it gets delivered
    • the Meter is decremented by some simple algebraic combination of:
      • a price: either the message's bid price, or the second-lower price, or some price determined by walking the entire pool down from the top-most price and adding up all the per-message claimed computron limits until we reach a per-block limit and then using the lowest-found "block clearing price"
      • a claimed max usage limit (the amount of computation they were bidding for, also the limit beyond which we're free to kill the crank for over-usage)
      • the actual usage, measured by watching how far the JS engine's counters were decremented while the crank ran
    • if the Meter is really a cascade of fallback Meters, they get consumed in priority order, but maybe their Keepers get notified or something

@warner
Copy link
Member Author

warner commented Feb 18, 2021

Notes from today's kernel meeting:

Execution Allowance

All cranks get a "small" unit of capacity, and by default are terminated if they exceed it

  • there is no mechanism for a caller to enable more, nor for certain Remotables to be configured to get more
  • however the method being called may invoke a special function to augment that limit if they see fit
    • vatPowers.extendCrankLimit(medium/large, meter)
    • this will cause the Vat Worker to change the low-level engine limit, or to configure the limit-exhausted callback to increase the limit if it ever gets triggered
    • note the difference between the crank limit being exhausted (which happens after either small/medium/large computrons) and the Meter balance going to zero
      • a infinite-value Meter does not enable infinite computation within a single crank
      • we must decide whether the Meter balance is deducted at the beginning or end of the crank; @FUDCo pointed out one option is to deduct the claimed amount at the beginning, then refund the leftover balance at a discount, which would make overestimation expensive but not too bad
    • the function should probably take a Meter object, which will be charged
      • maybe it gets charged for everything beyond that point in the crank (including other turns)
      • or maybe it gets charged for anything beyond the initial "small" unit, with the original Meter being charged for the initial unit
      • if the Meter has an insufficient balance to support the requested extension, the call should fail
        • vat code should not allow any other turns to queue up before making the switch, as the limit in effect is temporally scoped: confined to to the crank but not confined to the turn
  • the expected pattern is that the "small" unit is sufficient for every method exposed to the outside world to either finish their business or decide about getting more time, no matter what arguments they receive
  • defensive consistency (or maybe "defensive progress"?) means that one customer should not be able to kill a vat by sending it messages that are unexpectedly expensive to execute
    • I think that means we can't afford to ever have a metering fault, because that could leave the vat in an inconsistent state. Death before confusion!
      • any actual metering fault should terminate the vat
      • but a fallback meter could be invoked safely
      • if you expose an object whose execution could possibly cause a metering fault, you've given the caller life-or-death control over the entire vat
    • it is safe to look at the arguments of a message and decide to refuse to process it, if that decision can take place within the "small" limit, or it we first switch to a meter which we're sure will be enough to make that decision
    • we can look at the arguments and then switch to a specific meter, to charge different users differently, but if there's any chance that the processing will exceed the capacity of a "large", we must not process the message, for fear of killing the vat and thus failing to serve other (well-behaving) customers

Price Selection

The economics of price selection are still an open question.

  • the winning bid might pay their bid price, the next-lower bid price (a second-price auction), or we might find the full set of cranks that will fit within a block and then charge all of then the price bid by the first one that didn't fit (the "block clearing price")
  • however, there is an advantage to being first in a block, because your changes happen before anyone else's do, and that advantage is worth paying for

Authority Layout

Each Presence is mapped to a Meter.

Each Meter is mapped to a 2-tuple of Escalator, and Initial Bid.

Each Escalator is defined by a 2-tuple of Slope (price increase per unit time/crank/usage/not-sure) and Maximum Initial Bid. By limiting the Meter's ability to chose an infinite Initial Bid, all messages will eventually reach the top (the "fairness" property). A reasonable starting point would be three meters, whose slopes are 1, 100, and 10000.

Each message is sent to a specific Presence, which causes it to be added to an Escalator at a particular starting price. Its price then climbs over time.

The actual implementation will have a separate priority queue for each Escalator, rather than tracking every message individually.

Meter Balance limits the Bid

Each message will climb an Escalator until its price is the highest, then it gets delivered. However, the Meter associated with that message may not have the funds to pay for the claimed price.

One option is to freeze that bid at the Meter's balance. However this would probably break the O(numEscalators) processing algorithm, making it more expensive to track everything.

Another option is to invoke that Meter's "Keeper" and let it decide whether to supply additional funds, switch to a different meter, or abandon the message.

Facets for different Meters on the same Remotable

There are many advantages to baking the Meter into the Presence, however two drawbacks came up:

  • it becomes difficult to refer to the same Remotable with multiple priorities
  • intermediate requests happen at the priority of the provider of the intermediate Presence, even if the overall operation is running at a different priority

The example we came up with was a high-priority contract message that pays to get to the top of the queue, executes, and then discovers that it needs a Brand verification or balance lookup to continue. If the contract has an Issuer Presence which has the low- or medium- priority escalator baked into it, the intermediate message will take a long time, slowing down the overall operation despite the user's expressed willingness to pay for high priority.

If the contract could make a particular message use a particular Escalator, it could submit the Brand check at high-priority.

If it gets multiple Presences for the Issuer, each with a different escalator, it could submit the check via the one that matches the priority level. But, in that case, it would have three different Issuer Presences, which wouldn't compare as EQ, which is necessary for other reasons.

@erights suggested that we need to find an object-capability way to express the combination of priority and target. I don't know how to do that (and still maintain some useful form of EQ), but it sounds like the right goal to me, especially when you consider that being able to use a particular priority is an authority all by itself, and should be something that can be shared/attenuated through our usual patterns. Maybe something where there's a primary Issuer Presence (which you use for EQ), and a bunch of priority-bearing facets (which you use for messages), and the primary will tell you if an alleged priority facet is legitimate (so the primary and the facets have the same relationship as an Issuer and its Purses).

What to Charge For

When a message makes it to the top of the queue and gets delivered, it had a winning price of some sort. It will have some maximum usage limit (perhaps always just one "small" unit, but extensible if the vat requests it). There will be some actual amount used, which we measure to accumulate and size the block.

What should we deduct from the message's Meter?

  • We could charge them immediately for their bid: payment for priority, independent of payment for actual execution. Your bid buys you entrance to the office, but you have to pay something else for each minute you spend in it.
  • Or we might treat the bid as a price-per-usage, and multiply it by some measure of the CPU usage to figure out how much to charge. Ethereum does something similar to this: each txn has a gas-price and a max-gas, and the miner's reward is gas-price times used-gas.

We might charge them the full usage limit (because they had the ability to consume it all, before anyone else, and that's a scarce resource). We might charge them only for what they actually used, because the remaining time can still be used by others. We might charge them for the full limit but then buy back the remaining time at a discount. Or we might just charge a fixed amount and leave the economics to the priority payment.

The core question is: what does the bid buy you? We know it should get you at least one "small" unit of computation, and it make sense to allow up to one "large" unit if the vat requests it right away. We know the vat shouldn't get to run forever (starving out everyone else), so there must be an upper limit to what it can get without submitting a new message and waiting for it to get back up to the top.

If the vat can buy a "large" unit at the start of the crank, it wouldn't be any less fair to allow it to instead submit several (large / small) "small" units (to other vats) which run immediately, instead of starting at the bottom of an escalator. If we were managing a marketplace of fixed-size all-or-nothing timeslices, anyone who bought a block of time could trivially subdivide it and sell off the pieces separately, which feels like a similar situation.

End-to-End Story

I wrote up a set of questions for us to figure out:

  • How does a user prioritie their immediate economic goal?

Buying a house might be more important to you than buying a cryptokitty is to someone else. We can imagine the UI having a slider (like many BTC/ETH wallets) to increase priority and fees. If each Presence is bound to a Meter, the UI will implement this by going to the normal-priority endpoint first (taking however long it takes) to get an Invitation that is bound to a specific escalator. Everything that happens after the Invitation Presence uses the requested priority.

  • How does that affect the message sent by the wallet ag-solo?

The message would be normal, but the target to which it is sent would be bound to a high-priority Meter.

  • How do secondary/subcontracting messages get the same priority? Promise resolutions need priority too.

The contract that created the high-priority Invitation would be responsible for acquiring high-priority Presences for anything else it needs to do the requested job at the right priority level: Issuers, Purses, other contracts, etc.

I don't yet know how promise resolutions should work. The easiest answer is that all resolutions sent during a crank use the same Meter that the crank itself used, which means they'll be enqueued at the bottom of that Escalator and wait their (faster-priority) turn.

  • How does a vat bind a Mete to a particular Escalator?

Maybe something like:

const meter2 = vatPowers.makeMeter({ escalator: 'fast', initialBid: 12 };
meter2~.deposit(meter1.getPayment(45));
  • How does a vat create a Presence bound to a particular Meter?
return Far(iface, behavior, { meter: meter2 });
  • Does the contract/recipient pay for both execution and priority?

Yes.

  • Does the sender pay for anything?

No, not directly, however they might have to buy the high-priority Presence ahead of time, which will cost them whatever the Meter holder wants to charge.

  • How should the claimed CPU usage be expressed?

The sender does not influence this, nor is there anything associated with the message, Remotable, or Meter to suggest how long a message might take to execute. All messages get one "small" unit, but the receiver's code can ask for more, up to some limit, after the crank begins.

DoS Resistance Analysis

We'll need to do a careful analysis of how our approach mitigates spam attacks. We'll rely upon something at the Cosmos level to prevent attackers from introducing an unbounded number of messages into the kernel from the outside (messages aimed at the comms vat, which will parse their contents and emit syscalls with various Meters and Escalators). That layer should be safe if each such message costs some execution tokens to get put into a block. The next-earlier resource to protect is the mempool, which we might handle with @michaelfig 's N-per-sender refundable tickets idea, which limits the mempool consumption to N * numAccounts. A still-earlier resource to protect is the node's ability to examine these tickets, for which the "sentry architecture" seems like the right approach.

@warner
Copy link
Member Author

warner commented May 28, 2021

Per-Presence "Static" Escalator Assignment

@dtribble and I brainstormed today about an approach in which:

  • each run-queue item is bound to a specific meter/escalator/scheduling-policy (abbreviated "scheduling" here), as usual
  • each kernel object is bound to a specific scheduling
    • promises are not
  • vats can make a new syscall.bindMeter(vref, metercap) to pre-declare a scheduling for an object that's about to be exported in some other syscall
    • we treat the bindMeter call as a form of export (so any auxdata would need to be supplied at that time)
    • (there's a GC hiccup here, after the vat introduces the vref with bindMeter, but before it's been used in an outgoing message, the refcount is zero, and the kernel might mark it for deletion. However, it's still "recognizable", so the identity will be retained, along with the kernel object table entry, which is where the meter kref will be stored, so that's not really a problem. And anyways the kernel won't actually tell the vat to drop anything until end-of-crank, at which point the vref should be pinned down by an actual message on the run-queue)
  • each syscall.send looks up the scheduling of the target kref, and uses that to schedule the message run-queue entry
  • during each crank, the kernel remembers the scheduling of the run-queue item that provoked it
    • any newly-exported kernel objects inherit this scheduling, unless overridden by syscall.bindMeter
    • any newly-exported promise objects do not, since promises aren't bound to a scheduling
    • any syscall.resolves that happen during the crank inherit this scheduling for their notify run-queue entry
  • liveslots would represent the scheduling to userspace somehow, maybe a Meter object, and a one-time meter.bind(remotable) would assign a vref and do the syscall.bindMeter
    • we'd make this work for Representatives too: the vref-to-meter binding would be stored on disk

I can imagine building that system into the kernel and liveslots. I suspect (but haven't fully thought through) that it would satisfy the "priority contagion" property that we need: most of the work done on behalf of a high-priority starting message will also be performed at high priority. One sharp edge to watch out for is any pre-existing Presence from some other vat, which will be on its own pre-selected scheduling.

We tried to walk through how this would work from the user's point of view, and I don't think we reached a satisfactory picture. The starting point is a "Enable Premium Service" switch, which sends a normal-priority message to the dapp vat, asking for a high-priority facet (and establishing payment somehow). The dapp vat would bind this facet to the higher-priority scheduling and return it to the user's wallet. Then, later, when the user initiates an action (e.g. an AMM swap), they'd have a "Boost" button that causes their request to use the high-priority facet rather than the standard one.

The question is: what happens next?

  • The first AMM swap message is a request (to the dapp vat) for an invitation. This uses the priority scheduling, and the syscall.resolve(resultPromise, invitation) it makes in that same crank will inherit that priority.
  • The client receives the invitation, and sends a message to Zoe containing both the invitation and an Offer. We would need a way for the client to already have a similar-priority Zoe facet with which to send this message, otherwise this phase of their request will be stuck in the normal-priority queue.
  • When Zoe receives that message, it will send a message to a contract vat. That needs to be running on a high-priority queue as well.

So, I like the simplification this brings, and I think I know how to build it on the kernel side, but I don't yet see how we could use it successfully at the next higher layer. Still more to think about.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs-design SwingSet package: SwingSet
Projects
None yet
Development

No branches or pull requests

3 participants