Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

implement SandstormApi.schedulePeriodic() #2922

Merged
merged 3 commits into from
Jan 19, 2020

Conversation

dwrensha
Copy link
Collaborator

No description provided.

@ocdtrekkie
Copy link
Collaborator

i,i @dwrensha is a hero among men.

@kentonv
Copy link
Member

kentonv commented Apr 29, 2017

Cool stuff!

It's interesting that the two-phase commit problem comes up here. I wonder if we really ought to generalize it to all persistent capabilities. However, last time I thought about this problem, I concluded that forcing developers to understand two-phase commit was a big burden that wouldn't bring much benefit 99% of the time. In most cases, if the app loses a token, it doesn't really matter. The capability can just stick around, or it can be cleaned up when the grain is deleted, or the user can clean it up manually.

It does seem like it matters in this case, though.

Maybe apps should be able to opt-in to two-phase commit. SandstormApi.save() could take an extra boolean parameter requireConfirmation. If true, then the token must be passed to SandstormApi.confirm() after it has been persisted, otherwise Sandstorm will automatically drop it after some timeout.

The idea is that a grain would:

  1. Call save().
  2. Store the token to its database, flagged as "not confirmed".
  3. Call confirm().
  4. Update its database to mark the token "confirmed".

In case the grain crashes between save() and confirm(), on every startup it checks its database for unconfirmed tokens. If it finds any, it must either retry confirm() on them or call drop(). This ensures consistency between Sandstorm and the grain.

Thoughts?

confirm @0 ();
# Activates the job. This method only has an effect the first time it is called.

cancel @1 ();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a difference between this and drop()ing the token?

@dwrensha
Copy link
Collaborator Author

We only need two-phase-commit for save() if we attach special semantics to save() beyond its usual semantics.

Let's suppose we go with the original proposal where schedulePeriodic() returns a bare Handle. This kind of API pattern is common in the non-persistent case, e.g. for the streaming http methods, so it's tempting to try to make it work here. I can imagine a few different approaches for how we might decide whether the job is still active.

  1. The job is active if and only if there exist any liverefs or sturdyrefs to it. This sounds quite natural, but how do we keep track of the liverefs? We could put some logic in a node-capnp close() method, but that would not be robust to frontend crashes.

  2. The job is active if and only if there exist any sturdyrefs to it, regardless of whether there are any liverefs. We could implement this by adding some hooks in the save() and drop() logic. The problem is now that save() now actually means "save and confirm", which I imagine could be confusing. Moreover, the analogy to the non-persistent API pattern that we were mimicking has become rather weak. What happens if we drop all sturdyrefs, but hold on to a liveref, and then re-save the liveref? Does this mean that we can cancel and then un-cancel the job?

My proposal is to split these "save" and "confirm" steps into separate methods, avoiding the need to implement special hooks and keeping a clear conceptual distinction between the effects that are happening.

@kentonv
Copy link
Member

kentonv commented Apr 30, 2017

I think option 1 is the ideal behavior. In the long run, we actually want all restore()s of the same ref to end up returning a capability to the very same object, so eventually we need some way for the frontends to achieve consensus on who is running what.

In the shorter term, a hack we've already used is: "If the handle's original liveref is dropped without ever having been saved, it is canceled immediately. Once it is saved, it is canceled when the last studyref is dropped, even if liverefs still exist." In practice this probably works in the vast majority of cases. Check out the isSaved() method on PersistentImpl.

For the case of periodic jobs, I think it's fine to implement them in such a way that they don't actually fire unless they've been saved. In theory they should fire even when it's just a liveref, but I doubt anyone will depend on that...

@kentonv
Copy link
Member

kentonv commented May 20, 2017

@dwrensha, do you agree with the suggestions? Are you planning to update this at some point? Or would you like me to take it over?

@dwrensha
Copy link
Collaborator Author

Option 1 may be ideal, but it seems like we're a long way off from things working that way, and I have trouble imagining how the implementation would actually work. You're suggesting that we implement something more like option 2 for now, with the intention of eventually making it work like option 1. My fear is that "eventually" might be a very long way off, and meanwhile we'll be stuck with weird semantics.

Feel free to take over on this pull request. After running into #2925, I have no pressing need for this feature.

@ocdtrekkie
Copy link
Collaborator

As a note, I have a strong desire for this feature, specifically because I'd like to see TinyTinyRSS able to regularly update its feeds. It is also the highest dollar-value bounty on the board at present ($400).

@zenhack
Copy link
Collaborator

zenhack commented Dec 3, 2019

I think I want to take this on.

Regarding the issues around two phase commit: I think a good way to solve them would be to tweak the API as follows:

  • Get rid of ScheduledJob entirely. scheduleAt and schedulePeriodic would just not return a value here.
  • Have Runnable.run return a value indicating whether or not the job should continue to be run periodically. If this value indicates it should not, Sandstorm will remove it from the set of scheduled tasks. Given that this makes it more specific to the notion of scheduled tasks, It should probably be renamed accordingly.
  • When a job comes up for running, If restore()ing it fails, remove it from the set of scheduled jobs. This way, if the app needs to record something locally to know what to do with it, and the app fails to actually save that information, then when the task is run this will be discovered and the task will be removed.

Thoughts?

I also still think exposing the API via the powerbox, rather than directly on SandstormApi, is a good idea.

I'll take a whack at the above approach if I don't hear other suggestions.

@zenhack
Copy link
Collaborator

zenhack commented Dec 9, 2019

Thinking about the notion of exposing this via the powerbox, I'm reconsidering because the normal powerbox flow involves prompting the user for which thing to supply, but this seems like something weird to ask the user for.

I still like my proposed alternative to requiring two-phase commit as outline above, but for it to be practically useful we'd have to also address #3027, as the design requires the app to implement a persistent capability.

@zenhack
Copy link
Collaborator

zenhack commented Dec 9, 2019

@ocdtrekkie, I'd also like to hear your thoughts on the design, esp. since you've expressed how important this functionality is to you.

@ocdtrekkie
Copy link
Collaborator

Well, the app I particularly think of beyond TTRSS would be something like IFTTT. Apps are going to have different reasons for scheduling, and may schedule different things on different timers. So it makes no sense for the user to be particularly intervening in that?

On the other hand, if we look at any notion of potential feature abuse, I think scheduling has potentially a lot of it. Especially given a grain booting up moves it to the top of the list. I can imagine a scenario where an app update adds a periodic self-bump that the user never expected or intended. I'm not sure if having to click "Allow this app to wake itself periodically" is a huge burden for the user, and I lean towards saying users should be able to deny apps that ability.

So I guess ideally I think there should be a yes/no permission for apps to be able to use Cron, but given that they should be able to schedule as they please?

@ocdtrekkie
Copy link
Collaborator

I am not particularly versed on the impact of the different implementation choices, so I can really just give you a "I'd imagine it could work like this" opinion.

@zenhack
Copy link
Collaborator

zenhack commented Dec 9, 2019

Maybe the right thing to do is what we currently do with wake locks: The user gets a bell notification telling them the app is backgrounded, and there's a button on it that lets them cancel that. We could trigger a similar notification for scheduled jobs.

Also interested in your thoughts on the proposed API, as someone who's going to be using it.

const minimumSchedulingSlack :Util.DurationInNs = 60000000000;
# 60 seconds: The minimum value allowable for the `slack` parameter passed to `scheduleAt()`.

enum SchedulingPeriod {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This strikes me as not flexible enough for some use cases. Why not just provide a numeric duration?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you need better precision than this, you should schedule a callback for slightly before the target time, then countdown the remaining time in-process.

I presume this design is mostly large server/Oasis focused, in that you want to avoid the server being forced to run everyone's jobs precisely at midnight or the like. The initial boot of a grain presumably is higher load than it just running, so this staggers when grains start up?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm talking about the restriction to hourly/daily/monthly/annually, not the minimumSchedulingSlack (which I think is fine). I don't think using a numeric duration for the period makes this any harder.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah. Didn't see the lines below the comment in GitHub. Oops!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's been a while, but I think my goal here was to push apps to specify "rough" schedules rather than exact ones. It seems like most apps really only need rough schedules (like "invoke me once a day, but it doesn't really matter when"). My hope was to give the implementation some freedom to run these jobs at times when resource usage was low, or at least be able to spread out jobs so that it's not running tons of them at once.

If the developer can specify an exact period in seconds rather than use this enum, my worry is that developers will be more likely to assume that the scheduling will be precise. The enum at least hints that it's fuzzy.

@zenhack zenhack mentioned this pull request Dec 9, 2019
@ocdtrekkie ocdtrekkie added the app-platform App/Sandstorm integration features label Jan 15, 2020
@kentonv kentonv merged commit fd3b4d7 into sandstorm-io:master Jan 19, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
app-platform App/Sandstorm integration features
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants