Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Independent update schedules for different platforms #35

Open
Tracked by #36
berekuk opened this issue Mar 29, 2022 · 4 comments
Open
Tracked by #36

Independent update schedules for different platforms #35

berekuk opened this issue Mar 29, 2022 · 4 comments

Comments

@berekuk
Copy link
Collaborator

berekuk commented Mar 29, 2022

What I have in mind here: let's keep a platform_status DB table with a timestamp of the last update for each platform; ask every platform to update every minute in scheduler; and skip an update for a platform if its data is fresh enough ("fresh enough" definition can be specified by each platform code separately).

This would allow us to e.g. update polymarket once per day and metaculus every 30 minutes, or vice versa.

It would also give us an ability to force an update from web UI: mark a platform as "needs-to-be-updated" flag with a single button click and don't worry about Heroku.

And also a better observability: it'd be easy to build a (secret? login-only?) page listing all platforms and their current update status. We could also log an exception if platform update failed, store it in the same table, and display it on that web page.

@NunoSempere
Copy link
Collaborator

Why every minute?

@berekuk
Copy link
Collaborator Author

berekuk commented Mar 29, 2022

That's the easiest way to get periodic jobs running every N hours, automatic retries and custom business logic for "should we run this code now?", managed by configuration in code and by DB state, instead of manually configuring heroku jobs or managing crontab files.

Something like:

// polymarket.ts
export const polymarket: Platform = {
  name: 'polymarket',
  fetcher: ...,
  period: 8 * 3600, // seconds after success
  retryPolicy: {
    minDelay: 3600, // seconds after failure
    // after a few failures we won't spam polymarket with requests too often
    doubleDelayOnFailureUntil: 24 * 3600,
  },
};

// index.ts
export const fetchPlatformIfNeeded = (platform: Platform) => {
  const needsRefetch = ...; // based on db state, `period` and `retryPolicy` options
  if (needsRefetch) fetchPlatform(platform);
}

...And then we call platforms.map(fetchPlatformIfNeeded) every minute. With some DB locks per platform, or some other way to avoid fetching the same platform twice in parallel, if we ever get to multiple workers.

I might be overcomplicating the options just to show an idea, but it's nice to have some room for further flexibility.

@NunoSempere
Copy link
Collaborator

Seems a bit hardcore, but also pretty nice :)

@berekuk
Copy link
Collaborator Author

berekuk commented Mar 29, 2022

Well, we could also do a long-running server which calculates what to run and manages jobs queue as necessary. That's less hackish than "try everything every minute", but harder to implement properly and more fragile (unless there's a good node lib for that; for Python there's apscheduler which is comprehensive and work well; I'm not sure if there's anything similar in node world, but I'll look around).

Both of these approaches would be costly on Heroku, but this is a task for later and we might leave Heroku by then. Also, we can start with "every hour" instead of "every minute".

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants