Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ISO8601 Support #13

Merged
merged 35 commits into from
Aug 24, 2020
Merged

ISO8601 Support #13

merged 35 commits into from
Aug 24, 2020

Conversation

tooolbox
Copy link
Contributor

As per #2

Depends on #11

Not done yet obviously, but since we're on different timezones if you could start reviewing and asking any questions, I'd appreciate it.

I'm having to change a lot more than I anticipated, but the subject is tricky. For example:

  • I could do the ISO8601 math inside the database, but...seems crazy. So, I have to select UTC_TIMESTAMP(), do the math and put it back in.
  • You have this randomization feature (I don't particularly need it, but I respect that you do). Unfortunately it doesn't work well with an ISO8601 interval because the actual, hard duration (number of seconds) can only be computed with a start time, which is only known to the database. I solved it by moving the randomization calculation into the table package.

Anyway, here be dragons, but I hope to wrap this up tomorrow.

@cenkalti
Copy link
Owner

Getting current time from database, calculating in program code, then updating the record is going to add one more round trip to each job execution, hence reduce throughput but I am not sure how much the impact going to be. I didn't do any benchmarks on Dalga before. I think this is not important anymore because more Dalga instances may be added to mitigate this.

@tooolbox tooolbox marked this pull request as ready for review August 21, 2020 02:01
@tooolbox
Copy link
Contributor Author

Getting current time from database, calculating in program code, then updating the record is going to add one more round trip to each job execution, hence reduce throughput but I am not sure how much the impact going to be. I didn't do any benchmarks on Dalga before. I think this is not important anymore because more Dalga instances may be added to mitigate this.

Got it, yeah I agree it will reduce performance a little but should be fine with horizontal scaling.


Okay, I'm pretty much happy with this, here's a few specific points:

  • Has support for ISO8601 intervals, as gone over.
  • Has a battery of unit tests to ensure correct scheduling behavior.
  • Added a ScanFrequency setting which controls how often the Dalga instance checks for its next job. This was mainly to speed up the test suite, but could have some real-life use.
  • Exposed UseClock on a Dalga instance that lets you control its mocked time. This will be useful for integrations tests; I'm not sure if returning something from an internal package poses a problem, but I'll cross that bridge when I come to it.
  • Added a FixedIntervals setting which calculates next_run based on the previous value of next_run rather than the current point in time. This will prevent long job runs from causing schedule drift. It's also meant to prevent retries from causing schedule drift, but I realize to accomplish that, I will need to refactor how retries are handled—oh well, that's coming up next.

Daylight Savings

(This point is rather involved so it gets its own heading.)

As a side effect of using UTC time for everything, ISO8601 intervals may not behave exactly as intended. Specifically, there will be no recognition or compensation for daylight savings.

For example, let's say I schedule a job at 3pm with interval P1D. I expect it to fire at 3pm every day, even if daylight savings comes and goes. In contrast, an interval of PT24H would mean that, when daylight savings started, the job would fire at 4pm.

However, Dalga uses UTC for all of its time tracking! This means that it's unaware of what timezone it's in, and when daylight savings occurs—it doesn't care. So P1D is equivalent to PT24H.

Now, this isn't wrong, it's just...different than I first expected. I had to adjust my battery of tests to compensate for this behavior, but as I think about it, it seems fine, and possibly even superior. For example, here's what Google has to say about DST in the Cloud Scheduler docs:

Daylight Savings Time

For some time zones, day light savings time can cause jobs to run outside of cadences you may expect. This is because the Cloud Scheduler runs on wall clock time. In instances where a time can occur twice (such as the case when clocks spring backwards) and your job is scheduled at this time, your scheduled job may observe execution anomalies.

If your job requires a very specific cadence, you may want to consider choosing time zones that do not observe daylight savings time. Specifically, UTC is recommended for Cloud Scheduler to avoid the problem completely.

Now, this isn't exactly the same scenario since we're not running on wall clock time, but it illustrates the fact that timezones are generally problematic.

...As I continue to think about it, one thing we could do is give each job an optional timezone. When we SELECT UTC_TIMESTAMP() from the database, we could then convert to that timezone before applying the ISO8601 duration. It wouldn't even have the downsides of Cloud Scheduler, since we're not running on wall clock time (we're just using a "wall clock" to compute a distance in time and then saving that instant for later comparison).

Questions, comments? If you like everything so far, it can be merged, or you can wait for me to add the timezone stuff.

@tooolbox
Copy link
Contributor Author

(As a note, I will likely ask to change the schema at least once more in this series of PRs. Food for thought in terms of when you want to merge things or do version bumps, etc.)

@tooolbox
Copy link
Contributor Author

@cenkalti any thoughts you have would be appreciated. In the meantime, I'm going to add timezone support.

@tooolbox
Copy link
Contributor Author

You can now provide a *time.Location and Dalga will compute the ISO8601 intervals within that location. If no location is provided, it defaults to UTC. Run times are still stored in the database in UTC.

Note that this is different than providing a start time; the start time may have a time offset associated with it, but not a location.

@tooolbox tooolbox mentioned this pull request Aug 23, 2020
@@ -8,7 +8,8 @@ import (

var DefaultConfig = Config{
Jobs: jobsConfig{
RetryInterval: 60,
RetryInterval: "PT1M",
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a specific reason to change RetryInterval to ISO duration? If not, I would like to keep it as seconds for simplicity. A user can start to use Dalga without knowing about ISO durations?

While we are doing a major version bump, I would also like to change the config file format to TOML, it is more common nowadays. github.com/BurntSushi/toml library has also ability to unmarshal time.Duration values. Do you have any comments on this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a specific reason to change RetryInterval to ISO duration?

It had something to do with how the data was being passed and compared between functions and so on. Ah yes, I was changing Table.UpdateNextRun() to take an ISO8601 duration, but I found that the same code path was used for a successful and a failed execution, so RetryInterval would need to feed into the same argument.

It just seemed silly to have Table.UpdateNextRun() have both an ISO8601 and regular time.Duration and use one or the other. I suppose I could do a conversion from time.Duration into ISO8601 before passing it in?

A user can start to use Dalga without knowing about ISO durations?

Separate from the above, I'm not sure I understand. Won't he need to know ISO8601 durations now because that's how you schedule regular intervals? Unless you're thinking of one-off jobs specifically.

While we are doing a major version bump, I would also like to change the config file format to TOML, it is more common nowadays. github.com/BurntSushi/toml library has also ability to unmarshal time.Duration values. Do you have any comments on this?

Heh, personally I don't like TOML. Every time I go to read up on the documentation, I get to the part of how you can use [[something]] or something.something and they're basically the same except they're not, and I get confused and abandon the attempt.

I am a YAML fan, if you're into that. I find it's intuitive and easy to use that config format, as long as I'm not the one writing the parser for it 😉

I have used spf13/viper pretty extensively in the past and like it, although it has a lot of dependencies. I would probably only go that way if you considered switching to spf13/cobra for the CLI framework, since they integrate very well. But then, I have seen very minimal dependencies and mostly stdlib in Dalga so I doubt you want to go that route.

Something like knadh/koanf might do, or even just a straight YAML parser. But that depends on how you feel about YAML. I am also a fan of HJSON 😄

dalga.go Show resolved Hide resolved
@@ -35,7 +39,8 @@ func (t *Table) Create(ctx context.Context) error {
"CREATE TABLE `%s` (" +
" `path` VARCHAR(255) NOT NULL," +
" `body` VARCHAR(255) NOT NULL," +
" `interval` INT UNSIGNED NOT NULL," +
" `interval` VARCHAR(255) NOT NULL," +
" `location` VARCHAR(255) NOT NULL," +
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we allow NULL values and interpret them as UTC?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if you're aware, but empty strings are currently interpreted as UTC. I can switch it to NULL if you prefer? Please confirm and I'll make the change.

Copy link
Owner

@cenkalti cenkalti left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tooolbox Sorry, I was away from computer this weekand. Thank you for this PR. I have a couple questions. We can merge this afterwards.

@tooolbox tooolbox mentioned this pull request Aug 23, 2020
@tooolbox
Copy link
Contributor Author

No problem!

Looks like I have a little more work to do in general, we can finish hashing out exactly what's left to tweak and I will tackle it at some point over the next couple of days as my schedule allows.

Cheers!

@cenkalti
Copy link
Owner

@tooolbox I see nothing major in the PR. I am going to merge this to speed up progress and help you with #7 and #14. We can discuss minor details on a separate issue after your work is finished. Great work, thank you!

@cenkalti cenkalti merged commit 1486045 into cenkalti:master Aug 24, 2020
@tooolbox
Copy link
Contributor Author

Awesome! You're very welcome.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants