Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add last keyword argument to run_repeating #1497

Closed

Conversation

plammens
Copy link
Contributor

See #1345 (reopened on master instead of V12):

Added a last keyword argument to the run_repeating method of job queues that takes in a time specifier (in the same formats supported by first). It specifies the last time the job should be run. Currently, checking if the job should be run or not involves a certain "delay buffer" (so that if last is an exact multiple of interval, the internal delays don't mess up the intended behaviour), but it's a temporary solution. Also, I've made some minor refactorings.

Tests run correctly from my side. I've added one test for the new feature, but maybe a few more would be in place. The docs haven't been updated yet.

Closes #1333.

Copy link
Member

@Bibo-Joshi Bibo-Joshi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this PR @plammens!
This surely is a nice feature to have and the code looks good to me. Two thoughts:

  1. I feel like _to_timestamp could be moved to telegram.utils.helpers. What do the other @python-telegram-bot/developers think?
  2. Maybe add tests for the now _to_timestamp, especially if it's moved

And docs should be added, of course

@plammens
Copy link
Contributor Author

plammens commented Sep 2, 2019

@Bibo-Joshi
I agree that the _to_timestamp helper should be moved to the telegram.utils.helpers module. I have been working on that, but I came across some things I'd like to discuss before going forward. There are also some minor things which I will mention now that I've stopped.

  1. There was already a to_timestamp helper in that module. It seems it is exclusively used to convert datetime.datetime objects into integer timestamps to be sent in HTTP requests to Telegram's API.

    • To clear any ambiguity on the term timestamp, I renamed some identifiers in such a way that timestamp means "integer timestamp" (as I think that's what's usually meant when using that word) and explicitly used float_timestamp to refer to timestamps with sub-second precision.

      Namely, I renamed the moved function to to_float_timestamp, and the former _timestamp helper (for the former telegram.utils.helpers.to_timestamp) to _datetime_to_float_timestamp (to explicitly clarify that it only converts datetime.datetime objects). It is now used in the to_float_timestamp helper.

    • The original to_timestamp function's name has been left the same, but its implementation has been redirected through the new to_float_timestamp for better maintainability. Now it also is able to convert any of the other relative time values (e.g. ints or floats as "offset from a particular reference time").

      I don't know if you agree on that: as I said above, this function is only used in the repository to convert from a datetime.datetime to a timestamp, but not any of the other ways to specify time, as in the first argument for telegram.ext.jobqueue.JobQueue.run_repeating. So, should restrict the usage of to_timestamp to datetime.datetime objects? Or is it ok to allow the other types too?


  2. There was inconsistent use of UTC vs local times. For instance, in the former _timestamp helper (now _datetime_to_float_timestamp), assumed that naïve datetime.datetime objects were in the local timezone, while the from_timestamp helper —which I would have thought was the corresponding inverse function— returned naïve objects in UTC.

    This meant that, for instance, telegram.Message objects' date field was constructed as a naïve datetime.datetime (from the timestamp sent by Telegram's server) in UTC, but when it was stored in JSON format through the to_json method, the naïve date would be assumed to be in local time, thus generating a different timestamp from the one it was built from.

    So I sought to unify all handling of naïve datetime.datetime by assuming they're always in UTC.

    • This, though, breaks the tests related to the parameter first of JobQueue.run_repeating, since they assume that naïve datetime objects are in local time. That makes me assume that users of PTB have been feeding local datetime objects to this parameter, so changing it to UTC would break it for them. But I definitely believe that something has to be done about this: either we unify all naïve datetimes to UTC or to local time. And since unifying to local would almost surely break backwards compatibility elsewhere, I think UTC is the better of the two options.

    • Maybe even better: ditch naïve datetimes altogether and only work with datetimes that have an explicit tzinfo?

    • Finally, I also made sure that datetimes that are timezone-aware are also treated validly.

    • With Python 2's lack of the datetime.timezone class, and in particular, of datetime.timezone.utc, I had to "hard-code" a UTC singleton to use as a tzinfo instance to indicate UTC. I wonder whether it would be better to move that to a new module, e.g. telegram.utils.datetime, since it's not a "helper" strictly speaking.


  3. Besides the aforementioned timezone stuff, the original _datetime_to_float_timestamp had an issue of its own. The replacement method used to calculate the timestamp in Python 2 (in absence of datetime.datetime.timestamp) truncated the timestamp to an integer since that's the precision of datetime.datetime.timetuple/time.mktime. Of course, as it was before it didn't matter, but with the changes outlined in 1. it does. So I replaced the alternative calculation with calculating a timedelta from the epoch and using .total_seconds() on it. At this point, I don't know if it's worth keeping the switch between Python 3.3+'s datetime.datetime.timestamp and the alternative calculation, since the former's implementation is basically the same as the latter.


  4. When the job queue ticks, there is an inevitable delay between the exact time at which it was scheduled to peek and when tick is entered and the current time gets "frozen" (now = time.time()). Meaning that, for instance, if you set it to run every 5 seconds and stop running after 20 seconds from now, in theory it should run exactly five times: at Δt = 0s, Δt = 5s, Δt = 10s, Δt = 15s and Δt = 20s. But, in practice, when it gets to the fifth time, due to delays, Δt will be something like 20 + 6.28e-5, meaning that if the code makes a strict comparison (Δt <= 20.0s), then the job won't be executed the fifth time.

    When I first made this PR, I thought that was undesired behaviour, so I added a "delay buffer" so that the check is more like Δt <= 20.0s + ε, where ε is something like 0.001s. But now that I think of it, I'd rather have the strict check, partly because of the fact that if the user needed to control the exact times the job was executed, they wouldn't use an end date/time, but rather something like counting up from within the callback; and partly because of the ugliness and arbitrary nature of setting a delay_buffer. In this case I'd rename the kwarg to end, since it indicates that after that time the job shouldn't run anymore — but the job won't necessarily run at the specified time for the last time.


  5. Is this project going to keep supporting Python 2 for the next version? All of this compatibility handling and restraining oneself to older feature sets is a bit of a pain 😅

I realised that rambling on about the changes I've made without showing them is a bit useless. Tomorrow I'll push what I've got (we'll roll back if necessary). In any case, what do you think about all this?

plammens added a commit to plammens/python-telegram-bot that referenced this pull request Sep 3, 2019
There was inconsistent use of UTC vs local times. For instance, in the
former `_timestamp` helper (now `_datetime_to_float_timestamp`), assumed
that naive `datetime.datetime` objects were in the local timezone, while
the `from_timestamp` helper —which I would have thought was the
corresponding inverse function— returned naïve objects in UTC.

This meant that, for instance, `telegram.Message` objects' `date` field
was constructed as a naïve `datetime.datetime` (from the timestamp sent
by Telegram's server) in *UTC*, but when it was stored in `JSON` format
through the `to_json` method, the naïve `date` would be assumed to be in
*local time*, thus generating a different timestamp from the one it was
built from.

See python-telegram-bot#1497 for extended discussion.
@Bibo-Joshi
Copy link
Member

Thank for your detailed report, @plammens !
For the discussion of the time related stuff, I'd like to move that to #1505 (or a corresponding PR). I guess, we'll have to think about that, before moving on with this one …

About 4. : The way you put it, I tend to agree with your proposal about a strict check.

plammens added a commit to plammens/python-telegram-bot that referenced this pull request Sep 4, 2019
(cherry-picked from c6dd3d7.
I've included the refactoring mentioned in python-telegram-bot#1497 to facilitate the
change.)

There was inconsistent use of UTC vs local times. For instance, in the
former `_timestamp` helper (now `_datetime_to_float_timestamp`), assumed
that naive `datetime.datetime` objects were in the local timezone, while
the `from_timestamp` helper —which I would have thought was the
corresponding inverse function— returned naïve objects in UTC.

This meant that, for instance, `telegram.Message` objects' `date` field
was constructed as a naïve `datetime.datetime` (from the timestamp sent
by Telegram's server) in *UTC*, but when it was stored in `JSON` format
through the `to_json` method, the naïve `date` would be assumed to be in
*local time*, thus generating a different timestamp from the one it was
built from.

See python-telegram-bot#1505 for extended discussion.
@Poolitzer Poolitzer added this to the 12.2 milestone Sep 15, 2019
@jh0ker
Copy link
Member

jh0ker commented Oct 11, 2019

About 5.: We're looking to drop Python 2 support in version 13 (see #1538). Perhaps it makes sense to wait for that?

@tsnoam
Copy link
Member

tsnoam commented Nov 15, 2019

@plammens
would you like to rebas/merge this pr with the latest master so we can look at it with a fresh eye?

@tsnoam tsnoam modified the milestones: 12.3, 12.4 Nov 15, 2019
Added a `last` kwarg to `run_repeating` that specifies when
the job should last be run. The checking is done before executing
the job, but with a small delay buffer, so that if `last`
is an exact multiple of `interval`, the last job is still
run, even though the actual time is slightly after the
theoretical finish time.
Add parametrisations, extract constants and helpers.
@plammens
Copy link
Contributor Author

plammens commented Nov 16, 2019

@tsnoam I've rebased this using the stuff introduced in #1506; I've also squashed a little.

I think now the only thing to discuss is 4. (comment above). Do you agree that it should be a strict check? And thus last should be renamed to end?

@Bibo-Joshi Bibo-Joshi modified the milestones: 12.4, 12.5 Feb 2, 2020
@Bibo-Joshi
Copy link
Member

TL;DR: Closing in favor of #1981

After internal discussion, we came to the conclusion that we don't want to continue implementing scheduling logic on our own and instead rely on a 3rd party library for that (see #1936, #1981).
One of the upsides is, that a last keyword is already included.
We appreciate the effort that was put in this PR and hope that you'll find the new JobQueue useful once it's finished :)

@Bibo-Joshi Bibo-Joshi closed this Jun 5, 2020
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Aug 17, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

last kwarg for JobQueue.run_repeating method
5 participants