Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Review types of optional arguments in methods and classes #3864

Closed
harshil21 opened this issue Aug 29, 2023 · 12 comments · Fixed by #4120
Closed

Review types of optional arguments in methods and classes #3864

harshil21 opened this issue Aug 29, 2023 · 12 comments · Fixed by #4120
Labels
documentation type hinting Not bug. Not Docs. But both.
Milestone

Comments

@harshil21
Copy link
Member

There are certain parameters in our bot methods which optional with the default as DEFAULT_NONE (as specified by the Defaults class). The type hinting of such parameters is not consistent. For e.g.

disable_notification: DVInput[bool] = DEFAULT_NONE,

and
protect_content: ODVInput[bool] = DEFAULT_NONE,

both parameters accept bool, but one of them is type hinted to also accept None (ODV -> Optional DefaultValue). Here the type hint for protect_content should be corrected to DVInput imo.

We should look for other cases like this, and correct it in both the type hints and documentation.

@harshil21 harshil21 added documentation type hinting Not bug. Not Docs. But both. labels Aug 29, 2023
@Bibo-Joshi
Copy link
Member

I did some investigation on this. First, regarding arguments of bot methods:

Let's take disable_notification as example from above. This is an optional argument. We expect that users either do not pass this argument or when they pass it, they pass a bool. This would lead to DVInput[bool]. However, one can argue that passing None should be equal to not passing the argument. This is useful e.g. when passing kwargs that are generated at runtime (send_message(**build_kwargs())). Moreover, all other optional arguments are typed to actually accept None (e.g. reply_to_message_id: Optional[int] = None).
I would hence vote that all arguments of bot methods with default values should be using ODVInput.

Moreover, I collected all arguments in our code base with one of these type hints:

Script
import re
from pathlib import Path
from pprint import pprint

REGEX = re.compile(r"\s([_\w]+: O?DV(Input|Type)\[\w+\] = .*)")
MATCHES = set()

for file_path in Path("telegram").rglob("*.py"):
    content = file_path.read_text(encoding="utf-8")
    for match in REGEX.finditer(content):
        MATCHES.add(match.group(1))


pprint(sorted(MATCHES))

Results:

['allow_sending_without_reply: DVInput[bool] = DEFAULT_NONE,',
 'allow_sending_without_reply: ODVInput[bool] = DEFAULT_NONE,',
 'block: DVType[bool] = DEFAULT_TRUE,',
 'connect_timeout: ODVInput[float] = BaseRequest.DEFAULT_NONE,',
 'connect_timeout: ODVInput[float] = DEFAULT_NONE,',
 'disable_notification: DVInput[bool] = DEFAULT_NONE,',
 'disable_notification: ODVInput[bool] = DEFAULT_NONE,',
 'disable_web_page_preview: ODVInput[bool] = DEFAULT_NONE,',
 'explanation_parse_mode: ODVInput[str] = DEFAULT_NONE,',
 'parse_mode: ODVInput[str] = DEFAULT_NONE,',
 'pool_timeout: ODVInput[float] = BaseRequest.DEFAULT_NONE,',
 'pool_timeout: ODVInput[float] = DEFAULT_NONE,',
 'protect_content: ODVInput[bool] = DEFAULT_NONE,',
 'read_timeout: ODVInput[float] = BaseRequest.DEFAULT_NONE,',
 'read_timeout: ODVInput[float] = DEFAULT_NONE,',
 'write_timeout: ODVInput[float] = 20,',
 'write_timeout: ODVInput[float] = BaseRequest.DEFAULT_NONE,',
 'write_timeout: ODVInput[float] = DEFAULT_NONE,']

In this, I see several notable points.

Firstly, we have inconsistencies in the typing of allow_sending_without_reply and disable_notification that should be unified
Secondly, most arguments are already using ODVInput (which fits my above reasoning 😎 )

Thirdly, there is a problem with write_timeout defaulting to an integer 20 (not DefaultValue(20)!). In HTTPXRequest.do_request

async def do_request(
self,
url: str,
method: str,
request_data: Optional[RequestData] = None,
read_timeout: ODVInput[float] = BaseRequest.DEFAULT_NONE,
write_timeout: ODVInput[float] = BaseRequest.DEFAULT_NONE,
connect_timeout: ODVInput[float] = BaseRequest.DEFAULT_NONE,
pool_timeout: ODVInput[float] = BaseRequest.DEFAULT_NONE,
) -> Tuple[int, bytes]:
"""See :meth:`BaseRequest.do_request`."""
if self._client.is_closed:
raise RuntimeError("This HTTPXRequest is not initialized!")
# If user did not specify timeouts (for e.g. in a bot method), use the default ones when we
# created this instance.
if isinstance(read_timeout, DefaultValue):
read_timeout = self._client.timeout.read
if isinstance(write_timeout, DefaultValue):
write_timeout = self._client.timeout.write
if isinstance(connect_timeout, DefaultValue):
connect_timeout = self._client.timeout.connect
if isinstance(pool_timeout, DefaultValue):
pool_timeout = self._client.timeout.pool

this leads to write_timeout being 20 seconds - as desired. However, when specifying a custom write_timeout value via HTTPXRequest(write_timeout=42) or ApplicationBuilder().write_timeout(42), the value 42 will not be used by send_photo & friends. Moreover, even if we fix this problem, I notice that we can't specify different default values for media operations and non-media operations. Since already use different values for these in the standard configuration, I think this would make sense.

My first idea would be the following:

  1. In the bot methods make write_timeout default to DEFAULT_NONE as with the other timeout parameters

  2. Add a new argument media_write_timeout to HTTPXRequest that defaults to 20 seconds.

  3. Make HTTPXRequest.do_request select the write_timeout as follows:

     if write_timeout is not DEFAULT_NONE:
         effective_write_timeout = write_timeout
     elif files:
         effective_write_timeout = self.media_write_timeout
     else:
         effective_write_timeout = self.write_timeout

This would be a backward compatible change, further decouple the Bot from the networking backend and add an additional feature.

What do you think?

Sagarkumar7 added a commit to Sagarkumar7/python-telegram-bot that referenced this issue Sep 27, 2023
this commit fixes documentation
@UjjwalPardeshi
Copy link

can you assign this to me please .

@Bibo-Joshi
Copy link
Member

@UjjwalPardeshi this issue is still in discussion and not ready to be pred for

@Poolitzer
Copy link
Member

Poolitzer commented Oct 20, 2023

Add a new argument media_write_timeout to HTTPXRequest that defaults to 20 seconds.

Shouldn't that be higher?

Your logic looks sound to me. Would go ahead.

@harshil21
Copy link
Member Author

However, when specifying a custom write_timeout value via HTTPXRequest(write_timeout=42) or ApplicationBuilder().write_timeout(42), the value 42 will not be used by send_photo & friends.

kinda shocking that this went unnoticed by tests and everyone (maybe that's why ppl keep posting the timeout errors even after changing the timeout vals)

My first idea would be the following:

We would also need to add a new method to AppBuilder for the new arg right?

  1. In the bot methods make write_timeout default to DEFAULT_NONE as with the other timeout parameters
  2. Add a new argument media_write_timeout to HTTPXRequest that defaults to 20 seconds.
  3. Make HTTPXRequest.do_request select the write_timeout as follows:

Idea sounds good 👍🏽

Add a new argument media_write_timeout to HTTPXRequest that defaults to 20 seconds.

Shouldn't that be higher?

I agree, should be like 30 seconds min to accomodate for slower internet speeds (https://www.speedtest.net/global-index)

@harshil21
Copy link
Member Author

Actually I'm backtracking on adding media_write_timeout (point 2.). After reading #3893 I'm not sure if adding yet another argument would make it easier for the user (and us!). How about internally setting the timeout to 30 seconds for file-related methods only? The user can simply change that 30 second behaviour by using the same write_timeout argument in a bot method. So the logic would now be:

 if write_timeout is not DEFAULT_NONE:
     effective_write_timeout = write_timeout
 elif files:
     effective_write_timeout = 30
 else:
     effective_write_timeout = self.write_timeout

We would continue to document in the send_* methods that the write_timeout param defaults to 30 seconds.

@Bibo-Joshi
Copy link
Member

Add a new argument media_write_timeout to HTTPXRequest that defaults to 20 seconds.

Shouldn't that be higher?

I agree, should be like 30 seconds min to accomodate for slower internet speeds (https://www.speedtest.net/global-index)

20 is the value that has been used for eons (minimum since v6.0) and is documented everywhere. If there are no clear indications that this value is too short for most people, I would prefer to stick to the 20 …

We would also need to add a new method to AppBuilder for the new arg right?

yes, of course AppBuilder.media_write_timeout() would be required

I'm not sure if adding yet another argument would make it easier for the user (and us!). How about internally setting the timeout to 30 seconds for file-related methods only? The user can simply change that 30 second behaviour by using the same write_timeout argument in a bot method.

Just to clarify: This would steeps 1 & 3, only step 2 would be omitted.
Personally, I see no reason to make this particular timeout hard-coded while all other timeouts are configurable. Setting the value of every call to send_* is probably not easier than setting it via AppBuilder.media_write_timeout. In terms of maintenance, I would argue that an additional timeout setting is not much added overhead as long as it works just like the other ones - which it would :)
Still, one could get started with steps 1 & 3 and add 2 later on in a follow-up PR.

@tsnoam
Copy link
Member

tsnoam commented Oct 21, 2023

@Bibo-Joshi 20 seconds is a sane default. We worked it out the hard way.

It is a trade off between identifying problems asap and between being too aggressive and/or slow internet connections.

The default should not be supportive of those with horrible internet.

@harshil21
Copy link
Member Author

harshil21 commented Oct 21, 2023

20 is the value that has been used for eons (minimum since v6.0) and is documented everywhere. If there are no clear indications that this value is too short for most people, I would prefer to stick to the 20 …

It is a trade off between identifying problems asap and between being too aggressive and/or slow internet connections.

ok, but tbf I've only ever seen people increase the timeouts, not decrease them.

Setting the value of every call to send_* is probably not easier than setting it via AppBuilder.media_write_timeout.

Right, I was thinking AppBuilder.write_timeout should override the 20 (or 30) seconds timeout for media operations as well. But maybe that behaviour is obscure and not in line with having everything configurable via the builder..

Still, one could get started with steps 1 & 3 and add 2 later on in a follow-up PR.

👍🏽

@Bibo-Joshi
Copy link
Member

ALl right, then I think we have a consensus here and there are 3 PRs to be done:

  1. Make all arguments of bot methods with default values use ODVInput
  2. Steps 1 & 3 from Review types of optional arguments in methods and classes  #3864 (comment)
  3. As follow up to 2, step 2

@tsnoam
Copy link
Member

tsnoam commented Oct 21, 2023

ok, but tbf I've only ever seen people increase the timeouts, not decrease them.

@Bibo-Joshi you’ve out seen those who complained. Think of all those which it worked good for them. Increasing the timeout by default will change the behavior and raise a wave of complaints.
the locations of errors varied between getUpdates receive a message to establishing a connection.

However, that said and ptb is in a differed state today. It is using httpx which greatly superior to urllib3 and all the alternatives available back at the day.
Afair now you can have different timeouts to connection establishment, ssl negotiation, read & write. You can also easily set them per operation rather than on a connection and/or globally.

@tsnoam
Copy link
Member

tsnoam commented Oct 21, 2023

If you want something which controls the timeouts in a smarter way it might be a good idea to check what is the payload size to send and decide on the timeout accordingly

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
documentation type hinting Not bug. Not Docs. But both.
Projects
None yet
5 participants