-
Notifications
You must be signed in to change notification settings - Fork 1.1k
PYTHON-3052 Add Typings to PyMongo Itself #829
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work so far! I've reviewed the bson/ and gridfs/ files and think it's probably a good time to stop and address the various comments. Some of them are applicable to all the changes as a whole.
bson/codec_options.py
Outdated
@@ -92,6 +98,9 @@ class TypeCodec(TypeEncoder, TypeDecoder): | |||
pass | |||
|
|||
|
|||
Codec = Union[TypeEncoder, TypeDecoder, TypeCodec] | |||
Fallback = Callable[[Any], Any] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should these names be private to avoid exporting them?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want to do that for all types, including the ones in pymongo.typing
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd say yes since we don't want people to accidentally import them or depend on them yet. We try to make everything private by default.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good, updated
bson/json_util.py
Outdated
@@ -781,8 +789,8 @@ def default(obj, json_options=DEFAULT_JSON_OPTIONS): | |||
if not obj.tzinfo: | |||
obj = obj.replace(tzinfo=utc) | |||
if obj >= EPOCH_AWARE: | |||
off = obj.tzinfo.utcoffset(obj) | |||
if (off.days, off.seconds, off.microseconds) == (0, 0, 0): | |||
off = obj.tzinfo.utcoffset(obj) # type: ignore |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of type: ignore
, can we tag this as timedelta
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added an assert obj.tzinfo is not None
above and was able to remove the ignore.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh I see. Is this a bug/limitation in mypy? I would expect it to understand that tzinfo always exists here since we have these lines above:
if not obj.tzinfo:
obj = obj.replace(tzinfo=utc)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I moved the assert below that line for clarity
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That does seem like a limitation, but I wouldn't call it a bug, that would take some serious introspection
session=None): | ||
def download_to_stream_by_name(self, filename: str, destination: Any, | ||
revision: Optional[int] = -1, | ||
session: Optional[ClientSession] = None) -> None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I saw elsewhere you used NoReturn. Should we change all the -> None:
methods to use it too?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Those are for functions that always raise an exception
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh TIL. Thanks.
gridfs/grid_file.py
Outdated
"""Read one line or up to `size` bytes from the file. | ||
|
||
:Parameters: | ||
- `size` (optional): the maximum number of bytes to read | ||
""" | ||
if size is None: | ||
size = -1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why this change? The api does not support size=None and I'd like to avoid adding misc features in this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is another case of Optional[int]
. In this case the superclass also uses it, we'd have to add an # ignore: override
:
gridfs/grid_file.py:575: error: Argument 1 of "readline" is incompatible with supertype "IOBase"; supertype defines the argument type as "Optional[int]"
gridfs/grid_file.py:575: note: This violates the Liskov substitution principle
gridfs/grid_file.py:575: note: See https://mypy.readthedocs.io/en/stable/common_issues.html#incompatible-overrides
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the issue is the type should be size: int
instead of size: Optional[int]
. Size is not compatible with Optional because None is not supported (the default is -1, not None).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed, but because io.IOBase
uses Optional[int]
, we have to add an ignore override comment. I'll do that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh I see. Yeah override is what typeshed does: https://github.com/python/typeshed/blob/df0fe1645681638359a4ab706732e79c62db6a71/stdlib/io.pyi#L128
bson/codec_options.py
Outdated
@@ -92,6 +98,9 @@ class TypeCodec(TypeEncoder, TypeDecoder): | |||
pass | |||
|
|||
|
|||
Codec = Union[TypeEncoder, TypeDecoder, TypeCodec] | |||
Fallback = Callable[[Any], Any] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd say yes since we don't want people to accidentally import them or depend on them yet. We try to make everything private by default.
bson/__init__.py
Outdated
BSONDEC = b"\x13" # Decimal128 | ||
BSONMIN = b"\xFF" # Min key | ||
BSONMAX = b"\x7F" # Max key | ||
EPOCH_AWARE: datetime.datetime = datetime.datetime.fromtimestamp(0, utc) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you expand on that? In my testing I found they are not required even with strict mode:
$ cat typ.py
NUM = 1
def meth(x: int) -> int:
return x * 2
print(meth(NUM))
$ mypy --strict typ.py
Success: no issues found in 1 source file
@@ -49,7 +50,7 @@ class Regex(object): | |||
_type_marker = 11 | |||
|
|||
@classmethod | |||
def from_native(cls, regex): | |||
def from_native(cls: Type["Regex"], regex: Pattern[Any]) -> "Regex": |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great thanks.
session=None): | ||
def download_to_stream_by_name(self, filename: str, destination: Any, | ||
revision: Optional[int] = -1, | ||
session: Optional[ClientSession] = None) -> None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh TIL. Thanks.
gridfs/grid_file.py
Outdated
"""Read one line or up to `size` bytes from the file. | ||
|
||
:Parameters: | ||
- `size` (optional): the maximum number of bytes to read | ||
""" | ||
if size is None: | ||
size = -1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the issue is the type should be size: int
instead of size: Optional[int]
. Size is not compatible with Optional because None is not supported (the default is -1, not None).
bson/son.py
Outdated
return memo.get(val_id) | ||
value = memo.get(val_id) | ||
assert value is not None | ||
return value |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we replace the redundant assert and temp variable with: return memo[val_id]
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ideally if val_id in memo
would guard against the get()
being None
, but it does not, which is why I added the intermediate variable and guard.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I agree but I think it's similar to the other issue since mypy can't determine if the val_id might be removed by another thread in between the if-statement and the get() call. Regardless, I think we can replace these 3 lines with return memo[val_id]
since we know val_id is always there. Then we don't need to add redundant code just to appease mypy.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Used type ignore instead
Note: I removed the changes to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for pushing out the pymongo/+test/ changes!
gridfs/grid_file.py
Outdated
"""Read one line or up to `size` bytes from the file. | ||
|
||
:Parameters: | ||
- `size` (optional): the maximum number of bytes to read | ||
""" | ||
if size is None: | ||
size = -1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh I see. Yeah override is what typeshed does: https://github.com/python/typeshed/blob/df0fe1645681638359a4ab706732e79c62db6a71/stdlib/io.pyi#L128
test/performance/perf_test.py
Outdated
@@ -44,7 +45,7 @@ | |||
|
|||
OUTPUT_FILE = os.environ.get('OUTPUT_FILE') | |||
|
|||
result_data = [] | |||
result_data: List[Any] = [] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't this annotation redundant?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
test/performance/perf_test.py:48: error: Need type annotation for "result_data" (hint: "result_data: List[<type>] = ...")
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh that's surprising. I thought it would default to List. TIL.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It defaulted to List[unknown]
, but adding : List
allowed it to reference as List[Any]
, so the [Any]
was redundant.
bson/son.py
Outdated
return memo.get(val_id) | ||
value = memo.get(val_id) | ||
assert value is not None | ||
return value |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I agree but I think it's similar to the other issue since mypy can't determine if the val_id might be removed by another thread in between the if-statement and the get() call. Regardless, I think we can replace these 3 lines with return memo[val_id]
since we know val_id is always there. Then we don't need to add redundant code just to appease mypy.
bson/son.py
Outdated
val_id = id(self) | ||
if val_id in memo: | ||
return memo.get(val_id) | ||
return memo.get(val_id) # type: ignore |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can make this type-safe and remove # type: ignore
by using return memo[val_id]
instead of .get().
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, ha, duh, thanks
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
monitoring.py
,server_description
,server_type
,topology_description
,url_parser
(all failed when usingretype
)Union[MutableMapping[str, Any], RawBSONDocumentRef]
Collection
a generic object