Skip to content

PYTHON-3052 Add Typings to PyMongo Itself #829

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 81 commits into from
Jan 13, 2022

Conversation

blink1073
Copy link
Member

@blink1073 blink1073 commented Dec 23, 2021

  • Finish manual port of monitoring.py, server_description, server_type, topology_description, url_parser (all failed when using retype)
  • Double check all public interfaces in the auto-ported files:
     modified:   pymongo/__init__.py
	modified:   pymongo/collation.py
	modified:   pymongo/common.py
	modified:   pymongo/event_loggers.py
	modified:   pymongo/operations.py
	modified:   pymongo/read_concern.py
	modified:   pymongo/read_preferences.py
	modified:   pymongo/results.py
	modified:   pymongo/write_concern.py

Copy link
Member

@ShaneHarvey ShaneHarvey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work so far! I've reviewed the bson/ and gridfs/ files and think it's probably a good time to stop and address the various comments. Some of them are applicable to all the changes as a whole.

@@ -92,6 +98,9 @@ class TypeCodec(TypeEncoder, TypeDecoder):
pass


Codec = Union[TypeEncoder, TypeDecoder, TypeCodec]
Fallback = Callable[[Any], Any]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should these names be private to avoid exporting them?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to do that for all types, including the ones in pymongo.typing?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd say yes since we don't want people to accidentally import them or depend on them yet. We try to make everything private by default.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good, updated

@@ -781,8 +789,8 @@ def default(obj, json_options=DEFAULT_JSON_OPTIONS):
if not obj.tzinfo:
obj = obj.replace(tzinfo=utc)
if obj >= EPOCH_AWARE:
off = obj.tzinfo.utcoffset(obj)
if (off.days, off.seconds, off.microseconds) == (0, 0, 0):
off = obj.tzinfo.utcoffset(obj) # type: ignore
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of type: ignore, can we tag this as timedelta?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added an assert obj.tzinfo is not None above and was able to remove the ignore.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I see. Is this a bug/limitation in mypy? I would expect it to understand that tzinfo always exists here since we have these lines above:

 if not obj.tzinfo:
    obj = obj.replace(tzinfo=utc)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I moved the assert below that line for clarity

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That does seem like a limitation, but I wouldn't call it a bug, that would take some serious introspection

session=None):
def download_to_stream_by_name(self, filename: str, destination: Any,
revision: Optional[int] = -1,
session: Optional[ClientSession] = None) -> None:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I saw elsewhere you used NoReturn. Should we change all the -> None: methods to use it too?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Those are for functions that always raise an exception

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh TIL. Thanks.

"""Read one line or up to `size` bytes from the file.

:Parameters:
- `size` (optional): the maximum number of bytes to read
"""
if size is None:
size = -1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why this change? The api does not support size=None and I'd like to avoid adding misc features in this PR.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is another case of Optional[int]. In this case the superclass also uses it, we'd have to add an # ignore: override:

gridfs/grid_file.py:575: error: Argument 1 of "readline" is incompatible with supertype "IOBase"; supertype defines the argument type as "Optional[int]"
gridfs/grid_file.py:575: note: This violates the Liskov substitution principle
gridfs/grid_file.py:575: note: See https://mypy.readthedocs.io/en/stable/common_issues.html#incompatible-overrides

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the issue is the type should be size: int instead of size: Optional[int]. Size is not compatible with Optional because None is not supported (the default is -1, not None).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, but because io.IOBase uses Optional[int], we have to add an ignore override comment. I'll do that.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@@ -92,6 +98,9 @@ class TypeCodec(TypeEncoder, TypeDecoder):
pass


Codec = Union[TypeEncoder, TypeDecoder, TypeCodec]
Fallback = Callable[[Any], Any]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd say yes since we don't want people to accidentally import them or depend on them yet. We try to make everything private by default.

bson/__init__.py Outdated
BSONDEC = b"\x13" # Decimal128
BSONMIN = b"\xFF" # Min key
BSONMAX = b"\x7F" # Max key
EPOCH_AWARE: datetime.datetime = datetime.datetime.fromtimestamp(0, utc)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you expand on that? In my testing I found they are not required even with strict mode:

$ cat typ.py          
NUM = 1
def meth(x: int) -> int:
    return x * 2
print(meth(NUM))
$ mypy --strict typ.py
Success: no issues found in 1 source file

@@ -49,7 +50,7 @@ class Regex(object):
_type_marker = 11

@classmethod
def from_native(cls, regex):
def from_native(cls: Type["Regex"], regex: Pattern[Any]) -> "Regex":
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great thanks.

session=None):
def download_to_stream_by_name(self, filename: str, destination: Any,
revision: Optional[int] = -1,
session: Optional[ClientSession] = None) -> None:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh TIL. Thanks.

"""Read one line or up to `size` bytes from the file.

:Parameters:
- `size` (optional): the maximum number of bytes to read
"""
if size is None:
size = -1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the issue is the type should be size: int instead of size: Optional[int]. Size is not compatible with Optional because None is not supported (the default is -1, not None).

bson/son.py Outdated
return memo.get(val_id)
value = memo.get(val_id)
assert value is not None
return value
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we replace the redundant assert and temp variable with: return memo[val_id]?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ideally if val_id in memo would guard against the get() being None, but it does not, which is why I added the intermediate variable and guard.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I agree but I think it's similar to the other issue since mypy can't determine if the val_id might be removed by another thread in between the if-statement and the get() call. Regardless, I think we can replace these 3 lines with return memo[val_id] since we know val_id is always there. Then we don't need to add redundant code just to appease mypy.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Used type ignore instead

@blink1073
Copy link
Member Author

Note: I removed the changes to pymongo and test in order to make this PR easier to review. I will follow up with those in other PRs.

Copy link
Member

@ShaneHarvey ShaneHarvey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for pushing out the pymongo/+test/ changes!

"""Read one line or up to `size` bytes from the file.

:Parameters:
- `size` (optional): the maximum number of bytes to read
"""
if size is None:
size = -1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@@ -44,7 +45,7 @@

OUTPUT_FILE = os.environ.get('OUTPUT_FILE')

result_data = []
result_data: List[Any] = []
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't this annotation redundant?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

test/performance/perf_test.py:48: error: Need type annotation for "result_data" (hint: "result_data: List[<type>] = ...")

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh that's surprising. I thought it would default to List. TIL.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It defaulted to List[unknown], but adding : List allowed it to reference as List[Any], so the [Any] was redundant.

bson/son.py Outdated
return memo.get(val_id)
value = memo.get(val_id)
assert value is not None
return value
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I agree but I think it's similar to the other issue since mypy can't determine if the val_id might be removed by another thread in between the if-statement and the get() call. Regardless, I think we can replace these 3 lines with return memo[val_id] since we know val_id is always there. Then we don't need to add redundant code just to appease mypy.

bson/son.py Outdated
val_id = id(self)
if val_id in memo:
return memo.get(val_id)
return memo.get(val_id) # type: ignore
Copy link
Member

@ShaneHarvey ShaneHarvey Jan 13, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can make this type-safe and remove # type: ignore by using return memo[val_id] instead of .get().

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, ha, duh, thanks

Copy link
Member

@ShaneHarvey ShaneHarvey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@blink1073 blink1073 merged commit 52ed5a4 into mongodb:master Jan 13, 2022
@blink1073 blink1073 deleted the PYTHON-3052 branch January 13, 2022 22:09
juliusgeo pushed a commit to juliusgeo/mongo-python-driver that referenced this pull request Apr 5, 2022
juliusgeo pushed a commit to juliusgeo/mongo-python-driver that referenced this pull request Apr 7, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants