-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PYTHON-4533 Convert essential tests to async for Async PyMongo beta release #1724
base: master
Are you sure you want to change the base?
Conversation
cf839b7
to
73d3b19
Compare
240e00e
to
665f730
Compare
56b779e
to
c00c15b
Compare
@@ -1517,6 +1521,9 @@ async def close(self) -> None: | |||
# TODO: PYTHON-1921 Encrypted MongoClients cannot be re-opened. | |||
await self._encrypter.close() | |||
|
|||
async def aclose(self) -> None: | |||
await self.close() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Public API should have a docstring.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need aclose() when we already have close()?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
contextlib
expects that asynchronous context managers use aclose
. We could simplify by just setting aclose = close
instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In that case should we remove "close()"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that's better, yeah.
|
||
client_options = client_context.default_client_options.copy() | ||
client_options.update(kwargs) | ||
|
||
super().__init__(*args, **client_options) | ||
|
||
@classmethod | ||
def get_async_mock_client( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be get_sync_mock_client
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch, using get_mock_client
seems better.
@@ -2332,7 +2398,7 @@ class TestClientPool(MockClientTest): | |||
@client_context.require_connection | |||
def test_rs_client_does_not_maintain_pool_to_arbiters(self): | |||
listener = CMAPListener() | |||
c = MockClient( | |||
c = MockClient.get_async_mock_client( | |||
standalones=[], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be get_sync_mock_client
?
f"{f.__name__} sent wrong $clusterTime with {event.command_name}", | ||
) | ||
|
||
# class TestCausalConsistency(unittest.TestCase): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TODO?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Whoops yes good catch.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does anyone else feel there's a bit too much going on in this one PR to review effectively? Can you split it up into smaller and more manageable pieces?
@@ -861,6 +861,10 @@ def __init__( | |||
# This will be used later if we fork. | |||
AsyncMongoClient._clients[self._topology._topology_id] = self | |||
|
|||
async def connect(self) -> None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"Connect" might not be the best name here since we're not actually connecting. I know we have the "connect=True" argument but I can't help but feel like "open()" would be more appropriate. If a user actually wanted to connect I would expect them to run something like "db.command('ping')". What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It feels a bit odd to me for the synchronous API to have a connect=True
option but the equivalent async option is called open
instead even though they do the same thing. Consistency of naming makes more sense to me here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In that case I think the docstring just needs to explain the concept more clearly.
Good point, let me try. |
@@ -860,6 +860,10 @@ def __init__( | |||
# This will be used later if we fork. | |||
MongoClient._clients[self._topology._topology_id] = self | |||
|
|||
def connect(self) -> None: | |||
"""Explicitly connect synchronously.""" | |||
self._get_topology() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should avoid adding connect() and aclose() to the sync client.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if we rename them to be private for the synchronous client? Like aconnect
-> _aconnect
and aclose
-> _aclose
? There isn't a clean way of having the synchro process just not convert specific methods.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about something like this?:
_IS_SYNC = False
class AsyncMongoClient:
if not _IS_SYNC:
async def aclose(self):
pass
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A better solution for aclose
is to just rename it to close
when it's synchronized. That way all async code will use aclose
and all sync code will use close
without any extra work. For aconnect
what about this:
async def aconnect(self):
if _IS_SYNC:
pass
else:
...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
aclose -> close SGTM. For aconnect() I would prefer the approach I outlined above so that the method is never added to the sync client.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually. renaming aconnect -> _aconnect is fine too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The synchro process will still convert the aconnect
method within the if not _IS_SYNC
block and add it to the sync client source. Is that acceptable?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm good with any approach as long as aconnect() or connect() isn't a public method on MongoClient.
with warnings.catch_warnings(): | ||
# Ignore warning that ping is always routed to primary even | ||
# if client's read preference isn't PRIMARY. | ||
warnings.simplefilter("ignore", UserWarning) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've never seen this warning. What is it referring to?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This already exists in our test suite:
mongo-python-driver/test/utils.py
Line 866 in 554ce7d
# Ignore warning that ping is always routed to primary even |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did some digging and that warning was removed in https://jira.mongodb.org/browse/PYTHON-814 in PyMongo 3.0. Could you remove the catch_warnings block?
sys.setswitchinterval(interval) | ||
|
||
|
||
def lazy_client_trial(reset, target, test, get_client): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this even make sense to test in async? I would expect an async client to only be useable from a single thread since an I/O loop only runs on a single thread.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It isn't being tested at all in async, not sure why I moved it here. Putting back into test/utils.py
.
@@ -389,7 +392,7 @@ def test_transaction_direct_connection(self): | |||
with client.start_session() as s, s.start_transaction(): | |||
res = f(*args, session=s) # type:ignore[operator] | |||
if isinstance(res, (CommandCursor, Cursor)): | |||
list(res) | |||
res.to_list() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add a docstring and changelog entry for to_list?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
list(cursor)
still works on synchronous cursors, but asynchronous cursors require the use of Cursor.to_list()
instead. This change is purely so our tests can be synchronized smoothly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, the only issue is that to_list() is undocumented.
@@ -598,7 +601,10 @@ def _inherit_option(self, name: str, val: _T) -> _T: | |||
|
|||
async def with_transaction( | |||
self, | |||
callback: Callable[[AsyncClientSession], _T], | |||
callback: Union[ | |||
Callable[[AsyncClientSession], Coroutine[Any, Any, _T]], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we use Coroutine here or Awaitable? Which is a better fit?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We've used Coroutine
consistently so far, Awaitable
is a less-specific type hint that allows several other objects besides asynchronous functions.
callback: Callable[[AsyncClientSession], _T], | ||
callback: Union[ | ||
Callable[[AsyncClientSession], Coroutine[Any, Any, _T]], | ||
Callable[[AsyncClientSession], _T], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are there two callables here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
An artifact of the reasoning below, removed.
@@ -693,7 +699,10 @@ async def callback(session, custom_arg, custom_kwarg=None): | |||
read_concern, write_concern, read_preference, max_commit_time_ms | |||
) | |||
try: | |||
ret = callback(self) | |||
if not _IS_SYNC and iscoroutinefunction(callback): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why check IS_SYNC or iscoroutinefunction here? Shouldn't it just be:
ret = await callback(self)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was trying to be more lenient and allow both async and sync callbacks, but it's better to require all asynchronous callbacks for the Async API.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Halfway through! Left some comments for now.
callback: Union[ | ||
Callable[[ClientSession], _T], | ||
Callable[[ClientSession], _T], | ||
], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like the type definition is the same twice here. Is that needed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, this is an artifact of supporting both kinds of callbacks in the async API. Will be resolved in the separate PR for test_transaction
@@ -1512,6 +1516,9 @@ def close(self) -> None: | |||
# TODO: PYTHON-1921 Encrypted MongoClients cannot be re-opened. | |||
self._encrypter.close() | |||
|
|||
def aclose(self) -> None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as the comment from @ShaneHarvey before. We can remove close
def require_sync(self, func): | ||
"""Run a test only if using the synchronous API.""" | ||
return self._require( | ||
lambda: _IS_SYNC, "This test only works with the synchronous API", func=func |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since _IS_SYNC is defined as true, would the lambda ever fail?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This file is generated from test/asynchronous/__init__.py
, it always passes for a synchronous test, as intended.
|
||
@classmethod | ||
def setUpClass(cls): | ||
if _IS_SYNC: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comment. How does/may _IS_SYNC get redefined when tests are running?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
_IS_SYNC
is always true for a synchronous file and always false for an asynchronous file.
|
||
def drop_collections(db: Database): | ||
# Drop all non-system collections in this database. | ||
for coll in db.list_collection_names(filter={"name": {"$regex": r"^(?!system\.)"}}): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to filter the non system collection names? I had assumed those couldn't be dropped.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is ancient code, probably it's to avoid attempting to drop a system collection and getting an OperationFailure because it's not allowed.
t.start() | ||
|
||
for t in threads: | ||
t.join(60) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
60 seconds is quite a bit of time. Have we been doing 60 historically?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, 60 is just a fail safe in case something goes wrong.
for _i in range(NTRIALS): | ||
reset(collection) | ||
lazy_client = get_client() | ||
lazy_collection = lazy_client.pymongo_test.test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NIT: do we need to redefine the lazy_client each time?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, the point of repeating the test is to run with a new client each time.
Needs #1718 to be merged first.