Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make dependency of pymongo optional in running #1027

Merged
merged 1 commit into from
Nov 15, 2023

Conversation

phoebusm
Copy link
Collaborator

@phoebusm phoebusm commented Nov 1, 2023

Reference Issues/PRs

#910

What does this implement or fix?

Make dependency of pymongo optional in running.
pymongo offers uri_parser which offers fantastic error message for incorrect uri syntax. However it is not necessary. So this PR makes the dependency optional. If user's environment does not come with pymongo, a simple regex will be used for extracting the endpoint. And mongo C++ driver will detect any incorrect syntax still, but with relatively simple error message.

Any other comments?

Checklist

Checklist for code changes...
  • Have you updated the relevant docstrings, documentation and copyright notice?
  • Is this contribution tested against all ArcticDB's features?
  • Do all exceptions introduced raise appropriate error messages?
  • Are API changes highlighted in the PR description?
  • Is the PR labelled as enhancement or bug so it appears in autogenerated release notes?

@phoebusm phoebusm added the enhancement New feature or request label Nov 1, 2023
@phoebusm phoebusm self-assigned this Nov 1, 2023
@phoebusm phoebusm linked an issue Nov 1, 2023 that may be closed by this pull request
@phoebusm phoebusm changed the title Extract uri_parser from pymongo 4.1.1 to remove dependency WIP: Extract uri_parser from pymongo 4.1.1 to remove dependency Nov 2, 2023
@phoebusm phoebusm changed the title WIP: Extract uri_parser from pymongo 4.1.1 to remove dependency Make dependency of pymongo optional in running Nov 2, 2023
@phoebusm phoebusm changed the title Make dependency of pymongo optional in running WIP:Make dependency of pymongo optional in running Nov 2, 2023
# Python dependencies (for tests only)
- azure-storage-blob
- azure-identity
- pyarrow
- asv
- pymongo
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shall we drop pymongo here? Bit dodgy but then at least we'll have PyPi CI run with pymongo and Conda run without, so we can cover both code paths

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow that's hacky genius thought. Good idea.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have come up with a better idea on the testing, with parametrised pytest + monkeypatch

@phoebusm phoebusm changed the title WIP:Make dependency of pymongo optional in running Make dependency of pymongo optional in running Nov 3, 2023
@pytest.mark.skipif(not RUN_MONGO_TEST, reason="Mongo test on windows is fiddly")
def test_mongo_repr(mongo_test_uri):
def test_mongo_repr(request, pymongo_patcher, mongo_test_uri):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's a bit cleaner to just have two separate tests, rather than the parameterized fixture, or trying to shoe-horn monkeypatch in to a fixture. And we should assert that when pymongo is installed, we get the good exception. This test doesn't belong in test_mongo_repr.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed

ac = Arctic(uri)
assert repr(ac) == f"Arctic(config=mongodb(endpoint={mongo_test_uri[len('mongodb://'):]}))"

# With pymongo, exception thrown in the uri_parser;
with pytest.raises(UserInputException):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Standalone test for this please, AFAIK there's no reason to fold this in with the repr test? Unless constructing the mongo_test_uri fixture is expensive?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is for testing whether the extraction of endpoint is valid.
With pymongo, uri_parser is used for the extraction.
Without pymongo, re is used. It is not really necessary but it will be easier to debug if something gets messed up.

monkeypatch.setattr(arcticdb.adapters.mongo_library_adapter, "_HAVE_PYMONGO", False)

uri = f"{mongo_test_uri}/?maxPoolSize=10&minPoolSize=100&serverSelectionTimeoutMS=1000"
ac = Arctic(uri)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense to check we can construct with an OK URI, but we don't need the repr assertion here too

Copy link
Collaborator Author

@phoebusm phoebusm Nov 8, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same reason for the above comment. The repr assertion is for making sure the endpoint extraction is correct or not. I will delete repo in the naming of those tests so they look less like shoe-horning uri syntax exception checking, but more like testing the construction of mongo adapter.

python/tests/integration/arcticdb/test_arctic.py Outdated Show resolved Hide resolved
@phoebusm phoebusm force-pushed the feature/remove_pymongo branch 2 times, most recently from 2f15dd3 to c88f0de Compare November 13, 2023 18:04
Copy link
Collaborator

@jjerphan jjerphan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you, @phoebusm.

Comment on lines +34 to +41
if _HAVE_PYMONGO:
parameters = parse_uri(
uri
) # also checks pymongo uri syntax, throw exception as early as possible if syntax is incorrect
self._endpoint = f"{parameters['nodelist'][0][0]}:{parameters['nodelist'][0][1]}"
else:
match = re.search(r"\/\/(?P<endpoint>[^\/]*)", uri)
self._endpoint = match["endpoint"]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you document any benefits pymongo.uri_parser.parse_uri has over using a regex, here?

@phoebusm phoebusm merged commit 044fc02 into master Nov 15, 2023
131 checks passed
@phoebusm phoebusm deleted the feature/remove_pymongo branch November 15, 2023 09:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Remove pymongo dependency
3 participants