Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added patching logic to __main__.py #9

Merged
merged 8 commits into from
Jul 20, 2020

Conversation

juliusgeo
Copy link
Contributor

No description provided.


def succeeded(self, event):
pass
def update_one(self: Collection, filter, update, upsert=False,
Copy link
Collaborator

@ShaneHarvey ShaneHarvey Jul 16, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you think about doing this instead?

def update_one(self, *args, **kwargs):
    ExplainCollection(self).update_one(*args, **kwargs)
    return old_update_one(self, *args, **kwargs)

Then we might also be able to auto generate all the methods too!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Copy link
Collaborator

@ShaneHarvey ShaneHarvey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work! Just a few comments on style and the output format.

from pymongo.collection import Collection
from pymongo.collection import Collection, ReturnDocument
from pymongo.cursor import CursorType
from pymongo.common import BaseObject
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's remove the imports that are no longer used.

pass
def make_func(old_func, old_func_name):
def new_func(self: Collection, *args, **kwargs):
print(getattr(ExplainCollection(self), old_func_name)(*args,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you think about using the logging module instead of print? See https://docs.python.org/3/howto/logging.html#logging-basic-tutorial

Copy link
Collaborator

@ShaneHarvey ShaneHarvey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you show use some sample output of this script?


for old_func, old_func_name in zip(old_functions, old_function_names):
setattr(Collection, old_func_name, make_func(old_func,
old_func_name))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this can be one line:

setattr(Collection, old_func_name, make_func(old_func, old_func_name))



print(sys.argv)
exec(f.read())
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Newline at end of file.

@juliusgeo
Copy link
Contributor Author

Can you show use some sample output of this script?

from pymongo import MongoClient
client = MongoClient(serverSelectionTimeoutMS=1000)
collection = client.db.products
collection.update_one({"quantity": 1057, "category": "apparel"},{"$set": {"reorder": True}})

This outputs the following:

2020-07-16 18:02:14,209 INFO __main__ {'queryPlanner': {'plannerVersion': 1, 'namespace': 'db.products', 'indexFilterSet': False, 'parsedQuery': {'$and': [{'category': {'$eq': 'apparel'}}, {'quantity': {'$eq': 1057}}]}, 'queryHash': 'CD8F6D8F', 'planCacheKey': 'CD8F6D8F', 'winningPlan': {'stage': 'UPDATE', 'inputStage': {'stage': 'COLLSCAN', 'filter': {'$and': [{'category': {'$eq': 'apparel'}}, {'quantity': {'$eq': 1057}}]}, 'direction': 'forward'}}, 'rejectedPlans': []}, 'serverInfo': {'host': 'Juliuss-MBP.verizon.net', 'port': 27017, 'version': '4.4.0-rc10', 'gitVersion': '5218ee3e883b0230e121ae13a7640e0bc4a313ae'}, 'ok': 1.0, '$clusterTime': {'clusterTime': Timestamp(1594936925, 1), 'signature': {'hash': b'\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00', 'keyId': 0}}, 'operationTime': Timestamp(1594936925, 1)}

pass
def make_func(old_func, old_func_name):
def new_func(self: Collection, *args, **kwargs):
logging.info(getattr(ExplainCollection(self), old_func_name)(*args,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add some context to the long line? Maybe something as simple as "update_one explained: {...}"?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Context on what it does? Or just an example of the output?

Copy link
Collaborator

@ShaneHarvey ShaneHarvey Jul 17, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

More context in the log line itself. Right now we just log the response:

2020-07-16 18:02:14,209 INFO __main__ {'queryPlanner': ...}

It would be nicer to include the method name being explained so that the user know what the output is for. Like this:

2020-07-16 18:02:14,209 INFO __main__ update_one explain response: {'queryPlanner': ...}
2020-07-16 18:02:14,209 INFO __main__ find_one explain response: {'queryPlanner': ...}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh I see what you mean. Will do

def make_func(old_func, old_func_name):
def new_func(self: Collection, *args, **kwargs):
logging.info("%s explain response: %s"% (old_func_name, getattr(
ExplainCollection(self),old_func_name)(*args, **kwargs)))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest refactoring this slightly to make the code a bit more readable:

res = getattr(ExplainCollection(self), old_func_name)(*args, **kwargs)
logging.info("%s explain response: %s", old_func_name, res) 

Note that with the logging methods you should pass the format string arguments without using % on the format string directly. See https://docs.python.org/3/howto/logging.html#logging-variable-data

Copy link
Collaborator

@ShaneHarvey ShaneHarvey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work! I think there's some follow up work we need to do in order to:

  1. Test __main__.py on travis.
  2. Enable support for script arguments:
$ python -m pymongoexplain script.py # This works
$ python -m pymongoexplain script.py --arg=scriptArgument ...  # This does not

Could you open an issue to track those two improvements?

Copy link

@prashantmital prashantmital left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good start - we can merge this for now but it definitely needs some more testing.

def __init__(self):
self.payloads = []
FORMAT = '%(asctime)s %(levelname)s %(module)s %(message)s'
logging.basicConfig(format=FORMAT, level=logging.INFO)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to investigate how us configuring logging in __main__ might interact with the script being run configuring logging as well. I think if it is technically feasible, we should try to ensure:

  • that the logging configuration in main does not make the logger less verbose than as configured by the script, and
  • that the script cannot configure logging in a manner that would inadvertently suppress explain output.

I don't necessarily think that these objectives need to be achieved in code - good documentation should also suffice. Please open an issue to track this work.

import sys
from bson.son import SON
import logging

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a brief module-level docstring?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.


for old_func, old_func_name in zip(old_functions, old_function_names):
setattr(Collection, old_func_name, make_func(old_func, old_func_name))

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you confirm that this patches Collection no matter what namespace it is imported from? In other words, do we know that the following two cases would both see the same behavior:

import pymongo

pymongo.Collection(...).find(...)

and

from pymongo import Collection

Collection(...).find(...)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We also want the patching to work when PyMongo itself returns a Collection (e.g. via Database.get_collection). Checking once manually on your local machine should suffice for now, though maybe you can add a note about testing these cases to the ticket Shane has asked you to open above.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tested those two cases and it still works.

@juliusgeo juliusgeo merged commit b7d547f into mongodb-labs:master Jul 20, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants