Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MonkeyType is not working in IPython #68

Closed
tonyfast opened this issue Jan 31, 2018 · 9 comments
Closed

MonkeyType is not working in IPython #68

tonyfast opened this issue Jan 31, 2018 · 9 comments
Labels

Comments

@tonyfast
Copy link
Contributor

The newer version of MonkeyType does not work in IPython. I think e8bb8a6 could be the reason.

An interactive session in IPython will assert __name__ == '__main__' it would still be meaningful to capture the CallTrace so folks could interactively type their notebook/python source.

I can totally understand if this falls outside the scope of the project.

@carljm
Copy link
Contributor

carljm commented Jan 31, 2018

Hi, thanks for the report! Can you give repro steps for how MonkeyType previously worked usefully in an IPython notebook, that fails today? The reason we disabled tracing from __main__ is that when generating or applying those stubs MonkeyType will try to import __main__ which doesn't generally work, so I'm not clear how it ever worked in an IPython context. If you can outline a specific usage flow that used to work, we can investigate making it work again. Thanks!

@tonyfast
Copy link
Contributor Author

The use cases are scattered at the moment, but I am hopeful that MonkeyType can be used to provide interactive type annotations why someone is developing interactively.

The current prototype creates cell magics that use the MonkeyType profiler. The magics will either write the stubs to a file, show the stub in the next input, or show the retyped stub. I was validating this concept off of these tests.

This code was working with version 17.21.3.

@carljm carljm added bug and removed need-info labels Feb 1, 2018
@carljm
Copy link
Contributor

carljm commented Feb 1, 2018

Cool, thanks! Yeah, I can see why this used to work and no longer does. I think it's reasonable for MonkeyType to not record traces from __main__ by default (because they won't work with the normal stub/apply flow), but that should be configurable, so that someone working with deeper internals (like you are) can still get the traces you need. IOW we should put that in the default code filter instead of hardcoded into CallTracer, then you can easily override the default code filter with your own. Sound OK to you?

@carljm
Copy link
Contributor

carljm commented Feb 1, 2018

Hmm, it won't be quite that simple. The code filter API only gets the code object, not the frame (which may have been a design mistake); I don't think it's possible to tell from the code object alone whether the module name is __main__. So, options that I see:

  1. We could just delete that check and go ahead and record traces with module __main__ by default. I don't think this has any really bad consequences (@vodik do you know of any?); worst is probably a confusing additional entry from the new list_modules command.

  2. We could instead have the default CallTraceLogger discard such traces, so it's still default but overridable. Kinda ugly though to have CallTraceLogger doing what is really code-filter work.

  3. We could decide that it was a design mistake to not pass the whole frame to the code filter (which I think it was; there's really no downside, and it gives the code filter more info to work with). Problem is for back compat we'd have to introduce a new config method for "frame filters" and return support for code filters that take the code object. Or I guess we could make the signature be (code_obj, frame) and use signature inspection to decide whether to supply the second arg; kinda weird to take both args, since frame has a direct reference to code object, but would be the most painless way to do it.

I guess I lean toward option 1? Otherwise I'd go for 3, but that seems like a lot of work unless we discover more use cases for giving code filters access to the frame. @mpage any preferences?

@vodik
Copy link
Contributor

vodik commented Feb 4, 2018

I have no problem with lifting that filter. I put it in place mostly because I didn't think storing __main__ can show anything meaningful (admittedly wrongly, as I think this is a valid use case in this issue).

In recap, the problem I had (when I put in -m support and added that filter) is that my project has 3 different entry points - one backend server and two frontend cli utilities, and each entry point script, when collected, gets dropped into a single __main__, making it difficult to work with. Sometimes they'd even clobber each other.

Maybe another option would be collect it but maybe gate it from being used by the tooling (for example, maybe hide it by default in list-modules)? Still gives people room to play with it progromatically. I don't see, for example, how __main__ could work well with #26 otherwise.

We could decide that it was a design mistake to not pass the whole frame to the code filter (which I think it was; there's really no downside, and it gives the code filter more info to work with). Problem is for back compat we'd have to introduce a new config method for "frame filters" and return support for code filters that take the code object.

I actually think this is a very good idea and might be worth the breaking change. I can't speak for others, but it wouldn't break anything for me since the default filter is good enough for my use cases.

@tonyfast
Copy link
Contributor Author

tonyfast commented Feb 4, 2018

Thanks for digging into this issue y'all

Users of projects like python-fire, doit and luigi could get value from MonkeyType in the __main__ context. These applications and interactive computing contexts have a lower code and packaging dependency. In larger software projects, the consequence of__import__('__main__'_) is unpredictable across many projects.

It would rad if __main__ could work with the list-modules machinery. __main__ is likely a special case because it behaves unpredictably compared to normal packages. Maybe an argument like list-main which appends __main__ to list-modules would work.

In IPython, this configuration could be done as an IPython extension similar to the way the %%monkey magic was created.

@carljm
Copy link
Contributor

carljm commented Feb 5, 2018

I realized there's a significant downside to passing the entire frame to code filters; it breaks the LRU cache we have on the default code filter, which makes a big perf difference. We see the same code object many times over, but every frame is different.

So I guess I reluctantly lean towards option 2 (filter out __main__ by default in CallTraceLogger. __main__ is weird enough (especially considering the point that within a single project there could be multiple different modules run as __main__ clobbering each others' traces) that I'd really rather have it untraced by default and leave it to projects like the IPython integration to manually re-enable it with a custom trace logger when it's needed.

@tonyfast
Copy link
Contributor Author

tonyfast commented Feb 7, 2018

I've been staring at CallTraceStoreLogger and CallTraceLogger for a while. I can see why you didn't want option 2. It seems like formalizing some of the CallTraceLogger machinery is necessary to solve this issue. Would it make sense to move CallTraceStoreLogger.traces to CallTraceLogger.traces. This suggests something like:

class CallTraceLogger(metaclass=ABCMeta):
    def __init__(self):
        self.traces: List[CallTrace] = []

    def log(self, trace: CallTrace) -> None:
        self.traces.append(trace)

    def flush(self) -> None:
        self.traces = []

or as a dataclass

@dataclass
class CallTraceLogger(metaclass=ABCMeta):
    traces: List[CallTrace] = field(default_factory=list)
        
    def log(self, trace: CallTrace) -> None:
        self.traces.append(trace)

    def flush(self) -> None:
        self.traces = []

dataclass requires a dependency for Python 3.6, but is standardlib in 3.7.

@carljm
Copy link
Contributor

carljm commented Feb 11, 2018

I think it'd be preferable to implement this filter in CallTraceStoreLogger and leave CallTraceLogger as-is. CallTraceLogger is just an interface; CallTraceStoreLogger is the default implementation of it. Also, a big part of the reasoning for wanting to exclude __main__ traces is that there could be multiple __main__ stomping over each others' traces in the trace store, but this reasoning applies only to a logger that stores traces in a store; thus CallTraceStoreLogger.

I don't see any rationale here for introducing a dataclass dependency.

tonyfast added a commit to tonyfast/MonkeyType that referenced this issue Feb 12, 2018
This commit moves the filtering logic for main modules to the CallTraceLoggerStore.  All of the tests pass and this works in IPython.
@carljm carljm closed this as completed in 68a6e63 Feb 13, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants