MonkeyType is not working in IPython #68

tonyfast · 2018-01-31T22:39:14Z

The newer version of MonkeyType does not work in IPython. I think e8bb8a6 could be the reason.

An interactive session in IPython will assert __name__ == '__main__' it would still be meaningful to capture the CallTrace so folks could interactively type their notebook/python source.

I can totally understand if this falls outside the scope of the project.

The text was updated successfully, but these errors were encountered:

carljm · 2018-01-31T22:46:05Z

Hi, thanks for the report! Can you give repro steps for how MonkeyType previously worked usefully in an IPython notebook, that fails today? The reason we disabled tracing from __main__ is that when generating or applying those stubs MonkeyType will try to import __main__ which doesn't generally work, so I'm not clear how it ever worked in an IPython context. If you can outline a specific usage flow that used to work, we can investigate making it work again. Thanks!

tonyfast · 2018-01-31T22:56:24Z

The use cases are scattered at the moment, but I am hopeful that MonkeyType can be used to provide interactive type annotations why someone is developing interactively.

The current prototype creates cell magics that use the MonkeyType profiler. The magics will either write the stubs to a file, show the stub in the next input, or show the retyped stub. I was validating this concept off of these tests.

This code was working with version 17.21.3.

carljm · 2018-02-01T22:45:01Z

Cool, thanks! Yeah, I can see why this used to work and no longer does. I think it's reasonable for MonkeyType to not record traces from __main__ by default (because they won't work with the normal stub/apply flow), but that should be configurable, so that someone working with deeper internals (like you are) can still get the traces you need. IOW we should put that in the default code filter instead of hardcoded into CallTracer, then you can easily override the default code filter with your own. Sound OK to you?

carljm · 2018-02-01T23:00:06Z

Hmm, it won't be quite that simple. The code filter API only gets the code object, not the frame (which may have been a design mistake); I don't think it's possible to tell from the code object alone whether the module name is __main__. So, options that I see:

We could just delete that check and go ahead and record traces with module __main__ by default. I don't think this has any really bad consequences (@vodik do you know of any?); worst is probably a confusing additional entry from the new list_modules command.
We could instead have the default CallTraceLogger discard such traces, so it's still default but overridable. Kinda ugly though to have CallTraceLogger doing what is really code-filter work.
We could decide that it was a design mistake to not pass the whole frame to the code filter (which I think it was; there's really no downside, and it gives the code filter more info to work with). Problem is for back compat we'd have to introduce a new config method for "frame filters" and return support for code filters that take the code object. Or I guess we could make the signature be (code_obj, frame) and use signature inspection to decide whether to supply the second arg; kinda weird to take both args, since frame has a direct reference to code object, but would be the most painless way to do it.

I guess I lean toward option 1? Otherwise I'd go for 3, but that seems like a lot of work unless we discover more use cases for giving code filters access to the frame. @mpage any preferences?

vodik · 2018-02-04T18:21:20Z

I have no problem with lifting that filter. I put it in place mostly because I didn't think storing __main__ can show anything meaningful (admittedly wrongly, as I think this is a valid use case in this issue).

In recap, the problem I had (when I put in -m support and added that filter) is that my project has 3 different entry points - one backend server and two frontend cli utilities, and each entry point script, when collected, gets dropped into a single __main__, making it difficult to work with. Sometimes they'd even clobber each other.

Maybe another option would be collect it but maybe gate it from being used by the tooling (for example, maybe hide it by default in list-modules)? Still gives people room to play with it progromatically. I don't see, for example, how __main__ could work well with #26 otherwise.

We could decide that it was a design mistake to not pass the whole frame to the code filter (which I think it was; there's really no downside, and it gives the code filter more info to work with). Problem is for back compat we'd have to introduce a new config method for "frame filters" and return support for code filters that take the code object.

I actually think this is a very good idea and might be worth the breaking change. I can't speak for others, but it wouldn't break anything for me since the default filter is good enough for my use cases.

tonyfast · 2018-02-04T19:48:59Z

Thanks for digging into this issue y'all

Users of projects like python-fire, doit and luigi could get value from MonkeyType in the __main__ context. These applications and interactive computing contexts have a lower code and packaging dependency. In larger software projects, the consequence of__import__('__main__'_) is unpredictable across many projects.

It would rad if __main__ could work with the list-modules machinery. __main__ is likely a special case because it behaves unpredictably compared to normal packages. Maybe an argument like list-main which appends __main__ to list-modules would work.

In IPython, this configuration could be done as an IPython extension similar to the way the %%monkey magic was created.

carljm · 2018-02-05T22:11:52Z

I realized there's a significant downside to passing the entire frame to code filters; it breaks the LRU cache we have on the default code filter, which makes a big perf difference. We see the same code object many times over, but every frame is different.

So I guess I reluctantly lean towards option 2 (filter out __main__ by default in CallTraceLogger. __main__ is weird enough (especially considering the point that within a single project there could be multiple different modules run as __main__ clobbering each others' traces) that I'd really rather have it untraced by default and leave it to projects like the IPython integration to manually re-enable it with a custom trace logger when it's needed.

tonyfast · 2018-02-07T18:02:57Z

I've been staring at CallTraceStoreLogger and CallTraceLogger for a while. I can see why you didn't want option 2. It seems like formalizing some of the CallTraceLogger machinery is necessary to solve this issue. Would it make sense to move CallTraceStoreLogger.traces to CallTraceLogger.traces. This suggests something like:

class CallTraceLogger(metaclass=ABCMeta):
    def __init__(self):
        self.traces: List[CallTrace] = []

    def log(self, trace: CallTrace) -> None:
        self.traces.append(trace)

    def flush(self) -> None:
        self.traces = []

or as a dataclass

@dataclass
class CallTraceLogger(metaclass=ABCMeta):
    traces: List[CallTrace] = field(default_factory=list)
        
    def log(self, trace: CallTrace) -> None:
        self.traces.append(trace)

    def flush(self) -> None:
        self.traces = []

dataclass requires a dependency for Python 3.6, but is standardlib in 3.7.

carljm · 2018-02-11T18:26:40Z

I think it'd be preferable to implement this filter in CallTraceStoreLogger and leave CallTraceLogger as-is. CallTraceLogger is just an interface; CallTraceStoreLogger is the default implementation of it. Also, a big part of the reasoning for wanting to exclude __main__ traces is that there could be multiple __main__ stomping over each others' traces in the trace store, but this reasoning applies only to a logger that stores traces in a store; thus CallTraceStoreLogger.

I don't see any rationale here for introducing a dataclass dependency.

This commit moves the filtering logic for main modules to the CallTraceLoggerStore. All of the tests pass and this works in IPython.

carljm added the need-info label Jan 31, 2018

carljm added bug and removed need-info labels Feb 1, 2018

tonyfast added a commit to tonyfast/MonkeyType that referenced this issue Feb 12, 2018

The minimum changes to address Instagram#68

f8f9dd5

This commit moves the filtering logic for main modules to the CallTraceLoggerStore. All of the tests pass and this works in IPython.

carljm closed this as completed in 68a6e63 Feb 13, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MonkeyType is not working in IPython #68

MonkeyType is not working in IPython #68

tonyfast commented Jan 31, 2018

carljm commented Jan 31, 2018

tonyfast commented Jan 31, 2018

carljm commented Feb 1, 2018

carljm commented Feb 1, 2018

vodik commented Feb 4, 2018 •

edited

tonyfast commented Feb 4, 2018 •

edited

carljm commented Feb 5, 2018

tonyfast commented Feb 7, 2018

carljm commented Feb 11, 2018

MonkeyType is not working in IPython #68

MonkeyType is not working in IPython #68

Comments

tonyfast commented Jan 31, 2018

carljm commented Jan 31, 2018

tonyfast commented Jan 31, 2018

carljm commented Feb 1, 2018

carljm commented Feb 1, 2018

vodik commented Feb 4, 2018 • edited

tonyfast commented Feb 4, 2018 • edited

carljm commented Feb 5, 2018

tonyfast commented Feb 7, 2018

carljm commented Feb 11, 2018

vodik commented Feb 4, 2018 •

edited

tonyfast commented Feb 4, 2018 •

edited