Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor IPCompleter Matcher API #13745

Merged
merged 14 commits into from Oct 5, 2022
Merged

Conversation

krassowski
Copy link
Member

@krassowski krassowski commented Sep 5, 2022

Closes #12820 and closes #13735. Supersedes #13734.

Partially fixes #13155 by exposing full context to matchers (but not the hooks yet - that would be a breaking change; I can follow up with it).

This is a gentle, backward and forward-compatible rewrite of Matcher API. I attempted to design the API for performance and future extensibility, while retaining compatibility with existing code.

There is quite some legacy logic in this codebase, which I tried to carefully preserve and document by making any such behaviour explicit, but in a few places I encountered behaviour which I don't know how to tackle (no unit tests, nor comments - not sure if existing behaviour was intended or incidental due to accumulated code complexity). I would highly appreciate help on such cases, suggestions on how to handle deprecations (if any are needed) and general feedback on this PR.

User-facing changes

Completion type is returned in the provisional API for most matchers

Before After
Screenshot from 2022-09-07 04-04-28 Screenshot from 2022-09-07 04-04-32

Dictionary completions are restricted to dictionary keys

By default completions in dictionary context will only provide dictionary key matches. This is user-configurable.

The heuristic for this can be improved in the future iterations.

Before After
Screenshot from 2022-09-07 04-19-41 Screenshot from 2022-09-07 04-18-00

Configuration

New configuration options were added:

  • IPCompleter.suppress_competing_matchers allows to instruct which matchers should (or should not) short-circuit to return candidate completions ignoring other matchers
  • IPCompleter.disable_matchers enables selective disabling of matchers

Existing configuration options are now aliases:

  • IPCompleter.merge_completions = False is an alias for IPCompleter.suppress_competing_matchers = True

Matcher API

Briefly, the new Matcher is defined as:

MatcherAPIv1 = Callable[[str], list[str]]
MatcherAPIv2 = Callable[[CompletionContext], MatcherResult]

Matcher = Union[MatcherAPIv1, MatcherAPIv2]

where MatcherAPIv1 is the same public API as exposed via (undocumented) IPCompleter.custom_matchers attribute since v8.0, #12130), MatcherAPIv2 is the new API, and subsequent versions can be added in the future.

Matcher API v2

The MatcherResult of API v2 is a dictionary, as and resolves an old TODO comment:

# FIXME: we should extend our api to return a dict with completions for
# different types of objects. The rlcomplete() method could then
# simply collapse the dict into a list for readline, but we'd have
# richer completion semantics in other environments.

class MatcherResult(TypedDict):
    #: list of candidate completions
    completions: Union[Sequence[SimpleCompletion], Sequence[_JediCompletionLike]]   # simplified

    #: suffix of the provided ``CompletionContext.token``, if not given defaults to full token.
    matched_fragment: NotRequired[str]

    #: whether to suppress results from all other matchers (True), some
    #: matchers (set of identifiers) or none (False); default is False.
    suppress: NotRequired[Union[bool, Set[str]]]

    #: identifiers of matchers which should NOT be suppressed
    do_not_suppress: NotRequired[Set[str]]

    #: are completions already ordered and should be left as-is? default is False.
    ordered: NotRequired[bool]

This will enable adding additional attributes to the dictionary without breaking compatibility in the future. It also makes the API simple for users, as they only need to provide {'completions': list[SimpleCompletion]}.

SimpleCompletion is a container for completion metadata allowing to add type and in future additional attributes. After profiling initialisation, different forms of attribute/item access and memory use I chose to use a custom class with slots over NamedTuple or TypedDict due to inherent performance advantage. This is consistent with Completion class which also uses __slots__.

class SimpleCompletion:
    __slots__ = ["text", "type"]

    text: str
    type: str

CompletionContext is intended to provide matchers with information they need to generate completion candidates:

class CompletionContext(NamedTuple):
    token: str
    full_text: str
    cursor_position: int
    cursor_line: int

    @cached_property
    def text_until_cursor(self) -> str:
        ...

    @cached_property
    def line_with_cursor(self) -> str:
        ...

Switching between APIs and Matcher configuration

A decorator was added enabling demarking API versions and configuring Matcher-level options. The decorator is optional for v1 API, but required to use v2 API

def completion_matcher(
    *, priority: float = None, identifier: str = None, api_version=1
):
    """Adds attributes describing the matcher.
    Parameters
    ----------
    priority : Optional[float]
        The priority of the matcher, determines the order of execution of matchers.
        Higher priority means that the matcher will be executed first. Defaults to 50.
    identifier : Optional[str]
        identifier of the matcher allowing users to modify the behaviour via traitlets,
        and also used to for debugging (will be passed as ``origin`` with the completions).
        Defaults to matcher function ``__qualname__``.
    api_version: Optional[int]
        version of the Matcher API used by this matcher.
        Currently supported values are 1 and 2.
        Defaults to 1.
    """

    def wrapper(func: Matcher):
        func.matcher_priority = priority
        func.matcher_identifier = identifier or func.__qualname__
        func.matcher_api_version = api_version
        return func

    return wrapper

Maintaining compatibility

Existing methods such as dict_key_matches were not touched; instead wrapper methods converting from API v1 to API v2 and adding desired metadata (when applicable) were added; those methods have a new suffix _matcher. This approach allows users who monkey-patch matcher v1 methods in IPCompleter to continue using their existing code without any changes.

Follow-up tasks

To be discussed in separate issues and addressed in separate PRs:

  • benchmark type as a string vs as an enum to inform whether it is worth switching to enum
  • investigate how to integrate relevance score of sort (in sortText in LSP terms)

@krassowski
Copy link
Member Author

Is there an automated way to update doctest with ipdoctest, or do I copy line by line? I added new configuration with verbose explanation and it just happens that In [2]: %config IPCompleter is used as an example in IPython/core/magics/config.py.

@Carreau
Copy link
Member

Carreau commented Sep 6, 2022

Is there an automated way to update doctest with ipdoctest, or do I copy line by line? I added new configuration with verbose explanation and it just happens that In [2]: %config IPCompleter is used as an example in IPython/core/magics/config.py.

No there is no automatic ways. I copy and past also with light editting.

@krassowski krassowski marked this pull request as ready for review September 7, 2022 03:21
@joelostblom
Copy link

Awesome! The key-only completion for dictionaries will be super convenient!

@krassowski
Copy link
Member Author

Ok, I would say this is good for a review and hopefully merge.

I am happy to commit to maintaining this and fixing any problems that might emerge as a result of merging this (which I don't expect frankly since everything should be 100% backward compatible - but we know how it is with big projects).

Documentation-wise I hit a minor hiccup with documenting the TypedDict but I submitted a bugfix to sphinx autodoc which solves the issue sphinx-doc/sphinx#10806 and all members will show up once that is accepted and released.

self.type = type

def __repr__(self):
return f"<SimpleCompletion text={self.text!r} type={self.type!r}>"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WE could almost use a dataclass.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I decided not to use a dataclass due to a performance overhead.



@sphinx_options(show_inherited_members=True, exclude_inherited_from=["dict"])
class SimpleMatcherResult(_MatcherResultBase, TypedDict):
Copy link
Member

@Carreau Carreau Sep 16, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DO you need to inherit both if _MatcherResultBase is already TypedDict.
and do we want to TypedDict, or simply make a datalass with all the possible existing fields, even if it's slightly backward-incompatible ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DO you need to inherit both if _MatcherResultBase is already TypedDict.

No, not at runtime, but yes if we want to have nice documentation (sphinx-doc/sphinx#10806); a quirk of how TypedDict it was implemented. I added a comment in the body on this.

and do we want to TypedDict, or simply make a datalass with all the possible existing fields, even if it's slightly backward-incompatible ?

I hesitated here and considered making it a dataclass to, but I preferred TypedDict for two reasons:

  • in future releases we can add a new NotRequired field without breaking compatibility (and give notice if we want to make it required in the further future)
  • it does not require users to import any specific class from IPython to wrap the results (or create a custom class complying with the API), so it would keep it simple (while argument name and type safety is provided by typecheckers)

IPython/utils/docs.py Outdated Show resolved Hide resolved
@Carreau
Copy link
Member

Carreau commented Sep 16, 2022

Thanks, I tried to do a quick pass, and will try to get a deeper look when on my way to CZI meeting next week.

In general I think you can be a little bit less careful on backward compatibility, and if you find really old things try to remove them if unnecessary, we only have a few dependees that use old APIs.

krassowski and others added 2 commits September 24, 2022 21:15
Co-authored-by: Matthias Bussonnier <bussonniermatthias@gmail.com>
Data class has narrower API than named tuple (does not expose item getters)
which means it is less likely that the downstream code would be broken in
the future if we introduce new attributes. Also, compared to other places
where memory-efficient named tuple is used, `CompletionContext` is only
created once per completion and thus does not require as low overhead.
For the same reason we can forgo slightly more memory-efficient
`@property @cache` stack and just use `@cached_property`.
@Carreau
Copy link
Member

Carreau commented Oct 5, 2022

Ok, let's just try this and refine if needs there is.

@Carreau Carreau merged commit dc08a33 into ipython:main Oct 5, 2022
@Carreau Carreau added this to the 8.6 milestone Oct 19, 2022
@jdtsmith
Copy link
Contributor

Playing with the new completer in an Emacs mode I'm building. So far so good; great work. Seems faster as well.

I had formerly overridden the matchers property to remove file and magic matches in certain contexts. disable_matchers is obviously the new way to do that. But is the matchers property no longer consulted for the list of enabled matchers? If so, its value is probably misleading.

@krassowski
Copy link
Member Author

Thank you for the feedback @jdtsmith! Would you mind opening an issue with a minimal reproducible example? I don't see what could be the issue as:

  • IPCompleter.matchers was and remained a dynamically generated list defined as a property (the behaviour here did not change)
  • Completer.custom_matchers are still prepended to the matchers list (above) and used extensively in tests, so I don't believe there is a regression here
    Thanks!

And yes, this PR lays a foundation for improved performance (see #13752 for an example).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants