-
-
Notifications
You must be signed in to change notification settings - Fork 3k
Support protocols in fine grained mode #4790
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…more debugging needed: protocol subtype not checked in second run, although CallExpr visited in checkexpr
…ting cache is still too soon, it does many other things
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here's my first round of review. While writing this, I started thinking about an alternative approach which wouldn't require a global cache file. I'll need to think about that more before I can say more -- I'm not even sure if it's feasible yet.
Another general comment that I don't like the proliferation of mutable cache data in TypeInfo
objects. An alternative approach would be to put all global caches into a new shared cache object and each TypeInfo
would have a reference to this object. I think that it would be cleaner and it would also give us centralized read access to the cache data, which could be helpful.
mypy/server/deps.py
Outdated
# (unless invalidated by other deps). We mark this kind of deps | ||
# by a star, since they are higher priority, i.e., they need | ||
# to be processed before any other deps. | ||
deps[trigger].add(info.fullname() + '*') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not excited about adding another kind of dependency just for protocols, particularly since it has somewhat magical properties -- this sounds error-prone. I wonder if there is a way to avoid this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, I will try to put this in a separate structure and generalize recursive triggering logic so that it can handle both normal and protocol deps.
mypy/server/deps.py
Outdated
processed = set() | ||
deps = defaultdict(set) # type: DefaultDict[str, Set[str]] | ||
for node in names.values(): | ||
if isinstance(node.node, TypeInfo) and node.node not in processed: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should probably guard against processing imported classes that are defined in another module -- these would be processed as part of that other module?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this will improve speed a bit, will do.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note however that this means we should never skip processing typing
. Because e.g. subtype cache of typing.Iterable
also needs to be invalidated if something tested against it changed.
mypy/subtypes.py
Outdated
return True | ||
if right.type.is_protocol: | ||
if not left.type.is_protocol: | ||
left.type.checked_against_members.update(right.type.protocol_members) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd rather not directly modify checked_against_members
here since it violates encapsulation. It would be better to add a TypeInfo method that records the information. Say, something like left.type.record_protocol_subtype_check(right.type)
.
checked_against_members
is a good candidate for using a _
prefix since it servers such a specialized purpose and it's mutated (even we don't much use _
attribute prefixes).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree.
mypy/subtypes.py
Outdated
@@ -395,6 +398,7 @@ def f(self) -> A: ... | |||
as well. | |||
""" | |||
assert right.type.is_protocol | |||
right.type.attempted_implementations.add(left.type.fullname()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar to above, it's better to not to directly poke state of objects. Use a method instead.
mypy/subtypes.py
Outdated
return True | ||
if right.type.is_protocol: | ||
if not left.type.is_protocol: | ||
left.type.checked_against_members.update(right.type.protocol_members) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar to above.
mypy/build.py
Outdated
manager.log("Error writing protocol deps JSON file {}".format(proto_cache)) | ||
|
||
|
||
def get_protocol_deps_cache_name(manager: BuildManager) -> Tuple[str, str]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This duplicates parts of get_cache_names
. It would be better to share the code (such as cache directory naming logic).
Also add dosctring -- it's not immediately clear what is the purpose of having a separate protocol deps cache file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, I will try to refactor and certainly will add the docstring.
mypy/build.py
Outdated
for id in graph: | ||
meta_snapshot[id] = graph[id].source_hash | ||
if not atomic_write(proto_meta, json.dumps(meta_snapshot), '\n'): | ||
manager.log("Error writing protocol meta JSON file {}".format(proto_cache)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Error handling is a bit sketchy -- the cache is unusable without a protocol meta JSON file. Other cache files can be regenerated by reprocessing individual modules, but the global cache can't be built incrementally. So if we write some cache files but not the global cache file we should probably fail the build. If we don't write any cache files at all, things are probably okay, since we'll need to a full rebuild which will also populate the protocol data structures correctly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this is the part where I am not sure what to do. Also should we invalidate the protocol cache separately if cache was written with --strict-optional
but is used without --strict-optional
?
mypy/build.py
Outdated
manager.log("Error writing protocol meta JSON file {}".format(proto_cache)) | ||
listed_proto_deps = {k: list(v) for (k, v) in proto_deps.items()} | ||
if not atomic_write(proto_cache, json.dumps(listed_proto_deps), '\n'): | ||
manager.log("Error writing protocol deps JSON file {}".format(proto_cache)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar to above -- error checking needs to be more involved.
mypy/build.py
Outdated
else: | ||
process_graph(graph, manager) | ||
if manager.options.cache_fine_grained or manager.options.fine_grained_incremental: | ||
proto_deps = collect_protocol_deps(graph) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This needs a comment, since the special protocol dependencies are non-trivial.
mypy/build.py
Outdated
@@ -2052,8 +2131,13 @@ def dispatch(sources: List[BuildSource], manager: BuildManager) -> Graph: | |||
# just want to load in all of the cache information. | |||
if manager.use_fine_grained_cache(): | |||
process_fine_grained_cache_graph(graph, manager) | |||
manager.proto_deps = read_protocol_cache(manager, graph) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
read_protocol_cache
can return None
if it can't load the file. Can we handle it here? Without protocol dependencies we won't be able to produce correct results, so it looks like handling it somehow is the right thing to do. Maybe we should do a full build instead of using the cache?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was thinking about showing a blocking error and stopping the build. There is a TODO about this in update.py
.
… logic used by mypy
…members by using fixup.stale_info
OK, now I updated the PR to use the newly added @JukkaL This is ready for your review. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Partial review, will continue tomorrow.
mypy/dmypy_server.py
Outdated
@@ -196,6 +196,7 @@ def serve(self) -> None: | |||
|
|||
def free_global_state(self) -> None: | |||
TypeState.reset_all_subtype_caches() | |||
TypeState.reset_protocol_deps() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These who functions could be put in a function somewhere, since the two lines are repeated in mypy.build
-- something like reset_global_state
. It could be module-level function in mypy.typestate, and if we ever have multiple modules with global state, we can just move the function to another module.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, I will make it a global function in typestate
.
mypy/build.py
Outdated
"""Read and validate protocol dependencies cache.""" | ||
proto_meta, proto_cache = get_protocol_deps_cache_name(manager) | ||
try: | ||
with open(proto_meta, 'r') as f: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use file system cache here (and elsewhere in this function)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually after doing this I am not sure this will give any speed-up. Protocol deps are currently read only once when we load the cache. So this is rather just a consistency question (to use the same API everywhere) rather than performance.
mypy/subtypes.py
Outdated
@@ -392,6 +392,7 @@ def f(self) -> A: ... | |||
as well. | |||
""" | |||
assert right.type.is_protocol | |||
TypeState.record_protocol_subtype_check(left.type, right.type) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add short comment explaining why we do this ("We need to record this check to generate fine-grained dependencies." or something).
mypy/typestate.py
Outdated
# We also snapshot protocol members of the above protocols. | ||
_checked_against_members = {} # type: Dict[TypeInfo, Set[str]] | ||
# TypeInfos that has been reprocessed since latest dependency snapshot update. | ||
_reprocessed_types = set() # type: Set[TypeInfo] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The term 'reprocessed' is a bit unclear here. Can you spell out what it means exactly? Does it only apply to fine-grained incremental mode?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it only apply to fine-grained incremental mode?
Essentially yes, this is is an optimization to avoid gathering dependencies from all registered types, but only from those re-checked during last update. For full build, this simply contains all registered types, we can't optimize this.
…l proto deps update; attempt at fixing bytes/str on Python 3.4
@JukkaL Thanks for review! I addressed all your comments and also made few minor updates (for better separation between normal deps and protocol deps, now they are mixed only immediately before using, or dumping in a test). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the next batch of my code review, will continue tomorrow.
mypy/build.py
Outdated
error = True | ||
if error: | ||
manager.errors.set_file(_get_prefix(manager), None) | ||
manager.errors.report(0, 0, "Error writing protocol dependencies cache.", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Style nit: error messages don't generally have a trailing period.
|
||
Serialize protocol dependencies map for fine grained mode. Also take the snapshot | ||
of current sources to later check consistency between protocol cache and individual | ||
cache files. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Document which dependencies are included in proto_deps
(there are three different kinds which might plausibly be included).
mypy/build.py
Outdated
|
||
def read_protocol_cache(manager: BuildManager, | ||
graph: Graph) -> Optional[Dict[str, Set[str]]]: | ||
"""Read and validate protocol dependencies cache.""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Document which dependencies are read (out of the three kinds), or just add a reference to write_protocol_deps_cache
if that contains the description.
mypy/build.py
Outdated
proto_meta, proto_cache = get_protocol_deps_cache_name(manager) | ||
try: | ||
data = manager.fscache.read(proto_meta).decode() | ||
manager.trace('Proto meta {}'.format(data.rstrip())) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Style nit: Move this and the following line out of the try statement. Generally it's better to have only the potentially failing statement within the try block for clarity (and to avoid catching unanticipated exceptions).
mypy/build.py
Outdated
data = manager.fscache.read(proto_meta).decode() | ||
manager.trace('Proto meta {}'.format(data.rstrip())) | ||
meta_snapshot = json.loads(data) | ||
except IOError: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe handle unicode decode errors here as well?
|
||
Since these dependencies represent a global state of the program, they | ||
are serialized per program, not per module, and the corresponding files | ||
live at the root of the cache folder for a given Python version. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe document that returns tuple (meta file path, data file path) and describe what's included in the meta file.
mypy/build.py
Outdated
# after we loaded cache for whole graph. The `read_protocol_cache` will also validate | ||
# the protocol cache against the loaded individual cache files. | ||
TypeState.proto_deps = read_protocol_cache(manager, graph) | ||
if TypeState.proto_deps is None and manager.stats.get('fresh_trees', 0): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe write the latter condition could as manager.stats.get(..., 0) > 0
to be more explicit? Also mention this in the comment, since it looked a bit confusing at first. Also, it could be helpful to document the stats
attribute (in BuildManager
, not here) so that it's clear that we rely on the stats for correctness.
mypy/build.py
Outdated
TypeState.proto_deps = read_protocol_cache(manager, graph) | ||
if TypeState.proto_deps is None and manager.stats.get('fresh_trees', 0): | ||
manager.errors.set_file(_get_prefix(manager), None) | ||
manager.errors.report(0, 0, "Error reading protocol dependencies cache. Aborting.", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again, leave out the final period (maybe like this: "Error reading protocol dependencies cache -- aborting").
mypy/build.py
Outdated
# processed the whole graph. | ||
TypeState.update_protocol_deps() | ||
if TypeState.proto_deps is not None: | ||
write_protocol_deps_cache(TypeState.proto_deps, manager, graph) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For consistency, we probably shouldn't write any cache files in fine-grained incremental mode.
mypy/server/update.py
Outdated
|
||
The first item in the return tuple is a list of deferred nodes that | ||
needs to be reprocessed. If the target represents a TypeInfo corresponding | ||
to a protocol. Return it as a second item in the return tuple, otherwise None. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Grammar: Should be "... to a protocol, return it as ..."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here's another batch of comments -- mostly requests for more comments/docstrings.
mypy/typestate.py
Outdated
|
||
|
||
def reset_global_state() -> None: | ||
"""Reset all existing global states. Currently they are all in this module.""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't yet all reset all global state -- at least strict optional status is not reset, and we use functools.lru_cache
. Maybe reword as "Reset most ..." and discuss known deviations in the docstring.
Grammar nit: use state instead of states.
# | ||
# TODO: Some __* names cause mistriggers. Fix the underlying issue instead of | ||
# special casing them here. | ||
diff.add(id + WILDCARD_TAG) | ||
break | ||
if item.count('.') > package_nesting_level + 1: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add comment about this -- these are for changes within classes, used by protocols?
mypy/server/update.py
Outdated
@@ -665,12 +669,14 @@ def calculate_active_triggers(manager: BuildManager, | |||
# Activate catch-all wildcard trigger for top-level module changes (used for | |||
# "from m import *"). This also gets triggered by changes to module-private | |||
# entries, but as these unneeded dependencies only result in extra processing, | |||
# it's a minor problem. | |||
# it's a minor problem. Also used by protocols. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The "Also used by protocols." doesn't refer to top-level module changes, so this comment seems a little confusing, since this particular things doesn't seem to be used by protocols?
mypy/server/deps.py
Outdated
@@ -235,6 +233,13 @@ def process_type_info(self, info: TypeInfo) -> None: | |||
self.add_type_dependencies(info.typeddict_type, target=make_trigger(target)) | |||
if info.declared_metaclass: | |||
self.add_type_dependencies(info.declared_metaclass, target=make_trigger(target)) | |||
if info.is_protocol: | |||
for base_info in info.mro[:-1]: | |||
self.add_dependency(make_wildcard_trigger(base_info.fullname()), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Worth adding a clarifying comment here. Protocols a class is compatible with are often not in the MRO, so it's worth discussing this as well -- actually, is this even necessary?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
6 tests fail if I only add this for the class but not MRO. The idea is that we can have subprotocols, so that an actual interface is determined by the whole MRO, if something is changed in an explicit superprotocol we need to invalidate the protocol. A simple example is [case testInvalidateProtocolViaSuperClass]
. There are other ways how to treat this (like including protocol members in a class snapshot), but it seems to me this one is conceptually simplest and the most similar to how no-protocol classes are treated.
@@ -2211,6 +2208,18 @@ def get_containing_type_info(self, name: str) -> 'Optional[TypeInfo]': | |||
return cls | |||
return None | |||
|
|||
@property | |||
def protocol_members(self) -> List[str]: | |||
# Protocol members are names of all attributes/methods defined in a protocol |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this return an empty list if the class is not a protocol, similar to how this was defined earlier?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would say it is not necessary, but I see no particular danger in this.
mypy/typestate.py
Outdated
# a dependency snapshot at the very end, so this set will contain all type-checked | ||
# TypeInfos. After a fine grained update however, we can gather new dependencies only | ||
# from few TypeInfos that were type-checked during this update, because these are | ||
# the only that can generate new protocol dependencies. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment was a bit unclear. Which dependencies does the comment refer to (of the three kinds)? It's not clear why only type-checked TypeInfos can generate new dependencies, since elsewhere we've established that dependencies are a global property of a program. Please elaborate a bit. What does type-checked mean exactly here? It could plausibly mean, for example, some of 1) triggered ClsName
2) triggered <ClsName>
3) file containing the class was modified 4) the target containing the class definition was processed -- but I suspect it means none of these.
Style/grammar nits: "from few TypeInfos" -> "from the (typically) few TypeInfos". "are the only" -> "are the only ones"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not clear why only type-checked TypeInfos can generate new dependencies, since elsewhere we've established that dependencies are a global property of a program
I don't think there is a contradiction, because it is not the TypeInfo
that generates the new deps, but the re-checked assignment or a function call (i.e. something where is_subtype
was called while re-checking a target). This assignment may be in a module which is neither a module defining protocol, nor defining the implementation. The TypeInfo
is just a way to temporary store the new dependency, so yes this is none of 1-4 you mention. I will try to clarify this.
mypy/typestate.py
Outdated
|
||
@classmethod | ||
def reset_protocol_deps(cls) -> None: | ||
cls.proto_deps = {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mention when this is expected to be run in a docstring.
mypy/typestate.py
Outdated
|
||
The first kind is generated immediately per-module in deps.py. While two other kinds are | ||
generated here after all modules are type checked anf we have recorded all the subtype | ||
checks. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you give examples where each of these dependency kinds are needed to produce correct type checking results?
mypy/typestate.py
Outdated
""" | ||
if cls.proto_deps is None: | ||
# Unsuccesful cache loading, nothing to do. | ||
return |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is silently failing here the right thing to do? Should we instead assert that cls.proto_deps
is not-None
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you are right, we shouldn't call this after an unsuccessful cache load, I will change to an assert here and below.
mypy/typestate.py
Outdated
in its __init__. | ||
""" | ||
cls.update_protocol_deps() # just in case | ||
if TypeState.proto_deps is None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar to above, silently failing seems kind of suspicious. I'd rather perform the check in the call site perhaps, and only when it's actually safe to do so.
@JukkaL I addressed the previous round of comments. I abandoned the idea of using the |
@JukkaL Thanks! I fixed all the remaining comments. You can continue reviewing. |
Currently we don't use the fscache for any of the cache files. We only use it for source files. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is my final batch of comments. A lot of thought went into this PR -- sorry for taking so many iterations to review this! This seems almost good to merge (some comments can be addressed later in separate PRs) but the caching issue we discussed offline needs to be addressed first.
|
||
|
||
def _cache_dir_prefix(manager: BuildManager, id: Optional[str] = None) -> str: | ||
"""Get current cache directory (or file if id is given).""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm the new name doesn't work with a non-None
id
argument, since it won't be a directory prefix then? I'd recommended having another function for the file prefix that calls _cache_dir_prefix
. (It's okay to do this in a separate PR.)
mypy/typestate.py
Outdated
# a value of type a.A to a function expecting something compatible with Iterable, we'd have | ||
# 'a.A' -> {'__iter__', ...} in the map. This map is also flushed after every incremental | ||
# update. This map is needed to only generate dependencies like <a.A.__iter__> -> <a.A> | ||
# instead of a wildcard to avoid unnecessary invalidating classes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Grammar nit: unnecessary -> unnecessarily
mypy/typestate.py
Outdated
# update. This map is needed to only generate dependencies like <a.A.__iter__> -> <a.A> | ||
# instead of a wildcard to avoid unnecessary invalidating classes. | ||
_checked_against_members = {} # type: Dict[str, Set[str]] | ||
# TypeInfos that appeared as a left type (subtype) in a subytpe check since latest |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typo: subytpe
mypy/typestate.py
Outdated
# run we only take a dependency snapshot at the very end, so this set will contain all | ||
# subtype-checked TypeInfos. After a fine grained update however, we can gather only new | ||
# dependencies generated from (typically) few TypeInfos that were subtype-checked | ||
# (i.e. appeared as r.h.snin an assignment or an argument in a function call in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typo: r.h.sin
return deps | ||
|
||
@classmethod | ||
def update_protocol_deps(cls, second_map: Optional[Dict[str, Set[str]]] = None) -> None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe the interface or the implementation could be simplified -- this was a bit unclear. One idea would be to split this into a few different methods:
flush_protocol_deps()
would do_snapshot_protocol_deps
+ clear the attributes.add_protocol_deps(deps, to_add)
would addto_add
deps to an existing dependency map (corresponds to each of the loops in this method).
This method would then mostly contain calls to the two methods above, making the structure easier to see.
It's okay to do this refactoring in a separate PR.
import b | ||
class C: | ||
x: int | ||
x: b.P = C() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
C()
depends on the (abstract)
attribute, so a potentially slightly more narrowly targeted test would use a variable with type C
instead of calling C()
, maybe not in this case but somewhere where attributes are added to a protocol, for example.
test-data/unit/fine-grained.test
Outdated
def f() -> None: | ||
def g(x: b.P) -> None: | ||
pass | ||
g(C()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The note above seems to apply here.
x: str | ||
[out] | ||
== | ||
a.py:8: error: Argument 1 to "g" has incompatible type "C"; expected "P" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if all these test case variants are worth it. This seems pretty similar to what we had above?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I remember most of these added for a reason plus few variations in target, but I don't remember now which exactly were just variations. I think there shouldn't be many, if you are sure about certain test cases, then I can delete them.
test-data/unit/fine-grained.test
Outdated
a.py:8: note: Following member(s) of "C" have conflicts: | ||
a.py:8: note: x: expected "int", got "str" | ||
|
||
[case testProtocolConcreteUpdateTypeMetodGeneric] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typo in test case name: Metod
test-data/unit/fine-grained.test
Outdated
a.py:2: note: Following member(s) of "C" have conflicts: | ||
a.py:2: note: z: expected "int", got "str" | ||
|
||
[case testWeAreCarefullWithBuiltinProtocols] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typo: Carefull.
I think all issues are addressed now and the follow-up issues are opened, so I am going to merge this now. Thanks @JukkaL for thorough review! |
Support of protocols in fine grained mode is non-trivial because protocols are not very "modular" (can have action at a distance). To support them we introduce several new dependencies and a global state that represents protocol dependency map (serialized in a separate cache file).
There are three kinds of protocol dependencies. For example, after a subtype check:
the following dependencies will be generated:
The first kind is generated immediately per-module. While two other kinds are generated
after all modules are type checked and we have recorded all the subtype checks.
Another change is that we reset subtype caches in protocols scheduled for recheck before processing any other targets. This avoids some false negatives in corner cases that are sensitive to module processing order.
I didn't test the scenario when the protocol cache files are tampered with, it is also hard to write any tests for this.
UPDATE: edited description to reflect the current state of PR.