Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support protocols in fine grained mode #4790

Merged
merged 54 commits into from May 3, 2018
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
54 commits
Select commit Hold shift + click to select a range
ca018df
Track protocol dependencies; update deps tests
Mar 22, 2018
f218a5a
Add few protocol deps tests
Mar 22, 2018
8b7d40b
Start adding tests
Mar 22, 2018
a7031e8
Add dedicated protocol deps; cache writing still too soon, plus some …
Mar 22, 2018
1230d91
Reset protocol subtype cache; postpone protocol deps calculation; wri…
Mar 23, 2018
63ad42d
Some ideas on better reseting protocol caches
Mar 24, 2018
11453c5
Write protocol cache
Mar 25, 2018
331985e
Add few tests
Mar 25, 2018
bccfbc5
More subtype cache invalidation
Mar 25, 2018
a0e32ad
Add more basic tests
Mar 26, 2018
4ca0afe
Add more tests
Mar 26, 2018
bce5f5d
Add more tests
Mar 26, 2018
dd541b0
Add one more test
Mar 26, 2018
60433e0
Remove debugging statements
Mar 26, 2018
9c3a46e
Update one test
Mar 26, 2018
281659b
Invalidate protocol deps cache if tampered with
Mar 26, 2018
29ea7a8
Be carefull with deleted/added files
Mar 26, 2018
156a477
Add some comments
Mar 26, 2018
081552a
Minor fixes with tests (this doesn't address CR)
Mar 26, 2018
df72a03
Address some review comments
Apr 3, 2018
c20878a
Merge remote-tracking branch 'upstream/master' into fine-protocols
Apr 3, 2018
f565c8f
Test built-in protocols with fine-grained.test which matches the full…
Apr 3, 2018
2cd62af
Address more review comments
Apr 3, 2018
3934a8e
Add protocol diff tests
Apr 3, 2018
276564a
Start splitting protocol dependencies separately
Apr 3, 2018
2604067
Split the protocol deps
Apr 3, 2018
531b75e
Some code cleanup; it looks like we can avoid recalculating protocol …
Apr 3, 2018
2d57038
Fix indentation
Apr 4, 2018
5da0aeb
Make protocol_members a property
Apr 4, 2018
608e6e1
Abandon the idea of two priorities, just reset all caches in protocol…
Apr 4, 2018
257f7ae
Minor fixes
Apr 4, 2018
1e29ecc
Merge remote-tracking branch 'upstream/master' into fine-protocols
Apr 4, 2018
17d7db2
Undo some unnecessary formatting changes and other minor clean-ups
Apr 4, 2018
15bd91f
Re-order functions in build in more logical order; store state on sub…
Apr 4, 2018
ecaaeb1
Move one property; address last review comment.
Apr 4, 2018
544d135
Merge remote-tracking branch 'upstream/master' into fine-protocols
Apr 24, 2018
3045ad3
Fix merge
Apr 24, 2018
39ffd04
Move protocol deps to TypeState
Apr 25, 2018
ed1b8f6
Minor clean-ups
Apr 25, 2018
95f13fa
Merge remote-tracking branch 'upstream/master' into fine-protocols
Apr 25, 2018
10a9454
Fix merge
Apr 25, 2018
d136a27
Address review comments
Apr 25, 2018
74b3e7f
Better separation between normal deps and proto deps; more incrementa…
Apr 25, 2018
2fbb5f1
Make snapshot_protocol_deps private; improve docstring
Apr 25, 2018
234ad3c
Merge remote-tracking branch 'upstream/master' into fine-protocols
May 1, 2018
88799dd
Update tests
May 1, 2018
7209bf1
Address review comments
May 2, 2018
cd58a4f
Abandon fscache for meta files, it doesn't work correctly for meta files
May 2, 2018
8ece521
Add comments and docstrings; flush intermediate state
May 2, 2018
c146bcb
Hash on full names instead of TypeInfo instances
May 2, 2018
a4e120c
Add remaining requested comments and docstrings
May 2, 2018
c565769
Address review comments (except for those in follow-up issues)
May 3, 2018
fc680d9
Remove protocol restrictions from the docs
May 3, 2018
10f6a88
Remove irrelevant TODO
May 3, 2018
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
2 changes: 0 additions & 2 deletions docs/source/mypy_daemon.rst
Expand Up @@ -88,8 +88,6 @@ command-specific options.
Limitations
***********

* Changes related to protocol classes are not reliably propagated.

* You have to use either the ``--follow-imports=skip`` or
the ``--follow-imports=error`` option because of an implementation
limitation. This can be defined
Expand Down
144 changes: 131 additions & 13 deletions mypy/build.py
Expand Up @@ -58,7 +58,7 @@
from mypy.defaults import PYTHON3_VERSION_MIN
from mypy.server.deps import get_dependencies
from mypy.fscache import FileSystemCache
from mypy.typestate import TypeState
from mypy.typestate import TypeState, reset_global_state


# Switch to True to produce debug output related to fine-grained incremental
Expand Down Expand Up @@ -280,7 +280,7 @@ def _build(sources: List[BuildSource],
flush_errors=flush_errors,
fscache=fscache)

TypeState.reset_all_subtype_caches()
reset_global_state()
try:
graph = dispatch(sources, manager)
if not options.fine_grained_incremental:
Expand Down Expand Up @@ -609,7 +609,9 @@ class BuildManager:
but is disabled if fine-grained cache loading fails
and after an initial fine-grained load. This doesn't
determine whether we write cache files or not.
stats: Dict with various instrumentation numbers
stats: Dict with various instrumentation numbers, it is used
not only for debugging, but also required for correctness,
in particular to check consistency of the protocol dependency cache.
fscache: A file system cacher
"""

Expand Down Expand Up @@ -962,6 +964,96 @@ def verify_module(fscache: FileSystemCache, id: str, path: str) -> bool:
return True


def write_protocol_deps_cache(proto_deps: Dict[str, Set[str]],
manager: BuildManager, graph: Graph) -> None:
"""Write cache files for protocol dependencies.

Serialize protocol dependencies map for fine grained mode. Also take the snapshot
of current sources to later check consistency between protocol cache and individual
cache files.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Document which dependencies are included in proto_deps (there are three different kinds which might plausibly be included).


Out of three kinds of protocol dependencies described in TypeState._snapshot_protocol_deps,
only the last two kinds are stored in global protocol caches, dependencies of the first kind
(i.e. <SuperProto[wildcard]>, <Proto[wildcard]> -> <Proto>) are written to the normal
per-file fine grained dependency caches.
"""
proto_meta, proto_cache = get_protocol_deps_cache_name(manager)
meta_snapshot = {} # type: Dict[str, str]
error = False
for id in graph:
meta_snapshot[id] = graph[id].source_hash
if not atomic_write(proto_meta, json.dumps(meta_snapshot), '\n'):
manager.log("Error writing protocol meta JSON file {}".format(proto_cache))
error = True
listed_proto_deps = {k: list(v) for (k, v) in proto_deps.items()}
if not atomic_write(proto_cache, json.dumps(listed_proto_deps), '\n'):
manager.log("Error writing protocol deps JSON file {}".format(proto_cache))
error = True
if error:
manager.errors.set_file(_cache_dir_prefix(manager), None)
manager.errors.report(0, 0, "Error writing protocol dependencies cache",
blocker=True)


def read_protocol_cache(manager: BuildManager,
graph: Graph) -> Optional[Dict[str, Set[str]]]:
"""Read and validate protocol dependencies cache.

See docstring for write_protocol_cache for details about which kinds of
dependencies are read.
"""
proto_meta, proto_cache = get_protocol_deps_cache_name(manager)
meta_snapshot = _load_json_file(proto_meta, manager,
log_sucess='Proto meta ',
log_error='Could not load protocol metadata: ')
if meta_snapshot is None:
return None
current_meta_snapshot = {} # type: Dict[str, str]
for id in graph:
meta = graph[id].meta
assert meta is not None, 'Protocol cache should be read after all other'
current_meta_snapshot[id] = meta.hash
common = set(meta_snapshot.keys()) & set(current_meta_snapshot.keys())
if any(meta_snapshot[id] != current_meta_snapshot[id] for id in common):
# TODO: invalidate also if options changed (like --strict-optional)?
manager.log('Protocol cache inconsistent, ignoring')
return None
deps = _load_json_file(proto_cache, manager,
log_sucess='Proto deps ',
log_error='Could not load protocol cache: ')
if deps is None:
return None
if not isinstance(deps, dict):
manager.log('Could not load protocol cache: cache is not a dict: {}'
.format(type(deps)))
return None
return {k: set(v) for (k, v) in deps.items()}


def _load_json_file(file: str, manager: BuildManager,
log_sucess: str, log_error: str) -> Optional[Dict[str, Any]]:
"""A simple helper to read a JSON file with logging."""
try:
with open(file, 'r') as f:
data = f.read()
except IOError:
manager.log(log_error + file)
return None
manager.trace(log_sucess + data.rstrip())
result = json.loads(data) # TODO: Errors
return result


def _cache_dir_prefix(manager: BuildManager, id: Optional[str] = None) -> str:
"""Get current cache directory (or file if id is given)."""
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm the new name doesn't work with a non-None id argument, since it won't be a directory prefix then? I'd recommended having another function for the file prefix that calls _cache_dir_prefix. (It's okay to do this in a separate PR.)

cache_dir = manager.options.cache_dir
pyversion = manager.options.python_version
base = os.path.join(cache_dir, '%d.%d' % pyversion)
if id is None:
return base
return os.path.join(base, *id.split('.'))


def get_cache_names(id: str, path: str, manager: BuildManager) -> Tuple[str, str, Optional[str]]:
"""Return the file names for the cache files.

Expand All @@ -975,9 +1067,7 @@ def get_cache_names(id: str, path: str, manager: BuildManager) -> Tuple[str, str
A tuple with the file names to be used for the meta JSON, the
data JSON, and the fine-grained deps JSON, respectively.
"""
cache_dir = manager.options.cache_dir
pyversion = manager.options.python_version
prefix = os.path.join(cache_dir, '%d.%d' % pyversion, *id.split('.'))
prefix = _cache_dir_prefix(manager, id)
is_package = os.path.basename(path).startswith('__init__.py')
if is_package:
prefix = os.path.join(prefix, '__init__')
Expand All @@ -988,6 +1078,20 @@ def get_cache_names(id: str, path: str, manager: BuildManager) -> Tuple[str, str
return (prefix + '.meta.json', prefix + '.data.json', deps_json)


def get_protocol_deps_cache_name(manager: BuildManager) -> Tuple[str, str]:
"""Return file names for fine grained protocol dependencies cache.

Since these dependencies represent a global state of the program, they
are serialized per program, not per module, and the corresponding files
live at the root of the cache folder for a given Python version.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe document that returns tuple (meta file path, data file path) and describe what's included in the meta file.

Return a tuple ('meta file path', 'data file path'), where the meta file
contains hashes of all source files at the time the protocol dependencies
were written, and data file contains the protocol dependencies.
"""
name = os.path.join(_cache_dir_prefix(manager), 'proto_deps')
return name + '.meta.json', name + '.data.json'


def find_cache_meta(id: str, path: str, manager: BuildManager) -> Optional[CacheMeta]:
"""Find cache data for a module.

Expand All @@ -1003,13 +1107,10 @@ def find_cache_meta(id: str, path: str, manager: BuildManager) -> Optional[Cache
# TODO: May need to take more build options into account
meta_json, data_json, deps_json = get_cache_names(id, path, manager)
manager.trace('Looking for {} at {}'.format(id, meta_json))
try:
with open(meta_json, 'r') as f:
meta_str = f.read()
manager.trace('Meta {} {}'.format(id, meta_str.rstrip()))
meta = json.loads(meta_str) # TODO: Errors
except IOError:
manager.log('Could not load cache for {}: could not find {}'.format(id, meta_json))
meta = _load_json_file(meta_json, manager,
log_sucess='Meta {} '.format(id),
log_error='Could not load cache for {}: '.format(id))
if meta is None:
return None
if not isinstance(meta, dict):
manager.log('Could not load cache for {}: meta cache is not a dict: {}'
Expand Down Expand Up @@ -2237,8 +2338,25 @@ def dispatch(sources: List[BuildSource], manager: BuildManager) -> Graph:
# just want to load in all of the cache information.
if manager.use_fine_grained_cache():
process_fine_grained_cache_graph(graph, manager)
# Fine grained protocol dependencies are serialized separately, so we read them
# after we loaded cache for whole graph. The `read_protocol_cache` will also validate
# the protocol cache against the loaded individual cache files.
TypeState.proto_deps = read_protocol_cache(manager, graph)
if TypeState.proto_deps is None and manager.stats.get('fresh_trees', 0) > 0:
# There were some cache files read, but no protocol dependencies loaded.
manager.errors.set_file(_cache_dir_prefix(manager), None)
manager.errors.report(0, 0, "Error reading protocol dependencies cache -- aborting",
blocker=True)
else:
process_graph(graph, manager)
if manager.options.cache_fine_grained or manager.options.fine_grained_incremental:
# If we are running a daemon or are going to write cache for further fine grained use,
# then we need to collect fine grained protocol dependencies.
# Since these are a global property of the program, they are calculated after we
# processed the whole graph.
TypeState.update_protocol_deps()
if TypeState.proto_deps is not None and not manager.options.fine_grained_incremental:
write_protocol_deps_cache(TypeState.proto_deps, manager, graph)

if manager.options.dump_deps:
# This speeds up startup a little when not using the daemon mode.
Expand Down
9 changes: 3 additions & 6 deletions mypy/dmypy_server.py
Expand Up @@ -26,7 +26,7 @@
from mypy.fscache import FileSystemCache
from mypy.fswatcher import FileSystemWatcher, FileData
from mypy.options import Options
from mypy.typestate import TypeState
from mypy.typestate import reset_global_state


MEM_PROFILE = False # If True, dump memory profile after initialization
Expand Down Expand Up @@ -151,7 +151,7 @@ def serve(self) -> None:
conn, addr = sock.accept()
except socket.timeout:
print("Exiting due to inactivity.")
self.free_global_state()
reset_global_state()
sys.exit(0)
try:
data = receive(conn)
Expand Down Expand Up @@ -182,7 +182,7 @@ def serve(self) -> None:
conn.close()
if command == 'stop':
sock.close()
self.free_global_state()
reset_global_state()
sys.exit(0)
finally:
os.unlink(STATUS_FILE)
Expand All @@ -192,9 +192,6 @@ def serve(self) -> None:
if exc_info[0] and exc_info[0] is not SystemExit:
traceback.print_exception(*exc_info) # type: ignore

def free_global_state(self) -> None:
TypeState.reset_all_subtype_caches()

def create_listening_socket(self) -> socket.socket:
"""Create the socket and set it up for listening."""
self.sockname = os.path.abspath(SOCKET_NAME)
Expand Down
17 changes: 12 additions & 5 deletions mypy/nodes.py
Expand Up @@ -2065,9 +2065,6 @@ class is generic then it will be a type constructor of higher kind.
is_protocol = False # Is this a protocol class?
runtime_protocol = False # Does this protocol support isinstance checks?
abstract_attributes = None # type: List[str]
# Protocol members are names of all attributes/methods defined in a protocol
# and in all its supertypes (except for 'object').
protocol_members = None # type: List[str]

# The attributes 'assuming' and 'assuming_proper' represent structural subtype matrices.
#
Expand Down Expand Up @@ -2211,6 +2208,18 @@ def get_containing_type_info(self, name: str) -> 'Optional[TypeInfo]':
return cls
return None

@property
def protocol_members(self) -> List[str]:
# Protocol members are names of all attributes/methods defined in a protocol
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this return an empty list if the class is not a protocol, similar to how this was defined earlier?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would say it is not necessary, but I see no particular danger in this.

# and in all its supertypes (except for 'object').
members = set() # type: Set[str]
assert self.mro, "This property can be only acessed after MRO is (re-)calculated"
for base in self.mro[:-1]: # we skip "object" since everyone implements it
if base.is_protocol:
for name in base.names:
members.add(name)
return sorted(list(members))

def __getitem__(self, name: str) -> 'SymbolTableNode':
n = self.get(name)
if n:
Expand Down Expand Up @@ -2331,7 +2340,6 @@ def serialize(self) -> JsonDict:
'names': self.names.serialize(self.fullname()),
'defn': self.defn.serialize(),
'abstract_attributes': self.abstract_attributes,
'protocol_members': self.protocol_members,
'type_vars': self.type_vars,
'bases': [b.serialize() for b in self.bases],
'mro': [c.fullname() for c in self.mro],
Expand All @@ -2357,7 +2365,6 @@ def deserialize(cls, data: JsonDict) -> 'TypeInfo':
ti._fullname = data['fullname']
# TODO: Is there a reason to reconstruct ti.subtypes?
ti.abstract_attributes = data['abstract_attributes']
ti.protocol_members = data['protocol_members']
ti.type_vars = data['type_vars']
ti.bases = [mypy.types.Instance.deserialize(b) for b in data['bases']]
ti._promote = (None if data['_promote'] is None
Expand Down
12 changes: 0 additions & 12 deletions mypy/semanal_pass3.py
Expand Up @@ -139,8 +139,6 @@ def visit_class_def(self, tdef: ClassDef) -> None:
if tdef.info.mro:
tdef.info.mro = [] # Force recomputation
mypy.semanal.calculate_class_mro(tdef, self.fail_blocker)
if tdef.info.is_protocol:
add_protocol_members(tdef.info)
if tdef.analyzed is not None:
# Also check synthetic types associated with this ClassDef.
# Currently these are TypedDict, and NamedTuple.
Expand Down Expand Up @@ -487,16 +485,6 @@ def builtin_type(self, name: str, args: Optional[List[Type]] = None) -> Instance
return Instance(node, [any_type] * len(node.defn.type_vars))


def add_protocol_members(typ: TypeInfo) -> None:
members = set() # type: Set[str]
if typ.mro:
for base in typ.mro[:-1]: # we skip "object" since everyone implements it
if base.is_protocol:
for name in base.names:
members.add(name)
typ.protocol_members = sorted(list(members))


def is_identity_signature(sig: Type) -> bool:
"""Is type a callable of form T -> T (where T is a type variable)?"""
if isinstance(sig, CallableType) and sig.arg_kinds == [ARG_POS]:
Expand Down
1 change: 1 addition & 0 deletions mypy/server/astdiff.py
Expand Up @@ -190,6 +190,7 @@ def snapshot_definition(node: Optional[SymbolNode],
elif isinstance(node, TypeInfo):
attrs = (node.is_abstract,
node.is_enum,
node.is_protocol,
node.fallback_to_any,
node.is_named_tuple,
node.is_newtype,
Expand Down
22 changes: 19 additions & 3 deletions mypy/server/deps.py
Expand Up @@ -104,6 +104,7 @@ class 'mod.Cls'. This can also refer to an attribute inherited from a
from mypy.server.trigger import make_trigger, make_wildcard_trigger
from mypy.util import correct_relative_import
from mypy.scope import Scope
from mypy.typestate import TypeState


def get_dependencies(target: MypyFile,
Expand Down Expand Up @@ -165,9 +166,6 @@ def __init__(self,
self.is_class = False
self.is_package_init_file = False

# TODO (incomplete):
# protocols

def visit_mypy_file(self, o: MypyFile) -> None:
self.scope.enter_file(o.fullname())
self.is_package_init_file = o.is_package_init_file()
Expand Down Expand Up @@ -235,6 +233,23 @@ def process_type_info(self, info: TypeInfo) -> None:
self.add_type_dependencies(info.typeddict_type, target=make_trigger(target))
if info.declared_metaclass:
self.add_type_dependencies(info.declared_metaclass, target=make_trigger(target))
if info.is_protocol:
for base_info in info.mro[:-1]:
# We add dependencies from whole MRO to cover explicit subprotocols.
# For example:
#
# class Super(Protocol):
# x: int
# class Sub(Super, Protocol):
# y: int
#
# In this example we add <Super[wildcard]> -> <Sub>, to invalidate Sub if
# a new member is added to Super.
self.add_dependency(make_wildcard_trigger(base_info.fullname()),
target=make_trigger(target))
# More protocol dependencies are collected in TypeState._snapshot_protocol_deps
# after a full run or update is finished.

self.add_type_alias_deps(self.scope.current_target())
for name, node in info.names.items():
if isinstance(node.node, Var):
Expand Down Expand Up @@ -810,6 +825,7 @@ def dump_all_dependencies(modules: Dict[str, MypyFile],
deps = get_dependencies(node, type_map, python_version)
for trigger, targets in deps.items():
all_deps.setdefault(trigger, set()).update(targets)
TypeState.add_all_protocol_deps(all_deps)

for trigger, targets in sorted(all_deps.items(), key=lambda x: x[0]):
print(trigger)
Expand Down