Skip to content

Commit

Permalink
[engine] Convert all isinstance product checks to using Exactly type …
Browse files Browse the repository at this point in the history
…constraints

The existing type checking required coercion rules in order to ensure that product types that are produced by rules that declare they produce their super class would work.

This patch removes the sub/superclassing in favor of using the Exactly type constraint for type checking.

To do so, it adds a constraint method on SymbolTables that returns a type constraint defining what types can come out of using the symbol table. That constraint is then used in rule declarations that either produce or consume parsed products.

Other elements
- TaskNode now contains the rule it represents. That way it can use the rule components without recreating them and its repr now contains the rule's repr.
- `collect_item_of_type` is inlined back into `SelectNode#select_literal`. Removing the coercion rule class made extracting this unnecessary
- settled on `input_selectors` as the name for the collection of selectors used by a rule
- removed super / sub type checking in rule validation as it is no longer necessary
- renamed and moved intrinsic node factories to the rule module

Testing Done:
I've been testing with the engine tests locally. Additionally I've been profiling running `./pants list ::` with the engine enabled. It's slightly slower right now, but the removal of the coercion rules reduced the number of nodes in the graph by 5%. With further optimization of scheduling this enables, I think we can get to a better place.

```

(master)$ for i in 1 2 3; do ./pants --enable-v2-engine list :: -ldebug 2>&1 | grep 'scheduling iterations'; done
DEBUG] ran 152 scheduling iterations, 21135 runnables, and 62902 steps in 6.560897 seconds. there are 39158 total nodes.
DEBUG] ran 2 scheduling iterations, 2 runnables, and 23 steps in 0.001659 seconds. there are 39173 total nodes.
DEBUG] ran 0 scheduling iterations, 0 runnables, and 0 steps in 0.000002 seconds. there are 39173 total nodes.
DEBUG] ran 152 scheduling iterations, 21135 runnables, and 62902 steps in 6.554824 seconds. there are 39158 total nodes.
DEBUG] ran 2 scheduling iterations, 2 runnables, and 23 steps in 0.001486 seconds. there are 39173 total nodes.
DEBUG] ran 0 scheduling iterations, 0 runnables, and 0 steps in 0.000001 seconds. there are 39173 total nodes.
DEBUG] ran 152 scheduling iterations, 21135 runnables, and 62902 steps in 6.748163 seconds. there are 39158 total nodes.
DEBUG] ran 2 scheduling iterations, 2 runnables, and 23 steps in 0.001648 seconds. there are 39173 total nodes.
DEBUG] ran 0 scheduling iterations, 0 runnables, and 0 steps in 0.000002 seconds. there are 39173 total nodes.
$ git co nhoward/typechecks_for_everyone
Switched to branch 'nhoward/typechecks_for_everyone'
Your branch is up-to-date with 'origin/nhoward/typechecks_for_everyone'.
(nhoward/typechecks_for_everyone)$ for i in 1 2 3; do ./pants --enable-v2-engine list :: -ldebug 2>&1 | grep 'scheduling iterations'; done
DEBUG] ran 151 scheduling iterations, 19791 runnables, and 60214 steps in 7.202758 seconds. there are 37814 total nodes.
DEBUG] ran 2 scheduling iterations, 2 runnables, and 23 steps in 0.001512 seconds. there are 37829 total nodes.
DEBUG] ran 0 scheduling iterations, 0 runnables, and 0 steps in 0.000002 seconds. there are 37829 total nodes.
DEBUG] ran 151 scheduling iterations, 19791 runnables, and 60214 steps in 7.058010 seconds. there are 37814 total nodes.
DEBUG] ran 2 scheduling iterations, 2 runnables, and 23 steps in 0.001441 seconds. there are 37829 total nodes.
DEBUG] ran 0 scheduling iterations, 0 runnables, and 0 steps in 0.000002 seconds. there are 37829 total nodes.
DEBUG] ran 151 scheduling iterations, 19791 runnables, and 60214 steps in 7.324950 seconds. there are 37814 total nodes.
DEBUG] ran 2 scheduling iterations, 2 runnables, and 23 steps in 0.001683 seconds. there are 37829 total nodes.
DEBUG] ran 0 scheduling iterations, 0 runnables, and 0 steps in 0.000004 seconds. there are 37829 total nodes.
```

Bugs closed: 3868

Reviewed at https://rbcommons.com/s/twitter/r/4236/

closes #3868
  • Loading branch information
baroquebobcat committed Sep 15, 2016
1 parent d30cca1 commit c96d44b
Show file tree
Hide file tree
Showing 11 changed files with 280 additions and 185 deletions.
2 changes: 1 addition & 1 deletion src/python/pants/bin/engine_initializer.py
Original file line number Diff line number Diff line change
Expand Up @@ -124,7 +124,7 @@ def setup_legacy_graph(pants_ignore_patterns,
# Create a Scheduler containing graph and filesystem tasks, with no installed goals. The
# LegacyBuildGraph will explicitly request the products it needs.
tasks = (
create_legacy_graph_tasks() +
create_legacy_graph_tasks(symbol_table_cls) +
create_fs_tasks() +
create_graph_tasks(address_mapper, symbol_table_cls)
)
Expand Down
11 changes: 3 additions & 8 deletions src/python/pants/engine/graph.py
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,6 @@
from pants.engine.fs import DirectoryListing, Files, FilesContent, Path, PathGlobs
from pants.engine.mapper import AddressFamily, AddressMap, AddressMapper, ResolveError
from pants.engine.objects import Locatable, SerializableFactory, Validatable
from pants.engine.rules import CoercionRule
from pants.engine.selectors import Select, SelectDependencies, SelectLiteral, SelectProjection
from pants.engine.struct import Struct
from pants.util.objects import datatype
Expand Down Expand Up @@ -254,11 +253,12 @@ def create_graph_tasks(address_mapper, symbol_table_cls):
:param address_mapper_key: The subject key for an AddressMapper instance.
:param symbol_table_cls: A SymbolTable class to provide symbols for Address lookups.
"""
symbol_table_constraint = symbol_table_cls.constraint()
return [
# Support for resolving Structs from Addresses
(Struct,
(symbol_table_constraint,
[Select(UnhydratedStruct),
SelectDependencies(Struct, UnhydratedStruct, field_types=(Address,))],
SelectDependencies(symbol_table_constraint, UnhydratedStruct, field_types=(Address,))],
hydrate_struct),
(UnhydratedStruct,
[SelectProjection(AddressFamily, Dir, ('spec_path',), Address),
Expand All @@ -275,11 +275,6 @@ def create_graph_tasks(address_mapper, symbol_table_cls):
[SelectLiteral(address_mapper, AddressMapper),
Select(DirectoryListing)],
filter_buildfile_paths),
] + [
# Addresses for user-defined products might possibly be resolvable from BLD files. These tasks
# define that lookup for coercing a struct into each literal product.
CoercionRule(product, Struct)
for product in set(symbol_table_cls.table().values()) if product is not Struct
] + [
# Simple spec handling.
(Addresses,
Expand Down
10 changes: 2 additions & 8 deletions src/python/pants/engine/isolated_process.py
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@
from abc import abstractproperty
from hashlib import sha1

from pants.engine.fs import Files, PathGlobs
from pants.engine.fs import Files
from pants.engine.nodes import Node, Noop, Return, Runnable, State, Throw, Waiting
from pants.engine.selectors import Select, SelectDependencies
from pants.util.contextutil import open_tar, temporary_dir, temporary_file_path
Expand Down Expand Up @@ -239,13 +239,7 @@ class SnapshotNode(datatype('SnapshotNode', ['subject', 'variants']), Node):
product = Snapshot

@classmethod
def as_intrinsics(cls):
return {(Files, Snapshot): cls.create,
(PathGlobs, Snapshot): cls.create}

@classmethod
def create(cls, subject, product_type, variants):
assert product_type == Snapshot
def create(cls, subject, variants):
return SnapshotNode(subject, variants)

def step(self, step_context):
Expand Down
15 changes: 11 additions & 4 deletions src/python/pants/engine/legacy/graph.py
Original file line number Diff line number Diff line change
Expand Up @@ -296,15 +296,22 @@ def hydrate_bundles(bundles_field, files_digest_list, excluded_files_list):
return HydratedField('bundles', bundles)


def create_legacy_graph_tasks():
def create_legacy_graph_tasks(symbol_table_cls):
"""Create tasks to recursively parse the legacy graph."""
symbol_table_constraint = symbol_table_cls.constraint()
return [
# Recursively requests the dependencies and adapted fields of TargetAdaptors, which
# will result in an eager, transitive graph walk.
(LegacyTarget,
[Select(TargetAdaptor),
SelectDependencies(LegacyTarget, TargetAdaptor, 'dependencies', field_types=(Address,)),
SelectDependencies(HydratedField, TargetAdaptor, 'field_adaptors', field_types=(SourcesField, BundlesField, ))],
[Select(symbol_table_constraint),
SelectDependencies(LegacyTarget,
symbol_table_constraint,
'dependencies',
field_types=(Address,)),
SelectDependencies(HydratedField,
symbol_table_constraint,
'field_adaptors',
field_types=(SourcesField, BundlesField,))],
reify_legacy_graph),
(HydratedField,
[Select(SourcesField),
Expand Down
84 changes: 43 additions & 41 deletions src/python/pants/engine/nodes.py
Original file line number Diff line number Diff line change
Expand Up @@ -27,29 +27,9 @@
logger = logging.getLogger(__name__)


def collect_item_of_type(product, candidate, variant_value):
"""Looks for has-a or is-a relationships between the given value and the requested product.
Returns the resulting product value, or None if no match was made.
"""
def items():
# Check whether the subject is-a instance of the product.
yield candidate
# Else, check whether it has-a instance of the product.
if isinstance(candidate, HasProducts):
for subject in candidate.products:
yield subject

# TODO: returning only the first literal configuration of a given type/variant. Need to
# define mergeability for products.
for item in items():

if not isinstance(item, product):
continue
if variant_value and not getattr(item, 'name', None) == variant_value:
continue
return item
return None
def _satisfied_by(t, o):
"""Pickleable type check function."""
return t.satisfied_by(o)


class ConflictingProducersError(Exception):
Expand Down Expand Up @@ -245,7 +225,23 @@ def _select_literal(self, candidate, variant_value):
Returns the resulting product value, or None if no match was made.
"""
return collect_item_of_type(self.product, candidate, variant_value)
def items():
# Check whether the subject is-a instance of the product.
yield candidate
# Else, check whether it has-a instance of the product.
if isinstance(candidate, HasProducts):
for subject in candidate.products:
yield subject

# TODO: returning only the first literal configuration of a given type/variant. Need to
# define mergeability for products.
for item in items():
if not self.selector.type_constraint.satisfied_by(item):
continue
if variant_value and not getattr(item, 'name', None) == variant_value:
continue
return item
return None

def step(self, step_context):
# Request default Variants for the subject, so that if there are any we can propagate
Expand Down Expand Up @@ -446,31 +442,39 @@ def step(self, step_context):
raise State.raise_unrecognized(output_state)


def _run_func_and_check_type(product_type, func, *args):
def _run_func_and_check_type(product_type, type_check, func, *args):
result = func(*args)
if result is None or isinstance(result, product_type):
if type_check(result):
return result
else:
raise ValueError('result of {} was not a {}, instead was {}'
.format(func.__name__, product_type, type(result).__name__))


class TaskNode(datatype('TaskNode', ['subject', 'variants', 'product', 'func', 'clause']), Node):
"""A Node representing execution of a non-blocking python function.
class TaskNode(datatype('TaskNode', ['subject', 'variants', 'rule']), Node):
"""A Node representing execution of a non-blocking python function contained by a TaskRule.
All dependencies of the function are declared ahead of time in the dependency `clause` of the
function, so the TaskNode will determine whether the dependencies are available before
executing the function, and provides a satisfied argument per clause entry to the function.
All dependencies of the function are declared ahead of time by the `input_selectors` of the
rule. The TaskNode will determine whether the dependencies are available before executing the
function, and provide a satisfied argument per clause entry to the function.
"""

is_cacheable = True
is_inlineable = False

@property
def product(self):
return self.rule.output_product_type

@property
def func(self):
return self.rule.task_func

def step(self, step_context):
# Compute dependencies for the Node, or determine whether it is a Noop.
dependencies = []
dep_values = []
for selector in self.clause:
for selector in self.rule.input_selectors:
dep_node = step_context.select_node(selector, self.subject, self.variants)
dep_state = step_context.get(dep_node)
if type(dep_state) is Waiting:
Expand All @@ -491,11 +495,15 @@ def step(self, step_context):
if dependencies:
return Waiting(dependencies)
# Ready to run!
return Runnable(functools.partial(_run_func_and_check_type, self.product, self.func), tuple(dep_values))
return Runnable(functools.partial(_run_func_and_check_type,
self.rule.output_product_type,
functools.partial(_satisfied_by, self.rule.constraint),
self.rule.task_func),
tuple(dep_values))

def __repr__(self):
return 'TaskNode(subject={}, product={}, variants={}, func={}, clause={}' \
.format(self.subject, self.product, self.variants, self.func.__name__, self.clause)
return 'TaskNode(subject={}, variants={}, rule={}' \
.format(self.subject, self.variants, self.rule)

def __str__(self):
return repr(self)
Expand All @@ -514,12 +522,6 @@ class FilesystemNode(datatype('FilesystemNode', ['subject', 'product', 'variants
is_cacheable = False
is_inlineable = False

@classmethod
def as_intrinsics(cls):
"""Returns a dict of tuple(sbj type, product type) -> functions returning a fs node for that subject product type tuple."""
return {(subject_type, product_type): FilesystemNode.create
for product_type, subject_type in cls._FS_PAIRS}

@classmethod
def create(cls, subject, product_type, variants):
assert (product_type, type(subject)) in cls._FS_PAIRS
Expand Down
8 changes: 8 additions & 0 deletions src/python/pants/engine/parser.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@

from abc import abstractmethod

from pants.engine.addressable import Exactly
from pants.util.meta import AbstractClass


Expand All @@ -26,6 +27,13 @@ class SymbolTable(AbstractClass):
def table(cls):
"""Returns a dict of name to implementation class."""

@classmethod
def constraint(cls):
"""Returns the typeconstraint for the symbol table"""
# NB Sort types so that multiple calls get the same tuples.
symbol_table_types = sorted(set(cls.table().values()))
return Exactly(*symbol_table_types, description='symbol table types')


class Parser(AbstractClass):
@classmethod
Expand Down

0 comments on commit c96d44b

Please sign in to comment.