New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve the performance of v2 changed. #5571

Merged
merged 7 commits into from Mar 7, 2018

Conversation

Projects
None yet
4 participants
@kwlzn
Copy link
Member

kwlzn commented Mar 7, 2018

this is good for a ~40x speedup in my test case:

before:

[omerta pants (kwlzn/faster_v2_changed)]$ time ./pants --changed-parent=HEAD~60 list > a

real    4m26.282s
user    2m59.511s
sys     1m25.936s

after:

[omerta pants (kwlzn/faster_v2_changed)]$ time ./pants --changed-parent=HEAD~60 list > b

real    0m7.668s
user    0m6.251s
sys     0m1.425s

Fixes #5570

kwlzn added some commits Mar 7, 2018

@kwlzn kwlzn requested review from stuhood and illicitonion Mar 7, 2018

@illicitonion
Copy link
Contributor

illicitonion left a comment

I have no context, but your change looks like a good equivalent and the tests pass. Excellent speedup! :)

except ValueError:
# N.B. `fast_relpath` raises a `ValueError` if prefixes aren't aligned.
return False
else:

This comment has been minimized.

@illicitonion

illicitonion Mar 7, 2018

Contributor

Drop the else?

This comment has been minimized.

@kwlzn

kwlzn Mar 7, 2018

Member

done

if hydrated_target not in hydrated_target_to_address:
hydrated_target_to_address[hydrated_target] = hydrated_target.adaptor.address

for hydrated_target, legacy_address in six.iteritems(hydrated_target_to_address):

This comment has been minimized.

@illicitonion

illicitonion Mar 7, 2018

Contributor

I'm assuming order didn't matter here?

This comment has been minimized.

@kwlzn

kwlzn Mar 7, 2018

Member

shouldn't, no.

try:
path_relative_to_rel_root = fast_relpath(path_from_buildroot, self.rel_root)
except ValueError:
# N.B. `fast_relpath` raises a `ValueError` if prefixes aren't aligned.

This comment has been minimized.

@stuhood

stuhood Mar 7, 2018

Member

What would this case mean here? That a target owns a source... outside the buildroot? It's not clear that we would want to hide that.

It shouldn't be possible at all given the restrictions we place on symlinks in the engine, so would be good to know why/if you saw it.

This comment has been minimized.

@kwlzn

kwlzn Mar 7, 2018

Member

no, fwict just that EagerFilesetWithSpec.matches() is a miss because of unaligned prefixes.

without this, I was seeing tracebacks like this in --changed list:

Exception caught: (<type 'exceptions.ValueError'>)
  File "/Users/kwilson/dev/pants/src/python/pants/bin/pants_loader.py", line 73, in <module>
    main()
  File "/Users/kwilson/dev/pants/src/python/pants/bin/pants_loader.py", line 69, in main
    PantsLoader.run()
  File "/Users/kwilson/dev/pants/src/python/pants/bin/pants_loader.py", line 65, in run
    cls.load_and_execute(entrypoint)
  File "/Users/kwilson/dev/pants/src/python/pants/bin/pants_loader.py", line 58, in load_and_execute
    entrypoint_main()
  File "/Users/kwilson/dev/pants/src/python/pants/bin/pants_exe.py", line 39, in main
    PantsRunner(exiter, start_time=start_time).run()
  File "/Users/kwilson/dev/pants/src/python/pants/bin/pants_runner.py", line 53, in run
    return runner.run()
  File "/Users/kwilson/dev/pants/src/python/pants/bin/local_pants_runner.py", line 46, in run
    self._run()
  File "/Users/kwilson/dev/pants/src/python/pants/bin/local_pants_runner.py", line 91, in _run
    self._exiter).setup()
  File "/Users/kwilson/dev/pants/src/python/pants/bin/goal_runner.py", line 188, in setup
    goals, context = self._setup_context()
  File "/Users/kwilson/dev/pants/src/python/pants/bin/goal_runner.py", line 159, in _setup_context
    self._global_options.subproject_roots,
  File "/Users/kwilson/dev/pants/src/python/pants/bin/goal_runner.py", line 109, in _init_graph
    change_calculator=graph_helper.change_calculator)
  File "/Users/kwilson/dev/pants/src/python/pants/init/target_roots.py", line 67, in create
    changed_addresses = change_calculator.changed_target_addresses(changed_request)
  File "/Users/kwilson/dev/pants/src/python/pants/engine/legacy/change_calculator.py", line 133, in changed_target_addresses
    return list(self.iter_changed_target_addresses(changed_request))
  File "/Users/kwilson/dev/pants/src/python/pants/engine/legacy/change_calculator.py", line 111, in iter_changed_target_addresses
    in self._mapper.iter_target_addresses_for_sources(changed_files))
  File "/Users/kwilson/dev/pants/src/python/pants/engine/legacy/change_calculator.py", line 109, in <genexpr>
    changed_addresses = set(address
  File "/Users/kwilson/dev/pants/src/python/pants/engine/legacy/source_mapper.py", line 90, in iter_target_addresses_for_sources
    any(self._owns_source(source, hydrated_target) for source in sources_set)):
  File "/Users/kwilson/dev/pants/src/python/pants/engine/legacy/source_mapper.py", line 90, in <genexpr>
    any(self._owns_source(source, hydrated_target) for source in sources_set)):
  File "/Users/kwilson/dev/pants/src/python/pants/engine/legacy/source_mapper.py", line 69, in _owns_source
    if target_sources and self._match_source(source, target_sources):
  File "/Users/kwilson/dev/pants/src/python/pants/engine/legacy/source_mapper.py", line 54, in _match_source
    return fileset.matches(source) or matches_filespec(source, fileset.filespec)
  File "/Users/kwilson/dev/pants/src/python/pants/source/wrapped_globs.py", line 108, in matches
    path_relative_to_rel_root = fast_relpath(path_from_buildroot, self.rel_root)
  File "/Users/kwilson/dev/pants/src/python/pants/util/dirutil.py", line 39, in fast_relpath
    raise ValueError('{} is not a directory containing {}'.format(start, path))

Exception message: src/python/pants/backend/graph_info is not a directory containing src/rust/engine/src/lib.rs

where:

>>> os.path.relpath('src/rust/engine/src/lib.rs', 'src/python/pants/backend/graph_info')
'../../../../rust/engine/src/lib.rs'

vs

Traceback (most recent call last):
  File "<console>", line 1, in <module>
  File "/private/var/folders/bb/msvj_4_s741g93nzgrj19twc0000gn/T/tmpNb4Pj9/.deps/pantsbuild.pants-1.3.0-py2-none-any.whl/pants/util/dirutil.py", line 39, in fast_relpath
    raise ValueError('{} is not a directory containing {}'.format(start, path))
ValueError: src/python/pants/backend/graph_info is not a directory containing src/rust/engine/src/lib.rs
>>>

this was why I was alluding to an exact, faster os.path.relpath replica in #5570. but I think this is sufficient, unless it's possible that EagerFilesetWithSpec._files can contain ../../ style paths (which I'm assuming is not the case)?

This comment has been minimized.

@kwlzn

kwlzn Mar 7, 2018

Member

seems like falling back to os.path.relpath in this case (vs return False) is still a net perf win - happy to do that if you think it's more correct.

This comment has been minimized.

@kwlzn

kwlzn Mar 7, 2018

Member

err, actually not - scratch that last comment.

This comment has been minimized.

@stuhood

stuhood Mar 7, 2018

Member

Ooh, I see what is going on here. Yea, so not matching is expected.

Looks fine with the catch. The only potential improvement would be to have a return_none=True mode for fast_relpath, so that we don't hide other types of ValueError here. But also fine with just landing it.

This comment has been minimized.

@stuhood

stuhood Mar 7, 2018

Member

...oh. I already did that:

def fast_relpath_optional(path, start):

This comment has been minimized.

@kwlzn

kwlzn Mar 7, 2018

Member

ah, great idea - fixed.

@stuhood

This comment has been minimized.

Copy link
Member

stuhood commented Mar 7, 2018

... and of course: super awesome. Thanks! Just want to nail that down.

@stuhood

stuhood approved these changes Mar 7, 2018

@stuhood stuhood added this to the 1.4.x milestone Mar 7, 2018

@stuhood

stuhood approved these changes Mar 7, 2018

@kwlzn kwlzn merged commit eddc698 into pantsbuild:master Mar 7, 2018

1 check passed

continuous-integration/travis-ci/pr The Travis CI build passed
Details

@stuhood stuhood deleted the twitter:kwlzn/faster_v2_changed branch Mar 8, 2018

stuhood added a commit that referenced this pull request Mar 8, 2018

Improve the performance of v2 changed. (#5571)
this is good for a ~40x speedup in my test case:

before:

[omerta pants (kwlzn/faster_v2_changed)]$ time ./pants --changed-parent=HEAD~60 list > a

real    4m26.282s
user    2m59.511s
sys     1m25.936s
after:

[omerta pants (kwlzn/faster_v2_changed)]$ time ./pants --changed-parent=HEAD~60 list > b

real    0m7.668s
user    0m6.251s
sys     0m1.425s

Fixes #5570
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment