New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add --output-dir flag to ScalaFmt task #6134

Merged
merged 8 commits into from Jul 30, 2018

Conversation

Projects
None yet
5 participants
@rkstap
Copy link
Contributor

rkstap commented Jul 16, 2018

Problem

The Scalafmt task currently modifies files in-place. In some circumstances it more useful to put updated files in a separate output directory.

Solution

This patch adds an --output-dir flag to the ScalaFmt task which causes scalafmt to write updates to stdout instead of modifying in-place, then captures this and writes it to a mirrored directory structure in the user-specified output directory.

@illicitonion illicitonion requested review from stuhood and benjyw Jul 16, 2018

@illicitonion
Copy link
Contributor

illicitonion left a comment

Looks great, thanks!

Will give others a change to take a look before I merge :)

@benjyw
Copy link
Contributor

benjyw left a comment

This a neat change, but I don't like the idea of exposing WorkUnit like that. I have an alternative idea for implementation:

How about copying the source files to some tmpdir, running scalafmt there, and then copying the results to the output dir? Possibly after diffing them with the originals, so you only copy the changed files, if that's the requirement.

This would also allow you to continue to run a single process on all files, instead of one-per-file, with the performance implications of that.

This would obviate many of my in-line comments, but I've left them there for reference.

@@ -27,6 +30,10 @@ def register_options(cls, register):
super(ScalaFmt, cls).register_options(register)
register('--configuration', advanced=True, type=file_option, fingerprint=True,
help='Path to scalafmt config file, if not specified default scalafmt config used')
register('--output-dir', advanced=True, type=dir_option, fingerprint=True,
help='Path to scalafmt output directory. Any updated files will be written here. '
'If not specified, files will be modified in-place')

This comment has been minimized.

@benjyw

benjyw Jul 16, 2018

Contributor

Nit: End sentence with a period.

@@ -41,7 +41,8 @@ def _get_runner(classpath, main, jvm_options, args, executor,
def execute_java(classpath, main, jvm_options=None, args=None, executor=None,

This comment has been minimized.

@benjyw

benjyw Jul 16, 2018

Contributor

Having the return type change based on the value of an optional argument is asking for trouble. How about creating a new method, that returns both, and then calling that from this method, discarding the workunit argument?

@@ -93,13 +93,16 @@ def create_java_executor(self, dist=None):
return SubprocessExecutor(dist)

def runjava(self, classpath, main, jvm_options=None, args=None, workunit_name=None,
workunit_labels=None, workunit_log_config=None, dist=None):
workunit_labels=None, workunit_log_config=None, dist=None, return_workunit=False):

This comment has been minimized.

@benjyw

benjyw Jul 16, 2018

Contributor

Same comment here as below:

Having the return type change based on the value of an optional argument is asking for trouble. How about creating a new method, that returns both, and then calling that from this method, discarding the workunit argument?

created_dirs = set()

for _, source in target_sources:
res, workunit = self.runjava(classpath=self.tool_classpath('scalafmt'),

This comment has been minimized.

@benjyw

benjyw Jul 16, 2018

Contributor

So previously we executed once for all files, and now we do so once per file? That seems like a major negative performance impact in all cases, just to support this one use-case.

Is it not possible to separate out the stdout output of running on multiple files? Is there no delimiter or something?

with open(source, 'r') as f:
unformatted_file = f.read()

if formatted_file != unformatted_file:

This comment has been minimized.

@benjyw

benjyw Jul 16, 2018

Contributor

It seems like odd semantics to only write the changed files. Why is this the requirement?

from pants.task.fmt_task_mixin import FmtTaskMixin
from pants.task.lint_task_mixin import LintTaskMixin
from pants.util.dirutil import fast_relpath

This comment has been minimized.

@benjyw

benjyw Jul 16, 2018

Contributor

You have to add the appropriate dep in the BUILD file.

from abc import abstractproperty

from pants.backend.jvm.tasks.rewrite_base import RewriteBase
from pants.base.build_environment import get_buildroot

This comment has been minimized.

@benjyw

benjyw Jul 16, 2018

Contributor

You have to add the appropriate dep in the BUILD file.

@@ -5,14 +5,17 @@
from __future__ import (absolute_import, division, generators, nested_scopes, print_function,
unicode_literals, with_statement)

import os.path

This comment has been minimized.

@benjyw

benjyw Jul 16, 2018

Contributor

It's not super-important, but for uniformity note that we usually just import os.

But I don't think you need this import at all given my other comment below...

@benjyw

This comment has been minimized.

Copy link
Contributor

benjyw commented Jul 16, 2018

@illicitonion please don't merge yet...

@stuhood
Copy link
Member

stuhood left a comment

+1 to Benjy's comment about writing to a temp directory instead. That would also have the benefit of working for all of pants' linters, and natively supporting batch execution.

@illicitonion illicitonion force-pushed the rkstap:scalafmt-args branch 2 times, most recently from 1eee5b3 to 6843f68 Jul 27, 2018

@illicitonion illicitonion force-pushed the rkstap:scalafmt-args branch from 6843f68 to eba5d18 Jul 27, 2018

@illicitonion

This comment has been minimized.

Copy link
Contributor

illicitonion commented Jul 27, 2018

@benjyw All comments addressed, PTAL :)

@@ -26,6 +29,10 @@ def register_options(cls, register):
super(ScalaFmt, cls).register_options(register)
register('--configuration', advanced=True, type=file_option, fingerprint=True,
help='Path to scalafmt config file, if not specified default scalafmt config used')
register('--output-dir', advanced=True, type=dir_option, fingerprint=True,

This comment has been minimized.

@stuhood

stuhood Jul 27, 2018

Member

While this will work for either subclass, this option only makes sense for the ScalaFmtFormat and ScalaFixFix subclasses, which actually rewrite things (and are the tasks installed in fmt).

So a few semi-orthogonal things:

  1. I think that you could make registering the option conditional on sideeffecting = True, and then check it indirectly.
  2. You could move the option to RewriteBase, and execute the file copying before invoke_tool... that would net you support google-java-format as well, which also subclasses RewriteBase.

This comment has been minimized.

@illicitonion

illicitonion Jul 27, 2018

Contributor

Did 2 :)

@stuhood
Copy link
Member

stuhood left a comment

The option still doesn't make sense for sideeffecting=False tasks, but oh well! Thank you!

@benjyw
Copy link
Contributor

benjyw left a comment

I like this approach a lot better! Less code, less confusing, the same or better functionality.

Just a handful of minor comments about naming and documentation and such.

sideeffecting = False
if sideeffecting:
register('--output-dir', advanced=True, type=dir_option, fingerprint=True,
help='Path to scalafmt output directory. Any updated files will be written here. '

This comment has been minimized.

@benjyw

benjyw Jul 30, 2018

Contributor

s/scalafmt//

This comment has been minimized.

@illicitonion

illicitonion Jul 30, 2018

Contributor

Done

def _invoke_tool_in_place(self, target_sources):
# Invoke in place.
result = self.invoke_tool(get_buildroot(), target_sources)
def _invoke_tool_in_or_out_of_place(self, target_sources):

This comment has been minimized.

@benjyw

benjyw Jul 30, 2018

Contributor

This names seems unduly convoluted. How about just _call_invoke_tool?

@@ -46,13 +47,13 @@ def source_extension(cls):
def implementation_version(cls):
return super(ScalaFmt, cls).implementation_version() + [('ScalaFmt', 5)]

def invoke_tool(self, _, target_sources):
def invoke_tool(self, _root, target_sources):

This comment has been minimized.

@benjyw

benjyw Jul 30, 2018

Contributor

Why? If you're going to name the param, it may as well have the same name as in the base class (absolute_root). Making it anonymous was presumably so that IDEs wouldn't complain about an unused param.

This comment has been minimized.

@illicitonion

illicitonion Jul 30, 2018

Contributor

./pants lint (either in pre-commit or on travis, I can't remember which) was complaining to me that the _ in the list comprehension below was shadowing the _ param name, and this showed up as an error. I was surprised.

This comment has been minimized.

@benjyw

benjyw Jul 30, 2018

Contributor

In that case let's use the original name? Introducing a third option is just going to add confusion...

@@ -83,6 +83,15 @@ def safe_mkdir_for(path, clean=False):
safe_mkdir(os.path.dirname(path), clean=clean)


def safe_mkdir_for_all(paths):

This comment has been minimized.

@benjyw

benjyw Jul 30, 2018

Contributor

It would be a good idea to document why you might want to use this (E.g., when you'll be creating dirs for many files that are overwhelmingly likely to be in a small handful of dirs). I.e., why should I call this and not just call safe_mkdir_for on each path?

This comment has been minimized.

@illicitonion

illicitonion Jul 30, 2018

Contributor

Done

This avoids attempting to re-make the same directories, which may be noticeably expensive if many
paths mostly fall in the same set of directories.
:type paths list<string>

This comment has been minimized.

@benjyw

benjyw Jul 30, 2018

Contributor

I don't think I've seen this param annotation style before?

This comment has been minimized.

@illicitonion

illicitonion Jul 30, 2018

Contributor
pants$ grep -rI ':type ' src/python | wc -l
     108

Happy to take alternative suggestions though; I've never done real pydoc before, and am just grasping at what I find it the codebase :)

This comment has been minimized.

@benjyw

benjyw Jul 30, 2018

Contributor

Well this is more compact:
:param list paths: The paths to create dirs for.

More verbose:

:param paths: The paths to create dirs for.
:type paths: list of str

And if you don't want to document the meaning of the param, you can omit the first line I guess.

But importantly, you need that second :, and the way to say "list of str" is list of str...

Hopefully once we're all Python3 all the time, we can switch to using https://www.python.org/dev/peps/pep-0484/ instead.

This comment has been minimized.

@illicitonion

illicitonion Jul 30, 2018

Contributor

Done, maybe? :)

illicitonion added some commits Jul 30, 2018

@benjyw

benjyw approved these changes Jul 30, 2018

@illicitonion illicitonion merged commit 16e75ea into pantsbuild:master Jul 30, 2018

1 check passed

continuous-integration/travis-ci/pr The Travis CI build passed
Details

cosmicexplorer added a commit that referenced this pull request Aug 9, 2018

CMLivingston pushed a commit to CMLivingston/pants that referenced this pull request Aug 27, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment