New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Redesign JavaScript Style Checker to use ESLint directly #5265

Merged
merged 30 commits into from Feb 21, 2018

Conversation

Projects
None yet
4 participants
@nsaechao
Copy link
Collaborator

nsaechao commented Jan 3, 2018

Problem

Currently, the JS style checker is limited in capabilities due to being dependent on a 3rd party node package that is not well defined in code. Moreover, many features were limited because of the external undefined node package.

One such limitations is found during pants execution. A developer's working directory may include working files that is ignored by git. One such instance is when Webpack builds and bundles all javascript code into a single javascript file. A target's BUILD may include something like rglobs(*.js) which will result in pants trying to lint code that it shouldn't.

Solution

Major change: Move away from use the standard-engine framework and use eslint directly.
This provides a lot of flexibility in terms of self-bootstrapping and being able to utilize all of eslint's command-line parameters such as adding target-level ignore patterns.

Result

1.) ESLint distribution is well-defined
2.) Able to configure ESLint options through pants.ini
3.) Able to ignore patterns on a target-level
4.) Able to self-bootstrap with default rules specified by ESLint distribution.
5.) Cache is versioned by package.json / yarn.lock

@nsaechao nsaechao changed the title WIP: Redesign JavaScript Style Checker to use ESLint directly Redesign JavaScript Style Checker to use ESLint directly Jan 4, 2018

@@ -0,0 +1,133 @@
# coding=utf-8
# Copyright 2015 Pants project contributors (see CONTRIBUTORS.md).

This comment has been minimized.

@UnrememberMe

UnrememberMe Jan 16, 2018

Contributor

2018

This comment has been minimized.

@nsaechao

nsaechao Jan 16, 2018

Collaborator

No longer relevant

options = cls.get_options()
global_options = cls.global_instance().get_options()
pants_bootstrapdir = global_options.pants_bootstrapdir
return ESLintDistribution(options.setupdir, options.supportdir, options.eslint_config,

This comment has been minimized.

@UnrememberMe

UnrememberMe Jan 16, 2018

Contributor

I will suggest to use NodeDistribution instead of creating a new Distribution. Add a TODO will be suffice.

This comment has been minimized.

@nsaechao

nsaechao Jan 16, 2018

Collaborator

Done

return self._checksum_md5(file_a) == self._checksum_md5(file_b)
return False

def _checksum_md5(self, filename):

This comment has been minimized.

@UnrememberMe

UnrememberMe Jan 16, 2018

Contributor

from pants.base.hash_utils import hash_file should do the trick. It uses SHA1 instead of MD5, but it should be suffice for our purposes. You can also pass in a hashlib.md5() as digest if you really want to use md5.

This comment has been minimized.

@nsaechao

nsaechao Jan 16, 2018

Collaborator

Done

bootstrap_dir = os.path.realpath(os.path.expanduser(self._pants_bootstrapdir))
bootstrapped_support_path = os.path.join(bootstrap_dir, self._supportdir)
is_preconfigured = True
if not os.path.exists(bootstrapped_support_path):

This comment has been minimized.

@UnrememberMe

UnrememberMe Jan 16, 2018

Contributor

Maybe we only port this preconfigured functionality to bin_utils?

This comment has been minimized.

@nsaechao

nsaechao Jan 16, 2018

Collaborator

Done

result, yarn_add_command = self.execute_yarnpkg(
args=['install'],
args=['add', 'eslint'],

This comment has been minimized.

@UnrememberMe

UnrememberMe Jan 16, 2018

Contributor

if we are doing a yarn add eslint here, maybe we should ping down the eslint version as well?

This comment has been minimized.

@nsaechao

nsaechao Jan 16, 2018

Collaborator

Done

return bootstrap_dir

def _get_target_ignore_patterns(self, target):
ignore_path = next((source for source in target.sources_relative_to_buildroot()

This comment has been minimized.

@UnrememberMe

UnrememberMe Jan 16, 2018

Contributor

since we are only always look for a single ignore path, maybe it is more readable to construct the path and check for it in the sources? The usage of next() seems to be unnecessary.

ignore_patterns = []
if ignore_path:
root_dir = os.path.join('**', os.path.dirname(ignore_path))
with open(ignore_path) as f:

This comment has been minimized.

@UnrememberMe

UnrememberMe Jan 16, 2018

Contributor
with open(ignore_path) as f:
  return [os.path.join(root_dir, p.strip() for p in f] 

This comment has been minimized.

@nsaechao

nsaechao Jan 16, 2018

Collaborator

Oh nice, that's handy.

return True

def _compare_file_checksums(self, file_a, file_b):
if os.path.isfile(file_a) and os.path.isfile(file_b):

This comment has been minimized.

@UnrememberMe

UnrememberMe Jan 16, 2018

Contributor
return (
  os.path.is_file(file_a) and 
  os.path.is_file(file_b) and
  self._checksum_md5(file_a) == self._checksum_md5(file_b)
)

This comment has been minimized.

@nsaechao

nsaechao Jan 16, 2018

Collaborator

Done

bootstrapped_support_path = os.path.join(bootstrap_dir, self._supportdir)
is_preconfigured = True
if not os.path.exists(bootstrapped_support_path):
is_preconfigured = self._bootstrap_eslinter(bootstrapped_support_path)

This comment has been minimized.

@UnrememberMe

UnrememberMe Jan 16, 2018

Contributor

should we consider to remove preconfigured bootstrap dir during pants clean-all? otherwise, if we downloaded a bad file/configure, users can't easily delete them and retry.
I think a TODO for now is OK, but we should try to fix it soon.

This comment has been minimized.

@nsaechao

nsaechao Jan 16, 2018

Collaborator

Agreed, added a TODO here as the clean-up should clean up other things as well.

@nsaechao

This comment has been minimized.

Copy link
Collaborator

nsaechao commented Jan 23, 2018

@stuhood stuhood self-requested a review Jan 23, 2018

@stuhood
Copy link
Member

stuhood left a comment

Thanks Ny: this is a good improvement. Sorry for the long delayed review.

@@ -36,15 +36,15 @@ def register_options(cls, register):
register('--skip', type=bool, fingerprint=True, help='Skip javascriptstyle.')
register('--fail-slow', type=bool,
help='Check all targets and present the full list of errors.')
register('--javascriptstyle-dir', advanced=True, fingerprint=True,

This comment has been minimized.

@stuhood

stuhood Jan 23, 2018

Member

I'm ok with removing the functionality behind this option because it was added during the 1.4.0 cycle. But I think you should still leave the option as deprecated, so that anyone who might have used it will get some hint of where to look for the new solution.

See

register('--rpc-style', type=str, advanced=True, default='sync',
removal_version='1.6.0.dev0',
removal_hint='Use --compiler-args instead of --rpc-style.',
help='The default rpc-style to generate for java_thrift_library targets.')
for an example of how to deprecate an option. You should target 1.6.0.dev0 for the removal.

@@ -45,17 +48,29 @@ def register_options(cls, register):
NodeDistribution.VALID_PACKAGE_MANAGER_LIST.keys()))
register('--yarnpkg-version', advanced=True, default='v0.19.1', fingerprint=True,
help='Yarnpkg version. Used for binary utils')
register('--eslint-setupdir', advanced=True, fingerprint=True,

This comment has been minimized.

@stuhood

stuhood Jan 23, 2018

Member

It seems like all of these options should potentially be on an "eslint" subsystem instead, which could depend on the NodeDistribution subsystem?

Also, I don't think that having an explicit setupdir option is a good idea, as you could always just do it in a temp directory, right?

This comment has been minimized.

@nsaechao

nsaechao Jan 27, 2018

Collaborator

The setup dir here should not be in a temp directory, it should be a directory that gets checked in and contains a package.json that configures "eslint".

register('--javascriptstyle-dir', advanced=True, fingerprint=True,
help='Package directory for lint tool.')
register('--color', type=bool, default=True, help='Enable or disable color.')
register('--default-eslint-version', default='4.15.0', help='Default ESLint version if not configured.')

This comment has been minimized.

@stuhood

stuhood Jan 23, 2018

Member

It's odd that this option is located in a different location than the other eslint options... it would probably be good to move this to that same eslint subsystem.

Also, should drop the "default" from the option name and update the help: if someone overrides the value of the option, it would no longer be a default. So just "ESLint version to use" probably.

This comment has been minimized.

@nsaechao

nsaechao Jan 27, 2018

Collaborator

That's a good point.

:rtype: (string, bool)
"""
bootstrap_dir = os.path.realpath(os.path.expanduser(self._pants_bootstrapdir))
# TODO(nsaechao): ./pants clean-all should handle removing this directory.

This comment has been minimized.

@stuhood

stuhood Jan 23, 2018

Member

I don't think that the supportdir should be exposed outside of .pants.d... probably you should have the eslint subsystem hardcode this directory as os.path.join(self.get_options().pants_workdir, 'eslint', 'support' or something (ie, under .pants.d), and remove the eslint_supportdir option.

That would solve the clean-all problem, and would prevent people from depending on the location where eslint is installed.

This comment has been minimized.

@nsaechao

nsaechao Jan 27, 2018

Collaborator

IIRC, anything in the pants_workdir gets wiped out after every pants command? Having to re-install eslint each time would be costly if this is true...

This comment has been minimized.

@stuhood

stuhood Jan 27, 2018

Member

No, only after ./pants clean-all.

installed = self._binary_util.is_bin_valid(bootstrapped_support_path, binary_file_specs)
if not installed:
self._configure_eslinter(bootstrapped_support_path)
return (bootstrapped_support_path, configured)

This comment has been minimized.

@baroquebobcat

baroquebobcat Jan 23, 2018

Contributor

I found the conditionals here a bit hard to follow. From what I can tell, these are the cases.

  • If setupdir isn't set, or if it doesn't contain package.json, yarn.lock, then create/clean the bootstrap path, returning bootstrap, False
  • if setupdir contains the files but bootstrap doesn't, or bootstrap's are different, then run configure_eslinter and return bootstrap, True
  • if setupdir contains the files and they are the same as the ones in bootstrap, then return bootstrap, True

Does that sound right?

This comment has been minimized.

@nsaechao

nsaechao Jan 27, 2018

Collaborator

Yes. Updated with comments for future clarifications.

@stuhood
Copy link
Member

stuhood left a comment

Option locations look much better, but I don't think the fingerprinting you're doing is hooked up to anything.

help='The path to the global eslint configuration file specifying all the rules')
register('--eslint-ignore', advanced=True, fingerprint=True,
help='The path to the global eslint ignore path')
register('--eslint-version', default='4.15.0', help='Use this ESLint version if not configured.')

This comment has been minimized.

@stuhood

stuhood Feb 1, 2018

Member

The "if not configured" bit still doesn't make sense: should drop.

@@ -45,6 +48,13 @@ def register_options(cls, register):
NodeDistribution.VALID_PACKAGE_MANAGER_LIST.keys()))
register('--yarnpkg-version', advanced=True, default='v0.19.1', fingerprint=True,
help='Yarnpkg version. Used for binary utils')
register('--eslint-setupdir', advanced=True, fingerprint=True,
help='Find the package.json under this dir for installing eslint and plugins.')
register('--eslint-config', advanced=True, fingerprint=True,

This comment has been minimized.

@stuhood

stuhood Feb 1, 2018

Member

Should mark this option astype=file_option (which will cause it to be deeply fingerprinted).

from pants.option.custom_types import file_option
@@ -45,6 +48,13 @@ def register_options(cls, register):
NodeDistribution.VALID_PACKAGE_MANAGER_LIST.keys()))
register('--yarnpkg-version', advanced=True, default='v0.19.1', fingerprint=True,
help='Yarnpkg version. Used for binary utils')
register('--eslint-setupdir', advanced=True, fingerprint=True,
help='Find the package.json under this dir for installing eslint and plugins.')

This comment has been minimized.

@stuhood

stuhood Feb 1, 2018

Member

Should this say something about committing the yarn.lock file in this directory, or something?

binary_file_specs = [
BinaryUtil.BinaryFileSpec(f, hash_file(os.path.join(self.eslint_setupdir, f)))
for f in ['yarn.lock', 'package.json']]
installed = self._binary_util.is_bin_valid(bootstrapped_support_path, binary_file_specs)

This comment has been minimized.

@stuhood

stuhood Feb 1, 2018

Member

Because you're not persisting these fingerprints anywhere in association with the current options, they're always going to be re-calculated? And its not obvious where a good place to persist them would be.

I think that rather than bootstrapping these tools in a global location under pantsd, you should have the subsystem bootstrap them under the task's workdir... ie, give this method a signature:
def eslint_supportdir(self, task_workdir). Because Tasks have fingerprints based on their options, you could avoid doing any of your own fingerprinting, and instead just check for existence of the files.

When the Task fingerprint changes, it gets a brand new directory, and would re-bootstrap as you would expect.

This comment has been minimized.

@nsaechao

nsaechao Feb 13, 2018

Collaborator

@stuhood: I am not sure I fully understand the concept of fingerprinting. Can you clarify what you mean by:

  • "Because Tasks have fingerprints based on their options, you could avoid doing any of your own fingerprinting, and instead just check for existence of the files."

If the configuration changes, wouldn't the task options remain the same?

Ny Saechao added some commits Feb 13, 2018

@stuhood
Copy link
Member

stuhood left a comment

Thanks. A few tiny tweaks before we land this.

help='The path to the global eslint configuration file specifying all the rules')
register('--eslint-ignore', advanced=True, type=file_option, fingerprint=True,
help='The path to the global eslint ignore path')
register('--eslint-version', default='4.15.0', help='Use this ESLint version.')

This comment has been minimized.

@stuhood

stuhood Feb 13, 2018

Member

This should likely be fingerprinted as well.

@@ -209,3 +216,27 @@ def _fetch_binary(self, name, binary_path):
logger.debug('Selected {binary} binary bootstrapped to: {path}'
.format(binary=name, path=bootstrapped_binary_path))
return bootstrapped_binary_path

def _compare_file_checksums(self, filepath, checksum=None, digest=None):

This comment has been minimized.

@stuhood

stuhood Feb 13, 2018

Member

These edits should no longer be necessary because this will be located under the Task directory.

Ny Saechao added some commits Feb 15, 2018

Ny Saechao
@stuhood

This comment has been minimized.

Copy link
Member

stuhood commented Feb 19, 2018

@nsaechao : I restarted the first shard with a clean cache... let's see if that reproduces. In the meantime, this had two conflicting PRs land in master. Sorry for the trouble!

@nsaechao

This comment has been minimized.

Copy link
Collaborator

nsaechao commented Feb 20, 2018

@stuhood: Daniel Wagner was able to pinpoint my issue, I had accidentally deleted the .cargo/config file. Since restoring it and fixing the merge conflict, the travis CI has passed.

@stuhood
Copy link
Member

stuhood left a comment

Thanks Ny... two things that I would be fine with fixing in followups.

"""
hasher = sha1()
for dirpath in dirpaths:
for dirpath, dirnames, filenames in os.walk(dirpath, topdown=topdown, onerror=onerror,

This comment has been minimized.

@stuhood

stuhood Feb 20, 2018

Member

Because os.walk is likely to return things in different orderings on different machines, you'll need to sort paths here. Most likely what you'll want to do is collect all filenames under each of the top-level directories (where ordering does matter), and then sort them before calling self._fingerprint_files(..).

But since you've iterated so much on this, I'm fine with doing it as a quick followup?

This comment has been minimized.

@nsaechao

nsaechao Feb 20, 2018

Collaborator

nice catch, we should also sort the top-level dirpaths that is passed in as well.

This comment has been minimized.

@stuhood

stuhood Feb 20, 2018

Member

No, I don't think so. See the comment in self._fingerprint_files on that subject. The order of the entries in the list option might matter.

@@ -61,53 +63,93 @@ def test_config_invalidates_targets(self, cache_args):

This comment has been minimized.

@stuhood

stuhood Feb 20, 2018

Member

This file should be completely restored I think? I removed the _temporary_buildroot method in master intentionally. If you need edits to get this passing, you'll want to make them in the base class.

This comment has been minimized.

@nsaechao

nsaechao Feb 20, 2018

Collaborator

Agreed, fully-restored.

@stuhood stuhood merged commit aa85487 into pantsbuild:master Feb 21, 2018

1 check passed

continuous-integration/travis-ci/pr The Travis CI build passed
Details
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment