Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Check arbitrary (supported) files #127

Closed
bibz opened this issue Dec 5, 2019 · 8 comments
Closed

Check arbitrary (supported) files #127

bibz opened this issue Dec 5, 2019 · 8 comments
Labels
enhancement New feature or request wontfix This will not be worked on

Comments

@bibz
Copy link

bibz commented Dec 5, 2019

Expected Behavior

Similar to the "[nitpick.JSONFile]" section, arbitrary TOML, YAML, etc. files can have content enforcement.

Current Behavior

With the notable exception of JSON files, there is a definitive list of supported files that one has to abide by.
Even though the formats are parsed and handled by nitpick, the files are not on the whitelist.

Possible Solution

Rely on the filename extension to apply the format checker instead of using a whitelist.

Context

We would like to enforce new configurations with nitpick:

Prettier can also parse a JSON configuration, but we would ideally not rewrite the configuration and open the door for more of our YAML files.

@bibz bibz added the enhancement New feature or request label Dec 5, 2019
@andreoliwa
Copy link
Owner

Thanks for your suggestion!

Even though the formats are parsed and handled by nitpick

You're talking about TomlFormat and YamlFormat classes, right?
They are used to parse pyproject.toml and .pre-commit-config.yaml.

They are still kind of raw, changes will be needed to turn them into generic parsers for any .toml or .yaml file.

Rely on the filename extension to apply the format checker instead of using a whitelist.

The whitelist is used to validate the style strictly, and explicitly inform which files you want to be parsed.
Take the the package.json example.
I use this section:

[nitpick.JSONFile]
file_names = ["package.json"]

... to validate the ["package.json"] section.
If not mentioned under [nitpick.JSONFile] file_names, I consider the ["package.json"] section as "invalid", extra configuration.

I don't remember the code by heart.
I will check if it's possible, in the current logic, to autodetect any section with the .json suffix as a JSONFile.
Then the same logic can be applied to generic .toml or .yaml files.


Another issue for generic TOML and YAML parsers: what to check?
For JSON files, I currently check:

  • if keys are present (contains_keys in the package.json example above);
  • if a certain root key contains a JSON block: contains_json in the same example above, commitlint is the root key, and the JSON block is represented as a string.

Still a very limited check.

For TOML and YAML files, a similar approach could be used.
So far, I didn't think about a generic way of handling those formats, because they have a complex syntax and multiple hierarchy levels.

Some possibilities I see for TOML files:

  • check if a section contains some TOML content
  • check if a section contains some keys
  • check if a section matches exactly some TOML content, no missing keys nor extra keys
  • ...

Some possibilities I see for YAML files:

  • check if a root key is present
  • check if a root key has a certain value
  • check if some key is present at any or at a certain level of indentation
  • check if a key of type list/array/sequence has exactly all expected items
  • check if a key of type list/array/sequence contains some of the expected items in any order
  • check if a key of type list/array/sequence contains some of the expected items in the expected order
  • check if a key of type hash table/dictionary/mapping has exactly expected key/value pairs
  • check if a key of type hash table/dictionary/mapping contains some of the expected key/value pairs
  • ...

Currently, Nitpick handles a YAML file (.pre-commit-config.yaml), and the output is not yet to my liking.
E.g.: a hook can have an additional_dependencies key, it is a list/array.
If some item is missing in the list, the current output of Nitpick is confusing and not completely helpful.
The diff module I use outputs the diff in a certain way, and I still didn't think of a generic way to handle it and show a proper YAML suggestion on the warning (e.g "the item X is missing from the list").

There are many Nitpick validation possibilities for generic file parsers.
To avoid writing useless and complex generic code, I usually start from a real need for a specific file and then apply a generic approach that could be reused.

If you could share examples of excerpts from the files you would like to check (Prettier and Azure Pipelines config), we could come up with a generic style file syntax for them.

@bibz
Copy link
Author

bibz commented Dec 10, 2019

Thank you for your thoughtful reply.

If not mentioned under [nitpick.JSONFile] file_names, I consider the ["package.json"] section as "invalid", extra configuration.

As I see it, from a user perspective and not from a code perspective, that is a feature. It could potentially evolve in "any new section describes a new file".


It is true that having a generic parser raises many points.
Actually, I drafted another feature request to be able to merge nested arrays (additional_dependencies in .pre-config-commit.yaml). But as I was writing and extending the description, it dawned on me that this was too complex to not be a specific check for .pre-commit-config.yaml.

I agree with you, the feature request is vague and covers too much at once.
I think you are taking the better approach in starting with specifics first.

I can create new feature requests for the new file we would like to have support for. And close this one for reference, as a "Won't fix"?

@andreoliwa
Copy link
Owner

As I see it, from a user perspective and not from a code perspective, that is a feature. It could potentially evolve in "any new section describes a new file".

I agree, it is a feature that makes things easier.
I hope it is easily feasible with the current code.

If you want, you can create this new feature request.
If you don't create it in the next few days, I will draft it and create an issue myself.

I can create new feature requests for the new file we would like to have support for. And close this one for reference, as a "Won't fix"?

Perfect.
Then we can discuss the specifics of your new files (Prettier config, Azure Pipelines, tox.ini), and I might even start using them in my projects.

@andreoliwa
Copy link
Owner

Thank you for your thoughtful reply.

If not mentioned under [nitpick.JSONFile] file_names, I consider the ["package.json"] section as "invalid", extra configuration.

As I see it, from a user perspective and not from a code perspective, that is a feature. It could potentially evolve in "any new section describes a new file".

With 6f54480 the code is working like this for JSON and text files: a new section describes a new file.
It uses https://github.com/chriskuehl/identify to identify file type by extension.

It will be available on the next release.

@hmvp
Copy link

hmvp commented Nov 13, 2020

Thank you for your thoughtful reply.

If not mentioned under [nitpick.JSONFile] file_names, I consider the ["package.json"] section as "invalid", extra configuration.

As I see it, from a user perspective and not from a code perspective, that is a feature. It could potentially evolve in "any new section describes a new file".

With 6f54480 the code is working like this for JSON and text files: a new section describes a new file.
It uses https://github.com/chriskuehl/identify to identify file type by extension.

It will be available on the next release.

I feel that is the right behaviour. The text plugin does the same thing i suppose.

I did ran into a catch with that plugin though. The text plugin and presumably others also check if the file type matches.
Since there is no generic yaml support I tried to use the text plugin to at least check the presence of a line, but it does not allow other file types, I am not sure if I like that restriction.. inspect might not always do the right thing or I might want to use the text plugin as an escape of sorts for unsupported formats..

Overall the above suggestions all seem reasonable. I do want to add gitlab yaml to the list of desired files to support.

@andreoliwa
Copy link
Owner

Hi @hmvp, thanks for the enhancement and bug reports so far.

I tried to use the text plugin to at least check the presence of a line, but it does not allow other file types, I am not sure if I like that restriction..

A YAML file is also a text file:

$ identify-cli .pre-commit-config.yaml
["file", "non-executable", "text", "yaml"]

But I'm using plain-text to check text files, not text:

if "plain-text" in tags:
return TextPlugin(file_name)

Initially I thought "all files can also be text files, so I can use 'contains' with any file".
But there was a reason for using plain-text instead.
Something went wrong when I used text, but I didn't document what happened. 🤦🏻

Could you open another bug with the style you used and failed?
So I can investigate further and either fix it or write down the problem this time.


Overall the above suggestions all seem reasonable. I do want to add gitlab yaml to the list of desired files to support.

Which kind of validations you would like to do in .gitlab-ci.yml?
Any one of these possibilities below (mentioned in a comment above), or do you think of a different one?

Some possibilities I see for YAML files:

  • check if a root key is present
  • check if a root key has a certain value
  • check if some key is present at any or at a certain level of indentation
  • check if a key of type list/array/sequence has exactly all expected items
  • check if a key of type list/array/sequence contains some of the expected items in any order
  • check if a key of type list/array/sequence contains some of the expected items in the expected order
  • check if a key of type hash table/dictionary/mapping has exactly expected key/value pairs
  • check if a key of type hash table/dictionary/mapping contains some of the expected key/value pairs
  • ...

I ask because I'd like to understand real use cases, and then think of a generic and extensible API.

  1. For the JSON plugin there are 2 validations so far, the ones I needed ("contains keys" and "contains JSON").
  2. For the text plugin, there is a "contains" validation so far (more will be added on Add multiline option to text plugin #233, will comment there soon).

I like the "contains" approach so far, although I did it mindlessly on the JSON plugin, and in more extensible way on the text plugin.
JSON is not extensible because you can only have a single contains_json per checked file (there might be breaking changes in the future to fix this behaviour).

@hmvp
Copy link

hmvp commented Nov 16, 2020

So the thing I tried to do is this check:

[[".gitlab-ci.yml".contains]]
line = "    - pytest --cov=ims --cov-branch --junitxml=report-pytest-unit.xml --cov-fail-under="

Of course that was using the text plugin so with yaml support it would be nice if I can assert that (partial) line is present under a specific section. Maybe the regex option is also relevant here

@andreoliwa
Copy link
Owner

Hi @bibz @hmvp @jaysonsantos. 👋🏻

This issue might actually have a future.
I had some ideas on how to check generic YAML files here: #353

I'm not sure about the API, I would appreciate some feedback from you. 🙂
Could you review the pull request?
Multiple minds are better than one. 😅

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request wontfix This will not be worked on
Projects
None yet
Development

No branches or pull requests

3 participants