Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/shfmt: EditorConfig support in stdin doesn't support filenames #577

Closed
lassik opened this issue Jun 4, 2020 · 17 comments
Closed

cmd/shfmt: EditorConfig support in stdin doesn't support filenames #577

lassik opened this issue Jun 4, 2020 · 17 comments
Milestone

Comments

@lassik
Copy link

lassik commented Jun 4, 2020

Ref:

Could a command-line flag be added to shfmt to give a fake filename to use when formatting from stdin? I.e. assume that the contents of stdin come from a file with that name.

This would be important when looking up settings for stdin. If .editorconfig contains:

[*.sh]
indent_style = space
indent_size = 2

then shfmt < foo.sh will ignore that section of the config since it isn't told the filename of foo.sh.

This would be fixed by shfmt -assume-filename foo.sh < foo.sh. Stdin->stdout formatting continues to be important for text editor plugins; it would be more difficult to rely on a simple shfmt foo.sh for this use case.

Some other code formatters already have an option like this:

clang-format:

-assume-filename=<string>
    When reading from stdin, clang-format assumes this
    filename to look for a style config file (with
    -style=file) and to determine the language.

prettier:

--stdin-filepath <path>
    Path to the file to pretend that stdin comes from.

rufo:

--filename=value
    Filename to use to lookup .rufo (useful for STDIN formatting)
@lassik lassik changed the title Add stdin-filename flag Add stdin-filename flag to shfmt Jun 4, 2020
@lassik
Copy link
Author

lassik commented Jun 4, 2020

Apart from looking up settings, the fake filename can also be used when writing error/warning messages to stderr. This is also useful for editor plugins.

@lassik
Copy link
Author

lassik commented Jun 4, 2020

Naturally, the fake filename should also be used to find the .editorconfig config file itself. Only if the fake filename is relative should the working directory of `shfmt´ factor into finding that file.

@mvdan
Copy link
Owner

mvdan commented Jun 4, 2020

Hm. I can see how this could be useful, but I also want to avoid adding more flags unless strictly necessary.

It also feels a bit weird that we want to support shell input files with filenames from stdin, but we don't want to do the same for EditorConfig. Why should EditorConfig require being on disk, if we don't require the same of shell files?

then shfmt < foo.sh will ignore that section of the config since it isn't told the filename of foo.sh.

Why not just do shfmt foo.sh, then? If you have a better realistic example, I'm happy to hear about it, so that I can get a better picture.

@lassik
Copy link
Author

lassik commented Jun 4, 2020

The format-all package for Emacs supports perhaps 40-50 languages depending on how you count. We use stdin->stdout with all formatter programs - no exceptions.

For editor plugins like it, the user is typically editing a file but wants to format it before saving it. Piping the editor buffer's contents into the formatter's stdin and then replacing the buffer's contents with the formatter's stdout is much simpler than first ensuring the file is saved and then formatting based on filename. With the latter case, several problems can occur that are difficult to solve reliably.

Format-all also supports formatting temp buffers that are not saved to any file, so they have nothing reasonable to use as a filename. In this case, format-all does not pass an -assume-filename flag to the formatter.

@lassik
Copy link
Author

lassik commented Jun 4, 2020

Why should EditorConfig require being on disk, if we don't require the same of shell files?

It has never occurred to me to think about this. I can't think of a practical problem where it makes a difference where the .editorconfig comes from, but I'm open to suggestions.

@mvdan
Copy link
Owner

mvdan commented Jun 4, 2020

Thanks for your reply. I'll need to think about this. The intersection of stdin formatting with EditorConfig is a very awkward one for sure.

@lassik
Copy link
Author

lassik commented Jun 4, 2020

If the user is editing both foo.sh and the .editorconfig in a text editor at the same time, and has unsaved changes in both of them, it is assumed that formatting foo.sh in the editor uses the saved on-disk copy of the .editorconfig instead of the modifier, unsaved version in the editor buffer. This model seems intuitive, format-all uses it currently, and no-one has complained it in the couple years that the project has been in use.

@lassik
Copy link
Author

lassik commented Jun 4, 2020

Np, thanks for the prompt reply and consideration!

@mvdan mvdan changed the title Add stdin-filename flag to shfmt shfmt: EditorConfig support in stdin doesn't support filenames Jun 4, 2020
@lassik
Copy link
Author

lassik commented Jun 4, 2020

Also, this is an appeal to authority but clang-format and prettier are two of the most popular formatters and they have both had a stdin filename flag for a good while :) It works very well in practice.

@mvdan
Copy link
Owner

mvdan commented Jun 4, 2020

While that is true, many formatters out there also have a low filter when it comes to adding more and more flags. Just try to do man indent :)

@lassik
Copy link
Author

lassik commented Jun 4, 2020

While that's true, I don't see a way to solve this problem without adding a flag. Unix file descriptors do not carry filenames (and even if they did, it might not be the same filename that is used in editor's user interface due to symlinks and the like). One could use an environment variable, but that also adds something and is more brittle than a flag.

rufo (for Ruby) is one example of a formatter that has very few flags, but one of them is stdin-filename:

$ rufo --help
Usage: rufo files or dirs [options]
    -c, --check                      Only check formating changes
        --filename=value             Filename to use to lookup .rufo (useful for STDIN formatting)
    -x, --simple-exit                Return 1 in the case of failure, else 0
        --loglevel[=LEVEL]           Change the level of logging for the CLI. Options are: error, warn, log (default), debug, silent
    -h, --help                       Show this help

@mvdan
Copy link
Owner

mvdan commented Jun 4, 2020

First, a question: if this is a smart editor (or rather, a smart editor plugin), should it not be in charge of locating the right EditorConfig file and passing in flags? Granted that the shfmt tool implements that, but that's mainly meant for people running shfmt directly as a standalone tool, not from an editor.

In retrospect, I think fixing #450 might have been a mistake, because it should be the editor doing this work. Otherwise, if the editor runs ten formatting tools, they will all do their own separate work to locate and use EditorConfig files, which at best is wasted effort, and at worst is going to be inconsistent and buggy.

--

Now, assuming we want to fix this, my current thinking is as follows:

  • add a -filename flag that takes a string, defaulting to the empty string
  • when formatting stdin (either no args or just -), -filename=foo.bar means using foo.bar as the filename for reporting errors and applying EditorConfigs. The path can be relative to the working directory, or absolute.
  • when formatting stdin without -filename, we default to <standard input>, including for EditorConfig. I'm open to changing this to something like <standard input>.sh if it would provide a better default.
  • when not formatting stdin, using -filename is an error.

@lassik
Copy link
Author

lassik commented Jun 4, 2020

if this is a smart editor (or rather, a smart editor plugin), should it not be in charge of locating the right EditorConfig file and passing in flags?

It's not wise to rely on that because the set of available settings and their permitted values can (and probably will) change from one version of the formatter to the next. It's also easy to create subtle inconsistencies between how the editor applies the settings vs how the formatter (with the help of reading its own settings file) does. A -filename flag in the formatter is a huge deal simpler and less error-prone.

It's true that it is also useful to be able to map editor settings (that do not necessarily come from the .editorconfig file) to formatter settings and pass them as command line flags. But that's a different use case, and much more work to support in an editor (and much more error-prone). Format-all doesn't do any of this currently; it would be a huge effort to add the mappings, and it would be easy to create an inconsistent result due to human error.

if the editor runs ten formatting tools, they will all do their own separate work to locate and use EditorConfig files, which at best is wasted effort, and at worst is going to be inconsistent and buggy.

Most formatters do not use .editorconfig as their settings file. For example, clang-format uses .clang-format and prettier uses .prettierrc. Even if formatters universally stored their settings in .editorconfig, they will still have to contain code to locate and parse that file (perhaps using a library) since they need to support formatting files named on the command line.

As far as I can tell, any way you slice it the -filename flag looks like the simplest and least error-prone solution.

Now, assuming we want to fix this, my current thinking is as follows:

Looks good to me.

The only thing I would change is that using a zero-length filename "" is the default filename for stdin formatting, and causes <standard input> to be used for display (but loading .editorconfig would be skipped if the filename is blank).

@mvdan
Copy link
Owner

mvdan commented Jun 4, 2020

I agree that -filename is probably the simplest solution for you :) Though that doesn't mean it's the best course of action, long-term. Please bear with me for a minute.

the set of available settings and their permitted values can (and probably will) change from one version of the formatter to the next.

I'm not sure I agree - this already applies to flags, so the editor/plugin integration with shfmt already has to know about shfmt's options anyway. Of course, for any given stable major version like v3, the flags are stable and forwards compatible - and the same applies to how those options map to EditorConfig. I can't change any of those until a potential future v4.

subtle inconsistencies between how the editor applies the settings vs how the formatter does

True, but this cuts both ways. The other side of the coin is that, if two tools both have their own logic to use EditorConfig files, they might be inconsistent with each other when used as part of an editor. And, as we said before, duplicate effort that can't be reused between tool invocations either.

Most formatters do not use .editorconfig as their settings file.

It honestly bothers me that so many tools consider themselves special enough to deserve their own "dot-file". This is precisely why I rejected issues like #234 and #358, until someone suggested a format that was standard and could work well enough.

I get that, in practice, I can't even name a single other tool or formatter that uses EditorConfig and could fit in an interactive editor. Still, it still feels to me like it's a better design for the editor to be in charge of the EditorConfig from a central and high-level place.

Very interesting chat, in any case. I've been thinking about how tools integrate with editors a lot recently, but I hadn't thought about the edge case that EditorConfig introduces with in-memory buffers. I'm still on the fence when it comes to what we should do here. Something we can likely agree on is that, currently, EditorConfig support is almost completely pointless with stdin.

@lassik
Copy link
Author

lassik commented Jun 5, 2020

I appreciate your stamina with this discussion :)

It honestly bothers me that so many tools consider themselves special enough to deserve their own "dot-file".

I agree that it would be ideal to keep all formatters' settings in .editorconfig. I would go further and use a .attributes file (I have a design for one). Coding style settings are not fundamentally different from other per-file attributes.

However, we support 40 different formatters and new ones trickle in at a steady pace. There are plugins that support even more. There's no way we can get everyone to agree on where to store their settings.

the editor/plugin integration with shfmt already has to know about shfmt's options anyway. Of course, for any given stable major version like v3, the flags are stable and forwards compatible - and the same applies to how those options map to EditorConfig. I can't change any of those until a potential future v4.

I believe this is not a wise division of labor:

  • There's a difference between fundamental flags like -filename that affect the program's calling convention, and flags that merely change the style. Without having to care about style flags, I currently don't need to have any code to check the version of a formatter, and don't need to do anything special to worry about format-all's flags being in sync with formatters' (I don't need to know about style flags at all). With the current simple interfaces that most formatters have, I can easily maintain support for 40+ formatters by myself in my spare time, and the code for the plugin is simple. This would not be the case if I needed to track formatter version changes.

  • You need to support .editorconfig parsing anyway since you need shfmt foo.sh to do the right thing. The alternative would be to decouple shfmt from configuration files completely, and always require a user-supplied config file parser to translate the user's config to shfmt's flags. I.e. something like find-and-parse-editorconfig foo.sh shfmt foo.sh where find-and-parse-editorconfig would exec shfmt with the parsed values as environment variables. It gets quite esoteric and the benefit is unclear.

The other side of the coin is that, if two tools both have their own logic to use EditorConfig files, they might be inconsistent with each other when used as part of an editor.

This is unavoidable. Even if formatters don't parse .editorconfig, every editor needs to parse it separately. One person uses Emacs, another uses Vim; inconsistency can arise. There is no standard editor (except ed, of course).

The best solution is to have a standard test suite for editorconfig. Ideally also a standard library for each popular programming language.

There could be one and only one command line program to parse .editorconfig but it would have to translate it into an output format which would still have to be parsed so the gains are not that big. It would mainly handle the globbing in a central place. Then again, glob() could be moved from a library function to a standard program... In practice, a large mass of people won't conform to this design aesthetic.

And, as we said before, duplicate effort that can't be reused between tool invocations either.

We're talking about milliseconds here. Speed is not an issue; the real issue is arriving at a design simple enough to avoid race conditions, confusing corner cases in pathname lookup, etc. This is one reason why stdin->stdout is such a win.

Still, it still feels to me like it's a better design for the editor to be in charge of the EditorConfig from a central and high-level place.

An editor is not a central place:

  • There are dozens of different editors :)

  • It's useful to be able to run formatters from commit hooks, makefiles, CI, tests and the like.

As soon as your project has a make format target, its idea of the settings can already diverge from the editor's idea. When your teammate installs a different text editor than you, that's another potential divergence. A GitHub Actions job to run a formatter: another potential divergence. And so on.

A formatter needs to be callable from any number of different places and produce behavior as identical as possible from all of them. The more of the style-parsing that's centralized in the formatter itself, the more likely this outcome is.

I hadn't thought about the edge case that EditorConfig introduces with in-memory buffers. I'm still on the fence when it comes to what we should do here. Something we can likely agree on is that, currently, EditorConfig support is almost completely pointless with stdin.

Emacs keeps track of a a per-buffer working directory, so even in-memory buffers that are not associated with any file have their own idea of their working directory. However, this is not intuitive. In practice, users' mental model of the editor does not associate temp buffers with any particular directory. Since .editorconfig files are associated with particular directories, it's not a good fit.

This is different from the -filename case, because then the editor buffer is associated with a file and the fact that we're doing stdin->stdout is an invisible implementation detail.

In any case, formatting buffers whose current contents do not correspond to a disk file is the norm for editors so stdin->stdout (with or without -filename) is pretty much the only reasonable thing to do there. Files are typically formatted right before saving them, and users can also format them manually at any time while editing. If the editor formatted a disk file using shfmt foo.sh it would always have to save first (a limitation that complicates both the user interface and the implementation for no particular reason). Even then, it would have to load back the new copy of that file, introducing a race condition, potentially forgetting useful parts of its buffer state, etc.

@lassik
Copy link
Author

lassik commented Jun 5, 2020

Also, in case the editor's and the formatter's settings diverge (which inevitably happens every now and then), I think it's better to favor the formatter's idea as the ground truth. Automated tools such as a make format target would use the formatter's idea over any given editor's, and this favors keeping the project's style guide in effect even at a potential inconvenience to a particular user of a particular editor.

In practice, format-on-save using project settings is so convenient that I rarely even bother to configure Emacs anymore. I just type sloppy code using bad indentation and save frequently to sort out the mess.

@mvdan
Copy link
Owner

mvdan commented Jun 12, 2020

Thanks for the detailed reply. I'm leaning towards giving up the idealism and just adding the flag, for the following reasons:

  • stdin + editorconfig already kind of worked in v3 releases, so we can't remove the feature without breaking semver, even if the feature was broken
  • the mental model where the editor is the central configuration place is still possible with shfmt; the editor just needs to do its work with config files such as EditorConfig, and run shfmt -whatever -options, which won't duplicate that work
  • it's clear that the current trend is to let formatter tools do their own thing when run from an editor, so going against that purely on principle is a bit pointless

@mvdan mvdan changed the title shfmt: EditorConfig support in stdin doesn't support filenames cmd/shfmt: EditorConfig support in stdin doesn't support filenames Jun 26, 2020
@mvdan mvdan added this to the 3.2.0 milestone Jun 26, 2020
@mvdan mvdan closed this as completed in fa7035e Jun 26, 2020
gibfahn added a commit to gibfahn/coc-diagnostic that referenced this issue May 5, 2021
This tells shfmt what the filename was, which allows its parsing of the
.editorconfig file to work properly.

Refs: mvdan/sh#577
iamcco pushed a commit to iamcco/coc-diagnostic that referenced this issue May 7, 2021
This tells shfmt what the filename was, which allows its parsing of the
.editorconfig file to work properly.

Refs: mvdan/sh#577
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants