Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change TAB to use the mode-specific indentation config #49

Merged
merged 2 commits into from
Jul 19, 2022

Conversation

mcobzarenco
Copy link
Collaborator

#29 added mode-specific configuration for indentation, though it remained unused. See the indentation field for an example below. This PR changes TAB to respect the indentation configuration.

        Mode(
            name: "C",
            scope: "source.c",
            injection_regex: "c",
            patterns: [Suffix(".c"), Suffix(".h")],
            comment: Some(Comment(token: "// ")),
            indentation: Indentation(
                width: 4,
                unit: Space,
            ),
            grammar: Some(
                Grammar(
                    id: "c",
                    source: Git(
                        git: "https://github.com/tree-sitter/tree-sitter-c",
                        rev: "517bf92b2c5e8baa4241cbb8a49085ed7c3c48d4",
                    ),
                )
            ),
        ),

@mcobzarenco mcobzarenco self-assigned this Jul 18, 2022
@mcobzarenco
Copy link
Collaborator Author

@kevinmatthes Would you be so kind to review this PR?

Copy link
Contributor

@kevinmatthes kevinmatthes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The changes look good but they should also be mirrored in graphemes.rs where the tab width still is assumed to be statically four.

Especially with languages like Haskell which require a width of eight by convention, the tree-sitters might be confused.

Is there a possibility to read out the currently recognised language in graphemes.rs to use the width also there?

@mcobzarenco
Copy link
Collaborator Author

@kevinmatthes You make a very good point -- I did look at how TAB_WIDTH is used and concluded it played two slightly different roles

  • In graphemes.rs, it is essentially used to decide how to wide a tab is visually
  • In buffer.rs, it was used to insert a number of spaces (essentially to match the width of a tab)

Obviously these are very close concerns from a end user pov, but are slightly different. Moving forward, some options about what we can do

  1. Add a new configuration option, say visual_tab_width, which controls how wide the tab character is rendered

             indentation: Indentation(
                width: 1,
                visual_tab_width: 4
                unit: TAB,
            ),
    

    Pressing TAB would insert on \t character which is rendered as taking 4 spaces, e.g.

    image

  2. Same as 1, but instead, make this option global in the config file, rather than per mode.

  3. Inserting multiple \t on pressing TAB seems unlikely to be used. So width > 1 with unit: TAB is probably going to be rare.. we could reuse the width field instead to mean "visual_tab_width" when the unit is tab. I.e. always insert a single \t on pressing TAB and use width to decide how to render it.

3 is somewhat less uniform, but may be the behaviour users expect. What are your thoughts?

@kevinmatthes
Copy link
Contributor

I agree with you, possibility 3 would be the best solution.

In summary, this would mirror the count of spaces to the visual appearance when changing the unit to tab characters, right?

@iainh
Copy link
Contributor

iainh commented Jul 18, 2022

For what it is worth, I had a similar change to this in my now defunct 'working' branch in which I implemented option 3 since I interpreted the meaning of "width" to be "display width with a monospace font" so how wide a tab should appear as well as how many spaces to output to look the same as one tab of width "width". The grapheme changes were minor but I had done them by adding a method parameter which wasn't necessarily the cleanest solution (thus no PR).

I also think there is value in making it mode specific since some languages, make and cmake for example, have vastly different defaults in many editors like Emacs. Make uses 8 character wide tabs where as cmake uses 2 spaces.

@mcobzarenco
Copy link
Collaborator Author

I've pushed a commit that implements option 3 -- I must admit, using the width value in this two different roles (number of chars vs. visual width of the \t grapheme) did require more gymnastics than I anticipated. I still think from an end user pov is probably the right thing to do, but implementation-y wise, it's a bit confusing, check out the code.

Copy link
Contributor

@kevinmatthes kevinmatthes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, from my point of view, there are no problems with these changes.

But a more general question: how about override functionalities for these settings? Command line options and config variables in order to easily alter the indentation unit if required?

@mcobzarenco
Copy link
Collaborator Author

But a more general question: how about override functionalities for these settings? Command line options and config variables in order to easily alter the indentation unit if required?

I don't think being able to override these settings form the command line is worth it. Some arguments against providing many override settings with command line arguments

  • As the editor grows & we have more and more things in the configuration file, would we need to maintain an argument to override any of those settings?
  • The cost of adjusting a config file is not high. In many years of using emacs, I didn't find myself wishing to override things when I start it.
  • Specifying complex, nested pieces of data via command line arguments is not convenient
  • Emacs doesn't provide these overrides.

In my mind command line arguments for the editor should be pretty limited and instead should be reserved for truly "global" behaviour that needs to be specified at startup -- e.g. where the config dir is, whether to build the parsers, whether you want a terminal or gui version etc.

@mcobzarenco
Copy link
Collaborator Author

As one more datapoint on cmd line argument override, helix also doesn't support any

@mcobzarenco
Copy link
Collaborator Author

I've added a changelog too in this PR -- it was high time

@mcobzarenco mcobzarenco merged commit 0b40784 into master Jul 19, 2022
@mcobzarenco mcobzarenco deleted the indentation-config branch July 19, 2022 11:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants