Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Formatting with .clang-format #23

Closed
ErikPrantare opened this issue Mar 23, 2021 · 9 comments · Fixed by #34
Closed

Formatting with .clang-format #23

ErikPrantare opened this issue Mar 23, 2021 · 9 comments · Fixed by #34

Comments

@ErikPrantare
Copy link
Contributor

It would be nice to use clang-format to format the code base.

One issue brought up would be the raised barrier of entry to contribute.
I found an action that automatically checks(but doesn't format): https://github.com/marketplace/actions/clang-format-lint.
Could be useful for people without clang-format mayhaps?

Also, I vouch for trying 80 columns.
It's the most common width to use, and it's always possible to relax this limit in the future as it's all automatically formatted.
Imo it makes it easier to read from top to bottom :)

@btzy
Copy link
Owner

btzy commented Mar 27, 2021

I like the clang-format GitHub Action, which (I think) should provide sufficient information/hints for people who do not use it for development, since the build log is public. As long as it doesn't automatically commit to the PR branch, I'm for it. (Automatic commits prevent us from using the possibility of doing a "rebase and merge", which I prefer when the commit history is clean enough.)

@btzy
Copy link
Owner

btzy commented Mar 27, 2021

I briefly looked through the code in search for lines >80 chars long. There will be quite a few lines that will be wrapped, including a number of for-loops and function calls to system APIs.

For example:

    for (nfdfiltersize_t filterIndex = 0; filterIndex != filterCount ; ++filterIndex)
    {
        // this is the spec to parse (we don't use the friendly name on OS X)
        const nfdnchar_t *filterSpec = filterList[filterIndex].spec;
        
        const nfdnchar_t *p_currentFilterBegin = filterSpec;
        for (const nfdnchar_t *p_filterSpec = filterSpec; *p_filterSpec; ++p_filterSpec)
        {
            // ...
        }
        // add the extension to the array
        NSString *filterStr = [NSString stringWithUTF8String:p_currentFilterBegin];
        [buildFilterList addObject:filterStr];
    }

The two for-loops and the call to stringWithUTF8String will have to be broken into multiple lines. It does feel like would decrease readability by spilling the preface of the for-loop header into separate lines.

Even lines like this one will need to be split:

        nfdnchar_t *tmp_outStr = NFDi_Malloc<nfdnchar_t>(sizeof(nfdnchar_t) * charsNeeded);

Rustfmt does seem to have a default max width of 100, but with smaller limits for things like function calls and array literals; Rust is probably the programming language closest to C++ and Rustfmt is the default formatter for Rust so perhaps its choices should bear some weight. Would something like this be possible in ClangFormat?

@ErikPrantare
Copy link
Contributor Author

ErikPrantare commented Mar 28, 2021

AFAIK clang-format does not have anything similar.

Yeah, breaking up the for loops on multiple lines is kind of awkward. This specific example could be written like:

    for (nfdfiltersize_t index = 0; index != filterCount; ++index) {
        // this is the spec to parse (we don't use the friendly name on OS X)
        const nfdnchar_t* filterSpec = filterList[index].spec;

        const nfdnchar_t* p_currentFilterBegin = filterSpec;
        for (const nfdnchar_t* spec = filterSpec; *spec; ++spec) {
            // ...
        }
        // add the extension to the array
        NSString* filterStr =
                [NSString stringWithUTF8String:p_currentFilterBegin];
        [buildFilterList addObject:filterStr];
    }

A little bit terser variable names, but in this situation I think it works. Assignments are often pretty simple to split, I have them set to put the whole expression on a new line, like:

    nfdnchar_t* tmp_outStr =
            NFDi_Malloc<nfdnchar_t>(sizeof(nfdnchar_t) * charsNeeded);

There may be other places where it is harder while keeping it readable.
I think the only way to find out is to try and see if the cases where it becomes a little more awkward
outweighs the cases where it becomes more readable.
Generally I have found it to work out pretty well given some thought, but there are always edge cases.

@btzy
Copy link
Owner

btzy commented Mar 30, 2021

Admittedly, my variable naming is somewhat verbose at points. If you do want to help, perhaps you could run clangformat and make a PR so that I can take a look?

GitHub Actions is triggered for PRs so you could also modify the workflow (cmake.yml) to include a check (or perhaps it might be better to add a separate job for the check instead of appending to the existing jobs, so that we can easily see from GitHub if a PR is only failing because of clangformat). If you're not sure how to set up the workflow, then just ignore this and I'll add it in a separate PR in the future.

@ErikPrantare
Copy link
Contributor Author

ErikPrantare commented Mar 31, 2021

Yes, I'll get to it when I have a bit more time.
I think I'll do one commit for the autoformat, and one where I try to fix the stuff that looks wonky.
I'll just use the .clang-format I usually use in my projects for the initial PR.

Haven't learned how to use github actions, mostly because I don't want to become too dependent on github for my personal workflow (propietary and owned by Micro$oft ;^) )

@ErikPrantare
Copy link
Contributor Author

Ok, I have used clang-format in https://github.com/ErikPrantare/nativefiledialog-extended/tree/clang-format
I only manually intervened in the gtk version as I'm on a Linux machine and didn't want to accidentally change any behavior I couldn't test.
I automatically formatted all files though.

I think some of the places where loops wrap around could benefit from some general refactoring,
such as trying to use standard functions/containers or breaking out parts into separate functions
(e.g. splitting/merging filters could use a separate function).
Also, the use of Hungarian notation for raw pointer iterators could be removed
in favor of trying to use variable names with more semantic meaning.

Is C++98 compatibility a requirement?

@btzy
Copy link
Owner

btzy commented Apr 4, 2021

From a quick glance, there are some things that surprised me:

  1. In preprocessor definitions, the '#' can be separated from the rest of the line by spaces (not really a bad thing, just something I didn't know would work):

https://github.com/ErikPrantare/nativefiledialog-extended/blob/06d67bc286da94a18e0c7bb3798e9e18da4b9a09/src/include/nfd.h#L222-L228

  1. It seems a bit pointless to put the return type on a separate line:

https://github.com/ErikPrantare/nativefiledialog-extended/blob/06d67bc286da94a18e0c7bb3798e9e18da4b9a09/src/include/nfd.h#L168-L169

  1. Fairly surprising alignment (I understand why, but don't find it helpful since the declaration should be "different" from the following two lines):

https://github.com/ErikPrantare/nativefiledialog-extended/blob/06d67bc286da94a18e0c7bb3798e9e18da4b9a09/src/nfd_gtk.cpp#L144-L149

Also one more thing: Is there anything special about your .clang-format settings? Would it be better to base it upon some well-known style guide instead?

Looking at the formatted code, I feel that 80 lines is definitely too short, and the formatter is being somewhat too strict about making things align. The formatted version feels somewhat more difficult to read, for me at least, because it is harder to tell where one statement ends and the next one begins.

While I like the parts about putting the correct spaces in front of lines and in between tokens, as well as the proper C++-ish left aligned pointers (i.e. int* x instead of int *x), I'm afraid that the current state of formatting isn't really how I'd like it to look like. I understand that we're now disagreeing about the actual style, even as we agree with the use of clang-format in principle. You're free to disagree with my style choices; but as I'm going to be continually maintaining this code, I think I'm not going to accept this style. I'm sorry that I've made you waste your time trying this out and manually intervening on some parts.

You've however convinced me of the usefulness of having an automatic formatter though. In the future, I might play around with clang-format on my own, and try to find a style that I like best.

@ErikPrantare
Copy link
Contributor Author

ErikPrantare commented Apr 4, 2021

No problem, I picked my own .clang-format mostly as a proof-of-concept,
and not necessarily as what the actual formatting should look like.
Obviously you should use something that aligns with your sense of aesthetics ;^)

Just some motivations for the things you pointed out, to take into consideration when you create your .clang-format:

The reason for having the return type on its own line is to always align the function names.
This can aid in readability in certain situations, e.g.:

int myFunc1();

[[nodiscard]] std::pair<int, std::vector<std::variant<bool, int>>> myFunc2();

//becomes

int
myFunc1();

[[nodiscard]] std::pair<int, std::vector<std::variant<bool, int>>>
myFunc2();

The 80 column restriction can help by being an indicator that whatever you're writing is too complex.
Often the cause is overly verbose variable names, too many nestled statements, too many nestled expressions, et.c..
So when it becomes an issue to keep it readable under 80 lines, that is usually a smell signifying that the code itself needs refactoring.
I've found this to lead, in the long run, to cleaner code. It's basically a way to make difficult to read code uglier.
That's also why I e.g. use G_THIS_CASING for global variables, makes me more inclined to refactor them away ;^)

So yeah, things to consider.
When I picked my .clang-format, I made it as close to what I prefer and then adapted to its quirks.
The benefit of not having to think a ton about formatting outweighed the ability to micro-manage formatting in the end.
Didn't pick a style-guide because it feels unnecessary when it's all automatically formatted anyways.
I think the tool should be the cross-project standard, not the format.

@btzy
Copy link
Owner

btzy commented Apr 4, 2021

Thanks for the tips.

Indeed it might be useful to align the function names, though maybe it's less important for when the return type is short.

@btzy btzy mentioned this issue Apr 8, 2021
@btzy btzy closed this as completed in #34 Apr 9, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants