Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

zig fmt policy #1003

Closed
andrewrk opened this issue May 10, 2018 · 12 comments
Closed

zig fmt policy #1003

andrewrk opened this issue May 10, 2018 · 12 comments
Labels
bug Observed behavior contradicts documented or intended behavior enhancement Solving this issue will likely involve adding new logic or components to the codebase.
Milestone

Comments

@andrewrk
Copy link
Member

andrewrk commented May 10, 2018

I'm using zig fmt to convert deref code from *ptr to ptr.* syntax in the pointer-reform branch, using zig fmt from the zig-fmt-pointer-reform branch.

Here are some things I noticed that zig fmt does that we should address. Some of them have clear ways to fix them, others are less clear. Feedback welcome! Everybody gets to have an opinion about the bike shed color :)

complex if-else expressions [fixed]

-        const mode = if (release_safe and !release_fast and !release_small)
-            builtin.Mode.ReleaseSafe
-        else if (release_fast and !release_safe and !release_small)
-            builtin.Mode.ReleaseFast
-        else if (release_small and !release_fast and !release_safe)
-            builtin.Mode.ReleaseSmall
-        else if (!release_fast and !release_safe and !release_small)
-            builtin.Mode.Debug
-        else x: {
+        const mode = if (release_safe and !release_fast and !release_small) builtin.Mode.ReleaseS
afe else if (release_fast and !release_safe and !release_small) builtin.Mode.ReleaseFast else if (
release_small and !release_fast and !release_safe) builtin.Mode.ReleaseSmall else if (!release_fas
t and !release_safe and !release_small) builtin.Mode.Debug else x: {
             warn("Multiple release modes (of -Drelease-safe, -Drelease-fast and -Drelease-small)"
);
             self.markInvalidUserInput();
             break :x builtin.Mode.Debug;
-            return os.getEnvVarOwned(self.allocator, "CC") catch |err| 
-                if (err == error.EnvironmentVariableNotFound)
-                    ([]const u8)("cc")
-                else
-                    debug.panic("Unable to get environment variable: {}", err)
-            ;
+            return os.getEnvVarOwned(self.allocator, "CC") catch |err| if (err == error.EnvironmentVariableNotFound) ([]const u8)("cc") else debug.panic("Unable to get environment variable: {}", err);

long function definitions [fixed]

-    fn spawnChildEnvMap(self: &Builder, cwd: ?[]const u8, env_map: &const BufMap,
-        argv: []const []const u8) !void
-    {
+    fn spawnChildEnvMap(self: &Builder, cwd: ?[]const u8, env_map: &const BufMap, argv: []const []const u8) !void {

long function calls [fixed]

-            builder.pushInstalledFile(os.path.join(builder.allocator, builder.lib_dir,
-                artifact.major_only_filename) catch unreachable);
-            builder.pushInstalledFile(os.path.join(builder.allocator, builder.lib_dir,
-                artifact.name_only_filename) catch unreachable);
+            builder.pushInstalledFile(os.path.join(builder.allocator, builder.lib_dir, artifact.major_only_filename) catch unreachable);
+            builder.pushInstalledFile(os.path.join(builder.allocator, builder.lib_dir, artifact.name_only_filename) catch unreachable);

failure to parse extra comma [fixed]

../std/json.zig:745:27: error: Expected primary expression, found EqualAngleBracketRight
                '-', '+', => {
                          ~~

spaces around ..

-        return input[i + self.offset - self.count .. i + self.offset];
+        return input[i + self.offset - self.count..i + self.offset];

switch cases: newlines or same line? [fixed]

-                '"', '\\', '/', 'b', 'f', 'n', 'r', 't' => {
+                '"',
+                '\\',
+                '/',
+                'b',
+                'f',
+                'n',
+                'r',
+                't' => {
-            0, 1 => return  y,          // atan(+-0, +...)
-            2    => return  pi,         // atan(+0, -...)
-            3    => return -pi,         // atan(-0, -...)
+            0,
+            1 => return y, // atan(+-0, +...)
+            2 => return pi, // atan(+0, -...)
+            3 => return -pi, // atan(-0, -...)

nested struct initialization [fixed]

-        return Address {
-            .os_addr = posix.sockaddr {
-                .in = posix.sockaddr_in {
-                    .family = posix.AF_INET,
-                    .port = std.mem.endianSwapIfLe(u16, port),
-                    .addr = ip4,
-                    .zero = []u8{0} ** 8,
-                },
-            },
-        };
+        return Address{ .os_addr = posix.sockaddr{ .in = posix.sockaddr_in{
+            .family = posix.AF_INET,
+            .port = std.mem.endianSwapIfLe(u16, port),
+            .addr = ip4,
+            .zero = []u8{0} ** 8,
+        } } };

space after fn and space before return type in function types

-pub fn sort(comptime T: type, items: []T, lessThan: fn(lhs: &const T, rhs: &const T)bool) void {
+pub fn sort(comptime T: type, items: []T, lessThan: fn(lhs: &const T, rhs: &const T) bool) void {

compact, aligned array literals [fixed]

-        []const u8 {  0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15 },
-        []const u8 { 14, 10,  4,  8,  9, 15, 13,  6,  1, 12,  0,  2, 11,  7,  5,  3 },
-        []const u8 { 11,  8, 12,  0,  5,  2, 15, 13, 10, 14,  3,  6,  7,  1,  9,  4 },
-        []const u8 {  7,  9,  3,  1, 13, 12, 11, 14,  2,  6,  5, 10,  4,  0, 15,  8 },
-        []const u8 {  9,  0,  5,  7,  2,  4, 10, 15, 14,  1, 11, 12,  6,  8,  3, 13 },
-        []const u8 {  2, 12,  6, 10,  0, 11,  8,  3,  4, 13,  7,  5, 15, 14,  1,  9 },
-        []const u8 { 12,  5,  1, 15, 14, 13,  4, 10,  0,  7,  6,  3,  9,  2,  8, 11 },
-        []const u8 { 13, 11,  7, 14, 12,  1,  3,  9,  5,  0, 15,  4,  8,  6,  2, 10 },
-        []const u8 {  6, 15, 14,  9, 11,  3,  0,  8, 12,  2, 13,  7,  1,  4, 10,  5 },
-        []const u8 { 10,  2,  8,  4,  7,  6,  1,  5, 15, 11,  9, 14,  3, 12, 13 , 0 },
-        []const u8 {  0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15 },
-        []const u8 { 14, 10,  4,  8,  9, 15, 13,  6,  1, 12,  0,  2, 11,  7,  5,  3 },
+            []const u8{
+                0,
+                1,
+                2,
+                3,
+                4,
+                5,
+                6,
+                7,
+                8,
+                9,
+                10,
+                11,
+                12,
+                13,
+                14,
+                15,
+            },
+            []const u8{
+                14,
+                10,
+                4,
+                8,
+                9,
+                15,
+                13,
+                6,
+                1,
+                12,
+                0,
+                2,
+                11,
+                7,
+                5,
+                3,
+            },
+            []const u8{
+                11,
+                8,
+                12,
+                0,
+                5,
+                2,
+                15,
+                13,
+                10,
+                14,
+                3,
+                6,
+                7,
+                1,
+                9,
+                4,
...

long expressions [fixed]

-                self.crc =
-                    lookup_tables[0][p[7]] ^
-                    lookup_tables[1][p[6]] ^
-                    lookup_tables[2][p[5]] ^
-                    lookup_tables[3][p[4]] ^
-                    lookup_tables[4][@truncate(u8, self.crc >> 24)] ^
-                    lookup_tables[5][@truncate(u8, self.crc >> 16)] ^
-                    lookup_tables[6][@truncate(u8, self.crc >>  8)] ^
-                    lookup_tables[7][@truncate(u8, self.crc >>  0)];
+                self.crc = lookup_tables[0][p[7]] ^ lookup_tables[1][p[6]] ^ lookup_tables[2][p[5]] ^ lookup_tables[3][p[4]] ^ lookup_tables[4][@truncate(u8, self.crc >> 24)] ^ lookup_tables[5][@truncate(u8, self.crc >> 16)] ^ lookup_tables[6][@truncate(u8, self.crc >> 8)] ^ lookup_tables[7][@truncate(u8, self.crc >> 0)];

comment between then block and else [fixed]

    if (i < 0x3F800000 + (1 << 23)) {
        return math.log1p(x - 1 + math.sqrt((x - 1) * (x - 1) + 2 * (x - 1)));
    }
    // |x| < 0x1p12
    else if (i < 0x3F800000 + (12 << 23)) {
        return math.ln(2 * x - 1 / (x + math.sqrt(x * x - 1)));
    }
    // |x| >= 0x1p12
    else {
        return math.ln(x) + 0.693147180559945309417232121458176568;
    }
../std/math/acosh.zig:30:5: error: Expected primary expression, found Keyword_else

hex float literals [fixed]

../std/math/exp2.zig:20:10: error: expected ',' or RBrace, found Identifier
    0x1.6a09e667f3bcdp-1,
         ~~~~~~~~~~~~~

same line comment after open brace [fixed]

-        if (ix <= 0x32800000) { // |x| < 2^(-26)
+        if (ix <= 0x32800000) {

if-else one liners [fixed]

-            .uid = if (is_windows) {} else null,
-            .gid = if (is_windows) {} else null,
+            .uid = if (is_windows) {} else
+                null,
+            .gid = if (is_windows) {} else
+                null,

same line doc comments on variable declarations [fixed]

-pub const PROT_NONE   = 0x00; /// [MC2] no permissions
+/// [MC2] no permissions
+pub const PROT_NONE = 0x00;
@andrewrk andrewrk added bug Observed behavior contradicts documented or intended behavior enhancement Solving this issue will likely involve adding new logic or components to the codebase. labels May 10, 2018
@andrewrk andrewrk added this to the 0.3.0 milestone May 10, 2018
@tiehuis
Copy link
Member

tiehuis commented May 10, 2018

Quick summary of my thoughts. In general, I just like to format things so they are < 100 lines and don't clutter each line too much.

complex if-else expressions

Multi-line is infinitely more readable. I'd only do a single line personally on a single if/else, otherwise I'd always go multi-line. Further, I'd go multi-line if longer than a rough line limit.

long function definitions

Single line is usually fine. If your function is exceedingly long that'd indicate to me a possible function api change instead on how arguments are passed.

long function calls

Sort of similar reasoning as above. If you have exceedingly long lines that'd indicate that maybe you should use a constant before or otherwise. I think the following pattern is sometimes useful:

test "y_array_arraysWithSpaces" {
    ok(
        \\[[]   ]
    );
}

spaces around ..

I like this pattern for complex expressions (more than one expression term on either lower or upper) otherwise condensed is best.

switch cases: newlines or same line?

Definitely single-line. There may be a slight argument for ranges and complex expressions to use separate lines, however.

nested struct initialization

First example is much better. Much easier to see the separate nesting and what items belong to which sub-struct.

space after fn and space before return type in function types

I like the space. Mirrors the usual function definition style simply without a name so feels more consistent.

compact, aligned array literals

Compact 100%. These aligned details are usually done for a reason and it makes it much easier to visually line up certain details (as in the example, which we can see that the indices describe some permutations).

long expressions

Could argue that we should split this into sub-expressions. Not too fussed here.

comment between then block and else

I sort of like this pattern, the alternative would be to insert comments within the block. This feels a little clunky on the first if and for me distracts from the importance. I don't think this commenting style should be used fairly often as the expressions themselves are usually obvious enough anyway.

same line comment after open brace (obvious fix)

Can disregard this I think and prefer the above style (or this one)

if-else one liners

I'd go one-line for any if/else with no else if and see how that works.

@BraedonWooding
Copy link
Contributor

I agree almost 100% with @tiehuis, except for the longer function definitions and calls; when you have a long one it does probably suggest you should change how you call it but sometimes it just comes down to long sequences of variable names like obj->parent->sibling->get(X) + obj->child->otherSibling->get(Y) in which case dropping them over multiple lines often is a better solution.

I personally also don't like the spaces around .. as it makes it very clear that your slicing an array; for example when I first saw the code it took me longer to find the .. as I was looking at the one with spaces and it wasn't so inherently clear.

Maybe some of this should be optional? Like have 'settings' kinda like how go-fmt works.

@andrewrk
Copy link
Member Author

Maybe some of this should be optional? Like have 'settings' kinda like how go-fmt works.

No settings. The point is conformity. Also go fmt does not have any settings.

@BraedonWooding
Copy link
Contributor

BraedonWooding commented May 12, 2018

go fmt does not have any settings.

For some reason I felt it did, like the ability to define { as compact or extended, and to have 4/8 spaces. Though it seems I'm wrong.

I guess my opinion is based on the fact that everyone has different style guides (even if slightly different) and that'll it'll be extremely hard to get agreement in an open source community for style.

@andrewrk
Copy link
Member Author

Now that recursion is allowed (see #1006), I refactored std.zig.render to be recursive instead of using an explicit stack. The implementation should now be more straightforward and friendly to work with, if anybody wants to make pull requests to change how zig fmt works.

@andrewrk
Copy link
Member Author

I think I know how to resolve most of these issues. Here's my idea for how to resolve lists of things where it is unclear whether to do same line or one-item-per-line: if there are > 1 item and a trailing comma in the input, then put each item on its own line. Otherwise try to put everything on 1 line (perhaps wrapping and indenting) with no trailing comma.

I don't know how to resolve the array literal problem. Maybe we need something like // zig fmt: preserve? Because there could be a situation where we want to leave array items as one per line, and another where it should be a 8x20 grid or something. @tiehuis can you think of any rules that zig fmt could use to have useful output?

andrewrk added a commit that referenced this issue May 26, 2018
@andrewrk
Copy link
Member Author

After discussing in person with @thejoshwolfe I believe we have a solution for array literals:

  • If there is exactly 1 item, the array literal is inline, with no trailing comma
  • Otherwise, each item in an array literal gets its own line and a trailing comma, unless...
pub const c_digits_lut = []u8 {
    '0', '0', '0', '1', '0', '2', '0', '3', '0', '4', '0', '5', '0', '6', //
    '0', '7', '0', '8', '0', '9', '1', '0', '1', '1', '1', '2', '1', '3',
    '1', '4', '1', '5', '1', '6', '1', '7', '1', '8', '1', '9', '2', '0',
    '2', '1', '2', '2', '2', '3', '2', '4', '2', '5', '2', '6', '2', '7',
    '2', '8', '2', '9', '3', '0', '3', '1', '3', '2', '3', '3', '3', '4',
    '9', '1', '9', '2', '9', '3', '9', '4', '9', '5', '9', '6', '9', '7',
    '9', '8', '9', '9',
};

The empty // at the end of the first row will tell zig fmt the row size. All following rows will have the same size, and each one gets // appended so that they are uniform, like this:

pub const c_digits_lut = []u8 {
    '0', '0', '0', '1', '0', '2', '0', '3', '0', '4', '0', '5', '0', '6', //
    '0', '7', '0', '8', '0', '9', '1', '0', '1', '1', '1', '2', '1', '3', //
    '1', '4', '1', '5', '1', '6', '1', '7', '1', '8', '1', '9', '2', '0', //
    '2', '1', '2', '2', '2', '3', '2', '4', '2', '5', '2', '6', '2', '7', //
    '2', '8', '2', '9', '3', '0', '3', '1', '3', '2', '3', '3', '3', '4', //
    '9', '1', '9', '2', '9', '3', '9', '4', '9', '5', '9', '6', '9', '7', //
    '9', '8', '9', '9',
};

If the first line of the array literal does not have a // empty comment at the end, then each item goes on its own line with a trailing comma. If you want to transform an array into a rectangle, you can modify only the first line and run zig fmt:

pub const c_digits_lut = []u8{
    '0', '0', '0', '1', //
    '0',
    '2',
    '0',
    '3',
    '0',
    '4',
    '0',
    '5',
    '0',
    '6',
    '0',
    '7',
    '0',
    '8',
    '0',
    '9',
};

transforms into

pub const c_digits_lut = []u8{
    '0', '0', '0', '1', //
    '0', '2', '0', '3', //
    '0', '4', '0', '5', //
    '0', '6', '0', '7', //
    '0', '8', '0', '9', //
};

No alignment is done on elements as this would introduce a dependency on unicode and be inconsistent with other not-aligned things such as variable declarations and struct declarations.

andrewrk added a commit that referenced this issue May 27, 2018
andrewrk added a commit that referenced this issue May 27, 2018
andrewrk added a commit that referenced this issue May 28, 2018
now the first row of an array literal is the hint to zig fmt
for how long each row should be.

See #1003
@andrewrk
Copy link
Member Author

I realized that the // was unnecessary. Why are we using comments to communicate line breaks, when we have line breaks to communicate line breaks?

So now the first row of an array literal sets the row size for the rest of the array. I believe this is all solved now.

Here are some conclusions I arrived at:

  • zig fmt is unaware of any column length. zig fmt does not consider the width of anything at any time.
    • However zig fmt may consider the count of items. Currently it distinguishes between 0, 1, or N.
  • An extra trailing comma after a list of something communicates to zig fmt that each item should be on its own line.
  • When there is no trailing comma, zig fmt puts everything on the same line.
  • For array literals, regardless of trailing comma or not, if there are a subset of the total number of items in the first line of the literal, that is taken as the row size, and the array literal is formatted with all rows of that size, with a trailing comma.

@gh-4
Copy link

gh-4 commented May 28, 2018

I offer the suggestion that fn foo() void { is a little harder to read than
fn foo () void { because the former resembles a function call; in the same way that if(...) is not as nice as if (...).

@tiehuis
Copy link
Member

tiehuis commented May 29, 2018

@gh-4 I'm not 100% convinced on the space after the function name. This is seems counter to how most other languages do it. If there is a way to format that is likely to be preferred or expected by the majority of users I think we should favour that. Even though we expect the formatting to be opinionated and automatic, we should strive to have a reasonable default that is at least acceptable to most.

Small aside is that an anonymous function (or declaration) fn () feels more consistent with a single space fn a() vs. the two spaces of fn a (). This point is minor though, you could argue the other way is better parsing the declaration separated by spaces anyway.

andrewrk added a commit that referenced this issue May 29, 2018
@andrewrk
Copy link
Member Author

andrewrk commented May 29, 2018

Here's how I resolved most of these issues:

if-else expressions

  • if you put the entire expression on 1 line, zig fmt leaves it that way.
  • if your expression spans more than one line, zig fmt renders it with line breaks before and after the body.

long function definitions

  • if you put a line break or a trailing comma in the definition, zig fmt renders it one parameter per line
  • if you put the entire function definition on the same line with no trailing comma, zig fmt leaves it that way

long function calls

  • if you put a trailing comma after the last argument, zig fmt renders it one parameter per line
  • if you do not have a trailing comma after the last argument, zig fmt renders the call all on one line

spaces around ..

  • if the expression on either side is an infix operation then there is a space to pad, otherwise no space

switch cases: newlines or same line?

  • if you put a trailing comma after the last item, zig fmt renders it one per line
  • if you do not have a trailing comma after the last item, zig fmt renders all the items in one line

struct initialization

  • if the struct initialization spans more than 1 line or has a trailing comma after the last item, zig fmt renders with 1 line per field and a trailing comma
  • otherwise zig fmt renders the struct initialization all on one line with no trailing comma

space after fn and space before return type in function types

long expressions

  • zig fmt respects line breaks after infix operators

comments

  • All comments are preserved and you can use a same-line comment to force a line break
  • Doc comments are moved from same-line to before

misc

  • The other stuff was bugs in the parser or obvious deficiencies. They're all fixed.

I ran zig fmt on the zig codebase and worked through the issues that came up. Now that we can run zig fmt on the codebase, I will use it to do most of the syntax changes for #770.

@andrewrk
Copy link
Member Author

Everything is solved and I ran zig fmt on the entire codebase (except a couple files, see #1030).

zig fmt is now fairly robust and any newly discovered problems should be filed as independent github issues.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Observed behavior contradicts documented or intended behavior enhancement Solving this issue will likely involve adding new logic or components to the codebase.
Projects
None yet
Development

No branches or pull requests

4 participants