Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wrong indentation in C comments #3757

Closed
lifepillar opened this issue Jan 3, 2019 · 4 comments
Closed

Wrong indentation in C comments #3757

lifepillar opened this issue Jan 3, 2019 · 4 comments

Comments

@lifepillar
Copy link
Contributor

I have encountered a mildly annoying issue with comment's wrapping/reindentation in C buffers, which can be reproduced in Vim 8.1.0650 with the comment reported at the end of this message, with vim --clean and set tw=80. After typing 70G=G, the last two paragraphs are reindented as follows:

/*
 * ...
 * In decimalInfinite, the exponent is offset by 2 instead of 1, so that the
* unary prefix plus the separator is always at least two bits long. The first
* step (pre-encoding) of E is as follows: let N be the number of bits of the
* binary encoding of E+2. The pre-encoding of E is a sequence of N-1 ones
* followed by a zero followed by the trailing N-1 bits of E+2.
*
* For instance, if E=9 then the binary representation of E+2 is 1011, hence
* N=4. The pre-encoding of E is then written as 111 0 011. As another example,
	* if E=0 then N=2 (because E+2 is 10 in binary). Hence, the pre-encoding of
	* 0 is 100.
	*
	* In the second step, if T has been encoded as 0 then the pre-encoding is
	* bitwise complemented; otherwise it is left unchanged. Finally, since the
	* first bit of the encoded absolute exponent is always equal to the encoding of
	* T, the first bit of the encoded absolute exponent can be dropped (or,
			*/

The same issue (wrong indentation) happens if I start typing after the last line before the comment ends. For instance, 85GA equivalently results in:

/*
 * ...
 * T, the first bit of the encoded absolute exponent can be dropped (or,
		 * equivalently
 */

Here is the minimal (so to speak) reproducible example:

/**
 * Encodes a decNumber into (a slight variant of) decimalInfinite format.
 *
 * The decimalInfinite format is an order-preserving binary encoding that
 * supports arbitrarily large or small decimals (i.e., any number that can be
 * represented with a finite number of digits). "Order-preserving" means that,
 * for any two decimals m and n, m < n if and only enc(m) lexicographically
 * precedes enc(n), where enc() is the decimalInfinite encoding function. This
 * property makes decimalInfinite a suitable encoding for database applications,
 * because it permits direct comparisons on encoded decimals, hence it does not
 * require anything special for indexing (i.e, B-trees on encoded decimals may
 * be used).
 *
 * The format is fully described in:
 *
 * >   Fourny, Ghislain
 * >   decimalInfinite: All Decimals In Bits. No Loss. Same Order. Simple.
 * >   arXiv:1506.01598v2
 * >   17 Jun 2015
 *
 * Zero and special numbers (+Inf, -Inf, -NaN, +NaN) are encoded in one byte;
 * all other numbers occupy two or more bytes.
 *
 * - -Inf is encoded as       00 000000.
 * - Minus zero is encoded as 01 000000.
 * - Plus zero is encoded as  10 000000.
 * - +Inf is encoded as       110 00000.
 * - -NaN is encoded as       111 00000.
 * - +NaN is encoded as       1111 0000.
 *
 * (Note that the ordering does not apply to NaNs.)
 *
 * The encoding of all finite, negative, non-zero decimals starts with 00.
 * The encoding of all finite, positive, non-zero decimals starts with 10.
 * Since all finite numbers are encoded in two or more bytes, all negative
 * numbers are strictly between -Inf and -0, and all positive numbers are
 * strictly between +0 and +Inf.
 *
 * It is well known that every real number can be written in the form:
 *
 *   S x M x 10^(T x E)
 *
 * where S is the sign (-1 or +1); M is the significand, which is a real in the
 * interval [0,10); T is the sign of the exponent (-1 or +1); and E is a natural
 * number (the absolute value of the exponent).
 *
 * decimalInfinite represents each number by encoding each of S, T, E, and
 * M separately, then concatenating the results in that order (of course, this
 * assumes that M has a finite number of digits). We have already said that S is
 * encoded as either 00 (-1) or 10 (+1). The sign of the exponent T is encoded
 * in the third bit, according to the following scheme:
 *
 * 000: negative sign, non-negative exponent;
 * 001: negative sign, negative exponent;
 * 100: positive sign, negative exponent;
 * 101: positive sign, non-negative exponent.
 *
 * The absolute value of the exponent E is encoded in two steps using a modified
 * Gamma code. Gamma codes are variable-length self-delimiting encodings (i.e.,
 * prefix codes) of the natural numbers. The basic idea is to encode the length
 * of a binary number in unary followed by the binary representation of the
 * number. For instance, 9 (1001 in binary) might be encoded (in a sub-optimal
 * way) as 1111 0 1001 (the zero in the middle acting as a separator, signaling
 * where the unary encoding of the number's length stops). Gamma codes are
 * cleverer, though: if the number is offset by 1, it can be assumed that the
 * most significant bit of its binary representation is always 1, hence it is
 * not necessary to encode such bit. So, the Gamma code of 9 is in fact just
 * 111 0 010 (i.e., the naïve encoding of 9+1 with the leading 1 dropped). It
 * is easy to verify that this Gamma code is order-preserving.
 *
 * In decimalInfinite, the exponent is offset by 2 instead of 1, so that the
 * unary prefix plus the separator is always at least two bits long. The first
 * step (pre-encoding) of E is as follows: let N be the number of bits of the
 * binary encoding of E+2. The pre-encoding of E is a sequence of N-1 ones
 * followed by a zero followed by the trailing N-1 bits of E+2.
 *
 * For instance, if E=9 then the binary representation of E+2 is 1011, hence
 * N=4. The pre-encoding of E is then written as 111 0 011. As another example,
 * if E=0 then N=2 (because E+2 is 10 in binary). Hence, the pre-encoding of
 * 0 is 100.
 *
 * In the second step, if T has been encoded as 0 then the pre-encoding is
 * bitwise complemented; otherwise it is left unchanged. Finally, since the
 * first bit of the encoded absolute exponent is always equal to the encoding of
 * T, the first bit of the encoded absolute exponent can be dropped (or,
 */

Sorry for the long example, but if I delete text somewhere else in the comment the issue may not be reproducible, and I haven't been able to find a pattern so far.

@tonymec
Copy link

tonymec commented Jan 3, 2019

In the first case, the first wrongly indented line has the word if near the beginning. In the second case, the line before the newly inserted line has or near the end. Maybe the indent script doesn't realize that we are within a comment?

Try using :syn sync fromstart in that file before reformatting. Do you get the same wrong result?

Best regards,
Tony.

@lifepillar
Copy link
Contributor Author

The :syn command does not change anything: I still get wrong indentation.

@brammool
Copy link
Contributor

brammool commented Jan 3, 2019 via email

@lifepillar
Copy link
Contributor Author

That fixed it. Thanks, I didn't know about cino- options!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants