-
-
Notifications
You must be signed in to change notification settings - Fork 6.1k
[RDY] misc2.c -Wconversion changes #907
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@@ -489,23 +489,21 @@ time_t get8ctime(FILE *fd) | |||
* Read a string of length "cnt" from "fd" into allocated memory. | |||
* Returns NULL when unable to read that many bytes. | |||
*/ | |||
char_u *read_string(FILE *fd, int cnt) | |||
uint8_t *read_string(FILE *fd, int cnt) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
char_u
is the same as uint8_t
. If this type is going to change it should be changed to char
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then you could create a READ_STRING
macro that casts the return value of read_string
and use it in place of read_string
until all code can migrate from char_u
to char
.
Yeah, I changed it because of the discussion in #459. I also specifically kept it unsigned because under the hood we are reading an unsigned char from a file and gets() casts it to int. This could cause an overflow, unless I am missing something obvious. I'm a bit rusty though. |
So changing it to char wouldn't change the representation at all which is what matters. And read_string is supposed to, ideally, return a string. In that case, since it should only be characters (given it is reading a string and not bytes), changing it to char is appropriate. |
/* Read the string. Quit when running into the EOF. */ | ||
for (i = 0; i < cnt; ++i) { | ||
for (i = 0; i < cnt; i++) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can move the int i
declaration to here (we use C99 in Neovim).
This should be better. It seemed to me that cnt should be size_t, so I also changed that. Thoughts? Also, I force pushed this to my branch to keep the history clean. Is that acceptable if the pull request is not [WIP]? |
/* Read the string. Quit when running into the EOF. */ | ||
for (i = 0; i < cnt; ++i) { | ||
c = getc(fd); | ||
char *str = xmallocz(cnt); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use uint8_t
. Cast to char *
on return.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@philix I'm just curious why you think this is better? There's no math happening with the string, just straight comparisons, and using char *
avoids a cast.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know that using char
here works the same as uint8_t
, however it would
be nicer to use an unsigned type as that's what getc
returns inside the
int
.
Character values are returned as an unsigned char converted to an int. If
the stream is at end-of-file or a read error occurs, the routines return
EOF.
On Wed, Jul 2, 2014 at 8:17 AM, John Szakmeister notifications@github.com
wrote:
In src/nvim/misc2.c:
{
- int i;
- int c;
- char_u *str = xmallocz(cnt);
- /* Read the string. Quit when running into the EOF. */
- for (i = 0; i < cnt; ++i) {
- c = getc(fd);
- char *str = xmallocz(cnt);
@philix https://github.com/philix I'm just curious why you think this
is better? There's no math happening with the string, just straight
comparisons, and using char * avoids a cast.—
Reply to this email directly or view it on GitHub
https://github.com/neovim/neovim/pull/907/files#r14451544.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know that using
char
here works the same asuint8_t
, however it would be nicer to use an unsigned type as that's whatgetc
returns inside theint
.
I'm still trying to deduce what you mean by "nicer"--sorry! Is your underlying argument here is that it could prevent a subtle bug because a platform could (theoretically) use a different mechanism for representing numbers, which could cause the character read to be wrong? Or is the argument really just that it is more consistent with what's happening under-the-hood?
I only ask because avoiding casts is nice, since they can hide bugs. I would've swung the other way here said it should be char *
. So I'm trying to understand your philosophy about this so that I can either amend my vision or yours. :-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because it is more consistent with what's happening under-the-hood. getc
returns an int
, so there's a cast to char
or uint8_t
anyway.
I suggested casting on return when he changed everything to uint8_t
. Then
he changed everything to char *
, force pushed, and my suggestion became
less justifiable. :)
On Wed, Jul 2, 2014 at 11:34 AM, John Szakmeister notifications@github.com
wrote:
In src/nvim/misc2.c:
{
- int i;
- int c;
- char_u *str = xmallocz(cnt);
- /* Read the string. Quit when running into the EOF. */
- for (i = 0; i < cnt; ++i) {
- c = getc(fd);
- char *str = xmallocz(cnt);
I know that using char here works the same as uint8_t, however it would
be nicer to use an unsigned type as that's what getc returns inside the
int.I'm still trying to deduce what you mean by "nicer"--sorry! Is your
underlying argument here is that it could prevent a subtle bug because a
platform could (theoretically) use a different mechanism for representing
numbers, which could cause the character read to be wrong? Or is the
argument really just that it is more consistent with what's happening
under-the-hood?I only ask because avoiding casts is nice, since they can hide bugs. I
would've swung the other way here said it should be char *. So I'm trying
to understand your philosophy about this so that I can either amend my
vision or yours. :-)—
Reply to this email directly or view it on GitHub
https://github.com/neovim/neovim/pull/907/files#r14459871.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My original reasoning for using uint8_t was that we are returning bytes from a file, and in most systems bytes are probably going to be 8 unsigned bits.
But the method is actually an interface to get a string, so it is more accurate to return characters from it. I agree that is correct. :) The caller is going to assume they are going to get a string back. Since the data type used to read the characters is unsigned char, as per the man page for getc, I realized that it is possible, though very unlikely, that the value returned by getc may not fit in uint8_t at some point in the future. That is why I changed everything to char. It is guaranteed to fit the data, and the representation will be the same when the data is used by the caller.
Alright, this should be good unless we decide to stick with a char variant internally as opposed to uint8_t to be consistent with getc(). Either one seems reasonable. |
LGTM. Sorry for the nitpicking and confusion. |
No problem! I appreciate it. It has been a while since I have used C, so I need the criticism :) |
*/ | ||
char_u *read_string(FILE *fd, int cnt) | ||
/// Read a string of length "cnt" from "fd" into allocated memory. | ||
/// Returns NULL when unable to read that many bytes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Small corrections:
/// Reads a string of length "cnt" from "fd" into allocated memory.
/// @return pointer to the string or NULL when unable to read that many bytes.
ref. http://neovim.org/develop/style-guide.xml#Function_Comments
That should tidy it up then. |
One last nitpick. The commit that enables conversion should not be separate, because in general the idea is that any given commit should be a "good build". |
We discussed it here #876 (comment). @aktau proposed another method but I am still in favor of doing these things in one commit. Sorry for the yet another nitpick. I've should describe my changes to comment clearer. One of these was "Read" -> "Reads" because of "These comments should be descriptive ("Opens the file") rather than imperative ("Open the file"); the comment describes the function, it does not tell the function what to do." - from previously referenced paragraph in style guide. |
*/ | ||
char_u *read_string(FILE *fd, int cnt) | ||
/// Reads a string of length "cnt" from "fd" into allocated memory. | ||
/// @return pointer to the string or NULL when unable to read that many bytes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"pointer to string" is redundant, just say "string" next time.
Merged. Thanks @siler ! |
Related neovim#907 Also junegunn/fzf#1750 function! RipgrepFzf(query, fullscreen) let command_fmt = 'rg --column --line-number --no-heading --color=always --smart-case %s || true' let initial_command = printf(command_fmt, shellescape(a:query)) let reload_command = printf(command_fmt, '{q}') let options = {'options': ['--phony', '--query', a:query, '--bind', 'change:reload:'.reload_command]} call fzf#vim#grep(initial_command, 1, options, a:fullscreen) endfunction command! -nargs=* -bang RF call RipgrepFzf(<q-args>, <bang>0)
First go at helping out. Hopefully misc2.c goes away, but in the mean time I fixed up the one method in it that had -Wconversion errors, per #567.
xmallocz zeroes the string at the end, no need to double up on nulling it.
After looking, it seemed like a cnt of int was adequate, no callers used larger sizes. Still, should it be size_t?