-
Notifications
You must be signed in to change notification settings - Fork 24
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
some updates for PR 157 #171
Conversation
afni-rickr
commented
Apr 1, 2023
- put variable definitions at function blocks
- print int64_t using PRId64
- znz_off_t varies, so try not to print
- extensions are int size, limit size and check in read_file_text()
- call strlen() only after NULL check
- modify_field(), check length before copy
- act_disp_ci() limited to int number of elements
- formatting
@@ -5021,7 +5022,7 @@ nifti_image* nifti_convert_n2hdr2nim(nifti_2_header nhdr, const char * fname) | |||
|
|||
nim->nifti_type = (is_onefile) ? NIFTI_FTYPE_NIFTI2_1 : NIFTI_FTYPE_NIFTI2_2; | |||
|
|||
int byteOrder = nifti_short_order() ; | |||
byteOrder = nifti_short_order() ; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why split the declaration and initial assignment? The codebase clearly doesn't support C89, as things like int64_t
are used unconditionally.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While not required by the compiler, it is certainly the convention in the code, making location of types predictable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, if that's the project's style, I can follow it in the future, and update my other pending PRs accordingly.
But I don't think that style is advantageous. It's not for nothing that C99 and C++ got rid of the old C requirement for having declarations at the top. It puts everything in a larger scope than necessary, see for example:
https://wiki.sei.cmu.edu/confluence/display/c/DCL19-C.+Minimize+the+scope+of+variables+and+functions
Tools like cppcheck and clang-tidy can warn about such things. I was going to work on such a scope reduction patch after my other cppcheck PR lands.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, please do not alter the style of programming in this library. Save that for a new library if you wish to.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, will keep the C89 style in future changes.
nifti2/nifti_tool.c
Outdated
long long new_ecode = strtoll(argv[ac], NULL, 10); | ||
if ((new_ecode == 0 && errno != 0) || new_ecode <= INT_MIN || new_ecode >= INT_MAX) { | ||
fprintf(stderr,"** invalid -add_ext parameter\n"); | ||
new_ecode = atoi(argv[ac]); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why revert to atoi? It is obsolete. It results in undefined behaviour if the string can't fit into an int.
See for example:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmmm, it looks like it is being pushed towards obsolescence, but of course, will never be. But I believe it returns 0 on error, not an undefined value. I wanted to remove the unnecessary use of long long (which is a very annoying type, since it's actual type varies). The ecode is question only has a small range of acceptable values, which will be checked. Using atoi() is not important, but going to large types is ugly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But I believe it returns 0 on error, not an undefined value
C99 7.20.1 "If the value of the result cannot be represented, the behavior is undefined."
... use of long long (which is a very annoying type, since it's actual type varies)
That's true of all the types though, even the int
that atoi
uses.
but going to large types is ugly
Ugly is preferable to undefined behaviour though.
fprintf(stderr,"** RFT: file '%s' appears empty\n", filename); | ||
return NULL; | ||
} else if( len64 > INT_MAX ) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
INT32_MAX would be more correct.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure. I almost never use those terms, but would expect INT_MAX to be reasonably appropriate when comparing with ints. Either is fine. We require int to mean int32_t.
@@ -3829,8 +3834,7 @@ int modify_field(void * basep, field_s * field, const char * data) | |||
if( g_debug > 1 ) | |||
fprintf(stderr,"+d modifying field '%s' with '%s'\n", field->name, data); | |||
|
|||
size_t dataLength = strlen(data); | |||
if( !data || dataLength == 0 ) | |||
if( !data || strlen(data) == 0 ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good catch!
{ | ||
fprintf(stderr,"** FAILURE for dataset '%s'\n", nim->fname); | ||
if( data ) { free(data); data = NULL; } | ||
err++; | ||
} else if ( len64/nim->nbyper > INT_MAX ) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
INT32_MAX would be more correct.
} else if ( len64/nim->nbyper > INT_MAX ) { | ||
fprintf(stderr,"** %" PRId64 " is too many values to display\n", | ||
len64/nim->nbyper); | ||
if( data ) { free(data); data = NULL; } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The if(data)
check is unnecessary. free(NULL)
is well-defined.
@@ -6710,7 +6721,8 @@ int act_disp_ci( nt_opts * opts ) | |||
fprintf(stdout,")\n"); | |||
} | |||
|
|||
disp_raw_data(data, nim->datatype, len / nim->nbyper, space, 1); | |||
/* should we change disp_raw_data to allow for 64-bit nvalues? */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should, and I did in #165
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh right, though the counter was only of type int.
Still, it would be more clear to put functionality enhancement PRs separate from code cleanup ones. This looked more like "well, we've altered the type, so now it should propagate". It would be fine to go 64-bit here, but if you want that, maybe just make it a separate change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still, it would be more clear to put functionality enhancement PRs separate from code cleanup ones.
I did split my -Wshorten-64-to-32
fixes into 2 PRs:
- 64 to 32 truncation warnings #157 which changed no APIs, that this PR some updates for PR 157 #171 is partly reverting
- 64 to 32 truncation warnings api changes #165 which does change APIs like
disp_raw_data
above. It's awaiting review.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you would like to make the 64-bit update for this, please do not change library APIs in that PR. Modifying types in nifti_tool is okay, modifying them in the library is a bigger deal. Let's keep them separate. Any library updates should be done in the context of heading to a 4.x.x release. We should make useful updates in the current 3.x.x version, first.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, they are separate. That's what I meant when I said "I did split my -Wshorten-64-to-32 fixes into 2 PRs". The former is merged already, the latter I don't expect to be merged in the short term.
@@ -9258,7 +9261,8 @@ static int rci_read_data(nifti_image * nim, int64_t * pivots, int64_t * prods, | |||
|
|||
/* make sure things look good here */ | |||
if( *pivots != 0 ){ | |||
fprintf(stderr,"** NIFTI rciRD: final pivot == %lld!\n", *pivots); | |||
fprintf(stderr,"** NIFTI rciRD: final pivot == %" PRId64 "\n", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you mean to remove the exclamation point?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it seemed like a mistake to be there. I am not a fan of programs yelling at the users. :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BTW, way to pay attention to the little details... :)
niftilib/nifti1_io.c
Outdated
@@ -4405,15 +4405,15 @@ static int nifti_read_extensions( nifti_image *nim, znzFile fp, int remain ) | |||
return -1; | |||
} | |||
|
|||
znz_off_t posn = znztell(fp); | |||
posn = (int)znztell(fp); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
znztell
returns znz_off_t
. The cast you've added could cause information loss.
You mentioned "znz_off_t varies, so try not to print" which I guess is why you made this change...
But what's wrong with using %lld
for znz_off_t
? If znz_off_t
is smaller than long long
it'll still get printed correctly.
With this change, if znztell
returns a huge number, your error message won't even print the actual thing znztell
returned.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
posn is required to be 358, anything else is a fatal error. And %lld will produce compiler warnings on many systems (since znz_off_t varies). Using int64_t would be okay, but seems like overkill.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh wait, it is not a fatal error, just a warning (since we don't care soooo much about extensions, I guess)...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And %lld will produce compiler warnings on many systems (since znz_off_t varies)
Ah true, -Wformat
will warn about that. Then the most correct thing is to:
- not do the truncating cast you added
- cast the
%lld
param tolong long
to match%lld
znz_off_t posn = znztell(fp);
fprintf(stderr,"-d nre: posn = %lld, offset = %d, type = %d, remain = %d\n",
(long long)posn, nim->iname_offset, nim->nifti_type, remain);
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, will do. I sill worry about getting warnings, we will see.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
-Wformat
won't warn then, I tried.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These things can vary across systems, as the sizes vary. "long" types do not seem to be so reliable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Their size varies, but "%lld" means "long long" regardless of what the size of long long is.
niftilib/nifti1_tool.c
Outdated
|
||
if( !dp ) return 0; /* nothing to clear */ | ||
|
||
size_t len = strlen(dp); | ||
len = (int)strlen(dp); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not use size_t
here?
Sure, it's super-unlikely that any string passed to this function will be longer than INT_MAX, but still... are you trying to save 4 bytes of stack space?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The only reason size_t was added was because of warnings, and casting is generally the more simple and less impactful way to do that. The function is clearing some number of zeros from a maximum 32 character string. Unnecessarily promoting variables to higher sizes confuses the purposes of the functions. int isn't better than size_t here, but it would have been the preferred change in the first place. This is just more in line with how overall changes should be made to this mature (but not perfect) library. Unless something is actually broken, have as little impact as possible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The only reason size_t was added was because of warnings
I turned on -Wshorten-64-to-32
to find bugs, not just to shut the compiler up. This function clear_float_zeros
is buggy. It does not do what it's supposed to do when the string is huge. The warning found this bug for us.
Unnecessarily promoting variables to higher sizes...
There is no promotion to higher sizes. strlen
itself returns size_t
. strlen
will correctly compute the length of a huge string. My changing the len
variable from int
to size_t
only makes it match what strlen
is already giving us. Your changing it back to int
here is discarding 50% of strlen
's result.
We can even compare the assembly of the int
vs size_t
variants. The truncation to int takes extra work, on top of just being incorrect.
Unless something is actually broken, have as little impact as possible.
I understand not wanting to add regressions to a mature codebase. We indeed have to be careful of that. But something is actually broken here, albeit not something likely to be encountered in normal use.
The fix (changing int
to size_t
) is extremely safe.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for bring this back to size_t
. I appreciate you being open to argument!
niftilib/nifti1_tool.c
Outdated
@@ -3885,7 +3885,9 @@ int disp_raw_data( void * data, int type, int nvals, char space, int newline ) | |||
{ | |||
float temp; | |||
memcpy(&temp, dp, sizeof(temp)); | |||
snprintf(fbuf,sizeof(fbuf),"%f", temp); | |||
snprintf(fbuf, NT_LOC_MAX_FLOAT_BUF,"%f", temp); | |||
/* terminate, since some versions might not */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've never heard of any snprintf
with such a disastrous bug...
Are you perhaps thinking of the old Microsoft _snprintf
(note the leading underscore)? See:
https://stackoverflow.com/questions/7706936/is-snprintf-always-null-terminating
As we are not calling _snprintf
this change should not be needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"Bug" is not for us to decide regarding this, the question is what does the doc say vs what does the function do. Reading the man page many times, it is not 100% clear that it always does the termination, the help seems a bit ambiguous. But yes, I guess it was the _ version that raised the flag.
So I don't really care about this either way. Being safe is good, not mentioning M!cr*s@ft is very good.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"Bug" is not for us to decide regarding this
The C standard requires snprintf to always null terminate. I've updated sprintf to snprintf in many open source libraries recently, and have never seen the MS version having any problem. It was only the old non-standard underscore version that was busted.
I would remove this change, it an unnecessary complication.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, this is the last question remaining for this PR. You can choose: keep the termination or remove it, what do you prefer? I am happy keeping it. If I have to do google searches to verify the functionality, it is preferable not to make "the next programmer" have to as well, so clarity is nice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, I will remove this, thanks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BTW, I did look at the C standard, and indeed it is quite clear about the null termination, while the man page is not quite so clear. That should by my go-to reference, rather than the man pages. Thanks.
@afni-rickr can you squash the 3 commits? Otherwise history will be weird with changes done and undone. |
I was going to just do this as a new PR and a single commit, unless you know of a sneakier way to go. |
You can transform this PR into a single commit by "squashing the commits". Is that something you are unfamiliar with? If so, it's not that hard, I can walk you through it... |
Sure, I have generally avoided that. My concern is that committing a |
I would do |
Okay, will do. But to be sure, this would cause your local branch of update_157 to diverge, is that right? |
@@ -219,7 +218,7 @@ static int g_debug = 1; | |||
/* local prototypes */ | |||
static int free_opts_mem(nt_opts * nopt); | |||
static int num_volumes(nifti_image * nim); | |||
static char * read_file_text(const char * filename, int64_t * length); | |||
static char * read_file_text(const char * filename, int * length); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So why revert this one (and some others) back to int?
Is it because you still consider this part of the API?, even though it is:
static
and thus private to the library- has no prototype in the header file
I only have a local branch to do code review on my own computer, for big changes I find the github UI not great. |
Ah, there were a few other changes I was going to suggest, but as you've already merged, I'll make a new PR... |