Skip to content

Fix a C portability problem#553

Merged
tuxji merged 1 commit intoapache:masterfrom
tuxji:ji/daffodil-2507
May 5, 2021
Merged

Fix a C portability problem#553
tuxji merged 1 commit intoapache:masterfrom
tuxji:ji/daffodil-2507

Conversation

@tuxji
Copy link
Contributor

@tuxji tuxji commented May 4, 2021

Some older C compilers (e.g., gcc 4.8.2 on CentOS 7) don't allow a
local variable to be declared inside a for loop condition. It's much
easier to move the declaration outside the for loop condition than it
is to override the compiler with a different driver name or the
correct driver options.

DAFFODIL-2507

@tuxji
Copy link
Contributor Author

tuxji commented May 4, 2021

All checks have passed, ready for review now.

Copy link
Member

@stevedlawrence stevedlawrence left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

size_t i;
for (i = 0; i < count && !error; i++)
{
const size_t offset = offsets[i];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a setting we can enable so that we can error when we add non-portable code? Even if we only set it for GitHub actions so we can catch it before it's merged if there are concerns about enabling it all time?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I realized there is a setting we can use: -pedantic -std=gnu99. I've enabled it, found some more problems, and fixed them in my latest commit. The -pedantic option turns on some warnings about code that doesn't conform strictly to the C standard. The -std=gnu99 option checks the C99 standard with GNU extensions, allowing us to keep using <unistd.h> for getopt and so on for all the functions that aren't available with -std=c99. Note that there's nothing really GNU-ish about getopt, it's more of a Unix / POSIX extension than a GNU extension.

BUILD.md Outdated
You can use the `yum` package manager to install most of the tools
needed to build Daffodil:

sudo dnf install gcc git java-11-openjdk-devel llvm make pkgconfig
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be yum instead of dnf? Does CentOS 7 have dnf? Should this also be gcc instead of clang like we suggest for fedora, or is clang not even available?

Copy link
Contributor Author

@tuxji tuxji left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To follow up on the dnf comment (outdated now so I had to comment separately here), yes, that was a mistake and I've corrected it in the BUILD.md. The latest and greatest commit should work portably on both old and modern C compilers now and we should see warnings now if we use anything that C99 doesn't support.

size_t i;
for (i = 0; i < count && !error; i++)
{
const size_t offset = offsets[i];
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I realized there is a setting we can use: -pedantic -std=gnu99. I've enabled it, found some more problems, and fixed them in my latest commit. The -pedantic option turns on some warnings about code that doesn't conform strictly to the C standard. The -std=gnu99 option checks the C99 standard with GNU extensions, allowing us to keep using <unistd.h> for getopt and so on for all the functions that aren't available with -std=c99. Note that there's nothing really GNU-ish about getopt, it's more of a Unix / POSIX extension than a GNU extension.

@tuxji
Copy link
Contributor Author

tuxji commented May 5, 2021

Still need another +1.

Copy link
Contributor

@mbeckerle mbeckerle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1
This is all fine. I just thought we were dealing with modern C compilers here. The inability to treat a struct as a typedef bugs the crap out of me. If that isn't really needed I'd prefer the typedefs.

typedef const Error *(*VisitStartComplex)(const VisitEventHandler *handler, const InfosetBase *base);
typedef const Error *(*VisitEndComplex)(const VisitEventHandler *handler, const InfosetBase *base);
typedef const Error *(*VisitNumberElem)(const VisitEventHandler *handler, const ERD *erd, const void *number);
struct ERD;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Surprising to me that this typedef isn't portable. Seriously you can't suppress the need to use the struct keyword everywhere you use these types? Ugh.

Can we upgrade our usage to a newer still-standard revision of the C language that has this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wouldn't mind upgrading to C11 as a minimum requirement. C11 would make practically all of those changes unnecessary because C11 allows us to redefine typedefs, declare a local variable in a for loop's condition, and define anonymous unions (error.c instead of error.arg.c). However, we would have to make CodeGenerator.scala pass -std=gnu11 to the C compiler when daffodil runs TDML tests. That option is supported and understood by the gcc and clang versions I tested on CentOS 7 and they compile the code without complaining.

The problem is that I don't believe that -std=gnu11 is a standard option which every C compiler understands or that every C compiler in use now can compile C11-level code with getopt and other Unix functions. Once you start building for an embedded platform, you might be using an esoteric C compiler not derived from gcc or clang anymore. On a HPC machine, you might be using Intel's C compiler. I just googled to find documentation for Intel's C compiler. It accepts the -std option (here's the page), but it goes up only as high as gnu99 (and c11, but I don't think -std=c11 would let us call getopt and company just like it doesn't in clang and gcc).

I think it's quite possible a user might have a C compiler which won't accept -std=gnu11, and it might compile only C99-level code if we pass only -I and -o options (like what happens with both gcc and clang on CentOS 7). My fear is that we'd still need to make these changes later anyway, so we might as well merge these changes to save us time. The safest thing for a portable daffodil installation would be to pass only -I and -o options to unknown C compilers when running TDML tests on users' platforms, although we could make CodeGenerator.scala read a CFLAGS environment variable and add its value to the compiler options too. Thoughts?

Copy link
Contributor Author

@tuxji tuxji left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thoughts?

typedef const Error *(*VisitStartComplex)(const VisitEventHandler *handler, const InfosetBase *base);
typedef const Error *(*VisitEndComplex)(const VisitEventHandler *handler, const InfosetBase *base);
typedef const Error *(*VisitNumberElem)(const VisitEventHandler *handler, const ERD *erd, const void *number);
struct ERD;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wouldn't mind upgrading to C11 as a minimum requirement. C11 would make practically all of those changes unnecessary because C11 allows us to redefine typedefs, declare a local variable in a for loop's condition, and define anonymous unions (error.c instead of error.arg.c). However, we would have to make CodeGenerator.scala pass -std=gnu11 to the C compiler when daffodil runs TDML tests. That option is supported and understood by the gcc and clang versions I tested on CentOS 7 and they compile the code without complaining.

The problem is that I don't believe that -std=gnu11 is a standard option which every C compiler understands or that every C compiler in use now can compile C11-level code with getopt and other Unix functions. Once you start building for an embedded platform, you might be using an esoteric C compiler not derived from gcc or clang anymore. On a HPC machine, you might be using Intel's C compiler. I just googled to find documentation for Intel's C compiler. It accepts the -std option (here's the page), but it goes up only as high as gnu99 (and c11, but I don't think -std=c11 would let us call getopt and company just like it doesn't in clang and gcc).

I think it's quite possible a user might have a C compiler which won't accept -std=gnu11, and it might compile only C99-level code if we pass only -I and -o options (like what happens with both gcc and clang on CentOS 7). My fear is that we'd still need to make these changes later anyway, so we might as well merge these changes to save us time. The safest thing for a portable daffodil installation would be to pass only -I and -o options to unknown C compilers when running TDML tests on users' platforms, although we could make CodeGenerator.scala read a CFLAGS environment variable and add its value to the compiler options too. Thoughts?

Some older C compilers (e.g., gcc 4.8.5 on CentOS 7) won't allow a
local variable to be declared inside a for loop's condition since
they're still using the C89 standard, not C99.  We'll move the
declaration outside the for loop condition rather than add a compiler
driver option like -std=gnu99 to CodeGenerator.scala since some
compilers may not accept -std=gnu99.

Some older C compilers (e.g., clang 3.4.2 on CentOS 7) warn that
redefining a typedef requires C11 support.  Make infoset.h use struct
<name> where necessary instead of redefining typedef struct <name>.

Add CentOS 7 instructions to BUILD.md and update all instructions to
install both clang and gcc (sometimes gcc is needed for glibc headers
or to build mxml).

Check that our C code conforms to the C99 standard by adding -pedantic
and -std=gnu99 to compilation flags in build.sbt and Makefile.
Consider adding support for a CFLAGS environment variable to allow
developers to override "-Wall -Wextra -pedantic -std=gnu99" without
changing build.sbt or Makefile (not done in this commit).

Fix pedantic warnings about unnecessary semi-colons in parsers.c &
unparsers.c and C11-level-only anonymous union in errors.h.  Change
all references to error.<union member> to use error.arg.<union member>
instead.

DAFFODIL-2507
@tuxji tuxji force-pushed the ji/daffodil-2507 branch from b55c714 to 8994d3b Compare May 5, 2021 15:30
@tuxji tuxji merged commit 8c95b39 into apache:master May 5, 2021
@tuxji tuxji deleted the ji/daffodil-2507 branch May 5, 2021 15:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants