Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Do not use unnamed structs/union to make it compatible with GCC 4 #604

Closed
wants to merge 1 commit into from

Conversation

ret2libc
Copy link
Contributor

I know GCC 4 is a very old version, but it would be great for a project I work on to be able to build there as well. We recently included tree-sitter compiled by default and I noticed there are some constructs that do not work well with such a old gcc (used for example in RHEL6/Centos 6).

I understand those unnamed structs/union are probably handy as they avoid too long names, but I think it would be great to make tree-sitter compile on a wider range of systems.

I hope you can consider such inclusion. For the names of those unnamed unions/structs I am very open to suggestion :) Thanks!

@dcreager
Copy link
Contributor

I love the idea of being able to support earlier compilers, especially since the change is pretty non-invasive. My only concern here is that it's a backwards-incompatible change to the API, and would require a careful rollout of a new version. (We might be able to avoid a soname bump since we haven't technically cut a version since we added the Makefile (#602), but existing released versions of each language grammar would still try to use the old unnamed-field types.)

@dcreager
Copy link
Contributor

existing released versions of each language grammar would still try to use the old unnamed-field types

Actually this is not quite right; each released language grammar has its own copy of tree-sitter/parser.h, and the type layouts are the same, so we're probably safe with existing grammars "working" with a new version of the tree-sitter runtime library that includes these changes. But we'd also have to re-generate and re-release each language grammar that you'd want to use with this update. @maxbrunsfeld can you check my work on this?

@ret2libc
Copy link
Contributor Author

@dcreager you want newer tree-sitter libs to support grammars developed with older versions, right?

@maxbrunsfeld
Copy link
Contributor

maxbrunsfeld commented Apr 24, 2020

Yes, I think this is ok from a backwards-compatibility standpoint, because as you said, the type layouts are the same.

@ret2libc Just so that I understand - I thought that even GCC 4 could compile C11 features, it would just need to explicitly opt into that version of the language via -std=gnu11. Is that not the case? Are anonymous fields not supported by GCC 4 even in gnu11 mode?

Looking in the GCC 4.0.4 docs, it looks like unnamed fields were implemented in that version: https://gcc.gnu.org/onlinedocs/gcc-4.0.4/gcc/Unnamed-Fields.html#Unnamed-Fields.

@ret2libc
Copy link
Contributor Author

On Centos 6 there is gcc (GCC) 4.4.7 20120313 (Red Hat 4.4.7-23), but -std=gnu11 fails with cc1: error: unrecognized command line option "-std=gnu11"

@maxbrunsfeld
Copy link
Contributor

Ok, thanks for testing that out.

From my testing on GCC 4.4.7, anonymous struct/union fields do actually work. For example, this code compiles with -Wall and runs correctly:

#include <stdio.h>

typedef struct S {
  int a;
  union {
    struct {
      int b;
      int c;
    };
    long d;
  };
  char e;
} S;

int main() {
  S s = {.a = 1, {{.b = 2, .c = 3}}, .e = '4'};
  printf("a: %d, b: %d, c: %d, e: %c\n", s.a, s.b, s.c, s.e);
  s.d = 5;
  printf("a: %d, d: %ld, e: %c\n", s.a, s.d, s.e);
  return 0;
}

I think the only problem for Tree-sitter is that when creating an instance of these structs using composite literal syntax, the syntax is slightly more strict: you have to wrap the nested fields in curly braces, instead of being able to treat them as fields of the parent struct.

So, in this example ☝️ , you have to do this:

S s = {.a = 1, {{.b = 2, .c = 3}}, .e = '4'};

instead of this:

S s = {.a = 1, .b = 2, .c = 3, .e = '4'};

And you need to put the nested struct in the right order.


I think that in most cases, my preference would be to leave the type definitions unchanged, and to only change the places where composite literal syntax is used, rather than changing the type definitions and all of the usage sites of the fields. Does that seem reasonable to you?

It's also possible that I'm misunderstanding the level of incompatibility, and we actually do need to change the types. Let me know if that's the case.

@ret2libc
Copy link
Contributor Author

@maxbrunsfeld thanks for testing that, I will try it out and let you know!

@ret2libc
Copy link
Contributor Author

@maxbrunsfeld i'm having an hard time with:

#include <stdbool.h>
#include <stdlib.h>
#include <stdint.h>

typedef struct {
  union {
    struct {
      int state;
      bool extra;
      bool repetition;
    };
    struct {
      int symbol;
      int16_t dynamic_precedence;
      uint8_t child_count;
      uint8_t production_id;
    };
  } params;
  int type;
} TSParseAction;

typedef union {
  TSParseAction action;
  struct {
    uint8_t count;
    bool reusable;
  } ;
} TSParseActionEntry;

int main(int argc, char **argv) {
        TSParseActionEntry t = { {.count = 0, .reusable = true} };
}

probably the compiler tries to match count and reusable in TSParseAction. Any idea on how to workaround this? This is required for the parser.c file.

@ret2libc
Copy link
Contributor Author

I think before version GCC 4.6 there was some problems with regard to unnamed struct inside a union.

@maxbrunsfeld
Copy link
Contributor

Ok. Let's give that struct field a name then. Maybe just entry:

typedef union {
  TSParseAction action;
  struct {
    uint8_t count;
    bool reusable;
  } entry;
} TSParseActionEntry;

// ...

TSParseActionEntry t = {.entry = {.count = 0, .reusable = true}};

@maxbrunsfeld
Copy link
Contributor

Closing this one out since #607 is superseding it. Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants