Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

publish bison version number in generated output #55

Closed
bazsi opened this issue Oct 22, 2020 · 14 comments
Closed

publish bison version number in generated output #55

bazsi opened this issue Oct 22, 2020 · 14 comments

Comments

@bazsi
Copy link

bazsi commented Oct 22, 2020

I was trying to migrate from %name-prefix to api.prefix while trying to support the as-of-today versions of bison in various distributions.

I ran into one issue, which I have created a workaround for, but since this was pain to fix, let me suggest some additions to the generated grammar file. The suggestion is to publish

YYBISON_VERSION_MAJOR 3
YYBISON_VERSION_MINOR 5

macros, containing the major/minor version numbers. These would be in addition to the already existing YYBISON_VERSION macro.

Rationale:

YYBISON_VERSION as of today is a string representation of the bison version number, thus it does not allow me to create conditional code blocks in my grammar, which I needed in order to resolve a compatibility issue.

The issue I had was:

  • I am using YYEMPTY identifier in my rules, which in bison 3.5.1 was named YYEMPTY even if I used api.prefix.
  • starting with 3.6.1, YYEMPTY got renamed to _EMPTY when api.prefix is used
  • to make my grammar work with all bison versions, I #defined _EMPTY to YYEMPTY in case I was using 3.5.1

In order to have the version used to generate the .c file, I needed some sed magic in my Makefile rules to insert this information.

.y.c:
        $(AM_V_YACC)$(am__skipyacc) $(SHELL) $(YLWRAP) $< y.tab.c $@ y.tab.h $*.h y.output $*.output -- $(YACCCOMPILE)
        $(AM_V_GEN) sed -i -e '1i #define SYSLOG_NG_BISON_MAJOR @bison_version_major@\n#define SYSLOG_NG_BISON_MINOR @bison_version_minor@'  $@

Here's the code in the upstream codebase I needed: https://github.com/syslog-ng/syslog-ng/blob/059c1154ab87833bfb2598600c20ede4f09a8aaf/lib/rewrite/rewrite-expr-grammar.ym#L48-L55

Just in case you are interested, the migration from the old style to the new one took us almost 2 years, as documented in this pull request: syslog-ng/syslog-ng#2526

@akimd
Copy link
Owner

akimd commented Nov 1, 2020

Hi @bazsi
Sorry for the delays, I didn't get any notification :(

I have a few questions:

  • why do you actually use YYEMPTY? In most cases, it's quite an implementation detail. Using github's search engine, I could find where you actually depended on this symbol.

  • starting with 3.6.1, YYEMPTY got renamed to _EMPTY when api.prefix is used: I guess you didn't mean _EMPTY, right?

  • why don't you %require "3.6" and avoid all this portability hassle? Your releases can just ship the generated files, and you're done.

the migration from the old style to the new one took us almost 2 years

Wow... Don't hesitate getting in touch with us sooner!

Cheers!

@akimd
Copy link
Owner

akimd commented Nov 7, 2020

@bazsi Ping!

@bazsi
Copy link
Author

bazsi commented Nov 7, 2020

Sorry, this scrolled away in my inbox and forgot to answer.

I have a few questions:

  • why do you actually use YYEMPTY? In most cases, it's quite an implementation detail. Using github's search engine, I could find where you actually depended on this symbol.

I am not sure you remember but we are using multiple grammars in syslog-ng with the same lexer instance. The reason is that the parsing of the syslog-ng configuration file sections are delegated to plugins. This means that the main grammar would invoke the grammar of the plugin to perform the parsing of the respective plugin.

What can happen is that when this plugin grammar ends (for instance via YYACCEPT or simply when the grammar processes its first rule) it might have looked up a token that is stored in yychar. This is just basic look ahead, where the new symbol is used to decide if the grammar is finished or not.

If we return to the parent grammar at that point this token is lost.

This we inject this token back with a code fragment like this:

if (yychar != AFSOCKET_EMPTY)
  cfg_lexer_unput_token(lexer, &yylval);
YYACCEPT;
  • starting with 3.6.1, YYEMPTY got renamed to _EMPTY when api.prefix is used: I guess you didn't mean _EMPTY, right?

No I had PREFIX there but probably github chopped it off.

  • why don't you %require "3.6" and avoid all this portability hassle? Your releases can just ship the generated files, and you're done.

We do ship generated grammar files.

However it is common practice that distros potentially patch our grammar files and they need bison to regenerate that. We already decided to increase our requirement but only up to 3.5.1 because ubuntu focal is still at that version.

the migration from the old style to the new one took us almost 2 years

Wow... Don't hesitate getting in touch with us sooner!

It wasn't much you could do, we were mostly waiting for new versions to enter into various distros. And then working around incompatibilities within the broad set of biosn versions we wanted to support.

@akimd
Copy link
Owner

akimd commented Nov 7, 2020

Hi @bazsi

starting with 3.6.1, YYEMPTY got renamed to PREFIX_EMPTY when api.prefix is used

Actually, that's 3.6. Changes in the third number are only bug fixes and are not expected to have such a large effect.

I am not sure you remember but we are using multiple grammars in syslog-ng with the same lexer instance.

Yep, I remember.

What can happen is that when this plugin grammar ends (for instance via YYACCEPT or simply when the grammar processes its first rule) it might have looked up a token that is stored in yychar. This is just basic look ahead, where the new symbol is used to decide if the grammar is finished or not.

If we return to the parent grammar at that point this token is lost.

Gosh… That's a tricky one. I should really look at your set up at some point, and see what solution we could provide.

This we inject this token back with a code fragment like this:

if (yychar != AFSOCKET_EMPTY)
  cfg_lexer_unput_token(lexer, &yylval);
YYACCEPT;

That makes perfect sense!

We do ship generated grammar files.

However it is common practice that distros potentially patch our grammar files and they need bison to regenerate that. We already decided to increase our requirement but only up to 3.5.1 because ubuntu focal is still at that version.

I understand.

the migration from the old style to the new one took us almost 2 years

Wow... Don't hesitate getting in touch with us sooner!

It wasn't much you could do, we were mostly waiting for new versions to enter into various distros. And then working around incompatibilities within the broad set of bison versions we wanted to support.

I understand your concern.

So let's address your real question.

I've considered adding such version macros long long ago, but never really encountered a case where this was truly needed. We maintain backward compatibility. But accidents do happen, unfortunately. And in the present case, with 64aec0a and the following ones, I had not anticipated it could be a problem for anybody (I didn't have much choice actually, since anyway that token type is now exposed in the header, so it must be subject to api.prefix).

Anyway. This should not happen, so it should not be needed. Besides, some languages such as Java do not have equivalent to macros. And each time a backward issue was discovered, we found ways to address it in a version-independent way. In your case, I believe you could have used

#ifndef YYEMPTY
# define YYEMPTY PREFIX_EMPTY
#endif

and use YYEMPTY only (until you %require "3.6").

So I never wanted to cross this line, and I still want to avoid it.

That being said, if you insist this is needed, I'll do it for yacc.c.

Cheers!

@bazsi
Copy link
Author

bazsi commented Nov 7, 2020 via email

@akimd
Copy link
Owner

akimd commented Nov 8, 2020

YYEMPTY is an enum at least in newer versions and not a macro. I've tried to find a reliable way of coming up with a solution that works across a set of bison versions, but the only one that seemed to work was to detect it at build time and export as defines. YYEMPTY might be a macro prior to 3.6, so ifndef might work to detect if we are on a new version.

Yes, it was a macro.

Cheers!

@akimd
Copy link
Owner

akimd commented Nov 8, 2020

On the other hand, these version macros could be useful for the next such case. And we already have a string value containing this value, but that's not useful for conditional compilation.

I fully agree, these macros never made sense to me. I see them as utterly useless, and would be very happy to get rid of them.

So, are you asking for the numeric version?

@bazsi
Copy link
Author

bazsi commented Nov 8, 2020 via email

akimd added a commit that referenced this issue Nov 11, 2020
Suggested by Balazs Scheidler.
#55

* src/muscle-tab.c (muscle_init): Move/rename `b4_version` to/as...
* src/output.c (prepare): `b4_version_string`.
Also define `b4_version`.
* data/skeletons/bison.m4, data/skeletons/c.m4, data/skeletons/d.m4,
* data/skeletons/java.m4: Adjust.
* doc/bison.texi: Document it.
@akimd
Copy link
Owner

akimd commented Nov 11, 2020

Balazs, I intend to release 3.7.4 this weekend, but I would like to make sure it addresses your concerns. Could you please give a shot at:

https://www.lrde.epita.fr/~akim/private/bison/bison-3.7.3.15-f4e85.tar.gz
https://www.lrde.epita.fr/~akim/private/bison/bison-3.7.3.15-f4e85.tar.lz
https://www.lrde.epita.fr/~akim/private/bison/bison-3.7.3.15-f4e85.tar.xz

The YYBISON macro is now a version number.

Thanks a lot!

@bazsi
Copy link
Author

bazsi commented Nov 11, 2020 via email

@akimd
Copy link
Owner

akimd commented Nov 11, 2020

I would use a hexadecimal number so the major/minor/patch values can be extracted.

I fail to see how decimal and hexadecimal would differ here.

And I think this should be 0x030703 instead.

Care to argument?

#define YYBISON_VERSION_MAJOR ((YYBISON >> 16) & 0xFF)
#define YYBISON_VERSION_MINOR ((YYBISON >> 8) & 0xFF)
#define YYBISON_VERSION_PATCH ((YYBISON) & 0xFF)

I don't see any value in these macros. What kind of condition do you think you need to write that would make sense with them? In my experience, simple numbers suffice and are way more easy to manipulate and understand.

Also, I think I would use a bit more specific name, like YYBISON_VERSION_VALUE

Well, __GNUC__ is the version of GCC, I don't see a problem with that. I would have happily used YYBISON_VERSION, but it's already taken for something else :-(

This is how this looks like in syslog-ng with the current name/decimal encoding:

#if YYBISON / 10000 == 3 && (YYBISON % 10000) / 100 <= 5

Why do you need to specify == 3? Most of the time conditions such as

#if 30704 <= YYBISON && YYBISON < 30800

suffice.

Cheers!

@bazsi
Copy link
Author

bazsi commented Nov 12, 2020

You are right. I just always used hex numbers for this purpose, but the ifdef in your example makes sense. So I am okay with using this as is.

@akimd
Copy link
Owner

akimd commented Nov 14, 2020

Bison 3.7.4 was released with that patch. Cheers!

@akimd akimd closed this as completed Nov 14, 2020
@bazsi
Copy link
Author

bazsi commented Nov 14, 2020

Thanks a lot.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants