Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wip/jehan/special case set4glib #6267

Open
wants to merge 5 commits into
base: master
Choose a base branch
from

Conversation

Jehan
Copy link
Contributor

@Jehan Jehan commented Nov 29, 2019

Cf. #6259.

With these commits, we bring back the set() fix yet accepting exceptionally the special-case when the value starts with a newline (as is the case in GLib) while adding a big fat warning saying it's bad usage for the function.

This way, at least GLib won't break even with recent meson, still we warn them to fix their build eventually.

@xclaesse
Copy link
Member

With this PR, glib builds fine.

Also this can be dangerous allowing unexpected code injection
Not sure how that can be dangerous, we are not talking about SQL injections.

I don't think using a template is an acceptable solution for glib.

  1. I could be wrong, but I think it means rewriting all those #mesondefine TOKEN, there are many of them, all conditional, etc.
  2. glib used to have a template, it required work to remove it a while ago, because it's much cleaner and flexible. Going back is clearly a step backward.
  3. if you have multiple blobs to inject, each depending on different conditions, I don't think that can fit the template (IIRC that's the case with glib).
  4. For all those reasons, it requires work in glib, so let's be honest, it won't be done and you'll never get ride of that deprecation in meson.

IMHO, we should provide a better recommended way in the deprecation warning. Probably by adding .write() or .verbatim().

@Jehan
Copy link
Contributor Author

Jehan commented Nov 29, 2019

With this PR, glib builds fine.

That was the point. ;-)

Not sure how that can be dangerous, we are not talking about SQL injections.

Indeed not SQL injections, but C/C++/NASM injection just before build (so even harder to detect later on!).

I could be wrong, but I think it means rewriting all those #mesondefine TOKEN, there are many of them, all conditional, etc.

If they absolutely want to keep all the normal defines and these random code in a single config.h, while not having to rewrite all the defines (which I can understand), they can still find solutions. One could be to generate the defines without configuration file in say defines.h, then having a config.h.in containing:

#include "defines.h"
@randomcode1@
@randomcode2@

(well… in this case, it's 2 files, but they just include config.h so no code change)

Or they could generate 2 files and simply concat them in a single command.

And probably other solutions.

Anyway I can have a look at this .write() thing but I still think you are overthinking a problem (which didn't exist back with other build systems and they didn't have the .write() equivalent either AFAIK).

@Jehan
Copy link
Contributor Author

Jehan commented Nov 29, 2019

Ok so I added a commit implementing a append() method on configuration objects.

Why append() instead of write(), because I figured the random code might want to make use of commonly set() (or variants) macros, for instance through #ifdef or whatnot. So all such data will be appended after all key-value have been generated in the configuration file.

I have successfully tested with the glib code of course. Basically they could just change:

  glib_conf.set ('gl_unused', gl_unused)
  glib_conf.set ('gl_extern_inline', gl_extern_inline)

to:

  glib_conf.append (gl_unused)
  glib_conf.append (gl_extern_inline)

I would provide a patch but I assume they'll want to stick to their current code until their minimum meson requirement becomes the next release.

@jpakkane
Copy link
Member

Regardless of anything else, plain append is not a very good name. It should be something like append_footer to be more explicit about what it does.

mesonbuild/interpreter.py Outdated Show resolved Hide resolved
@textshell
Copy link
Contributor

Regardless of anything else, plain append is not a very good name. It should be something like append_footer to be more explicit about what it does.

If we must go with an api on the configuration_data() it should be very clear to be abstraction breaking. I've seen "verbatim" in a suggestion. Or maybe "raw".

But i still feel like this does not belong into configuration_data() because that's a abstract dictionary like thing. There was the suggestion by tim to use a new kwarg (e.g. header or footer) on configation_file() which i think is a better place. Of course that means glib needs to have an additional variable that gets chunks appended to it. But that seems doable.

@xclaesse
Copy link
Member

I think it belongs to configuration_data, it's much more natural. Also configure_file could have a command instead of a configuration_data, in which case how do you handle the footer? That seems weird.

@Jehan Jehan force-pushed the wip/Jehan/special-case-set4glib branch from 69ee469 to 1aa26c3 Compare November 30, 2019 23:18
@eli-schwartz
Copy link
Member

glib used to have a template, it required work to remove it a while ago, because it's much cleaner and flexible. Going back is clearly a step backward.

Injecting complex custom code into config.h is also unclean and non-flexible, and feels like some ill legacy design. Clean and flexible would be a beautiful standalone dict of key-value pairs and nothing else.

Adding special API to meson to do something unclean and non-flexible, simply because it is a lesser "evil" than manually maintaining a template, is not automatically the right choice. Although I guess it could be added anyway (I do however strongly caution people writing new projects from scratch to not fall into the trap of actually using it).

For all those reasons, it requires work in glib, so let's be honest, it won't be done and you'll never get ride of that deprecation in meson.

Nice, now who is being being uncooperative. :) If the append() method was not added, and glib still used the bug in 3 years from now, it would be infinitely reasonable to remove it from meson, and when glib complains, say in response "we begged you to fix it 3 years ago, we added angry deprecation messages to every single invocation of meson on the glib project, and you acknowledged it but refused on principle to do anything. Our hands are clean."

I don't understand how "neener neener, we will never cooperate so suck it up" is supposed to be a helpful discussion point here.

@eli-schwartz
Copy link
Member

if you have multiple blobs to inject, each depending on different conditions, I don't think that can fit the template (IIRC that's the case with glib).

While an append() method is obviously more convenient to use due to being able to preserve the autogeneration nature of a configure_file without input files, I cannot see any technical reason for it to be impossible to define multiple blobs and have it be activated by a define which the configuration_data uses as a switch to activate the blob. It should be no different from having a single blob activated by the same switch.

@Jehan
Copy link
Contributor Author

Jehan commented Dec 1, 2019

For all those reasons, it requires work in glib, so let's be honest, it won't be done and you'll never get ride of that deprecation in meson.

Nice, now who is being being uncooperative. :) If the append() method was not added, and glib still used the bug in 3 years from now, it would be infinitely reasonable to remove it from meson, and when glib complains, say in response "we begged you to fix it 3 years ago, we added angry deprecation messages to every single invocation of meson on the glib project, and you acknowledged it but refused on principle to do anything. Our hands are clean."

I don't understand how "neener neener, we will never cooperate so suck it up" is supposed to be a helpful discussion point here.

To be fair, I haven't seen GLib complain and saying they would not fix it or even that the bug was in meson, according to them. At least not in the GLib bug report whose link was given in an earlier comment. Was there an actual discussion about this problem with GLib devs on other channels (IRC for instance)?

Or is it all just assumptions that they would not like it and we are just all acting on this whole assumption?

Anyway, now I implemented this .append() method, and while I globally agree that the whole issue was completely overthought and that a template would have been totally acceptable IMO, it is also true that the .append() method is simpler in the end. And yeah, it removes the abstraction, but I don't think it's so bad.
Now that is implemented, I'd think we could just merge and go on with our lives. I really feel like we wasted a lot of our time on quite a simple issue, which should really have not gone overboard like it did. Is it just me?

@eli-schwartz
Copy link
Member

IDK, I thought @xclaesse was one of those. :p

Copy link
Member

@eli-schwartz eli-schwartz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Anyway, this is missing documentation that the API exists.

Copy link
Member

@xclaesse xclaesse left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this feature, I'm happy you did the work, it's always nicer to deprecate something with a better replacement. As for the naming, IMHO .append() is nice and short.

Extra bonus point if we can have a unit test too :)

mesonbuild/mesonlib.py Outdated Show resolved Hide resolved
@jpakkane
Copy link
Member

jpakkane commented Dec 1, 2019

A possible alternative is to add a footer kwarg to configure_file like this:

configure_file(input: ...
   output: ...
   configuration: ...
   footer: '''This gets written at the end of the generated file,
line feeds and all.''')

@jpakkane
Copy link
Member

jpakkane commented Mar 1, 2020

Sorry for the delay. Going trough this now again it seems reasonable (assuming GLib people are happy, which seems to be the case). Just some minor fixes:

  • the method should be called append_footer rather than just append
  • there should be a test for the footer appending test
  • instead of raw_data the variable name should be footer_data

With those changes, this should be ready to merge.

@Jehan
Copy link
Contributor Author

Jehan commented Mar 1, 2020

Sorry all for lacks of answers (if one was needed, I have not re-read the thread). These last few months were a bit hectic and I just moved (like 2 days ago) to a new place hundreds of kms away. I hope things will settle soon, then if changes are needed on my patches, I'll do and apply them.

@eli-schwartz
Copy link
Member

Any update on this?

@eli-schwartz
Copy link
Member

Ping?

mesonbuild/interpreter.py Outdated Show resolved Hide resolved
mesonbuild/mesonlib.py Outdated Show resolved Hide resolved
mesonbuild/build.py Outdated Show resolved Hide resolved
@eli-schwartz
Copy link
Member

Hmm, what ever ended up happening to this?

@Jehan
Copy link
Contributor Author

Jehan commented Aug 13, 2021

Sorry I haven't re-looked at this for ever. I'll try to make some time.

@eli-schwartz
Copy link
Member

Looking through old issues, I stumbled across #1492 which asked for the ability to do prefix/suffix in configured files, and also suggested a potential alternative mechanism:

Add a new token type #meson_dump_configuration (or similar) that can be put in config.h.meson that will dump the configuration_data() object in that place

So at the very least the third commit in this patch series should be marked as fixing that issue... and I also wonder if maybe the #meson_dump_configuration route should be taken instead?

@codecov
Copy link

codecov bot commented Jun 7, 2022

Codecov Report

Merging #6267 (5303511) into master (1e1dee7) will decrease coverage by 27.68%.
The diff coverage is 86.66%.

@@             Coverage Diff             @@
##           master    #6267       +/-   ##
===========================================
- Coverage   68.82%   41.14%   -27.69%     
===========================================
  Files         406      203      -203     
  Lines       88062    43617    -44445     
  Branches    19560     8962    -10598     
===========================================
- Hits        60612    17945    -42667     
- Misses      22871    23932     +1061     
+ Partials     4579     1740     -2839     
Impacted Files Coverage Δ
mesonbuild/mesonlib/universal.py 61.71% <75.00%> (-17.01%) ⬇️
mesonbuild/build.py 64.00% <100.00%> (-14.37%) ⬇️
mesonbuild/interpreter/interpreterobjects.py 84.89% <100.00%> (-6.09%) ⬇️
mesonbuild/modules/qt5.py 0.00% <0.00%> (-100.00%) ⬇️
mesonbuild/modules/qt6.py 0.00% <0.00%> (-100.00%) ⬇️
mesonbuild/mesonlib/win32.py 0.00% <0.00%> (-100.00%) ⬇️
mesonbuild/scripts/msgfmthelper.py 0.00% <0.00%> (-100.00%) ⬇️
mesonbuild/scripts/run_tool.py 0.00% <0.00%> (-95.46%) ⬇️
mesonbuild/scripts/clangtidy.py 0.00% <0.00%> (-93.34%) ⬇️
mesonbuild/scripts/depscan.py 0.00% <0.00%> (-83.45%) ⬇️
... and 351 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 1e1dee7...5303511. Read the comment docs.

@Jehan Jehan force-pushed the wip/Jehan/special-case-set4glib branch from 9b7dc47 to 0e3ae34 Compare June 7, 2022 18:24
@Jehan
Copy link
Contributor Author

Jehan commented Jun 7, 2022

Hi all!
Sorry for the long wait.

I just reviewed my code, rebased it over changes in meson codebase these last years (files were moved around) and pushed to the branch.

I also applied the various proposed changes in @dcbaker's review.

As for @jpakkane's review, I also applied the changes (renaming and adding unit tests), though I'd like to add a quick note/question:

the method should be called append_footer rather than just append

Personally I find the proposed alternative name append_footer() redundant (therefore unnecessary overlong). By definition, "appending" is in the footer (you don't append in the start or in the middle, it's the end). Or maybe I don't understand what you meant. Maybe is it to leave future possibility to append just after the prelude but before generated #defines in the no-input/header generation case (I would have just called this prepend() unless you also wanted to be able to prepend before the prelude! So a third case!)?

Though I still applied the change, I wanted to raise this little point.
Yet if you are sure you want to go with append_footer, so be it. 🤷

Finally I made 2 more changes:

  • I made so that append() also works on configured input file, as per @xclaesse review, not only dumped header config files.
  • I further propose to add an output_format='custom' feature, which basically allows to drop the prelude writing (for 'c' or 'nasm'). This would allow to easily generate multi-line files without necessarily using a template, which is useful e.g. if the created format is simple enough. I had such a case today where I needed to create a 2-line file, which I was originally doing with custom_target() target and an echo command. It worked well on Linux but turned out to replace the newline with a space on native Windows build. Rather than having to fight with platform shell commands incompatibility or committing a very ridiculous template file, I would have appreciated that meson would just allow me to create files this way. Now it would with this new output_format='custom' option.

@jpakkane
Copy link
Member

Though I still applied the change, I wanted to raise this little point. Yet if you are sure you want to go with append_footer, so be it. shrug

The reason I wanted that was to make sure how things get ordered. For example if you do something like:

cdata.set('foo', 'bar')
cdata.append(...)
cdata.set('baz', 'bob')

Then it is unclear from just looking at that how those things are ordered in the final file. The two do very different things and thus it is typically worth it to make their APIs look different too. We might also need to add a corresponding append_header though that could also be done with a to kwarg to a plain append command.

Reading what I just wrote out actually uncovered a naming issue (the ones that are actually hard). Maybe "footer" is not a good name because the corresponding term for stuff at the top of the file is "header" which is a heavily loaded term. How do people feel about calling this method append_raw_text and then later if we need we can add a kwarg for specifying where the text should go?

(Sorry for these delays, but naming is one of those things where you only have one chance to get it right so it pays to think about it a bit more deeply.)

@Jehan
Copy link
Contributor Author

Jehan commented Jun 10, 2022

Ah I understand why you suggested the name change now.

How do people feel about calling this method append_raw_text and then later if we need we can add a kwarg for specifying where the text should go?

Sure. It's long but much more meaningful than append_footer() IMO and indeed adding an optional kwarg later on won't break the API. So it leaves the space for future improvements.

(Sorry for these delays, but naming is one of those things where you only have one chance to get it right so it pays to think about it a bit more deeply.)

No prob. I understand how these things are important too. ;-)

@Jehan
Copy link
Contributor Author

Jehan commented Jun 10, 2022

@jpakkane Should I just rename to append_raw_text() and amend or wait for more people to give an opinion?

mesonbuild/interpreter/interpreterobjects.py Outdated Show resolved Hide resolved
mesonbuild/interpreter/interpreter.py Outdated Show resolved Hide resolved
Jehan added 5 commits July 30, 2022 23:48
… with newlines."

This reverts commit b33830f.
We will simply add a special case when the multi-line value starts with
a newline, together with a warning telling that such usage of set() is
deprecated.
... code injection.
GLib was using the bug of multi-line macro values (cf. mesonbuild#6179) as a
feature for injecting code. This was definitely semantically wrong as
the function is obviously meant for key-value setting. Also this can be
dangerous allowing unexpected code injection.

Anyway for backward compatibility, we will leave a special case when the
value also **starts** directly with a newline (as in GLib case), yet
accompanied with a message warning this is deprecated bug usage and
should be fixed. In later versions of meson, this flexibility will
likely be revoked.
This can be used to just append random data as-is to the generated
configuration file, for instance if the basic #define creation is not
enough for one's needs.

The data is appended (after all key-value defines, if any), so that it
is possible to make use of any of the said key-values. I.e. you could
add #ifdef code using macros previously set on this configuration
object.
The following tests were modified with data appended with append_raw_text:
'Configless.' and 'Substituted'. This way, we test appending data with
the case where an input was given (variable substition) or not
(typically header file creation).
@Jehan Jehan force-pushed the wip/Jehan/special-case-set4glib branch from 0e3ae34 to 5303511 Compare July 30, 2022 22:40
@Jehan
Copy link
Contributor Author

Jehan commented Jul 30, 2022

Ok once again I resolved all worries, got back to taking care of the original issue only, and also renamed once more the new method. It's now called append_raw_text() as latest suggested by @jpakkane to avoid the header/footer possible naming confusion.

Hopefully this should now be good for a merge. :-)

P.S.: also rebased and force pushed to my branch.

@jpakkane
Copy link
Member

Test failure seems relevant:

unittests/allplatformstests.py:137: in conf_file
    do_conf_file(fin, fout, confdata, 'meson')
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

src = '/tmp/tmp_0jdkccj', dst = '/tmp/tmp8zmtfopn'
confdata = {'VAR': ('foo', 'bar')}, variable_format = 'meson'
encoding = 'utf-8'

    def do_conf_file(src: str, dst: str, confdata: 'ConfigurationData',
                     variable_format: Literal['meson', 'cmake', 'cmake@'],
                     encoding: str = 'utf-8') -> T.Tuple[T.Set[str], bool]:
        try:
            with open(src, encoding=encoding, newline='') as f:
                data = f.readlines()
        except Exception as e:
            raise MesonException(f'Could not read input file {src}: {e!s}')
    
        (result, missing_variables, confdata_useless) = do_conf_str(src, data, confdata, variable_format, encoding)
        dst_tmp = dst + '~'
        try:
            with open(dst_tmp, 'w', encoding=encoding, newline='') as f:
                f.writelines(result)
                # Raw data are outputted as-is at the end.
                for v in confdata.footer_data:
                    f.write(f'{v}\n')
        except Exception as e:
>           raise MesonException(f'Could not write output file {dst}: {e!s}')
E           mesonbuild.mesonlib.universal.MesonException: Could not write output file /tmp/tmp8zmtfopn: 'dict' object has no attribute 'footer_data'

@Jehan
Copy link
Contributor Author

Jehan commented Jul 31, 2022

Argh. I'll have to look when I can make the time again.

If anyone has any relevant info/hint, feel welcome to tell. Because from my testing, I had no problems.

@eli-schwartz
Copy link
Member

eli-schwartz commented Jul 31, 2022

The error is effectively the same one as this:

$ mypy mesonbuild/modules/cmake.py

[...]
mesonbuild/modules/cmake.py:328:60: error: Argument 3 to "do_conf_file" has incompatible type "Dict[str, Tuple[object, str]]"; expected "ConfigurationData"
[...]

@Jehan
Copy link
Contributor Author

Jehan commented Aug 1, 2022

Hmmm… this error also exists on latest master HEAD:

$ mypy mesonbuild/modules/cmake.py | grep do_conf_file
mesonbuild/modules/cmake.py:328:60: error: Argument 3 to "do_conf_file" has incompatible type "Dict[str, Tuple[object, str]]"; expected "ConfigurationData"

As for allplatformstests.py, if this is really a bug I introduced, could someone explain to me how I run this test file to reproduce? I tried to run it directly, and to grep to know if there was any docs, but couldn't find any.

In any case, looking at the pointed code inside allplatformstests.py, it does indeed look very similar. The code passes a test dict to do_conf_file:

        def conf_file(in_data, confdata):
            with temp_filename() as fin:
                with open(fin, 'wb') as fobj:
                    fobj.write(in_data.encode('utf-8'))
                with temp_filename() as fout:
                    do_conf_file(fin, fout, confdata, 'meson')
                    with open(fout, 'rb') as fobj:
                        return fobj.read().decode('utf-8')
        
        confdata = {'VAR': ('foo', 'bar')}
        self.assertEqual(conf_file('@VAR@\n@VAR@\n', confdata), 'foo\nfoo\n')
        self.assertEqual(conf_file('@VAR@\r\n@VAR@\r\n', confdata), 'foo\r\nfoo\r\n')

Yet it's indeed supposed to be a ConfigurationData object:

def do_conf_file(src: str, dst: str, confdata: 'ConfigurationData',
                 variable_format: Literal['meson', 'cmake', 'cmake@'],
                 encoding: str = 'utf-8') -> T.Tuple[T.Set[str], bool]:

It's not something I changed. But maybe I am missing something, but it's much harder to understand if I can't try and run this test script myself, so I'd welcome hints on doing this. :-)

@eli-schwartz
Copy link
Member

eli-schwartz commented Aug 1, 2022

The issue is that both these sites in git master have type violations, but before this PR nothing happened to use attributes which aren't present in a plain dict. It's a bug anyway because of the type violation, your PR just happened to expose it.

The solution would be to add one commit to the beginning of this patch series that fixes the type violations by constructing a ConfigurationData for use instead of a dict.

You can run the tests via python3 run_unittests.py (there are options to only run a subset of the tests, check --help for details) -- you cannot run allplatformstests.py itself as it's an importable module only. ;)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants