Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature Request] Lazy variable/dependency evaluation #8178

Closed
Volker-Weissmann opened this issue Jan 10, 2021 · 31 comments
Closed

[Feature Request] Lazy variable/dependency evaluation #8178

Volker-Weissmann opened this issue Jan 10, 2021 · 31 comments
Labels
design discussion Discussions about meson design and features enhancement

Comments

@Volker-Weissmann
Copy link
Contributor

Hello,

lets say your project includes building a lot of libraries, and the dependency graph of these libraries is given by this acyclic graph. You might solve this by writing

libEvar = library('mylibE', 'fileE.cpp', link_with : [])
libDvar = library('mylibD', 'fileD.cpp', link_with : [libEvar])
libCvar = library('mylibC', 'fileC.cpp', link_with : [libDvar, libEvar])
libBvar = library('mylibB', 'fileB.cpp', link_with : [libDvar])
libAvar = library('mylibA', 'fileA.cpp', link_with : [libBvar, libCvar, libEvar])

This works, but there is one major problem: It breaks once you change the order of those lines. If all of these lines are in the same meson.build file, then putting them into the correct order is easy. But if these lines are scattered across multiple meson.build files in nested subdirectories, then it might no longer be possible (Imagine if libEvar and libCvar are in a subdirectory of libDvar's directory).

One solution would be to write this:

libs = []
# The order of the following 5 lines is irrelevant
libs += [['mylibA', 'fileA.cpp', ['mylibB', 'mylibC', 'mylibE']]]
libs += [['mylibB', 'fileB.cpp', ['mylibD']]]
libs += [['mylibC', 'fileC.cpp', ['mylibD', 'mylibE']]]
libs += [['mylibD', 'fileD.cpp', ['mylibE']]]
libs += [['mylibE', 'fileE.cpp', []]]

lookup = {}
foreach step : [1,1,1,1,1,1,1,1,1,1,1,1,1,1,0]
        foreach lib : libs
            flag = true
            deps = []
            foreach dep : lib[2]
                if dep not in lookup
                     flag = false
                else
                     deps += lookup[dep]
                endif
            endforeach
            assert(step == 1 or flag, 'error building the dependency graph')
            if flag and lib[0] not in lookup
                lookup += {lib[0] : library(lib[0], lib[1], link_with : deps)}
            endif   
        endforeach
endforeach

This basically does what I want/need, but it feels like very bad style. Meson should have some functionality to build these acyclic graphs itself.

This lazy function would also solve similar problems that arise if you try to use an executable to generate sources, before building that executable.

I thought of using subprojects instead of subdirs, but that would result in sandbox violations if some libraries share source files with other libraries.

@mensinda mensinda added design discussion Discussions about meson design and features enhancement labels Jan 10, 2021
@mensinda
Copy link
Member

First of all, I do understand your issue (and I also ran into this problem myself when I accidentally changed the order of my subdirs) and I get that this is an annoyance, however, implementing lazy evaluation would have massive implications for the meson project:

  • it makes everything more complex, which means more logic and code and thus a higher maintenance burden (which is not inherently bad if the feature is worth it)
  • it would be a fundamental shift from our current design where we evaluate everything line by line. Changing this makes the core more difficult to reason about (both for humans and other software).
  • it also effectively makes libraries, etc. mutable while one of the core features/design decisions of meson is that all libraries/etc. are immutable after creation.
  • it is currently impossible to generate cyclic dependencies with build objects in meson. So new logic would need algorithms for detecting cycles basically everywhere to ensure that we end up with a DAG since we get this feature currently for free with our more restrictive syntax.

And finally, implementing this correctly would be a huge effort, since it would require some major changes to the interpreter to allow correct general lazy evaluation or some hacky and error-prone mess after the interpreter ran to fix all the references...

@Volker-Weissmann
Copy link
Contributor Author

Volker-Weissmann commented Jan 10, 2021

Since I never read the meson source code, I cannot say how easy or hard the implementation will be.

it also effectively makes libraries, etc. mutable while one of the core features/design decisions of meson is that all libraries/etc. are immutable after creation.

Libraries can still be immutable if they are created late enough.

My proposal: Add a function called e.g. "order_and_eval":

my_array = []
my_array += [['libDvar', 'library(\'mylibD\', \'fileD.cpp\', link_with : [libEvar])']]
my_array += [['libEvar', 'library(\'mylibE\', \'fileE.cpp\')']]
order_and_eval(my_array)

Notice how 'libDvar' and 'library(\'mylibD\', \'fileD.cpp\', link_with : [libEvar])' are just definitions of strings.
The order_and_eval first parses the 'libary...' strings to mesons AST. order_and_eval then makes a list of all undefined variable references in the AST, in this case [libEvar] for the first line and [] for the second line. order_and_eval now knows which line depends on which other line. order_and_eval can then check if the graph is connected and acylic and exit with a meaningful error message otherwise. order_and_eval can then put the lines in the correct order and evaluate them.

Graphlib can do topological sorts, which is exactly what order_and_eval needs to do.

What I wrote is basically what order_and_eval should do, but python is better suited for such cases than the meson dsl.

Maybe there could also be some syntax sugar for

my_array += [['libDvar', 'library(\'mylibD\', \'fileD.cpp\', link_with : [libEvar])']]

Maybe something like

my_array.lazy( libDvar = library('mylibD', 'fileD.cpp', link_with : [libEvar])  )

If the AST parsing to find all undefined variable references, is too complicated one could also write

my_array += [['libDvar', ['libEvar'], 'library(\'mylibD\', \'fileD.cpp\', link_with : [libEvar])']]

The following should abort:

my_array = []
my_array += [['libDvar', '...']]
my_array += [['libDvar', '...']]
order_and_eval(my_array)

@mensinda
Copy link
Member

Ok, this proposal could work, however, there are some problems.

my_array.lazy( libDvar = library('mylibD', 'fileD.cpp', link_with : [libEvar])  )

If the AST parsing to find all undefined variable references, is too complicated one could also write

Yes, this is way too complicated, since we would have to do a special case in the parser (not even the interpreter) for one object + method call combination...

Secondly, I don't know the official 100% correct stance regarding eval(), however, I am 99.9% certain that any form of eval() will never be accepted.

What could work is a new object type for defining dependency graphs for user-defined objects (maybe even as a new experimental module). Essentially something like this could work:

dag = import('graph')


// Syntax:
// dag.insert('<provides>', depends: ['<optional>', '<list>', '<of>', '<deps>'], data: <opaque object>}
dag.insert('mylibA', depends: ['mylibB', 'mylibC', 'mylibE'], data: {'src': ['fileA.cpp']})
dag.insert('mylibB', depends: ['mylibD'],                     data: {'src': ['fileB.cpp']})
dag.insert('mylibC', depends: ['mylibD', 'mylibE'],           data: {'src': ['fileC.cpp']})
dag.insert('mylibD', depends: ['mylibE'],                     data: {'src': ['fileD.cpp']})
dag.insert('mylibE', data: {'src': ['fileE.cpp']})

dag.insert('mylibX', depends: ['mylibY'], data: {'src': ['fileX.cpp']})
dag.insert('mylibY', depends: ['mylibZ'], data: {'src': ['fileY.cpp']})
dag.insert('mylibZ', depends: ['mylibX'], data: {'src': ['fileZ.cpp']}) // Generates a cycle error

foreach i : dag.as_list()
  // Your code that just adds the libs
  // Guaranteed to have all dependencies resolved
endforeach

Also, thanks for telling me about graphlib, however, we only recently bumped our min. python version requirement to 3.6, so requiring 3.9 features is a few years away :)

@Volker-Weissmann
Copy link
Contributor Author

I think the following syntax would make more sense:

dag.insert('mylibB', depends: ['mylibD'],  cmd: 'library(\'mylibB\', \'fileB.cpp\', link_with : [libDvar])')

@mensinda
Copy link
Member

The problem with cmd: 'library(\'mylibB\', \'fileB.cpp\', link_with : [libDvar])' is that we are not adding any string eval() functionality. This also includes any runtime AST analysis of random strings. This would be a huge pain to get right in the first place and also a huge maintenance burden going forward.

My idea was a general dependency graph data structure where you can define some provides and depends strings and attach to each entry a user-defined object that meson does not care about and will completely ignore for all internal operations. It is then up to the user to iterate over this object (which is guaranteed to be topologically sorted) in a foreach loop and interpreter the user data.

This approach has the advantage that you can maintain any kind of dependency tree and the impact for the meson codebase is minimal since no changes to the interpreter are required.

@Volker-Weissmann
Copy link
Contributor Author

Ok. I'm a bit disappointed, but if everything else would be too much of a writing and maintenance burden, then that's ok.

@Volker-Weissmann
Copy link
Contributor Author

Do you want to implement this Feature Request or should I do it?

@mensinda
Copy link
Member

Patches are always welcome and I currently have limited time to work on meson. However, here are some (hopefully helpful) pointers for getting started:

First of all, this feature should be in a new module, since

  • it is arguably not a core feature because you can do the sort manually (this module just makes things easier)
  • this makes the algorithm more logically contained.
  • this increases the likelihood of getting the PR merged.

These are only recommendations and by no means hard requirements:

  • Try to avoid changes in the core interpreter and work with new objects when required.
  • A good starting point would probably be to look at the external_project module and the cmake module for a setup with custom InterpreterObjects.
  • Modules themself are stateless, so you will likely need a new InterpreterObject

In meson code this would mean:

dag_mod = import('graph')    # dag_mod is stateles
dag     = dag_mod.gen_dag()  # generate the object that does all the actual work

dag.insert(...)   # do the actual work

Some important points from our contributing docs:

  • new features need a release snippet
  • the module needs documentation
  • some form of tests are required (in this case including tests that are expected to fail (dependency cycles, etc.))

We are also in the (admittedly slow) process of type annotating the meson source code, so new code should also be fully type annotated and the new module should be added to our run_mypy.py script.

Feel free to ask if you have any questions.

@jpakkane
Copy link
Member

Notice how 'libDvar' and 'library('mylibD', 'fileD.cpp', link_with : [libEvar])' are just definitions of strings.

One of the core design points of Meson was that things are actually objects and not strings to things that might get defined later. Make, CMake, Bazel and possibly others do that and it always leads to incredibly convoluted and unmaintainable messes. For example you might look at the build definitions of Google's Abseil C++ libraries and try to work out how the dependencies go.

Meson does not do that by design. It requires you to be very clear, direct and upfront about your build structure. It proceeds through source directories in specific and unsurprising ways. Enter one dir. Process it. Never return to it. Proceed to the next one. Never return to it. And so on. This means that people writing their build definitions can rely on this behaviour. It reduces the mental burden of working out what is happening when bugs occur.

We are very, very, very unlikely to add functionality to add lazy evaluation like this. That is not to say we could not add other primitives to improve things, but a general lazy evaluation thing is exceedingly unlikely to be accepted.

@mensinda
Copy link
Member

@jpakkane jus to be clear, do you then have any objections to a new general graph module, as described in #8178 (comment)?

@jpakkane
Copy link
Member

Yes I have. For mostly the same reasons. I am strongly of the opinion that if your build definition is so bizarrely complicated that it requires a DAG dependency solver, then the bettersolution is to make your build setup simpler rather than adding this functionality to the build system.

@Volker-Weissmann
Copy link
Contributor Author

Notice how 'libDvar' and 'library('mylibD', 'fileD.cpp', link_with : [libEvar])' are just definitions of strings.

One of the core design points of Meson was that things are actually objects and not strings to things that might get defined later. Make, CMake, Bazel and possibly others do that and it always leads to incredibly convoluted and unmaintainable messes. For example you might look at the build definitions of Google's Abseil C++ libraries and try to work out how the dependencies go.

Meson does not do that by design. It requires you to be very clear, direct and upfront about your build structure. It proceeds through source directories in specific and unsurprising ways. Enter one dir. Process it. Never return to it. Proceed to the next one. Never return to it. And so on. This means that people writing their build definitions can rely on this behaviour. It reduces the mental burden of working out what is happening when bugs occur.

Good plan, but to quote a prussian general: "no plan survives contact with the enemy", or in our case "no architecture design survives contact with legacy code".

I'm currently working on a project with roughly 10000 *.cpp files, 10000 *.hpp files that are built into 130 shared libraries and 260 binaries. It currently uses gnu make + 2500 lines of custom bash scripts.
I want to write meson.build files for it (more exactly: I want to write a python script that generates these meson.build files), but I don't want to break the old build system.
Reordering the directory structure would be a nice solution, but

  1. It would be a looooot of work.
  2. It would be a biiiiiiiig breaking change, leaving a lot of people very pissed that the old build system no longer works. Such a PR would certainly not be accepted by upstream.

Every solution to this problem is a bad one, but the question is what the least bad solution is.
Possible solutions:

  1. Continue using the old build system and not writing meson.build files -> bad, because the old build system is bad
  2. Reordering the directory structure -> lots of work and breaks the old build system
  3. Using the foreach step : [1,1,1,1,1,1,1,1,1,1,1,1,1,1,0] ... endforeach code: Very hacky and it has very unreadable error messages if you have cyclic dependencies. And I cannot easily improve these error messages, because the meson dsl is not as nice and Turing complete as Python.
  4. Writing the 'graph' module and submitting a PR to meson.
  5. Don't use the subdir command and instead put all of the static_library(...) commands into root/meson.build. The disadvantages are that this is more messy than having it nicely sorted into subdirs, and it also requires me to write a DAG solver (this DAG solver would be in the python script that generates the meson.build files instead of in meson.)

I am strongly of the opinion that if your build definition is so bizarrely complicated

Do you seriously think that a build is bizarrely complicated, just because of the way files are organized into directories?

that it requires a DAG dependency solver,

I don't want to bash on meson, but I just want to note that ninja, make and nearly every other build system has a DAG dependency resolver.

then the bettersolution is to make your build setup simpler rather than adding this functionality to the build system.

In my case it takes an impractical amount of time to do it and it would break the old build system. Do you seriously think that in a project with 900 k lines of code, "break the old build system and start using another one" is that easy? If both build systems work on the same files, then you can have gradual adoption which is a lot easier.

Of the 5 solutions I suggested above, I think you can only seriously suggest 3., 4. or 5.
I honestly want to hear your opinion, what you would suggest me to do.

@mensinda
Copy link
Member

I agree that it would be better to restructure the build system, but not everyone will be willing to do this (especially if you are porting a project to meson). It is already possible to build a DAG solver in meson (see the first comment), hence I would argue that meson already has the base functionality to do this.

The new module basically moves the hacky foreach step : [1,1,1,1,1,1,1,1,1,1,1,1,1,1,0] loop part from the first comment into a module. This way we don't end up with a bunch of bad copy-pasted meson code that solves this problem in a bad way.

Also, using the DAG solver is not trivial, since you have to come up with a system to pass all the kwargs and sources to the correct build type function. This would be far too much overkill for most projects where then manually resolving the dependencies is easier.

So my point is (basically the same argument why we have modules in the first place and do not allow custom functions in meson): We can't prevent people from doing (from our perspective) stupid things if they can already do some hacky loop magic in meson that is duplicated many times (like CMake functions) and will break horribly if you touch it. We can however do it once correctly in meson itself and add a big warning label that you are 99% of the time better of just doing this manually but if you really can't be bothered and would rather copy-paste some hacky meson code, then we have an official module for it.

So could you live with this functionality as a module + some warning labels in the docs?

@Volker-Weissmann
Copy link
Contributor Author

Volker-Weissmann commented Jan 15, 2021

Thank you mesinda for your reasonable comment. I very much agree with you.

The new module basically moves the hacky foreach step : [1,1,1,1,1,1,1,1,1,1,1,1,1,1,0] loop part from the first comment into a module. This way we don't end up with a bunch of bad copy-pasted meson code that solves this problem in a bad way.

Also note that if its a module, it is written in python, but if its in meson.build it is written in the meson dsl. Python is much better suited for writing a DAG solver with meaningful error messages than the meson dsl.

Also, using the DAG solver is not trivial, since you have to come up with a system to pass all the kwargs and sources to the correct build type function.

Yes, this might be tricky, especially, if your problem does not look like this:

libEvar = library('mylibE', 'fileE.cpp', link_with : [])
libDvar = library('mylibD', 'fileD.cpp', link_with : [libEvar])
libCvar = library('mylibC', 'fileC.cpp', link_with : [libDvar, libEvar])
libBvar = library('mylibB', 'fileB.cpp', link_with : [libDvar])
libAvar = library('mylibA', 'fileA.cpp', link_with : [libBvar, libCvar, libEvar])

but more like this:

binEvar = executable('mybinE', 'fileE.cpp')
binDvar = executable('mybinE', ['fileD.cpp', generator(binEvar, ...).process('fileD2.txt')])
...

Having a eval(cmd_string) function would solve this, but I'm too scared of jpakkane to propose this.

@jpakkane
Copy link
Member

more exactly: I want to write a python script that generates these meson.build files)

If you are already doing this, then putting the DAG solver in that script is a good option. That way you can use all the cool new stuff in Python 3.9 (which we can't for many years to come). You can customize it to your exact needs and can keep it up to date as the code changes (if needed) until such time that you do the final changeover.

Note that if Meson had this sort of a dependency solver, you'd still need to write a converter script like that. But its output would need to be Meson's potentially quite different format for specifying the DAG (it would need to be general, rather than tailored for your use case). It would also make debugging harder, because if problems occur you can't really tell whether they are in your converter script or Meson's DAG module (which is entirely possible, new code like this tends to have bugs and edge cases).

@Volker-Weissmann
Copy link
Contributor Author

I thought about this: The big advantage of the foreach step : [1,1,1,1,1,1,1,1,1,1,1,1,1,1,0] method is that the libs += [['mylibD', 'fileD.cpp', ['mylibE']]] are nicely sorted into the meson.build files in the subfolders. This is especially important for my project, because it has a lot of nested subfolders. With this solution, you could move one subfolder with its meson.build file to another meson subfolder and everything would still work. With your solution, I have something like
file('dirA/dirB/dirC/file.cpp')
in a meson.build file. If I want to move the dirC foder to dirA/dirD, I would have to change this line manually.

Doing what you suggest leaves me with two options.

  1. Put everything into root/meson.build. This means that I have one 10000 lines meson.build with very long lines, because the paths needs to spell out every subdirectory.
  2. Let my python script put the library(...) calls not in the root/meson.build file, but in the lowest subdirectory, that still makes it possible to build it with meson. The disadvantage would be that changing a dependency and rerunning my python script might result in a lot of change in a lot of meson.build files.

I'm going to do what you suggest (therefore closing this issue now) (mainly because I know that you can be stubborn ), but I still think that writing this 'graph' library would be a good idea because:

  1. If you generate meson.build files, you have a program that generrates meson.build files, that itself generates build.ninja files. It would be a meta-meta-build system. If you telll me "go write a script with a DAG solver that generates meson.build files" you are basically saying "go write your own build system and use meson as a backend". The Problem is that nothing I could write would be as good of a build system than meson. I would have two choices: Either ship the generating_script.py, which is bad, because this script is far less readable than meson.build files or ship meson.build files, which means that someone who wants to change a dependency needs to reorder hundreds of lines.
  2. Mixing hand-written meson.build code and generated meson.build code is a mess. This 'graph' module would essentially be a script that generates meson.build code, but is nicely integrated into meson.
  3. All your concerns about how a dependency cycle etc. is hard to debug are invalid, because your proposal just shifts the problems away from meson and towards this script.
  4. You seem to forget that using a module in meson is OPTIONAL. A PR that adds a module should be accepted based on "are they common cases where this is useful" and not base don "is this always useful".

I recently watched your talks and said something like
"Everthing should be done the normal way, except for all the nonstandard things that I do, they need to be supported."
This feels a lot like it.

Nonetheless, thank you for your great build system and your comments and suggestions in this thread. If I'm finished with porting my project to meson, I will write a blog post about it and then we will see whether your decision has lead to good or bad code.

@eli-schwartz
Copy link
Member

  1. A PR that adds a module should be accepted based on "are they common cases where this is useful" and not base don "is this always useful".

I would actually expect a PR to be accepted based on whether it seems like the proper design paradigm to do it. (@jpakkane's argument condenses to "this is bad design". Your argument condenses to "but my project would use it".) But let's go with your logic for the moment.

Can you provide some supporting facts to "there are common cases where this is useful"?

Because to be perfectly honest, I don't really understand the nuance of your use case. But I would have expected the only actual problem here to be "the root meson.build needs to subdir() into other directories in a particular order, so as to provide defined objects needed in later directories". That's the only thing this proposed dag module needs here. It seems like stupendous overkill, since the alternative is simply to order your subdir()s correctly once, and commit that to git -- and failure to do so produces obvious "variable not defined" errors.

I'm not entirely sure what churn you expect, either, unless you regularly git mv the majority of your source code files around to different locations? Your general project hierarchy should not keep changing -- but if it did, the only thing you'd need to do to solve it, is change the order in which you process directories.

@Volker-Weissmann
Copy link
Contributor Author

I would actually expect a PR to be accepted based on whether it seems like the proper design paradigm to do it.

Yes! Come to the dark side!

@jpakkane's argument condenses to "this is bad design". Your argument condenses to "but my project would use it".

My Argument condenses to "some projects would use it". It is a general purpose DAG, not a special purpose DAG only suitable for my project. Because meson does not support user-defined modules, my only choices are submitting this PR or forking meson.
Because meson does not support user-defined modules, PR's that add modules that some might use, but others won't use, should be accepted. For example the RPM-module was accepted, even tough not every meson user uses RPM, and some user might even argue that RPM is a bad design.

since the alternative is simply to order your subdir()s correctly once,
NO. THIS IS NOT TRUE.
If reordering subdir()s would solve it, I would gladly do it. But consider this directory structure:

root
├──meson.build
├──dirX/
│ ├── meson.build
├── dirY
├── meson.build

What if dirX contains the library A and C and dirY contains the library B.
You can easily achieve these orders:
BAC, BCA, ACB, CAB,
But you cannot easily achieve these orders:
ABC, CBA

unless you regularly git mv the majority of your source code files around to different locations

We don't do that. But it would be a nice bonus if you could git mv directories around more easily.
(Technically, the project supports copying one subdirectory of the project somewhere else, modifying it and building it (and linking it to the original libraries). But since I don't know what the best way to do something similar with meson yet, this is a question for another day.)

Because to be perfectly honest, I don't really understand the nuance of your use case.

In case anyone want to understand my use case better:
It is a large (10000 .C files, 10000 .H files) project that builds a > 100 shared libraries and > 100 binaries. The old build system has subdirectories sub-...-sub-subdirectories that contain some C++ source files and a directory called "Make". Each "Make" folder has a two files: "files" and "options". These files include a list of C++ source files, the name of the resulting library/binary and a list of libraries that are dependencies. I wrote this hacky python script that searches for those "Make" folders and outputs these files meson_build.txt. Here is the discussion about it.

@eli-schwartz
Copy link
Member

What if dirX contains the library A and C and dirY contains the library B.
You can easily achieve these orders:
BAC, BCA, ACB, CAB,
But you cannot easily achieve these orders:
ABC, CBA

Uhhhhh.

So the core problem here is the current organizational layout of the project in question has a library in dirX which depends on a library in dirY, which depends in turn on another library back in dirX? Multiplied at scale?

This sounds... unfortunate. I'd really recommend as a long-term goal, no matter which build system is in use, to implement a more ordered approach to things. I'm not sure what the short-term solution should be...

It would still be possible to define such libraries in the root meson.build, but only after subdir()ing into dirX and dirY in order to define the list of input files via files() (the purpose of which is "define a list of files that properly remember which directory they come from"), which should be a lot less disorderly than defining everything in the root meson.build and/or typing out many long repetitive filenames.

@Volker-Weissmann
Copy link
Contributor Author

Uhhhhh.

It is code written by physicists. Trust me, if you have never read science code, you don't know what bad code is. If you want see a messy build, look at sagemath. Featuring:

  1. A hand-written Makefile that includes something like:
    build/make/Makefile: configure ...
    ./configure ...
  2. If you build it, and run make again, it runs ./configure (autotools)
  3. It downloads code during compilation ... without checking checksums ... or signatures ... or using TLS (link).
  4. Installing it will put a hard-coded key into your ~/.ssh/known_hosts file (link).

So the core problem here is the current organizational layout of the project in question has a library in dirX which depends on a library in dirY, which depends in turn on another library back in dirX? Multiplied at scale?

This sounds... unfortunate. I'd really recommend as a long-term goal, no matter which build system is in use, to implement a more ordered approach to things. I'm not sure what the short-term solution should be...

Honestly, I have not checked if there are some parts of my project where this structure occurs. But our buildsystem, and many other buildsystems support it and I thought that its not that unordered and that it might be useful, because its easier to write stuff in any order than to write it in the correct order, which makes porting stuff from other build systems easier. I will report back in one or two days if we actually need it. (Maybe I should have done that before opening this issue and having a big mouth.)

Actually, you made me think of another argument.
If a library in dirX depends on a library in dirY and the other way around than you could argue both directories are not that seperated and the correct place for the library(...) calls would be in dirX/../meson.build. This would mean that the DAG is unnecessary. I will do some testing and report back in one or two days.

It would still be possible to define such libraries in the root meson.build, but only after subdir()ing into dirX and dirY in order to define the list of input files via files() (the purpose of which is "define a list of files that properly remember which directory they come from"), which should be a lot less disorderly than defining everything in the root meson.build and/or typing out many long repetitive filenames.

The question is if this is more or less ugly than the DAG solution.

@ferdnyc
Copy link
Contributor

ferdnyc commented Nov 12, 2022

(I am honestly not trying to stir up old, long-dormant debates or make any sort of point here at all. It's just that, at the end of this whole saga we — meaning, anyone reading this for posterity — were left with a bit of a cliffhanger, and I'm genuinely curious to hear how it ends.)

So, @Volker-Weissmann
What was the outcome of your followup testing (if any)? Did you ever work out whether your builds truly required a DAG to properly define? Did you adopt one of the other approaches discussed here?

@Volker-Weissmann
Copy link
Contributor Author

(I am honestly not trying to stir up old, long-dormant debates or make any sort of point here at all. It's just that, at the end of this whole saga we — meaning, anyone reading this for posterity — were left with a bit of a cliffhanger, and I'm genuinely curious to hear how it ends.)

So, @Volker-Weissmann What was the outcome of your followup testing (if any)? Did you ever work out whether your builds truly required a DAG to properly define? Did you adopt one of the other approaches discussed here?

Sorry for the delay, sickness + overtime delayed me.
In about a month or so, I will link a nicely written article about my solution. Hint: It involves 700 lines of graph theory rust.

@Volker-Weissmann
Copy link
Contributor Author

Volker-Weissmann commented Nov 27, 2022

I finally found time to write it:
Link

I hope it's understandable. Writing is not my strength. Feel free to suggest changes.

@jpakkane
Copy link
Member

Does it work if you use link_whole instead (which is what most non-Linux linkers do by default)?

@Volker-Weissmann
Copy link
Contributor Author

Volker-Weissmann commented Nov 29, 2022

Does it work if you use link_whole instead (which is what most non-Linux linkers do by default)?

I can't quite follow you. How does link_whole change the order of the library and executable calls?

@eli-schwartz
Copy link
Member

eli-schwartz commented Nov 29, 2022

It doesn't, link_whole is not the default for non-unix linkers.

There's an unrelated feature that meson uses regardless of link_with vs. link_whole.

That feature is, grouping all libraries together with -Wl,--start-group .... -Wl,--end-group. It causes the linker to resolve symbols for one library, by checking both the libraries preceding it and the libraries following it, instead of unloading each library from memory as soon as it is done with primary processing. This is done by default with non-unix linkers, but Unix linkers typically have to list each -lfoo multiple times in order to re-resolve new symbols when multiple libraries use symbols from a third library. So e.g. -llibA -lfoo -llibB -lfoo to make sure that all three libraries effectively pool the sum total of their symbols for availability in the final binary product.

This doesn't really have anything to do with referencing meson.build variables before they are defined, or alternatively adding to a target's dependencies: kwarg after the target is already defined.

@Volker-Weissmann
Copy link
Contributor Author

This doesn't really have anything to do with referencing meson.build variables before they are defined, or alternatively adding to a target's dependencies: kwarg after the target is already defined.

Ok. Then why is jussi writing this in this thread? Am I expected to answer/test anything?

@jpakkane
Copy link
Member

What if dirX contains the library A and C and dirY contains the library B.
You can easily achieve these orders:
BAC, BCA, ACB, CAB,
But you cannot easily achieve these orders:
ABC, CBA

Thinking about this a bit more there is a "Meson native" way of doing this already that is mostly a question of renaming the directories, but it is slightly involved and requires you to use subprojects.

First you move all "target directories" under subprojects giving them unique names.

If you have "global" headers that everybody needs, keep them in the top level and do (approximately):

main_dep = declare_dependency(include_directories: 'include')
meson.override_dependency('maindep', main_dep)

Then in each subproject you do this:

main_dep = dependency('maindep')
out_lib = library(..., dependencies: main_dep)

And now you are mostly done. Whenever you need a library with the name foo, you'd do this:

bar_lib = library(..., link_with: subproject('foo').get_variable('out_lib'))

If you need to expose headers too, then you'd use declare_dependency instead.

@Volker-Weissmann
Copy link
Contributor Author

Thank you for trying to solve my problems. Unfortunately I am trying to create meson.build files that work without moving/changing any of the existing .C files. In Hindsight, setting this as a requirement is probably a bad idea and I should (maybe) never started working on it.
The graph-theory solution is slightly overly-complex, but it produces "Meson native" looking meson.build files. (We had to move just 3 targets to a different folder.)

@ferdnyc
Copy link
Contributor

ferdnyc commented Dec 2, 2022

Thinking about this a bit more there is a "Meson native" way of doing this already that is mostly a question of renaming the directories, but it is slightly involved and requires you to use subprojects.

Thank you for trying to solve my problems. Unfortunately I am trying to create meson.build files that work without moving/changing any of the existing .C files.

Indeed, if rearranging the tree was an option, you could just move library C to dirY to make the ABC ordering possible. Or move library A to dirY to make CBA possible.

(Edit: Or move them all into the same directory, if all else fails and the code is total spaghetti.)

The subprojects trick is "neat", but it's overkill as a solution given that the exact same requirements it imposes also make less-invasive solutions possible.

@ferdnyc
Copy link
Contributor

ferdnyc commented Dec 2, 2022

Although... will Meson follow symlinks, when reading a source tree to construct a build system? Will it follow directory symlinks?

If you could create a subprojects/ directory filled with nothing but relative symlinks to directories in other parts of your source tree, and use the subprojects trick that way, then it's starting to look like a viable way of at least working around this issue, though I'd hardly go so far as to call it a "solution".

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
design discussion Discussions about meson design and features enhancement
Projects
None yet
Development

No branches or pull requests

5 participants