Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

determine_windows_extra_paths() returns list of dependency paths in random order, breaks building glib #12330

Open
lazka opened this issue Oct 4, 2023 · 13 comments · May be fixed by #12861
Open

Comments

@lazka
Copy link
Contributor

lazka commented Oct 4, 2023

glib is currently not buildable in case glib is already installed on the system, this is due to some glib build commands trying to link to system glib during the build which is missing some new symbols. See downstream issue: https://gitlab.gnome.org/GNOME/glib/-/issues/3114

A bit of debugging shows that the PATH env var passed to commands (serialized via "extra_paths") is in a different order on every configure run. Looking at the function filling it shows that it uses unordered sets for everything and mixes build and system dependencies in multiple places and puts them into the same set():

def determine_windows_extra_paths(

So sometimes the system DLL is in PATH before the meson built one.

Site note: whatever the fix is here, avoiding set() with deterministic output would help hunt down future issues in that code

@lazka lazka changed the title determine_windows_extra_paths() returns list of dependency paths in random order determine_windows_extra_paths() returns list of dependency paths in random order, breaks building glib Oct 4, 2023
@lazka
Copy link
Contributor Author

lazka commented Oct 4, 2023

@bruchar1 since you touched this last. Do you have any thoughts on how to best fix this?

@dcbaker
Copy link
Member

dcbaker commented Oct 4, 2023

@lazka I think you can just change result: T.Set[str] = set() to result: mesonlib.OrderedSet[str] = mesonlib.OrderedSet() and the same with perspectives and be good to go

@xclaesse
Copy link
Member

xclaesse commented Oct 4, 2023

@lazka OOC, what's the expected order? builddir or system first?

In related news, in #12059 I rework a lot that function, but I didn't pay attention to ordering, so that code needs revisiting too.

avoiding set() with deterministic output would help hunt down future issues in that code

That's a fair point, I personally thought that set() was ordered the same way dict now are. That's a huge trap, we probably should do a project wide search-and-replace to never use it!

@eli-schwartz
Copy link
Member

There are many, many, many cases where sets are entirely correct to use because order doesn't matter. Why would we ever want to replace those?

There are a bunch of cases where we know order matters and we use mesonlib.OrderedSet, again, as we should. There is apparently at least one case where order matters and we did NOT use mesonlib.OrderedSet, and we can fix that on its own.

@xclaesse
Copy link
Member

xclaesse commented Oct 4, 2023

Why would we ever want to replace those?

OOC Is there a reason no to? Unstable order bit us many times for reproducibility, it's often far from obvious that order actually matters.

@eli-schwartz
Copy link
Member

  • suggesting that the entire python programming language is advised to give up the use of one of its most basic builtin datatypes certainly sounds novel to me :)
  • import order circularity
  • many times it is very obvious that order does not matter
  • OrderedSet is slow. It's a replacement for one of the most basic language builtins, and that replacement is written in pure python instead of C. It cannot help but be comparatively slow :) and I would generally prefer to make meson faster, not slower. At the very least, I would want to vet every single proposed replacement to ensure either one of two things is true:
    • order matters in that specific case, in which case discussion over, we did in fact need it
    • it's not in a hot path and/or it's never used a large number of times, so even if we become slower, we don't become noticeably slower in the common case

@xclaesse
Copy link
Member

xclaesse commented Oct 5, 2023

It's tangential, I'm not really proposing to do it here. But wouldn't that be just a dict with None value? That would have pretty similar perf, no? To be honest I wrongly assumed that set() was actually a dict and was ordered the same way.

@xclaesse
Copy link
Member

xclaesse commented Oct 5, 2023

The reason I'm saying this is because it's not only about args order, it's also anything that gets pickled into coredata or any other president state that would cause spurious reconfigure. We felt into that trap many times already.

@dcbaker
Copy link
Member

dcbaker commented Oct 5, 2023

I did a few tests with timeit, OrderedSet is roughly 2-4x slower than set for common operations (initializtion, add, membership tests), and even slower than list in some cases (adding, but faster for membership tests). So I think it would be a bad idea to replace all set uses with OrderedSet

@xclaesse
Copy link
Member

xclaesse commented Oct 5, 2023

Wondering how you get such huge difference. There is some variability, set() and dict() always have similar perf, but wrapping dict into a class does have a 10-15% cost.

https://bpa.st/CXEQ

$ python set.py
set()		 1.0711724859429523
OrderedSet()	 1.2292896250728518
MySet()		 1.1717969930032268
dict()		 1.0361316660419106

Anyway, that's offtopic, sorry.

@dcbaker
Copy link
Member

dcbaker commented Oct 5, 2023

I should have been clearer, initialization with values. so something like set(range(10)) vs OrderedSet(range(10)).

EDIT:

Here's my test script:

import timeit
from mesonbuild.mesonlib import OrderedSet

print('base initialization:')
print('list:', timeit.timeit(list))
print('set:', timeit.timeit(set))
print('OrderedSet:', timeit.timeit(OrderedSet))

x = list(range(10))

print('\ninitialization with values:')
print('list:', timeit.timeit(lambda: list(x)))
print('set:', timeit.timeit(lambda: set(x)))
print('OrderedSet:', timeit.timeit(lambda: OrderedSet(x)))

print('\nelement addition (no repeat):')
print('list:', timeit.timeit(lambda: list().append(7)))
print('set:', timeit.timeit(lambda: set().add(7)))
print('OrderedSet:', timeit.timeit(lambda: OrderedSet().add(7)))

print('\nelement addition (repeat):')
y = set()
print('set:', timeit.timeit(lambda: y.add(7)))
y = OrderedSet()
print('OrderedSet:', timeit.timeit(lambda: y.add(7)))

print('\nmembership (present):')
print('list:', timeit.timeit(lambda: 7 in x))
y = set(x)
print('set:', timeit.timeit(lambda: 7 in y))
y = OrderedSet(x)
print('OrderedSet:', timeit.timeit(lambda: 7 in y))

print('\nmembership (missing):')
print('list:', timeit.timeit(lambda: 17 in x))
y = set(x)
print('set:', timeit.timeit(lambda: 17 in y))
y = OrderedSet(x)
print('OrderedSet:', timeit.timeit(lambda: 17 in y))

and here's the results:

base initialization:
list: 0.031823622004594654
set: 0.035224589984863997
OrderedSet: 0.31845129901194014

initialization with values:
list: 0.1108416179777123
set: 0.2433469349925872
OrderedSet: 1.3216239210159983

element addition (no repeat):
list: 0.1217501929786522
set: 0.12117283698171377
OrderedSet: 0.5398813229985535

element addition (repeat):
set: 0.07545559998834506
OrderedSet: 0.15576225699624047

membership (present):
list: 0.11119814799167216
set: 0.056258563010487705
OrderedSet: 0.12434810600825585

membership (missing):
list: 0.13001264000195079
set: 0.059524321026401594
OrderedSet: 0.13463887601392344

So what I see is even in optimal cases OrderedSet is significantly slower than set, and even slower than list in some cases (that surprises me).

@xclaesse
Copy link
Member

xclaesse commented Oct 5, 2023

Right, that's indeed the worst case scenario. Note that our implementation can be improved, dict({i: None for i in range(10)}) is faster. But still about 2x slower than set() initialization :/

@lazka
Copy link
Contributor Author

lazka commented Oct 6, 2023

@lazka OOC, what's the expected order? builddir or system first?

Since there is no guarantee with system dependencies that they wont inject DLLs that conflict with other dependencies I'd say sort all system dependencies (anything not in the meson build dir) at the end.

gnomesysadmins pushed a commit to GNOME/glib that referenced this issue Feb 14, 2024
This works around a Meson bug where a system-wide installed copy of
GLib can incorrectly get used in preference to the just-built copy
(see mesonbuild/meson#12330).

In principle some of these are unnecessary (for example libgio-2.0-0.dll
only needs to be in gio/tests/ and girepository/tests/) but it's simpler
to copy each DLL to each tests directory.

Fixes: c428d6e "ci: Build and tar the platform specific documentation"
Resolves: https://gitlab.gnome.org/GNOME/glib/-/issues/3262
Signed-off-by: Simon McVittie <smcv@collabora.com>
gnomesysadmins pushed a commit to GNOME/glib that referenced this issue Feb 14, 2024
This works around a Meson bug where a system-wide installed copy of
GLib can incorrectly get used in preference to the just-built copy
(see mesonbuild/meson#12330).

In principle some of these are unnecessary (for example libgio-2.0-0.dll
only needs to be in gio/tests/ and girepository/tests/) but it's simpler
to copy each DLL to each tests directory.

Fixes: c428d6e "ci: Build and tar the platform specific documentation"
Resolves: https://gitlab.gnome.org/GNOME/glib/-/issues/3262
Signed-off-by: Simon McVittie <smcv@collabora.com>
lb90 added a commit to lb90/meson that referenced this issue Feb 15, 2024
gnomesysadmins pushed a commit to GNOME/glib that referenced this issue Feb 15, 2024
Rather than pinning it to the lowest version we support, as is the
standard policy.

This means we’ll end up using version 1.3.2-2, which has just been
packaged to contain the fix for
mesonbuild/meson#12330, which has been
impacting GLib significantly since we started installing
gobject-introspection in CI in commit
c428d6e.

Thanks to Christoph Reiter, Luca Bacci and Simon McVittie for diagnosing
and fixing this issue.

Signed-off-by: Philip Withnall <pwithnall@gnome.org>

Fixes: #3262
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants