New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Imported binaries not always properly removed from published repository #4373
Comments
Most recent:
|
@lnussel @DimStar77 who have been involved when these issues were encountered. |
This hits us now with llvm3 32bit packages and blocks stagings atm. Could you please remove the stale binaries? |
IIRC it come in still from another package and we dropped that one .. right? So I close this report for now. |
The goal of this issue was the fix the underlying problem not the specific case. From what you wrote the issue is not fixed? The issue was to fix the actual problem so it stops happening every few months rather than having to dig through logs and API calls when |
Just a note, seeing this currently again (yes, I can workaround it, but no, I don't think I should have to) Situation: openssl-1_1_0 package has been renamed to openssl-1_1. This was done with a submission of openssl-1_1 in combination with a delete request for openssl-1_1_0. Delete requests are accepted 'in phases' into Factory, as to not disrubt openSUSE:Factory/snapshot (removal of the source container has an immediate effect there, hence we don't do this anymore) So, openssl-1_1_0 has been build disabled and the binaries for /standard wipe. osc can confirm this:
Binaries for /totest and /snapshot are still in place, which is intentional. Nevertheless, the -32bit packages are still 'offered' to the scheduler, as can be seen on the package lmms for the time being:
This choice only exists, because the wipebinaries command did not take care of properly disposing of the -32bit packages of the old openssl-1_1_0 package. (I will aid OBS over this with a Prefer statement; as said, I CAN workaround it, but I should not have to) |
Another instance in Leap 15.0 which blew time to figure out. repo-checker sees the following
The important binaries being complained about are 1.10.0 instead of 1.10.1 and are half a year older than the rest.
Would appreciate someone purging them and perhaps fixing this. |
In the future perhaps OBS team should read through repo-checker output and find these ones since release team is currently paying for this. |
I can not see the openssl example problem anymore, but I think it is too late. The hdf5 example is again a mis-usage of OBS disable feature. The old binaries are still there in hdf5:serial package. Sorry, I have not seen a single time a problem which was not caused by setup problems. Either caused
I am not sure that I will even look into the issue the next time if you still have this setup :/ |
wiping 'disabled packages' is invalid setup? |
On Mittwoch, 25. April 2018, 08:54:25 CEST wrote Dominique Leuenberger:
wiping 'disabled packages' is invalid setup?
yes, because disable flags are not doing it on its own, you must not forget multbuild
instances like in hdf5 case and you cause anyway problems for maintenance later on.
(since every branch will have the builds enabled again by default)
…--
Adrian Schroeter
email: adrian@suse.de
SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg)
Maxfeldstraße 5
90409 Nürnberg
Germany
|
|
On Mittwoch, 25. April 2018, 09:26:20 CEST wrote Dominique Leuenberger:
* there is no maintenance in Tumbleweed;
hdf5 example was in openSUSE:Leap:15.0 project.
You are right that openSUSE:Factory has not the maintenance issue, but IMHO we should
not run different models there as in stable distros...
* build disabled / wiped packages are always transient stated in TW (delete requests, making sure not to impact /snapshot directly)
okay, different thing
* at least in case of openssl, there was no multibuild involved
* The issue is that ::import:* is not properly removed by wipebinaries (it is always -32bit stuff hanging back)
should not be the case, but I can test that thing easily ...
…--
Adrian Schroeter
email: adrian@suse.de
SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg)
Maxfeldstraße 5
90409 Nürnberg
Germany
|
I can confirm that: if an _aggregate created ::import:*. then you deactivate the package with the _aggregate, then you wipe the binaries, that in this case those binaries still are in :full, but the package shows there are no binaries. The binaries also remain in :repo, so they will still be published. |
just checked, the wipe is indeed keeping the imports on purpose. We have additional code to ensure this. (again, that problem would not exists if you would turn the package state into "excluded" instead of buid disabled + manual events. The state would also be reproducable then). |
Maybe an extended API, which can trigger the removal of those files?
There is a difference on what is disabled on Leap and what on TW, maybe we have to split this; and to my knowledge, there is no 'external' chance to change the build state to excluded, short of changing the .spec file. And even that: a package changing from succeeded to excluded leaves it's binaries back, no? so wipebinaries would still be nescessary (and sharing sources between TW/SLE/Leap would become a pain, if packagers have to set a excludearch: 586 on all Leap packages, except if there is a baselibs.conf)
All in all it is confusing that |
|
On Mittwoch, 25. April 2018, 11:16:05 CEST wrote Michael Schroeder:
`wipebinaries` should wipe all binaries iff you wipe all architectures.
this is not the case here and therefore the ::import::* filtering has an effect
…--
Adrian Schroeter
email: adrian@suse.de
SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg)
Maxfeldstraße 5
90409 Nürnberg
Germany
|
In my testcase, I have switched off a package with the _aggregate and wiped all binaries for all archs. So afterwards the package for this repo shows all empty. But as said, the binaries are still in :full and in :repo. It is problematic that the binaries remain in :full, so I cannot remove e.g. broken binaries, I have to remove them manually in :full and trigger a rescan repo. @mlschroe do you mean with "wipe all architectures" all packages or only this specific package ? |
Adrian Schröter wrote:
On Mittwoch, 25. April 2018, 08:54:25 CEST wrote Dominique Leuenberger:
> wiping 'disabled packages' is invalid setup?
yes, because disable flags are not doing it on its own, you must not forget multbuild
instances like in hdf5 case and you cause anyway problems for
maintenance later on.
Feel free to derive a feature request from this bug report that leads to obs implementing the right means to handle our use cases properly.
|
Martin: (I sure hope that you use the "new full handling", i.e. that you don't force the old handling via a new_full_handling = 0 entry in BSConfig.pm) |
Ludwig: I don't even know your use case. Why don't you wipe all architectures? But I think you do that, contrary to what Adrian said. Can you please confirm this? |
(And why do you think that disabling the build has something to do with wipe?) |
Ah, found it. Wipe is special as it ignores the build enable/disable flag, but the created export job doesn't do that. So the export is always dropped if the x86_64 arch is disabled. Fixing... |
Hmm, or maybe not... Digging deeper... |
Bug, feature, whatever we are calling it, the problem exists. The recent addition is a new method for setting packages as excluded to avoid the disabled blank. |
Bah. Wild guessing doesn't help at all. Please stop that. |
We are on different pages apparently. Having debugged this problem numerous times and found the stale data there is no doubt the problem exists. |
The thing is that we don't know what's going on and if there's a bug or not. And I don't see any stale entries for your hdf5 examples. |
❤️ please - we try to work together and find solutions. Re-iterating the same points over and over leads nowhere. Fact is: the release team (and probably many other OBS users) have the 'disable' switch at hand to 'no longer build a package' - but using this (in the webui exposed feature) results in what the OBS-Team defined as 'invalid setup'. Anybody disabling the build of a package (where sources want to be kept, but no longer be built) and that package happened to have a baselibs.conf will run into the issue that the binaries are not completely removable using |
Yes please. I'd love to find out what's going on. |
(And I never said it's an invalid setup. It just has some drawbacks that can be avoided with the new "onlybuild" feature.) Note that |
Oops, did not mean to reopen ;) |
Adrian did ok, so I setup
so, as the ticket is about: the --32bit packages remained back and have not been wiped |
My reaction was to:
Which was in response to me finding another instance of the bad data looking to be cleaned up. This response again implies using a feature exposed by OBS is wrong which is just baffling, but anyway. The "invalid setup" is related to that same opinion expressed multiple times above and wasn't directed at you. You just stepped in and started saying what you didn't say which no one is disputing. lnussel went ahead and added the entries for the "workaround"/"feature" to the Leap 15.0 prjconf so we'll see how it works out. |
DimStar: thanks for setting up that demo project. From what I see now is that the 32bit packages get removed from the package container but stay in the _repository tree. So I always looked at wrong parts of the code, I thought it would be a bug in the 43bit export/import code, but now it seems to be the the _repository handling code. Which completely surprises me, that part has been rock solid in the past. So this is absolutly a bug, at no point in time the _repository tree must get out of sync with the build tree. Pretty amazing and somewhat scary. |
Ok, found it. As suspected this has nothing to do with the disabled flag. It's a bug in the wipe code, it leaves the imported rpms in place when calculating the _repository tree but later deletes the from the build tree. When then the import event is received that is supposed to delete all 32bit packages it thinks that there's no work to do because the files are already gone. But they are still in the _repository tree. So the good news is that this only happens with wipe, normal obs operation is not affected. |
Fixed with commit e0427a2 |
(Btw obs can rebuild the _repository packages from the built tree. This will get rid of all stale entries. Just tell us the projects where we should do this.) ( |
Just for correctness, while there was an issue with wiping in openssl case (but not in hdf5 case), the build-disabled flag is still a bad/invalid/... setup for distributions. You still put a lot pain to others, it is maybe not the problem of the release team, but the problem of maintenance afterwards. Also rebuilding of trees can only be done by OBS admins, so it is definitive a bad idea to rely on this for the release team. (and using complete different mechanic in factory makes no sense either). PLEASE switch to the excluded state using only buildflags before release to avoid this. (And regarding being "baffeled" about this, we keep saying this since beginning, we had a meeting regarding this setup a few month ago with release managers, so it should definitive not be surprising) |
A practical alternative has only existed for a few days. Not terribly useful to point out undesirable approach without a workable alternative. All of that being a side-discussion to a valid bug which was fixed 2 days ago. I am baffled it was so hard to admit as much. Sure benefits of alternative approach, but certainly wasn't "invalid". Just word games that drug out a resolution. |
The hdf5 binaries are no longer present and the package sources were not updated (ie they were wiped out likely with the migration to excluded workflow). I appreciate the fix and alternative workflow so I won't have to debug this once a month. |
The problem is that single event emission never really guarantees a reproducible state. You will always need to debug stuff what happens depending on the order of them. Also an inconsistency can happen again at any time due to external events. And you will never be sure that a staging (project linked) project will behave the same. (And it is also very time consuming on my side to debug such setups). "Invalid" is a correct term when taking any external branches into account, like it happens with maintenance. Every single maintenance incident needs additional manual care or you release unwanted binaries. There this was and still is an invalid setup when you want to do a proper maintenance. Yes, not your problem. |
Can you two please do me a favor and stop bickering about the setup for the 32bit packages in this issue? It has absolutly nothing to do with this bug and the stale binaries in the _repository tree. In fact, if you just disable the i586 arch and then do a 'osc wipebinaries --build-disabled' the 32bit packages will be erased like they should. You only get stale entries if you wipe the x86_64 architecture, and it does not matter at all if something is build disabled or not. You'll trigger the bug by wiping any package that has a baselibs setup, it's just that in most cases the 32bit packages will get rebuilt and so the stale entries will get replaced. Anyway, the bug is fixed. For the baselibs setup in leap/opensuse there's now the onlybuild feature that makes the setup a bit easier and less error prone. Everyone should be happy and the world's a better place. Please move on. |
On Montag, 30. April 2018, 09:32:58 CEST wrote Michael Schroeder:
Can you two please do me a favor and stop bickering about the setup for the 32bit packages in this issue? It has *absolutly nothing* to do with this bug and the stale binaries in the _repository tree.
In fact, if you just disable the i586 arch and then do a 'osc wipebinaries --build-disabled' the 32bit packages will be erased like they should. You only get stale entries if you wipe the x86_64 architecture, and it does not matter at all if something is build disabled or not. You'll trigger the bug by wiping any package that has a baselibs setup, it's just that in most cases the 32bit packages will get rebuilt and so the stale entries will get replaced.
Anyway, the bug is fixed. For the baselibs setup in leap/opensuse there's now the onlybuild feature that makes the setup a bit easier and less error prone. Everyone should be happy and the world's a better place. Please move on.
sorry, can't let you have the last word here, since it is important that we agree, that
build flags disablement plus manual created events by external scripts is not a correct setup
for the named reasons. We must not repeat this.
It really caused lot's of problems and your fix is only handling the openssl/import case.
The hdf5 one (and many other reports where I had to read log files for a longer time) got
only fixed by going away from build flag mechanics.
So, please don't think that after this fix build disablement in stable distros is a valid
setup.
…--
Adrian Schroeter
email: adrian@suse.de
SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg)
Maxfeldstraße 5
90409 Nürnberg
Germany
|
Literally my point. Not sure if my English is bad or what. @adrianschroeter If what you say is true it would seem removing the disabled feature from OBS would be appropriate. I can only imagine how frustrating this must be for some poor sole who encounters this in their home project and does not have the understanding or resources to debug it. Otherwise if fixed by @mlschroe then again this discussion is about a different topic (ie maintenance workflow). |
…BS bugs. Related to openSUSE/open-build-service#4373 as disabling s390x leaves old binaries in repo-md while publishing new ones :(((((((.
…BS bugs. Related to openSUSE/open-build-service#4373 as disabling s390x leaves old binaries in repo-md while publishing new ones :(((((((.
…BS bugs. Related to openSUSE/open-build-service#4373 as disabling s390x leaves old binaries in repo-md while publishing new ones :(((((((.
Issue/Feature description
The first of the two should not exist. Only the rpm with
5_3-5
in the name should exist.Has occurred at least three times including my original report.
Expected result
Published binaries properly reflect current build.
How to Reproduce
Further information
https://lists.opensuse.org/opensuse-buildservice/2017-08/msg00035.html
The text was updated successfully, but these errors were encountered: