Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Release quality 17.09 #31045

Closed
mkocha2 opened this issue Oct 31, 2017 · 14 comments
Closed

Release quality 17.09 #31045

mkocha2 opened this issue Oct 31, 2017 · 14 comments

Comments

@mkocha2
Copy link

mkocha2 commented Oct 31, 2017

At NixCon 2017 it appeared as if the release managers thought their release went well. Via this message I would like to provide some data to the contrary.

I tried to upgrade on two different machines to 17.09 and none of them worked without reconfiguration or without removing features. As such I am still stuck at 17.03 on both and haven't bothered upgrading other machines.

For example:

  • Skype doesn't work anymore (broken download)
  • Flash failed (has been fixed, but by the time 10 people have discovered it, you already failed)
  • NFS with autofs or systemd doesn't work due to a missing symbol (open for months)

It is great that doing a rollback was possible (although it does restart the network connection, which would result in some downtime on production infrastructure), but as far as I am concerned no working 17.09 release with feature parity from 17.03 has been released.

I am also not happy with the discontinued support of 32 bits code when other distributions still support newer 32 bits code. Not sure whether this is specific to 17.09.

There are just a small number of critical packages that tens of millions of people use of which Skype and Flash are an example. NFS is used by a lot of businesses and as such should also be considered important. I don't understand how one can make a release without systematically creating tests based on e.g. Debian's popcon download measures to see whether a package still works. If a release has any QA done on it, I have missed it.

What is the point of tagging some git version as a release when the QA on it is non-existent?

@vcunat
Copy link
Member

vcunat commented Oct 31, 2017

Flash and Skype in Debian's popcon? :-) I must say our QA for unfree packages is inherently worse, just because of the policy not to allow building such packages on the build farm.

@vcunat
Copy link
Member

vcunat commented Oct 31, 2017

If a release has any QA done on it, I have missed it.

To remain positive, QA done specifically for the release: #28643. A critical test-set is checked before every channel bump, too, though it would certainly be nice to have more tests. You may have noticed a couple proposals around tests on NixCon 2017.

@grahamc
Copy link
Member

grahamc commented Nov 1, 2017

Hi there!

Yes, indeed, we are quite proud of our release. We merged thousands of
pull requests, addressed many many issues, added lots of services and
packages, and included many security updates. Our community has also
grown quite a lot, and we are proud and excited by the growth and
progress of NixOS.

You've had a less good experience, and that sucks. It seems you fall
in to somewhat less tested areas of NixOS, and that is certain to
expose you to sharper corners and more broken things. I'm sorry you
did! Please try upgrading your remaining machines, as 17.03 is no
longer supported. We'd rather dedicate everyone's efforts to making
17.09 work sufficiently well.

It seems that two of your issues are with unfree software, which NixOS
doesn't officially test in any capacity. This is by policy, so any
sort of testing on these packages will have to be by volunteer
contributors on their own time and hardware.

If you'd like to help with this, I'd be happy to work with you to help
set something up.

Your third issue about NFS:

NFS with autofs or systemd doesn't work due to a missing symbol
(open for months)

Luckily, it seems one of our volunteer contributors has a patch! Maybe
you could try it out on your system, and reply to the PR:
#31038

It is great that doing a rollback was possible (although it does
restart the network connection, which would result in some downtime
on production infrastructure)

Interesting! I don't see an issue about this. Can you open one? When I
call nixos-rebuild switch, I don't have this issue ... Hmm...

I am also not happy with the discontinued support of 32 bits code

Unfortunately NixOS is a small distribution without substantial
corporate backing. We still support and build some software for i686,
but as you say, we no longer support entire i686 systems. It was a
very difficult choice, as we didn't want to leave users without
updates, but we believe it was worth the decision. x86_64 has been
available for 17 years now and covers almost all of the modern
hardware. Dropping i686 support has significantly improved our ability
to test and release NixOS. What were you using i686 for?

Regarding NFS: We have automatic testing of NFS 3 and NFS 4, which
pass:

These tests automatically create servers and clients and run thorough
tests to ensure our NFS support works. I think that is pretty good,
and pretty cool! However, it doesn't cover the autofs case. Perhaps
we should add that to the test? Would you like to send a PR adding it?
If so, I'd be happy to do an IRC chat or video call to help you.

Maybe you're not familiar with the extensive, innovative automatic VM
testing we already do? I think it is pretty cool, and many distros
don't have as robust of a test framework we do.

Please remember that NixOS is operated by a very wonderful group of
volunteers, and your negativity isn't welcome.

If you would like to learn about our release proceses, we'd be happy
to show and teach you.

If you would like to learn about our QA process, we'd be happy to show
and teach you.

If you would like to become a contributor and help scratch your own
itches, make NixOS as good as it can be for your use cases, we'd be
happy to show and teach you.

If you would like to contribute enough money to hire a team of full
time people to work on and support NixOS, we'd be happy to work with
you.

Thank you,
Graham Christensen

@globin
Copy link
Member

globin commented Nov 1, 2017

I'm pretty sure I cannot add much to @grahamc's awesome response, except to say we're sorry that you have had issues and want to emphasize that we try to deliver the best experience possible but sadly don't have infinite resources and even by far not as many as debian etc.

c0bw3b pushed a commit to c0bw3b/nixpkgs that referenced this issue Nov 1, 2017
@fpletz
Copy link
Member

fpletz commented Nov 1, 2017

Updated the nixpkgs manual and the wiki that we cannot test or build unfree packages.

fpletz added a commit that referenced this issue Nov 1, 2017
Resolves confusion mentioned in #31045.

(cherry picked from commit e32352f)
@mkocha2
Copy link
Author

mkocha2 commented Nov 9, 2017

@grahamc

I upgraded one server to 17.09. Another system upgrade broke idempotency.

That is, nixos-rebuild switch;nixos-rebuild switch != nixos-rebuild switch. It is my understanding that the NixOS organization claims to achieve the opposite.

On the same system where idempotency was broken, shell fonts are broken too. So, I now get to enjoy some terrible slanted font. So, now I get to spend time on figuring out how to undo the incompetence of someone responsible for the release. Things don't break by themselves. It seems it is #31294.

Flash video/audio in Chrome also doesn't work anymore.

I switched back to 17.03 again to have a working system.

The i686 device is an end user laptop. If it runs a browser (don't care which one as long as it is graphical) and an ssh client.

Regarding your tests, I took a look at one of your links and found a build with a green checkmark which said ( https://hydra.nixos.org/build/63781841#tabs-summary )

server# [   27.174097] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.10.2-0-g5f4c7b1-prebuilt.qemu-project.org 04/01/2014
server# [   27.179101]  ffffa155d76039a8 ffffffff90af7b12 0000000000000000 0000000000000000
server# [   27.180555]  ffffa155d76039e8 ffffffff9086f3ab 000001ec90c90135 ffffa155d48c5800
server# [   27.182066]  ffffa155d3d5d800 ffffa155d3d5e888 ffffa155d3c1da00 0000000000000000
server# [   27.183307] Call Trace:
client1# [   12.469497] dhcpcd[645]: Failed to try-reload-or-restart ntpd.service: Unit ntpd.service not found.
server# [   27.183857]  <IRQ> [   27.184156]  [<ffffffff90af7b12>] dump_stack+0x63/0x81
server# [   27.185127]  [<ffffffff9086f3ab>] __warn+0xcb/0xf0
server# [   27.185811]  [<ffffffff9086f49d>] warn_slowpath_null+0x1d/0x20
server# [   27.186968]  [<ffffffff908f8b13>] cgroup_get+0x53/0x60
server# [   27.187702]  [<ffffffff908ff080>] cgroup_sk_alloc+0x50/0xe0
server# [   27.188316]  [<ffffffff90c935fe>] sk_clone_lock+0x2fe/0x3c0
server# [   27.189258]  [<ffffffff90cf74f6>] inet_csk_clone_lock+0x16/0xf0
server# [   27.190270]  [<ffffffff90d12dc3>] tcp_create_openreq_child+0x23/0x4c0
client1# [   12.475306] nscd[713]: 713 monitoring file `/etc/passwd` (1)
server# [   27.191522]  [<ffffffff90d11667>] tcp_v4_syn_recv_sock+0x47/0x350
server# [   27.192466]  [<ffffffff90d13acd>] tcp_check_req+0x35d/0x4f0
server# [   27.193388]  [<ffffffff90d60fb4>] ? tcp_v4_inbound_md5_hash+0x63/0x18a
server# [   27.195173]  [<ffffffff90d12863>] tcp_v4_rcv+0x513/0xa00
server# [   27.196059]  [<ffffffff90ce9f64>] ? ip_route_input_noref+0xbb4/0xe70
server# [   27.197068]  [<ffffffff90cebb36>] ip_local_deliver_finish+0x96/0x1c0
server# [   27.198072]  [<ffffffff90cebe10>] ip_local_deliver+0x60/0xd0
server# [   27.198887]  [<ffffffff90ceb7c8>] ip_rcv_finish+0x118/0x3f0
server# [   27.199836]  [<ffffffff90cec0e3>] ip_rcv+0x263/0x370
server# [   27.201042]  [<ffffffff90d213eb>] ? arp_rcv+0x10b/0x1a0
server# [   27.202384]  [<ffffffff90caadc5>] __netif_receive_skb_core+0x505/0xa30
server# [   27.203493]  [<ffffffff90d17fbc>] ? tcp4_gro_receive+0x11c/0x1c0
client1# [   12.488737] nscd[713]: 713 monitoring directory `/etc` (2)
server# [   27.204791]  [<ffffffff90d27d72>] ? inet_gro_receive+0x202/0x2a0
server# [   27.205962]  [<ffffffff90cab308>] __netif_receive_skb+0x18/0x60
server# [   27.206862]  [<ffffffff90cab373>] netif_receive_skb_internal+0x23/0x80
server# [   27.207986]  [<ffffffff90cac132>] napi_gro_receive+0xc2/0xe0
server# [   27.208896]  [<ffffffffc04310fd>] virtnet_receive+0x23d/0x870 [virtio_net]
server# [   27.210171]  [<ffffffffc043174d>] virtnet_poll+0x1d/0x80 [virtio_net]
server# [   27.211394]  [<ffffffff90cabb66>] net_rx_action+0x216/0x350
server# [   27.212362]  [<ffffffffc03c5504>] ? vring_interrupt+0x34/0x80 [virtio_ring]
server# [   27.213595]  [<ffffffff90d6ca54>] __do_softirq+0x104/0x28c
server# [   27.214306]  [<ffffffff90875466>] irq_exit+0xb6/0xc0
server# [   27.214981]  [<ffffffff90d6c7a4>] do_IRQ+0x54/0xd0
server# [   27.215702]  [<ffffffff90d6a882>] common_interrupt+0x82/0x82
client1# [   12.500208] dhcpcd[645]: Failed to try-reload-or-restart openntpd.service: Unit openntpd.service not found.
server# [   27.216492]  <EOI> [   27.217018]  [<ffffffff90d69746>] ? native_safe_halt+0x6/0x10
server# [   27.217974]  [<ffffffff90d69480>] default_idle+0x20/0xd0
server# [   27.218942]  [<ffffffff9082f99f>] arch_cpu_idle+0xf/0x20
server# [   27.219919]  [<ffffffff90d69893>] default_idle_call+0x23/0x30
server# [   27.220955]  [<ffffffff908adbf1>] cpu_startup_entry+0x1c1/0x230
server# [   27.222225]  [<ffffffff90d627f7>] rest_init+0x77/0x80
server# [   27.223298]  [<ffffffff91111f71>] start_kernel+0x42c/0x439
server# [   27.224237]  [<ffffffff91111120>] ? early_idt_handler_array+0x120/0x120
server# [   27.225696]  [<ffffffff911112ca>] x86_64_start_reservations+0x24/0x26

to me that says that the QA on the tests themselves is lacking. One of the prerequisites for testing an application is a working system. That is already not working here, so at that point I stopped reading output.

While everyone hearts your reaction, this self congratulatory attitude seems overly positive. Perhaps I have different metrics regarding success, though. For example, you count thousands of PRs being merged as an achievement. I count the work of e.g. Ericson2314 as one feature of interest (if it works) despite it being spread out over many PRs. As a leadership principle, undesired behavior should not be encouraged. By celebrating a meaningless metric like the number of merged updates, you only achieve tired contributors (because pressing a merge button cannot possibly be more boring), who at some point will be bored and stop.

When people dig a tunnel with a teaspoon you are the one cheering to the team, while I am there thinking: why don't they call 3M?

Your message about my "negativity not being welcome" seems to be straight from SJW hell. While Nix and NixOS do some things pretty good, there is also a lot of stuff which is terrible. Implying you only want "happy thoughts" essentially is not leadership, it's a culture suited for the gulags.

When someone throws shit in your face, are you also going to describe it as a warm welcome gift?

I am not sure whether this is an American thing, but I can assure you that it is not normal to take criticism this badly.

In short, your message is more passive aggressive than it should receive a heart. In the future stick to the facts please. The facts are that 17.09 was not ready for use, even today.

@fpletz You are allowed to test and build unfree packages in a lot of cases. Redistributing the resulting builds just is not always allowed. But distributing the recipe for building something and claiming you have followed that recipe and ended up with something that works is perfectly fine. There are a few database vendors which say that you cannot publish benchmarks, but simply saying that you cannot do it, is just lying. Perhaps you don't want to do it or you have no interest in it. Perhaps there is nobody in the entire community interested in it.

That still doesn't mean that it cannot be done.

So, building a piece of software and then not distributing it to entities who are not a member of the NixOS organisation is perfectly legal.

This particular contribution to the manual is of negative value, because it doesn't reference any policy and provides no legal reason. As such, you have just inflated the size of the manual prematurely and someone else (me in this case) needs to point out the fact that you need to redo it again.

@fpletz
Copy link
Member

fpletz commented Nov 9, 2017

Upgrade broke idempotency

Could you please elaborate what broke for you? Did you open an issue for that? Claiming something broke without mentioning what or any further explanation is not polite.

Edit: @mkocha2 edited his comment to include some more infos.

The facts are that 17.09 was not ready for use, even today.

Why? What are the issues that should be fixed in your opinion?

You can't expect everything to be bug-free, even with extensive testing in place. Just look at other community-driven distributions for a reference. Errors happen, may they be in the code or in its tests.

nfs4 NixOS Test

This could be a bug in the test that nobody noticed before. We should fix that but that doesn't mean that all NixOS tests and their QA are inherently inadequate. Even though the kernel printed a stack trace the VM seemed to work fine. I can't pinpoint an actual cause for the stack trace after a quick look.

Unfree packages

Our CI system is not able to build packages without pushing it to the binary cache at cache.nixos.org. This would require some code changes. You're free to implement this. As nobody has done this yet or opened an issue that I know of there doesn't seem to be much interest in it.

I don't agree that the manual change is useless and that I am lying. It just clarifies that we can't support unfree software unless we look at each license/EULA on a case-by-case basis ("most"). This is the legal reason. If you have a suggestion to improve the wording, please do send it to me or open a PR.

You're also welcome to help to add license abstractions for unfree software that we can build, distribute or run. We're just putting everything in the unfree category so we don't have to deal with that.

Personally, I'm not interested in improving the unfree packages situation in my spare time and any help is appreciated.

@vcunat
Copy link
Member

vcunat commented Nov 9, 2017

Unfree packages: I don't see it as being that much about legal reasons but about current policy for the build farm – it disallows building "unfree" packages and now it's explicitly written down in docs. Note that many distributions don't even allow them into the main repository. I'm personally not too motivated by them, but if enough people are interested, surely they can put together a Hydra instance doing the additional unfree tests (and write them first, actually :).

Negativity: I'm personally not at all against criticizing etc. but it seems only useful if it's done in a "productive way", i.e. focused on fixing those problems. For example, you claimed that idempotency got broken but you provided almost no information about it (so far).

@mkocha2
Copy link
Author

mkocha2 commented Nov 12, 2017

@vcunat It's not that I don't want to provide information. The available logging information was limited. I got a "Warning" that a service didn't restart correctly, but running nixos-rebuild switch directly afterwards did succeed. I looked into journalctl and journalctl -xe and didn't see the details that were needed for a quick resolution.

I wouldn't mind running a Hydra instance for a limited number of packages, but not as long as I can't just copy paste something complete in my configuration.nix.

@fpletz The font issue mentioned is the most user visible problem and that's 100% open-source code. If there is an intention to make things better, one could setup a canary machine that compares pixels for various terminal emulators as well as e-mail programs for different versions for rendering 'the quick brown box...'.

The reason this would be a reasonable approach is that desirable font changes for a given machine for the same configuration are extremely rare, while searching for "broken fonts" on a search engine returns many results. It is a recurring problem resulting from a lack of QA processes.

Implementing something like this could be done on various levels, but starting a terminal emulator, waiting for the window to exist with e.g. wmctr, making an automated screenshot (with scrot) and comparing the image to the reference image for equality seems to be in the realm of the possible (that could be a call to diff). Once that is in place, the same tests could be run on virtual graphics devices and shipped to end users even. End users could select that they want to opt in to the test program, where differences would result in an automatic rollback. That way they don't even need to report that their NVIDIA GTX1070 with sprinkles has a problem. Instead, they would just upgrade the next week, possibly automatically, and they would have never seen there was a problem in the first place.

I agree that it would be better if Microsoft would maintain a Nix expression for Skype, but since they don't do that... If there was an open-source alternative for Skype, I wouldn't mind to share those with my contacts, but the last time I tried to use one, it didn't end well.

Regarding the legal issue: You should first establish that there is even a single EULA on this planet which says that you are not allowed to talk about the mere fact that a particular Nix expression accomplishes something when installed on your infrastructure.

You can write a whole book about doing highly illegal things.

In fact, in the case of Nix it's a mathematical theorem:

Given a complex state machine (a computer), and then executing a particular program will result in a particular bit being set to 1. There is no way that's illegal.

A less contrived argument is that it falls under interoperability laws and you are perfectly allowed to reverse-engineer even Oracle software for such work.

I am fairly sure that Oracle would be happy to see its software work on NixOS out of the box with an integrated payment system so they can more easily take your money. It's just that NixOS is so incredibly insignificant right now that merely looking at this website won't have a positive ROI for them.

Doing much more than this, like running actual benchmarks and publishing those is against the EULAs of some companies, but that's a completely separate topic.

So, my point stands. Legally, your arguments are not sound. If you don't want to do work for free, that's an understandable position, but don't make up legal arguments based on nothing.

Just because something is community driven, doesn't mean quality has to suffer. Comparing yourself to other distributions doesn't lead to a significantly better system. The reason NixOS exists, is because all the other distributions were designed badly. The other systems cannot and will never work.

@vcunat
Copy link
Member

vcunat commented Nov 12, 2017

one could setup a canary machine that compares pixels

I don't think that will work reliably without repeated maintenance. There isn't one way to render fonts, and freetype+fontconfig do change the (default) rendering a bit, from time to time. Still, if someone was manually checking it whenever it changed and somehow was building up a database of acceptable renderings... it might work well and it would be better QA, yes. (It's not high on my own priority list, but why not have them.)

BTW, we do have some VM tests with OCR, screenshotting, etc. The VM tests aren't very resource-efficient, and some are planned to be migrated to containers, but everyone is welcome to write more tests and submit in PRs. I'm certain we can accommodate reasonable ones on Hydra (for "free" SW at least).

@fpletz
Copy link
Member

fpletz commented Nov 14, 2017

This discussion is not leading to anything productive. Therefore I'm closing this issue.

The other comment was edited to include the "incompetence" of someone responsible for the release. This language is not acceptable. You're just discrediting the hard work of our community without having contributed anything yourself. Please help instead of insulting people who want to help you. 💩

I do not understand your arguments about the legal issue and will not waste any time discussing this with you. Please open a PR with alternative wording or shut up about this.

@fpletz fpletz closed this as completed Nov 14, 2017
@NixOS NixOS locked and limited conversation to collaborators Nov 14, 2017
@NixOS NixOS unlocked this conversation Nov 14, 2017
@mkocha2
Copy link
Author

mkocha2 commented Nov 18, 2017

@fpletz You are lying again, since it has actually led to something productive in the referenced issues. As such you are essentially discrediting me, while in fact I did make a useful contribution, albeit not in code, but with a comment to the upstream author of the package causing the issue. Clearly, that has been useful.

If you lack the mental capabilities to understand a legal argument, you should delegate decisions to others who do.

Criticism is discrediting "the hard work" and can optionally include mentioning things that went well, which is exactly what I did. The fact that you haven't learned the concept of criticism as a grown man must make your life difficult.

You are the one wasting the time of the community with your lies and the fact that the released software broke basic features. If you can't be bothered to run a couple of terminal emulators in all the popular environments (there are about 4 or so), why call yourself "responsible" for the release? If you want to receive credit for a quality release, make a quality release and you will be credited in history with this fact. The flip side is that when you don't, you get the blame and effectively negative credits.

If the person responsible for this isn't incompetent, then the person responsible is not willing to make it work, which would be even worse. If you are going for the "we don't have the resources angle", then don't make a release at all.

Closing an issue for accurately describing a release, because you disagree and you can is nothing short of what Bashar al-Assad would do. With your policy, you should just add an EULA to NixOS stating that one cannot write a review or publish a bug report about NixOS. Perhaps ask Oracle's legal team for advice on such matters.

I sincerely hope that you will learn to take criticism in the future. In a corporation, this issue would be reopened, until all the dependent issues have been solved. Since no such reference has been made, I believe this is not the case. As such the only rational course of action is to reopen this issue. I will close it when the referenced issues have all been resolved.

If I can upgrade to 18.03 in some months without issues I will press the thumbs up button to credit the merit of whoever is responsible for the 18.03 release if someone else makes a review of the 18.03 release.

In short, please stop pissing me off, especially since I am very limited in the amount of time I can spend here. A community doesn't grow when you piss people off.

@grahamc
Copy link
Member

grahamc commented Nov 19, 2017

As noted, this isn't an appropriate way to engage with the NixOS community. I invite you to not participate in communities you think are full of liars and incompetent people. Maybe a different distribution has more competent members that will leave you satisfied.

@NixOS NixOS locked and limited conversation to collaborators Nov 19, 2017
@domenkozar
Copy link
Member

I wanted to write a response how we treat people regardless how much they contribute to NixOS community, but I feel we've gone through this so many times that I just blocked @mkocha2

You're still welcome to use our free software, but you're not welcome by contributing harsh words.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants