Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

golang 1.14: mlock signal stack failed #21672

Closed
howardjohn opened this issue Feb 29, 2020 · 11 comments
Closed

golang 1.14: mlock signal stack failed #21672

howardjohn opened this issue Feb 29, 2020 · 11 comments
Assignees

Comments

@howardjohn
Copy link
Member

This is a tracker bug for golang/go#37436. I see this running istiod built with golang 1.14 on kind. Note that we do not yet build with golang 1.14 in CI

@howardjohn howardjohn self-assigned this Feb 29, 2020
@howardjohn
Copy link
Member Author

Apparently kernal 5.3.15+, 5.4.2+, or 5.5+ fix this, but every cluster I have is on a much older version. Only ubuntu 19.10+ has these

@howardjohn
Copy link
Member Author

Ah actually its fine on kernal 4.x, there is a small window of 5.2-> the version I mentioned above where it is broken

@neelance
Copy link

neelance commented Mar 1, 2020

@howardjohn Just for clarification: What do you mean by "it" in "it is broken"? Go?

@howardjohn
Copy link
Member Author

it's actually a kerbal but (more details on the linked issue), where before it silently did weird things in go 1.13 and in 1.14 it panics

@neelance
Copy link

neelance commented Mar 1, 2020

I am already involved with the issue. I am still unclear on how much the issue is seen as a regression, that's why I wondered if you perceive Go 1.14 as "broken".

@howardjohn
Copy link
Member Author

When i upgrade to go 1.14 my application is crashing every single minute in some instances, which is basically a non starter for updating. It may have been silently broken on 1.13 as well, but now it barely even runs

@howardjohn
Copy link
Member Author

So I don't necessarily perceive go as "broken", but I don't see how we can upgrade to it right now given that we cannot control the users ulimits, cannot control the users kernal version, and cannot ship an application that crashes constantly

@neelance
Copy link

neelance commented Mar 1, 2020

Yes, this seems worse. However, on an unpatched kernel this is seen as "works as intended". Question is: Is your kernel patched or not?

@howardjohn
Copy link
Member Author

My kernel on my personal machine is not patched. But it doesn't really matter what my machine has, we ship Istio to 1000s of users. I cannot control what machine they run it on, and I would love to not have to restrict that to some subset of kernel versions

@nberlee
Copy link
Contributor

nberlee commented Mar 19, 2020

This can be temporary fixed by adding to the discovery container deployment yaml:

        securityContext:
          capabilities:
            add:
            - IPC_LOCK
            drop:
            - ALL
          runAsUser: 0

As istio-pilot discovery runs as nonroot, it cannot set ulimit. Running as root, dropping all capabilities except for IPC_LOCK fixes this as a quick fix

@howardjohn
Copy link
Member Author

Also "fixed" in 1.14.1, it will no longer crash the deployment. I think this is safe to close now

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants