Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support Posix Shared Memory across containers in a pod #28272

Open
CsatariGergely opened this issue Jun 30, 2016 · 64 comments
Open

Support Posix Shared Memory across containers in a pod #28272

CsatariGergely opened this issue Jun 30, 2016 · 64 comments
Assignees
Labels

Comments

@CsatariGergely
Copy link

@CsatariGergely CsatariGergely commented Jun 30, 2016

Docker implemented modifiable shmsize (see 1) in version 1.9. It should be possible to define the shmsize of a pod on the API and Kubernetes shall pass this information to Docker.

@Random-Liu

This comment has been minimized.

Copy link
Member

@Random-Liu Random-Liu commented Jun 30, 2016

Also ref #24588 (comment), in which we also discussed whether we should expose shmsize in pod configuration.

@janosi

This comment has been minimized.

Copy link
Contributor

@janosi janosi commented Jun 30, 2016

I am not sure I can see that discussion in that issue about exposing ShmSize on the Kubernetes API :( As I understand, that discussion is about how to use the Docker API after it introduced the ShmSize attribute.

@Random-Liu

This comment has been minimized.

Copy link
Member

@Random-Liu Random-Liu commented Jun 30, 2016

I would like kube to set an explicit default ShmSize using the option 1 proposed by @Random-Liu and I wonder if we should look to expose ShmSize as a per container option in the future.

I should say "in which we also mentioned whether we should expose shmsize in container configuration."

@janosi

This comment has been minimized.

Copy link
Contributor

@janosi janosi commented Jun 30, 2016

@Random-Liu All right, thank you! I missed that point.

@dims

This comment has been minimized.

Copy link
Member

@dims dims commented Jul 8, 2016

@janosi @CsatariGergely - the 64m default is not enough? what would be the best way to make it configurable for your use? (pass a parameter in kubelet command line?)

@janosi

This comment has been minimized.

Copy link
Contributor

@janosi janosi commented Jul 11, 2016

@dims Or maybe it is too much to waste? ;)
But yes, sometimes 64m is not enough.
We would prefer a new optional attribute for the pod in PodSpec in the API,like e.g. "shmSize".
As shm is shared among containers in the pod, PodSpec would be the appropriate place, I think.

@pwittrock pwittrock removed the team/ux label Jul 18, 2016
@janosi

This comment has been minimized.

Copy link
Contributor

@janosi janosi commented Sep 2, 2016

We have a chance to work on this issue now. I would like to align the design before applying it on the code. Your comments are welcome!

The change on the versioned API would be, that there would be a new field in type PodSpec:

//Optional: Docker "--shm-size" support. Defines the size of /dev/shm in a Docker-managed cotainer 
//If not defined here Docker uses a default value
//Cannot be updated
ShmSize *resource.Quantity `json:"shmSize,omitempty"`
@ddysher

This comment has been minimized.

Copy link
Contributor

@ddysher ddysher commented Sep 21, 2016

@janosi Did you have a patch for this? we currently hit this issue running db on k8s, would like to have shm size configurable.

@janosi

This comment has been minimized.

Copy link
Contributor

@janosi janosi commented Sep 22, 2016

@ddysher We are working on it. We send the PRs in the next weeks.

@wstrange

This comment has been minimized.

Copy link
Contributor

@wstrange wstrange commented Oct 5, 2016

Just want to chime in that we are hitting this problem as well

@gjcarneiro

This comment has been minimized.

Copy link

@gjcarneiro gjcarneiro commented Nov 9, 2016

Hi, is there any known workaround for this problem? I need to increase the shmem size to at least 2GB, and I have no idea how.

@janosi

This comment has been minimized.

Copy link
Contributor

@janosi janosi commented Nov 11, 2016

@ddysher @wstrange @gjcarneiro Please share your use cases with @vishh and @derekwaynecarr on the pull request #34928 They have concerns about extending the API with this shmsize option, and they have different solution proposals. They would like to understand whether users really require this on the API, or shm size could be adjusted by k8s automatically to some calculated value.

@gjcarneiro

This comment has been minimized.

Copy link

@gjcarneiro gjcarneiro commented Nov 11, 2016

My use case is a big shared memory database, typically at a 1 GiB order, but we usually reserve 3 GiB shared memory space just in case it grows. This data is constantly being updated by a writer (a process), and must be made available to readers (other processes). Previously we tried redis server for this, but the performance for this solution was not great, so shared memory it is.

My current workaround is (1) mount a tmpfs volume in /dev/shm, as in this openshift article, and (2) make the writer and reader processes all run in the same container.

@wstrange

This comment has been minimized.

Copy link
Contributor

@wstrange wstrange commented Nov 11, 2016

My use case is an Apache policy agent plugin that allocates a very large (2GB) cache. I worked around it by setting a very low shm value. This is OK for development, but I need a solution for production.

Adjusting shm size dynamically seems tricky. From my perspective, declaring it as a container resource would be fine.

@ddysher

This comment has been minimized.

Copy link
Contributor

@ddysher ddysher commented Nov 11, 2016

My use case is to run database application on top of kubernetes that needs at least (2GB) shared memory. Right now, we just set a large default; it would be nice to have a configurable option.

@vishh

This comment has been minimized.

Copy link
Member

@vishh vishh commented Nov 11, 2016

@ddysher @wstrange @gjcarneiro Do you applications dynamically adjust their behavior based on the shm size? Will they be able to function if the default size is >= pod's memory limit?

@wstrange

This comment has been minimized.

Copy link
Contributor

@wstrange wstrange commented Nov 11, 2016

The shm size is configurable only when the application starts (i.e. , you can say "only use this much shm").

It can not be adjusted dynamically.

@vishh

This comment has been minimized.

Copy link
Member

@vishh vishh commented Nov 11, 2016

@wstrange Thanks for clarifying.

@ddysher

This comment has been minimized.

Copy link
Contributor

@ddysher ddysher commented Nov 13, 2016

@vishh We have the same case as @wstrange. shm size doesn't need to be adjusted dynamically.

@gjcarneiro

This comment has been minimized.

Copy link

@gjcarneiro gjcarneiro commented Nov 13, 2016

Same for me, shm size is a constant in a configuration file.

@vishh

This comment has been minimized.

Copy link
Member

@vishh vishh commented Nov 14, 2016

Great. In that case, kubelet can set the default size of /dev/shm to be that of the pod's memory limit. Apps will have to be configured to use a value that is lesser than the pod's memory limit for shm.

@vishh vishh self-assigned this Nov 14, 2016
@vishh vishh added this to the v1.6 milestone Nov 14, 2016
@elyscape

This comment has been minimized.

Copy link
Contributor

@elyscape elyscape commented Nov 16, 2016

@vishh What about if there is no memory limit imposed on the application? For reference, it looks like Linux defaults to half of the total RAM.

@janosi

This comment has been minimized.

Copy link
Contributor

@janosi janosi commented Nov 21, 2016

@vishh you can close the PR is you think so.

@fejta-bot

This comment has been minimized.

Copy link

@fejta-bot fejta-bot commented Oct 4, 2018

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

@arunmk

This comment has been minimized.

Copy link

@arunmk arunmk commented Oct 4, 2018

/remove-lifecycle stale

@fejta-bot

This comment has been minimized.

Copy link

@fejta-bot fejta-bot commented Jan 2, 2019

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

@ijumps

This comment has been minimized.

Copy link

@ijumps ijumps commented Jan 3, 2019

/remove-lifecycle stale

@bhack

This comment has been minimized.

Copy link

@bhack bhack commented Mar 29, 2019

Any news on this?

@tahayk

This comment has been minimized.

Copy link

@tahayk tahayk commented May 15, 2019

Any news on this? 2

@jerryyifei

This comment has been minimized.

Copy link

@jerryyifei jerryyifei commented Jul 10, 2019

Any update on this :) I'm using helm chart to install containers, and want to figure out where to add shmsize value

@axelborja

This comment has been minimized.

Copy link

@axelborja axelborja commented Sep 18, 2019

The following may help to implement this feature request easily for those who use containerd as CRI runtime.: containerd/containerd#3655 ;)

@YesterdayxD

This comment has been minimized.

Copy link

@YesterdayxD YesterdayxD commented Oct 11, 2019

FYI, this is the configuration we're using for k8s:

volumes: [{
    name: 'dshm',
    emptyDir: {
        medium: 'Memory',
        sizeLimit: '256Mi'
    }
}]

maybe the 256Mi can't limit shmsize actually.
The actual size is half of memory size of node where the pod is running on.
Right?

@gaocegege

This comment has been minimized.

Copy link
Contributor

@gaocegege gaocegege commented Oct 29, 2019

FYI, this is the configuration we're using for k8s:

volumes: [{
    name: 'dshm',
    emptyDir: {
        medium: 'Memory',
        sizeLimit: '256Mi'
    }
}]

maybe the 256Mi can't limit shmsize actually.
The actual size is half of memory size of node where the pod is running on.
Right?

I think so and it is why we need #63641

@bashimao

This comment has been minimized.

Copy link

@bashimao bashimao commented Nov 4, 2019

While this solves the initial problem, at least the current implementation is not acceptable. If any application running in the container tries to go beyond this SHM limit, it is allowed to do so, but within a few seconds the pod is evicted by k8s (see below). Instead, what you want should be a hard limit, such that the operating system in the container throws a no space left on device error, but otherwise continues to run.

  Normal   Started  12m    kubelet, x_machine_x  Started container x_pod_x
  Warning  Evicted  9m38s  kubelet, x_machine_x  Usage of EmptyDir volume "dshm" exceeds the limit "1Gi".
  Normal   Killing  9m38s  kubelet, x_machine_x  Stopping container x_pod_x
@LzyRapx

This comment has been minimized.

Copy link

@LzyRapx LzyRapx commented Nov 20, 2019

volumeMounts:
 - mountPath: /dev/shm
      name: dshm
volumes:
 - name: dshm
      emptyDir:
        medium: Memory

This would be work for me.

@fejta-bot

This comment has been minimized.

Copy link

@fejta-bot fejta-bot commented Feb 18, 2020

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

@jwm

This comment has been minimized.

Copy link

@jwm jwm commented Feb 18, 2020

/remove-lifecycle stale

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

You can’t perform that action at this time.