-
Notifications
You must be signed in to change notification settings - Fork 38.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Proposal for disk accounting #16889
Proposal for disk accounting #16889
Conversation
Labelling this PR as size/XL |
GCE e2e build/test failed for commit 1957329ee110dcc46c52ef445c7f41b5228a2ac1. |
@kubernetes/rh-cluster-infra @kubernetes/rh-storage |
1957329
to
07d246a
Compare
GCE e2e test build/test passed for commit 07d246add8b62e7b991fbb216afbaeb55d54d5aa. |
|
||
3. Container’s logs - when written to stdout/stderr and default logging backend in docker is used. | ||
|
||
4. Local volumes - hostPath and emptyDir (TODO: Do git volumes consume local disk?) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The same questions exist for secrets, downward api, etc. I'd argue that hostPath
is different from the others because it's necessarily 'outside' kubernetes. Also, what if you do a host mount of another pod's volume dir? The recycler does this, for example -- how do you count hostPath correctly in that scenario?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, theoretically you could have an image which has a docker volume. In openshift, we generate volumes for each of docker volume, but unless I'm mistaken the possibility of containers using docker-created volumes exists in kubernetes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pmorie: What is the kubernetes equivalent representation for auto-generated volumes in openshift? I'm only concerned about volumes that consume local disk in this proposal. If there are such volumes that I have not included here, let me know and I will update the proposal.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Get rid of TODO here since GitRepo does consume local disk, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
On Mon, Nov 16, 2015 at 1:28 PM, Dawn Chen notifications@github.com wrote:
In docs/proposals/disk-accounting.md
#16889 (comment)
:
+2. Support for filesystems other than ext4.
+
+### Introduction
+
+Disk accounting in Kubernetes cluster running with docker is complex because of the plethora of ways in which disk gets utilized by a container.
+
+Disk can be consumed for,
+
+1. Container images
+
+2. Container’s writable layer
+
+3. Container’s logs - when written to stdout/stderr and default logging backend in docker is used.
+
+4. Local volumes - hostPath and emptyDir (TODO: Do git volumes consume local disk?)Get rid of TODO here since GitRepo does consume local disk, right?
—
Reply to this email directly or view it on GitHub
https://github.com/kubernetes/kubernetes/pull/16889/files#r44985742.
@vishh Will there be any API changes for quotas or status? |
No. New metrics API is being added to kubelet. Accounting will use those On Fri, Nov 6, 2015 at 11:22 PM, Paul Morie notifications@github.com
|
|
||
1. Compatibility with all storage backends. The matrix is pretty large already and the priority is to get disk accounting to on most widely deployed platforms. | ||
|
||
2. Support for filesystems other than ext4. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hrm - we're starting to move customers to XFS in other domains - can you describe why you think this is a non-goal (?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
XFS is the default in RHEL 7 https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/7/html/Migration_Planning_Guide/sect-Red_Hat_Enterprise_Linux-Migration_Planning_Guide-File_System_Formats.html (it scales better than ext4).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think supporting xfs is important. Just now we are discovering that ext4 is having issues with thin devices specially in AWS. Preallocating metadata on large thin devices can take a long time and systemd times out and docker folks are investigating the possibility of switching to xfs by default as container rootfs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the info @smarterclayton @mrunalp and @rhvgoyal. Can someone help me scope out the changes required for xfs
? The first step might be to enhance this proposal to include changes required for xfs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now docker PR got merged which makes xfs default container fs for devicemapper graph driver.
So far I can't anything special required for xfs. Whatever we were doing to make sure ext4 works, we need to repeat same for xfs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the update. This change will affect the writable layer and the image layers. Since we depend on devicemapper semantics here, this change shouldn't affect this proposal.
I assume that the primary filesystem where volumes and logs will be stored will continue to be ext4? This is important because, the current plan is to use quota for those use cases. Quota configuration is slightly different between ext4 and xfs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When you say primary file system, are you referring to filesystem on host? I think there also we need to support both ext4 and xfs. RHEL has xfs as default host filesystem. And I am assuming that users will like all this to work on RHEL too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah! ok. That's what I was looking for. So yes, we need to support xfs by default. I plan to get this rolling with ext4 support to start with. I'd appreciate some help with xfs integration once the internal implementation is finalized as part of implementing ext4 support.
@pwittrock: Can you review this proposal? |
@vishh yes, taking a look now |
|
||
1. Account for disk usage on the nodes. | ||
|
||
2. Compatibility with the most important docker storage backends - devicemapper, aufs and overlayfs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: consider important -> common
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
@vishh Looks good. Just a couple of minor comments. |
Signed-off-by: Vishnu Kannan <vishnuk@google.com>
@pwittrock: Thanks for the review. Fixed your comments. I might need another LGTM since I updated the patch. |
PR changed after LGTM, removing LGTM. |
GCE e2e test build/test passed for commit 0b43037. |
@k8s-bot test this please |
GCE e2e test build/test passed for commit 0b43037. |
@k8s-bot test this [submit-queue is verifying that this PR is safe to merge] |
GCE e2e test build/test passed for commit 0b43037. |
@k8s-bot test this [submit-queue is verifying that this PR is safe to merge] |
GCE e2e test build/test passed for commit 0b43037. |
@k8s-bot unit test this |
@k8s-bot test this Tests are more than 48 hours old. Re-running tests. |
GCE e2e test build/test passed for commit 0b43037. |
|
||
* `chown -R :9000 /var/lib/docker/**container**/b8cc9fae3851f9bcefe922952b7bca0eb33aa31e68e9203ce0639fc9d3f3c61b/*` | ||
|
||
##### Testing titbits |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/titbits/tidbits
Proposal for disk accounting
@kubernetes/goog-node @bgrant0607 @thockin @smarterclayton