Skip to content
This repository has been archived by the owner on Jan 11, 2023. It is now read-only.

Document Usage of Ephemeral Disks #543

Closed
bearrito opened this issue Apr 27, 2017 · 18 comments
Closed

Document Usage of Ephemeral Disks #543

bearrito opened this issue Apr 27, 2017 · 18 comments

Comments

@bearrito
Copy link
Contributor

bearrito commented Apr 27, 2017

The answer to this question might be entirely obvious to a seasoned K8's on ACS user. It isn't obvious to me coming from typical usage of Azure VM's

What is capacity to use ephemeral disks? None or some capacity? documented or undocumented?

I see them mentioned in #406 so ephermal disks seem to be a concept.

@JackQuincy
Copy link
Contributor

Yes there are ephemeral disks on every vm. documented here https://docs.microsoft.com/en-us/azure/virtual-machines/linux/optimization

Azure OS Disk
Once you create a Linux VM in Azure, it has two disks associated with it. /dev/sda is your OS disk, /dev/sdb is your temporary disk. Do not use the main OS disk (/dev/sda) for anything except the operating system as it is optimized for fast VM boot time and does not provide good performance for your workloads. You want to attach one or more disks to your VM to get persistent and optimized storage for your data.

depending on the vm size you get a different disk with different disk speeds. which is documented here
https://docs.microsoft.com/en-us/azure/virtual-machines/linux/sizes-general

@alahiff
Copy link

alahiff commented Apr 28, 2017

Given the text quoted above, why are /var/lib/docker and /var/lib/kubelet on the OS disk? Is there some way to deploy a Kubernetes cluster in ACS where these are on the temporary disk (i.e. /dev/sdb)?

@JackQuincy
Copy link
Contributor

JackQuincy commented Apr 28, 2017

the suggestion there is to actual have a separate data disk and use that instead, not the ephemeral disk.

@colemickens Any answer to the above question about why we are using osdisk with this is stated in Azure documentation

Do not use the main OS disk (/dev/sda) for anything except the operating system as it is optimized for fast VM boot time and does not provide good performance for your workloads. You want to attach one or more disks to your VM to get persistent and optimized storage for your data.

Edit: formatting, keep forgetting to do two newlines to make a new paragraph

@itaysk
Copy link

itaysk commented Apr 29, 2017

I never tried Kubernetes, but with DCOS ephemeral is used by default. why the difference choice?

@bearrito
Copy link
Contributor Author

bearrito commented May 1, 2017

Thanks for the response. I'm aware of the difference between the os disks and epehermal disks.

Just for my edification, when I create a emptyDir volume it will map to /dev/sdb?

The portion where someone could be confused comes from
By default, emptyDir volumes are stored on whatever medium is backing the machine

https://kubernetes.io/docs/concepts/storage/volumes/#emptydir

@marc-sensenich

@alahiff
Copy link

alahiff commented May 2, 2017

@bearrito I found that emptyDir volumes use the OS disk (i.e. not /dev/sdb).

@JackQuincy
Copy link
Contributor

DCOS we started using the ephemeral disk to speed up install/start times. I am not familiar with the reasons for choosing to not use it for k8s. @colemickens should know.

@bearrito
Copy link
Contributor Author

bearrito commented May 2, 2017

Thank you @alahiff

This seems fairly impactful. I need a significant amount of space for check-pointing without incurring network IO costs to write to a Storage Account based disk or Managed Disk. Is mapping emptyDir to /dev/sda by design or should this ticket be reclassified from question to bug?

@alahiff
Copy link

alahiff commented May 8, 2017

I would consider it a bug, as it reduces the usefulness of emptyDirs. I've had to resort to using hostPath instead in order to use the ephemeral disk.

@bearrito
Copy link
Contributor Author

bearrito commented May 8, 2017

@alahiff I concur.

We are also considering the usage of host path. In particular we are attempt to override the creation of the agent vm's so we can run an azure template script, so that we me set the permissions on sdb which would allow us to run in unprivileged mode.

Did you do something similar or just run the containers in privileged?

@alahiff
Copy link

alahiff commented May 8, 2017

@bearrito For the moment I'm just using the simplest option. I'm not using privileged containers, however - they initially run as root (& hence can write in sdb) then drop to an unprivileged user before doing actual work. Each container creates a directory in sdb named after its hostname in order to avoid collisions.

emptyDirs would be much nicer!

@JackQuincy
Copy link
Contributor

So right now we are pretty heads down working on our internal goals, hence why @colemickens hasn't been able to respond. But the idea of having emptyDir be on the ephemeral disk sounds good. If one of you want to take a stab at a PR mounting the ephemral disk or a partition of the ephemeral disk where empty dirs are created I'd be very willing to review it

@anhowe
Copy link
Contributor

anhowe commented May 16, 2017

@squillace can you take a look at this?

@andyzhangx
Copy link
Contributor

emptyDir would use directory under /var/lib/kubelet, it cannot use ephemeral disk, hostpath could do that

@andyzhangx
Copy link
Contributor

I got the fix, on every node:

1.

sudo service docker stop
sudo mv /var/lib/docker /mnt
sudo ln -s /mnt/docker /var/lib/docker
sudo service docker start

2.

sudo vi /etc/systemd/system/kubelet.service, append following (this is the key point here)

--volume=/mnt/docker:/mnt/docker:rw \

3.

sudo systemctl daemon-reload
sudo systemctl restart kubelet

@chrislangston
Copy link

I'm not sure if this will help anyone else or not, but wanted to share in case it does. I needed to have the ability for my docker containers to use the ephemeral/temp drive (/mnt) that is created by default when the Nodes are built out.

Without this, I believe my docker instances where writing to the OsDisk.

I ended up using something like this in my .yaml file:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: mp-mnt-deployment
  labels:
    app: message-processor
spec:
  replicas: 1
  selector:
    matchLabels:
      app: message-processor
  template:
    metadata:
      labels:
        app: message-processor
    spec:
      containers:
      - name: message-processor
        image: {my-acr-image-location}/message-processor:mnt01
        volumeMounts:
          - name: internalprocessinputvol
            mountPath: /mnt/app/input
          - name: internalprocessoutputvol
            mountPath: /mnt/app/output
        resources: 
          limits:
            memory: "2.5G"
          requests:
            cpu: ".7"
            memory: "2.5G"
      restartPolicy: Always
      volumes:
        - name: internalprocessinputvol
          hostPath:
            path: /mnt
        - name: internalprocessoutputvol
          hostPath:
            path: /mnt

@lunarfs
Copy link

lunarfs commented Aug 13, 2018

So the issue with the above solution is that /mnt might not point to the ephermal disk when the container starts. The ephermal disk is handled by waagent and for instance when you are moved to a new physical node during reboot or maintenance of the underlaying hardware, you will be assigned new ephermal space, waagent does partition creation and formating of this new space, processes that all takes time. Personally I have seen issues where it takes several minutes before /mnt is ready to be used. I am currently looking for a bullet proof solution as how to monitor that /mnt is the ephermal disk. and not just a folder on the os disk.
Usually i add a systemctl dropin with the follwing shell code
until mount | grep "/mnt"; do sleep 5; echo "waagent service has not yet mounted /mnt/resource." done
but this will obviously not work in this case.

@stale
Copy link

stale bot commented Mar 9, 2019

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contribution. Note that acs-engine is deprecated--see https://github.com/Azure/aks-engine instead.

@stale stale bot added the stale label Mar 9, 2019
@stale stale bot closed this as completed Mar 16, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

8 participants