Skip to content
This repository has been archived by the owner on Jan 12, 2024. It is now read-only.

[Errno 28] No Space left on device #3

Open
drdivine opened this issue Jun 30, 2020 · 2 comments
Open

[Errno 28] No Space left on device #3

drdivine opened this issue Jun 30, 2020 · 2 comments

Comments

@drdivine
Copy link

drdivine commented Jun 30, 2020

Dear all,

First of all thank you very much for creating this repository. This has helped my team and I tremendously.

I have implemented the solution as described in the README using the "standard deployment".

I have made modifications to the state machine. I'm not using the JOB_INPUTS, JOB_OUTPUTS, JOB_OUTPUT_PREFIX... arguments anymore. I am just giving an s3 path for a config file, which the container downloads and executes in a script accordingly. The issue arises when the batch job downloads fasta files from the s3 bucket (1) it seems to be slow, and (2) I get the [Errno 28] No Space left on device

  1. I know that this no longer uses entrypoint.aws.sh for input, output, and avoiding file path clobbering and I have no problem going back to it, I just wanted to get our set-up working in the new development stack

  2. I had to add the following two arguements to build.sh
    --build-arg AWS_DEFAULT_REGION=$AWS_DEFAULT_REGION \
    --build-arg AWS_CONTAINER_CREDENTIALS_RELATIVE_URI=$AWS_CONTAINER_CREDENTIALS_RELATIVE_URI \
    and our base docker image also has the following arguments added:
    ARG AWS_DEFAULT_REGION
    ARG AWS_CONTAINER_CREDENTIALS_RELATIVE_URI

  3. I made sure that the shell scripts we are running in the docker batch job are specifically doing work in /scratch.

  4. I tried playing with the size of the attached EBS volume, although I was under the impression that it should automagically change it's size with the amazon-ebs-autoscale repository.

Also, any changes I make to the VolumeSize either the /dev/xvdcz or /dev/sdc don't get implemented when the ec2 instance starts running.

I tried to describe changes that I made and some things that I tried out to solve this issue. Any help you could throw my direction would be greatly appreciated.

Thanks in advance

Matt

@drdivine
Copy link
Author

drdivine commented Jul 3, 2020

Just a follow up. It seems that there are some errors in the userdata script in the launch template. could someone verify it does the right thing?

        UserData:
          Fn::Base64: |
            MIME-Version: 1.0
            Content-Type: multipart/mixed; boundary="==BOUNDARY=="

            --==BOUNDARY==
            Content-Type: text/cloud-config; charset="us-ascii"

            packages:
            - jq
            - btrfs-progs
            - python27-pip
            - sed
            - wget
            - git
            - bzip2
            - amazon-ssm-agent

            runcmd:
            - pip install -U awscli boto3
            - start amazon-ssm-agent

            - stop ecs
            - service docker stop

            # install amazon-ebs-autoscale
            - cp -au /var/lib/docker /var/lib/docker.bk
            - rm -rf /var/lib/docker/*
            - EBS_AUTOSCALE_VERSION=$(curl --silent "https://api.github.com/repos/awslabs/amazon-ebs-autoscale/releases/latest" | jq -r .tag_name)
            - cd /opt && git clone https://github.com/awslabs/amazon-ebs-autoscale.git
            - cd /opt/amazon-ebs-autoscale && git checkout $EBS_AUTOSCALE_VERSION
            - sh /opt/amazon-ebs-autoscale/install.sh /var/lib/docker /dev/sdc 2>&1 > /var/log/ebs-autoscale-install.log
            - sed -i 's+OPTIONS=.*+OPTIONS="--storage-driver btrfs"+g' /etc/sysconfig/docker-storage
            - cp -au /var/lib/docker.bk/* /var/lib/docker
            
            # install miniconda/awscli
            - wget https://repo.continuum.io/miniconda/Miniconda3-latest-Linux-x86_64.sh
            - bash Miniconda3-latest-Linux-x86_64.sh -b -f -p /opt/miniconda
            - /opt/miniconda/bin/conda install -c conda-forge -y awscli
            - chown -R ec2-user:ec2-user /opt/miniconda
            - rm Miniconda3-latest-Linux-x86_64.sh

            - service docker start
            - start ecs

            --==BOUNDARY==--

This is also different then what is used in the genomics-workflow repository:

      additions: |-
        - stop ecs
        - service docker stop
        - cp -au /var/lib/docker /var/lib/docker.bk
        - rm -rf /var/lib/docker/*
        - cd /opt && wget $artifactRootUrl/get-amazon-ebs-autoscale.sh
        - sh /opt/get-amazon-ebs-autoscale.sh
        - sh /opt/amazon-ebs-autoscale/install.sh $scratchPath /dev/sdc > /var/log/ebs-autoscale-install.log 2>&1
        - sed -i 's+OPTIONS=.*+OPTIONS="--storage-driver btrfs"+g' /etc/sysconfig/docker-storage
        - cp -au /var/lib/docker.bk/* /var/lib/docker
        - cd /opt && wget $artifactRootUrl/aws-ecs-additions.tgz && tar -xzf aws-ecs-additions.tgz
        - sh /opt/ecs-additions/ecs-additions-step-functions.sh
        - service docker start
        - start ecs

and

UserData:
          Fn::Base64: !Sub
            - |
              MIME-Version: 1.0
              Content-Type: multipart/mixed; boundary="==BOUNDARY=="
              --==BOUNDARY==
              Content-Type: text/cloud-config; charset="us-ascii"
              packages:
              - jq
              - btrfs-progs
              - sed
              - wget
              - git
              - amazon-ssm-agent
              - unzip
              runcmd:
              - curl -s "https://awscli.amazonaws.com/awscli-exe-linux-x86_64.zip" -o "/tmp/awscliv2.zip"
              - unzip -q /tmp/awscliv2.zip -d /tmp
              - /tmp/aws/install
              - export scratchPath="${ScratchMountPoint}"
              - export artifactRootUrl="${ArtifactRootUrl}"
              - start amazon-ssm-agent

I'm getting the following errors when I install by hand:

+ MOUNTPOINT=/scratch
+ SIZE=100
+ DEVICE=
+ FILE_SYSTEM=btrfs
++ dirname /opt/amazon-ebs-autoscale/install.sh
+ BASEDIR=/opt/amazon-ebs-autoscale
+ . /opt/amazon-ebs-autoscale/shared/utils.sh
+ initialize
++ curl -s http://169.254.169.254/latest/meta-data/placement/availability-zone/
+ export AWS_AZ=eu-west-1a
+ AWS_AZ=eu-west-1a
++ echo eu-west-1a
++ sed -e 's/[a-z]$//'
+ export AWS_REGION=eu-west-1
+ AWS_REGION=eu-west-1
++ curl -s http://169.254.169.254/latest/meta-data/instance-id
+ export INSTANCE_ID=i-029e03fc7d0eb2f14
+ INSTANCE_ID=i-029e03fc7d0eb2f14
+ export EBS_AUTOSCALE_CONFIG_FILE=/etc/ebs-autoscale.json
+ EBS_AUTOSCALE_CONFIG_FILE=/etc/ebs-autoscale.json
+ PARAMS=
+ ((  2  ))
+ case "$1" in
+ PARAMS=' /var/lib/docker'
+ shift
+ ((  1  ))
+ case "$1" in
+ PARAMS=' /var/lib/docker /dev/sdc'
+ shift
+ ((  0  ))
+ eval set -- ' /var/lib/docker /dev/sdc'
++ set -- /var/lib/docker /dev/sdc
+ '[' '!' -z ' /var/lib/docker /dev/sdc' ']'
+ MOUNTPOINT=/var/lib/docker
+ '[' '!' -z /dev/sdc ']'
+ DEVICE=/dev/sdc
+ mkdir -p /usr/local/amazon-ebs-autoscale/bin /usr/local/amazon-ebs-autoscale/shared
+ cp /opt/amazon-ebs-autoscale/bin/create-ebs-volume /opt/amazon-ebs-autoscale/bin/ebs-autoscale /usr/local/amazon-ebs-autoscale/bin
+ chmod +x /usr/local/amazon-ebs-autoscale/bin/create-ebs-volume /usr/local/amazon-ebs-autoscale/bin/ebs-autoscale
+ ln -sf /usr/local/amazon-ebs-autoscale/bin/create-ebs-volume /usr/local/amazon-ebs-autoscale/bin/ebs-autoscale /usr/local/bin/
+ ln -sf /usr/local/amazon-ebs-autoscale/bin/create-ebs-volume /usr/local/amazon-ebs-autoscale/bin/ebs-autoscale /usr/bin/
+ cp /opt/amazon-ebs-autoscale/shared/utils.sh /usr/local/amazon-ebs-autoscale/shared
+ cp /opt/amazon-ebs-autoscale/config/ebs-autoscale.logrotate /etc/logrotate.d/ebs-autoscale
+ cat /opt/amazon-ebs-autoscale/config/ebs-autoscale.json
+ sed -e s#%%MOUNTPOINT%%#/var/lib/docker#
+ sed -e s#%%FILESYSTEM%%#btrfs#
+ '[' -e /var/lib/docker ']'
+ '[' -d /var/lib/docker ']'
+ '[' -e /var/lib/docker ']'
+ '[' -z /dev/sdc ']'
+ '[' '!' -b /dev/sdc ']'
++ create-ebs-volume --size 100
/usr/bin/create-ebs-volume: line 169: aws: command not found
/usr/bin/create-ebs-volume: line 175: aws: command not found
/usr/bin/create-ebs-volume: line 183: aws: command not found
/usr/bin/create-ebs-volume: line 185: [: : integer expression expected
/usr/bin/create-ebs-volume: line 190: [: : integer expression expected
/usr/bin/create-ebs-volume: line 195: [: : integer expression expected
Error: could not create volume
+ DEVICE='/usr/bin/create-ebs-volume: line 218: aws: command not found'

Any help would be greatly appreciated.

@wleepang
Copy link
Contributor

wleepang commented Jul 8, 2020

Hi @drdivine ,

If you still have the instance available, can you check /var/log/cloud-init-output.log for any errors that may have occurred while running the LaunchTemplate steps?

This solution and genomics-workfows are related projects and will eventually have aligned codebases. The core steps of the UserData sections are roughly the same:

  1. Install requisite system packages (i.e. jq, btrfs-progs)
  2. Install the AWS CLI
  3. Provision the instance - e.g. get / install amazon-ebs-autoscale, run any orchestrator specific provisioning steps.

Can you provide some additional context for how you ran the manual install of amazon-ebs-autoscale? What AMI did you use? The errors at the bottom indicate that the AWS CLI was not properly installed. Amazon EBS Autoscale uses the AWS CLI to create and attach volumes.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants