# Transferring files in S3

### Transferring between EC2 and S3

We have explored S3 quite a bit. We can use S3 with EC2, that way we can save some space in EBS. And if we decided to terminate our EC2 instance, storing into S3 would be very convenient.

Before anything, we have to assign an IAM role to our EC2 for access to S3 - I am giving full access to it. First, go to IAM dashboard.

Click Roles on the navigation pane.

Click Create Role.

Choose  the service that will use this a role. We are going to use EC2 to access S3, so EC2 will be needing a role. Click EC2, then click Next: Permissions.

There are hundreds of policy in AWS. Type in S3 in the entry box, and choose the relevant policy. I am going to choose AmazonS3FullAccess. Click on the link to learn more about the policy. If you are happy with it, check the tickbox beside it. Then, click Next:Tags.

I am just going to skip Tags; it is useful for tracing back services, especially if you deployed many services at a time. Click Next:Review. Then fill in the required information remaining. I'm going to name it *ec2_to_s3*. Click Create role.

After creating the role, let's attach it to our instance. Go to our EC2 Instances dashboard. Right click on our instance, hover over Instance Settings and click on Attach/Replace IAM Role.

Choose the newly created IAM role: ec2_to_s3.

Click Apply.

At last. Let's try getting the *hiseqXten_url.txt* from S3. This is achievable with the AWS CLI. Let's see what buckets we have.

In [10]:
aws s3 ls

2019-05-08 16:56:00 mumgf-eagle01


Correct! What is inside that bucket?

In [11]:
aws s3 ls mumgf-eagle01

2019-05-08 17:41:29        158 hiseqXten_url.txt
2019-05-08 17:01:06       1376 sequel_url.txt


Nice. Now, lets copy *hiseqXten_url.txt*.

In [18]:
aws s3 cp s3://mumgf-eagle01/hiseqXten_url.txt .

download: s3://mumgf-eagle01/hiseqXten_url.txt to ./hiseqXten_url.txt


Okay. Let's download the files from the URLs in the file.

In [19]:
for url in `cat hiseqXten_url.txt`; do
 wget $url
done

--2019-05-09 01:36:12--  ftp://ftp.sra.ebi.ac.uk/vol1/fastq/ERR322/002/ERR3229772/ERR3229772_1.fastq.gz
           => ‘ERR3229772_1.fastq.gz’
Resolving ftp.sra.ebi.ac.uk (ftp.sra.ebi.ac.uk)... 193.62.192.7
Connecting to ftp.sra.ebi.ac.uk (ftp.sra.ebi.ac.uk)|193.62.192.7|:21... connected.
Logging in as anonymous ... Logged in!
==> SYST ... done.    ==> PWD ... done.
==> TYPE I ... done.  ==> CWD (1) /vol1/fastq/ERR322/002/ERR3229772 ... done.
==> SIZE ERR3229772_1.fastq.gz ... 4975966320
==> PASV ... done.    ==> RETR ERR3229772_1.fastq.gz ... done.
Length: 4975966320 (4.6G) (unauthoritative)


2019-05-09 01:50:44 (5.49 MB/s) - ‘ERR3229772_1.fastq.gz’ saved [4975966320]

--2019-05-09 01:50:44--  ftp://ftp.sra.ebi.ac.uk/vol1/fastq/ERR322/002/ERR3229772/ERR3229772_2.fastq.gz
           => ‘ERR3229772_2.fastq.gz’
Resolving ftp.sra.ebi.ac.uk (ftp.sra.ebi.ac.uk)... 193.62.192.7
Connecting to ftp.sra.ebi.ac.uk (ftp.sra.ebi.ac.uk)|193.62.192.7|:21... connected.
Logging in as anonymous ... Logg

That took a while, around 30 minutes. Now, afterwards, we are going to copy the files we have downloaded back to the bucket. Or maybe we want to create a new bucket just for the Hiseq data. Let's create a bucket. Specify your region if you need to - I tried once and instead of Singapore, it created one in US East N. Virginia.

In [9]:
aws s3 mb s3://eagle-hiseqxten

make_bucket: eagle-hiseqxten


Now, like linux mv, move the fastq files into the bucket. 

In [14]:
aws s3 mv ERR3229772_1.fastq.gz s3://eagle-hiseqxten/ERR3229772_1.fastq.gz
aws s3 mv ERR3229772_2.fastq.gz s3://eagle-hiseqxten/ERR3229772_2.fastq.gz

move: ./ERR3229772_1.fastq.gz to s3://eagle-hiseqxten/ERR3229772_1.fastq.gz
move: ./ERR3229772_2.fastq.gz to s3://eagle-hiseqxten/ERR3229772_2.fastq.gz


In [16]:
aws s3 ls s3://eagle-hiseqxten

2019-05-09 03:06:15 4975966320 ERR3229772_1.fastq.gz
2019-05-09 03:07:46 5646625948 ERR3229772_2.fastq.gz


Nice.

Actually it is also possible to transfer files from your local computer to your S3 bucket. Let me try submit a sample text file.

Well, that is a bit embarassing. By doing this, you can probably save money for not using EC2 instance. However, if you are limited by storage in your local computer, using an instance is one way of around it.