## Park 1. Checking data availability

1. Install AWS CLI:
```
$ curl "https://awscli.amazonaws.com/AWSCLIV2.pkg" -o "AWSCLIV2.pkg" \n
$ sudo installer -pkg AWSCLIV2.pkg -target /
```
2. Get the SRR IDs by following the instructions [here](https://github.com/asadprodhan/A-Guide-to-Automatically-Downloading-NCBI-SRA-Reads?tab=readme-ov-file#step-2-collect-sra-accession-numbers). (Also stored under `data/srr_acc_list.txt`)
3. Install python requirements: `pip3 install -r requirements.txt`
4. Check availability of SRR data: `python3 scripts/check_aws_availability.py`

## **Part 2: Setting up Amazon EC2 instance**

## **Launch EC2 Instance from Console**

### **Step 1: Go to EC2**
- Sign in to AWS Console
- Search for "EC2" in the top search bar
- Click "EC2"

### **Step 2: Launch Instance**
- Click the orange "Launch instance" button

### **Step 3: Configure Instance**

**Name:**
- Enter: `mgs-download`

**Application and OS Images (AMI):**
- Select "Ubuntu Server 22.04 LTS"
- Make sure it says "Free tier eligible"

**Instance type:**
- Select `t3.large` from the dropdown

**Key pair:**
- Click "Create new key pair"
- Name: `mgs-key`
- Key pair type: RSA
- Private key file format: `.pem`
- Click "Create key pair"
- **Save the downloaded file somewhere safe**

**Network settings:**
- Leave defaults (should auto-create security group with SSH access)

**Configure storage:**
- Change from `8` GB to `500` GB
- Leave type as "gp3"

### **Step 4: Launch**
- Click "Launch instance" (orange button at bottom right)
- Wait 2-3 minutes for instance to start

### **Step 5: Set Reminder to Terminate (IMPORTANT)**
**Before continuing, set an reminder to terminate EC2 instance, otherwise you will get surprised by a high bill**

### **Step 6: Get Connection Info**
- Click "View all instances"
- Find your `mgs-download` instance
- Wait until "Instance state" shows "Running"
- Click on the instance ID
- Copy the "Public IPv4 address" (looks like `3.123.45.67`)

### **Step 7: Connect from Your Terminal**

```bash
# Navigate to where you saved mgs-key.pem
cd ~/Downloads  # or wherever you saved it

# Make key secure
chmod 400 mgs-key.pem

# SSH in (replace with your actual IP)
ssh -i mgs-key.pem ubuntu@3.123.45.67
```

## **Part 2: Install SRA Toolkit on EC2**

### **Step 1: Update Package Manager**

```bash
sudo apt update
```

### **Step 2: Install SRA Toolkit**

```bash
sudo apt install sra-toolkit -y
```

### **Step 3: Verify Installation**

```bash
fastq-dump --version
```

Should show version number (e.g., `fastq-dump : 2.11.0`).

### **Step 4: Configure SRA Toolkit**

```bash
vdb-config --interactive
```

This opens a text-based GUI. Use arrow keys and Enter to navigate:

1. Press `TAB` to move between sections
2. Navigate to "CACHE" tab
3. Enable "local file-caching" (if not already enabled)
4. Set cache location to `/home/ubuntu/ncbi/public` (or leave default)
5. Navigate to "AWS" tab  
6. **Disable** "report cloud instance identity" (uncheck it)
7. Press `S` to Save
8. Press `X` to Exit
