# RUN: SARS-CoV-2 Zoonotic Reservoir III

```
Lead     : ababaian
Issue    : #55
Version  : 
start    : 2020 05 08
complete : YYYY MM DD
files    : ~/serratus/notebook/200505_ab/
s3_files : s3://serratus-public/notebook/200505_ab/
output   : s3://serratus-public/out/200505_zoonotic/
```

Continuation from `200505_Run_Zoonotic_Reservoir.ipynb`

In [1]:
date

Fri May  8 14:45:24 PDT 2020


### Initialize local workspace

In [67]:
# Serratus commit version
SERRATUS="/home/artem/serratus"
cd $SERRATUS
git rev-parse HEAD # commit version

# Create local run directory
WORK="$SERRATUS/notebook/200505_ab"
mkdir -p $WORK; cd $WORK

# SRA RunInfo Table for run -- use first 500 from Zoonotic pilot
RUNINFO="$SERRATUS/notebook/200505_ab/zoonotic_SraRunInfo.csv"

cp zoonotic_SraRunInfo.csv zoonotic_SraRunInfo_Batch6.csv
#head $RUNINFO

43e7803ec9c4e51ad8055d3c54b4722c66c956c1


In [13]:
# Create a list of all completed runs to date
cd $WORK
aws s3 ls s3://serratus-public/out/200505_zoonotic/summary/ > batchA.complete
cat batchA.complete | sed 's/^...............................//g' - | cut -f1 -d'.' - > batchA.sra.complete

wc -l zoonotic_SraRunInfo.csv
wc -l batchA.sra.complete

# Generate an updated RunInfo File with all matches NOT completed
# This is dropping ~50 entries I'll need to figure out later
grep -vif batchA.sra.complete zoonotic_SraRunInfo.csv > zoonotic_SraRunInfo_Batch6.csv

wc -l zoonotic_SraRunInfo_Batch6.csv

RUNINFO="$SERRATUS/notebook/200505_ab/zoonotic_SraRunInfo_Batch6.csv"

70966 zoonotic_SraRunInfo.csv
15165 batchA.sra.complete
55736 zoonotic_SraRunInfo_Batch6.csv


In [15]:
cd $WORK
RUNINFO="$SERRATUS/notebook/200505_ab/zoonotic_SraRunInfo_Batch6.csv"

# Run the last 5000 files in the zoonotic dataset (bats)
head -n1 $RUNINFO > batch6_zoonotic.csv
tac $RUNINFO | head -n 5000 - >> batch6_zoonotic.csv


tac: write error: Broken pipe


In [48]:
cd $WORK
RUNINFO="$SERRATUS/notebook/200505_ab/zoonotic_SraRunInfo_Batch6.csv"

# Run the last 5000 files in the zoonotic dataset (bats)
head -n 5000 $RUNINFO > batch7_zoonotic.csv



In [75]:
cd $WORK
RUNINFO="$SERRATUS/notebook/200505_ab/zoonotic_SraRunInfo_Batch6.csv"

# ERROR: This was actually generated from the original zoonotic_SraRunInfo.csv file
#        therefore this is re-running samples 5000 - 15000 of that original file.
# Run the last 5000 files in the zoonotic dataset (bats)
head -n 15000 $RUNINFO > batch8_zoonotic.csv
sed -i '2,5000d' batch8_zoonotic.csv

#head $RUNINFO



### Terraform Initialization



In [16]:
# Terraform customization
# This version is updated to master
git diff $SERRATUS/terraform/main/main.tf



In [23]:
# Initialize terraform
TF=$SERRATUS/terraform/main
cd $TF
terraform init

[0m[1mInitializing modules...[0m

[0m[1mInitializing the backend...[0m

[0m[1mInitializing provider plugins...[0m

[0m[1m[32mTerraform has been successfully initialized![0m[32m[0m
[0m[32m
You may now begin working with Terraform. Try running "terraform plan" to see
any changes that are required for your infrastructure. All Terraform commands
should now work.

If you ever set or change modules or backend configuration for Terraform,
rerun this command to reinitialize your working directory. If you forget, other
commands will detect it and remind you to do so if necessary.[0m


In [24]:
cd $TF
# Launch Terraform Cluster
# Initialize the serratus cluster with minimal nodes
terraform apply -auto-approve

[0m[1mmodule.merge.data.aws_availability_zones.all: Refreshing state...[0m
[0m[1mmodule.merge.data.aws_ami.amazon_linux_2: Refreshing state...[0m
[0m[1mmodule.merge.data.aws_region.current: Refreshing state...[0m
[0m[1mmodule.align.data.aws_ami.amazon_linux_2: Refreshing state...[0m
[0m[1mmodule.scheduler.data.aws_ami.amazon_linux_2: Refreshing state...[0m
[0m[1mmodule.align.data.aws_region.current: Refreshing state...[0m
[0m[1mmodule.scheduler.data.aws_region.current: Refreshing state...[0m
[0m[1mmodule.download.data.aws_region.current: Refreshing state...[0m
[0m[1mmodule.align.data.aws_availability_zones.all: Refreshing state...[0m
[0m[1mmodule.work_bucket.aws_s3_bucket.work: Refreshing state... [id=tf-serratus-work-20200507200447766300000002][0m
[0m[1mmodule.download.data.aws_ami.amazon_linux_2: Refreshing state...[0m
[0m[1mmodule.download.data.aws_availability_zones.all: Refreshing state...[0m
[0m[1mmodule.monitoring.data.aws_ami.ec

## Running Serratus 
Upload the run data, scale-out the cluster, monitor performance.


### Run Monitors & Upload table
Open SSH tunnels to monitor node then open monitors in browser


In [26]:
cd $TF

# Open SSH tunnels to the monitor
./create_tunnels.sh

# Download Scheduler config file
#curl localhost:8000/config > serratus-config.json

channel 2: open failed: connect failed: Connection refused
channel 2: open failed: connect failed: Connection refused
channel 2: open failed: connect failed: Connection refused
Tunnels created:
    localhost:3000 -- grafana
    localhost:9090 -- prometheus
    localhost:8000 -- scheduler


In [102]:
cd $TF
# Make local changes to config file
cat serratus-config.json
echo '--------'
# Re-upload config file
curl -T serratus-config.json localhost:8000/config

{
"ALIGN_ARGS":"--very-sensitive-local",
"ALIGN_SCALING_CONSTANT":0.1,
"ALIGN_SCALING_ENABLE":true,
"ALIGN_SCALING_MAX":10,
"CLEAR_INTERVAL":600,
"DL_ARGS":"",
"DL_SCALING_CONSTANT":0.1,
"DL_SCALING_ENABLE":true,
"DL_SCALING_MAX":0,
"GENOME":"cov2r",
"MERGE_ARGS":"",
"MERGE_SCALING_CONSTANT":0.1,
"MERGE_SCALING_ENABLE":true,
"MERGE_SCALING_MAX":0,
"SCALING_INTERVAL":30
}
--------
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0{"ALIGN_ARGS":"--very-sensitive-local","ALIGN_SCALING_CONSTANT":0.1,"ALIGN_SCALING_ENABLE":true,"ALIGN_SCALING_MAX":10,"CLEAR_INTERVAL":600,"DL_ARGS":"","DL_SCALING_CONSTANT":0.1,"DL_SCALING_ENABLE":true,"DL_SCALING_MAX":0,"GENOME":"cov2r","MERGE_ARGS":"","MERGE_SCALING_CONSTANT":0.1,"MERGE_SCALING_ENABLE":true,"MERGE_SCALING_MAX":0,"SCALING_INTERVAL":30}


In [28]:
# Load SRA Run Info into scheduler (READY) BATCH 6
curl -s -X POST -T $WORK/batch6_zoonotic.csv localhost:8000/jobs/add_sra_run_info/

{"inserted_rows":5000,"total_rows":5000}


In [49]:
# Load SRA Run Info into scheduler (READY) BATCH 7
curl -s -X POST -T $WORK/batch7_zoonotic.csv localhost:8000/jobs/add_sra_run_info/

{"inserted_rows":4999,"total_rows":9999}


In [76]:
# Load SRA Run Info into scheduler (READY) BATCH 8
curl -s -X POST -T $WORK/batch8_zoonotic.csv localhost:8000/jobs/add_sra_run_info/

{"inserted_rows":10000,"total_rows":19999}


### Scale up the cluster

Cluster scale-in and scale-out is automated. Should be "set it and forget it".


In [16]:
# Error fixes (manually help along)

# Reset Split_err
# sqlite3 instance/scheduler.sqlite 'UPDATE blocks SET state = "new" WHERE state = "fail";'

# Clear DONE Accessions
# sqlite3 instance/scheduler.sqlite 'DELETE FROM acc WHERE state = "merge_done";'

# Error fixes (manually help along)
#curl -X POST "localhost:8000/jobs/split/36?state=new&N_paired=0&N_unpaired=0"

# Reset splitting accessions to new
# sqlite3 instance/scheduler.sqlite 'UPDATE acc SET state = "new" WHERE state = "splitting";'

# Reset ALIGNING blocks to NEW
# sqlite3 instance/scheduler.sqlite 'UPDATE blocks SET state = "new" WHERE state = "aligning";'

#X=36; Y=36; STATE='new';
#for BLOCK_ID in $(seq $X $Y);
#do
#  curl -X POST "localhost:8000/jobs/split/$BLOCK_ID?state=new&N_paired=0&N_unpaired=0"
#done

#X=4218; Y=4218; STATE='new';
#for BLOCK_ID in $(seq $X $Y);
#do
#  curl -X POST -s "localhost:8000/jobs/align/$BLOCK_ID?state=$STATE"
#done

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2 Final//EN">
<title>400 Bad Request</title>
<h1>Bad Request</h1>
<p>The browser (or proxy) sent a request that this server could not understand.</p>



## Shutting down procedures

Closing up shop.

In [103]:
# Dump the Scheduler SQLITE table to a local file
date
curl localhost:8000/db > \
  $WORK/zoonotic_batch8_checkpoint.sqlite

Sat May  9 00:05:59 PDT 2020
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0  0 52.4M    0 48762    0     0   140k      0  0:06:22 --:--:--  0:06:22  140k  2 52.4M    2 1599k    0     0  1255k      0  0:00:42  0:00:01  0:00:41 1254k  6 52.4M    6 3487k    0     0  1535k      0  0:00:34  0:00:02  0:00:32 1534k 10 52.4M   10 5503k    0     0  1682k      0  0:00:31  0:00:03  0:00:28 1681k 14 52.4M   14 7583k    0     0  1766k      0  0:00:30  0:00:04  0:00:26 1766k 17 52.4M   17 9599k    0     0  1811k      0  0:00:29  0:00:05  0:00:24 1926k 21 52.4M   21 11.4M    0     0  1871k      0  0:00:28  0:00:06  0:00:22 2028k 26 52.4M   26 13.6M    0     0  1926k      0  0:00:27  0:00:07  0:00:20 2103k 30 52.4M   30 15.9M    0     0  1972k      0  0:00:27  0:00:08  0:00:19 2162k 34 52.4M   34 18.0

## Destroy Cluster

Close out all resources with terraform (will take a few minutes).


In [None]:
terraform destroy -auto-approve
# WARNING this will also delete the standard output bucket/data
# Save data prior to destroy

### Run Notes

#### SRA run error

This happened again, same as last night. It's within a few minutes of one another, there may be an event NCBI side that is occuring at ~4:45 UST which is causing each instance to time out.

```
2020-05-09T04:47:18 fastq-dump.2.10.4 err: connection failed while opening file within cryptographic module - error with https open 'https://locate.ncbi.nlm.nih.gov/sdlr/sdlr.fcgi?jwt=eyJhbGciOiJSUzI1NiIsImtpZCI6InNkbGtpZDEiLCJ0eXAiOiJKV1QifQ.eyJleHAiOjE1ODkwMDI5MzgsImlhdCI6MTU4ODk5OTMzOCwibGluayI6Imh0dHBzOi8vc3JhLXB1Yi1ydW4tNC5zMy5hbWF6b25hd3MuY29tL1NSUjEwOTA5NzA2L1NSUjEwOTA5NzA2LjE_bmNiaV9waGlkPTkzOUI4RkVERTQyRjc4MzUwMDAwNUQwN0Y2RDIxMzlFLjEuMSZ4LWFtei1yZXF1ZXN0LXBheWVyPXJlcXVlc3RlciIsInJlZ2lvbiI6InVzLWVhc3QtMSIsInNlcnZpY2UiOiJzMyIsInNpZ25pbmdBY2NvdW50Ijoic3JhX3MzIiwidGltZW91dCI6NjAwMH0.G8CC9PLqH_N9mMaJk_aWIHIQvkSD1V--IeHfkMpWD9CmCMR5dHXXlMmABWqlrCb_c0b17--Gh2lqqhhIsj1I7186mzfKDSUE-btVlQNnV2L2J3SkdcpXll1zgqSs2dp5FgE4ANsJxNFI9AYxcBIWxW2CyPrkm3QahvXUdWHvFkhMhZdmHu3HSBgL4RFLf5eTbUFf_GPM0hcFWv9Jj6Q18YxfVutyx3Y3KV_UkfKTbgo6yo_WMAkzzOAUt7JgbgJyTzJjGPy1suNI_fB9gJvq3KXaQ57eMMUCaKEdbNL5YQD6qVRiU13mvC1zZ1jKLJ8hyvim4G632zHCW-GbVBpTZg'
```

