# Start a Cromwell server

`Server mode` provides the ability to submit multiple workflows to Cromwell, asynchronously, for execution.
The Cromwell server will take care of orchestration, but does not execute workflow `tasks` directly.
Workflow `tasks` are executed on separate VMs which are scheduled and monitored by the Google Life Sciences API.

The Cromwell server opens a local port (`8000` by default) to receive job submission request over a simple REST API.
You can submit requests to the Cromwell server for workflow execution, job monitoring, and job canceling using
command-line tools (such as `curl`) a Python [urllib.request](https://docs.python.org/3/library/urllib.request.html#module-urllib.requesthttps://docs.python.org/3/library/urllib.request.html#module-urllib.request), or a purpose-built
tool such as [Cromshell](https://github.com/broadinstitute/cromshell).

### Notebook setup

#### Set up utility functions

In [None]:
'''
Resolves bucket URL from bucket reference in workspace.
'''
def get_bucket_url_from_reference(bucket_reference):
    BUCKET_CMD_OUTPUT = !terra resolve --name={bucket_reference}
    BUCKET = BUCKET_CMD_OUTPUT[0]
    return BUCKET

#### Workspace setup

<div class="alert alert-block alert-info">
<b>Note:</b> This notebook assumes that `workspace_setup.ipynb` in the parent directory has been run.
</div>
    
`workspace_setup.ipynb` creates two Cloud Storage buckets for your workspace files with workspace reference names: 

 - ws_files   
 - ws_files_autodelete_after_two_weeks      
    
The code in this notebook will write output files to the "autodelete" bucket by default.  
    Any file in this bucket will be automatically deleted <b>two weeks</b> after it is written.  
    This alleviates the need for you to remember to clean up temporary and example files manually.  
    If you want to write outputs to a durable location, simply change the assignment of the `BUCKET_REFERENCE` variable in the cell below and re-run the notebook. 

In [None]:
# Change this to "ws_files" to use the durable workspace bucket instead of the autodelete bucket.
BUCKET_REFERENCE = "ws_files_autodelete_after_two_weeks"

#### Cloud environment setup

The notebooks in this workspace create a few files on your cloud environment. For clarity and to ease cleanup after
running the tutorials, the notebooks will write, by default to a well-defined location as determined by the
`CROMWELL_EXAMPLES_DIR`. You are free to change this location to suit your own use cases.

In [None]:
import os

CROMWELL_EXAMPLES_DIR=os.path.expanduser('~/terra-tutorials/cromwell')
CROMWELL_CONF=f'{CROMWELL_EXAMPLES_DIR}/cromwell.conf'
CROMWELL_SERVER_LOG=f'{CROMWELL_EXAMPLES_DIR}/cromwell.server.log'

!mkdir -p {CROMWELL_EXAMPLES_DIR}

print(f'Tutorial files will be written locally to {CROMWELL_EXAMPLES_DIR}')
print()
print(f'Cromwell configuration file will be written to {CROMWELL_CONF}')
print(f'Cromwell server log file will be written to {CROMWELL_SERVER_LOG}')

## Configure your server

Run the following cell to generate the server configuration file, `cromwell.conf`, using the [terra CLI](https://github.com/DataBiosphere/terra-cli).<br>The file modified by the function below also configures the Cromwell server to submit the jobs through the [Lifesciences API](https://cloud.google.com/life-sciences/docs/concepts/introduction).

In [None]:
!rm -f {CROMWELL_CONF}
!terra cromwell generate-config --dir={CROMWELL_EXAMPLES_DIR} --workspace-bucket-name={BUCKET_REFERENCE}

## Start  a MySQL DB

In order to store job state, Cromwell needs a database attached.

In [None]:
!docker run -p 3306:3306 \
    --name MySQLContainer \
    -e MYSQL_ROOT_PASSWORD=cromwell \
    -e MYSQL_DATABASE=cromwell_db \
    -e MYSQL_USER=cromwell \
    -e MYSQL_PASSWORD=cromwell \
    -d mysql/mysql-server:5.5 \
    --max-allowed-packet=16M

We also need to modify the Cromwell config file to use this database.

In [None]:
db_config_content = """
}
database {
  profile = "slick.jdbc.MySQLProfile$"
  db {
    driver = "com.mysql.cj.jdbc.Driver"
    url = "jdbc:mysql://localhost/cromwell_db?rewriteBatchedStatements=true&useSSL=false"
    user = "cromwell"
    password = "cromwell"
    connectionTimeout = 5000
  }
}
"""

with open(CROMWELL_CONF, 'r') as conf_file:
    conf_file_contents = conf_file.read()
li = conf_file_contents.rsplit('}', 1)
new_conf_file_contents = db_config_content.join(li)
with open(CROMWELL_CONF, 'w') as conf_file:
    conf_file.write(new_conf_file_contents)


## Starting your server

To start Cromwell in server mode as a background task, execute the cell below, which will launch Cromwell and send all of the server messages to the file `cromwell.server.log`.
It will take a few seconds for Cromwell to complete its startup sequence and be ready to receive requests. 

In [None]:
%%bash -s {CROMWELL_CONF} {CROMWELL_SERVER_LOG}

# To run a shell command in the background from an iPython notebook, we need to use the
# %%bash magic instead of the "!" notation.

CROMWELL_CONF="$1"
CROMWELL_SERVER_LOG="$2"

java -Xms10g -Xmx10g -Dconfig.file="${CROMWELL_CONF}" -jar "${CROMWELL_JAR}" server &> "${CROMWELL_SERVER_LOG}" &

When Cromwell is ready to receive requests, it will emit a message to the log. 
You can check for this message by running the cell below. You should then see something like:

`Cromwell 81 service started on 0:0:0:0:0:0:0:0:8000...`

### Waiting for the server to start

In [None]:
!while ! grep "Cromwell.*service started" {CROMWELL_SERVER_LOG}; do \
   echo "Did not detect Cromwell service start line; retrying in 3 seconds"; \
   sleep 3s; \
 done

### Verify server port is open

You can also poll the Cromwell server port by running the cell below. You should then see something like:

`{"cromwell":"81"}`

In [None]:
!curl http://127.0.0.1:8000/engine/v1/version

## Stopping your server

To stop the Cromwell server, we need to kill the running process (pausing your cloud environment will also kill the process).

### Look up the process

To find and kill the process, you have several command-line tools available. Run the cell below to use `pgrep` to list all Java processes and then narrow down the list based our command-line arguments. The output should resemble:

`<PID> java -Dconfig.file=<PATH> -jar cromwell/cromwell-81.jar server`. 

In [None]:
!pgrep "java" --list-full | grep "java .* -jar .*cromwell.*\.jar server"

### Kill the process

Assuming the above has correctly identified your Cromwell server process, execute the cell below to kill the server.

In [None]:
%%bash

SERVER_PID="$(pgrep "java" --list-full | grep "java .* -jar .*cromwell.*\.jar server" | cut -d" " -f1)"

echo "Killing process ${SERVER_PID}"
if kill -TERM "${SERVER_PID}"; then
  echo "Termination signal sent to ${SERVER_PID}"
fi

You can run the following cell to check for new messages in the `cromwell.server.log` which demonstrate that the server has exited.

Your output should appear similar to:

```
2022-10-26 22:48:04,389 INFO - Shutting down connection pool: curAllocated=0 idleQueues.size=0 waitQueue.size=0 maxWaitQueueLimit=256 closed=false

2022-10-26 22:48:04,389 INFO - Shutting down connection pool: curAllocated=0 idleQueues.size=0 waitQueue.size=0 maxWaitQueueLimit=256 closed=false

2022-10-26 22:48:04,393 INFO - Database closed

2022-10-26 22:48:04,393 INFO - Stream materializer shut down

2022-10-26 22:48:04,408 INFO - WDL HTTP import resolver closed
```

In [None]:
!tail -n 5 {CROMWELL_SERVER_LOG}