# Start with AWS

### Useful Links
[Connect to the LPC CAF](https://uscms.org/uscms_at_work/physics/computing/getstarted/uaf.shtml)
[Amazon EC2 F1 Instances](https://aws.amazon.com/ec2/instance-types/f1/)
[Jessica Lan's tutorial](https://github.com/jylan98/aws/blob/master/AWS%20Tutorial.ipynb)
[Xilinx ML Suite](https://github.com/Xilinx/ml-suite)
[Docker Document](https://docs.docker.com/)

### Setup
First Begin with getting a FNAL account, and then contact Burt (email:burt@fnal.gov or via cms-aws-ml channel in Slack) to setup an F1 instance with either: 
1. Xilinx ML AMI
or
2. Ubuntu 16.04 AMI

AMI is short forAmazon Machine Image.
I would recommend you to choose the second option since I learned from Xilinx team that the latest release of Xilinx ML Suite is based on a docker container. Follow the next part to install Xilinx ML suite docker image.    

AS for the first option, next steps could be found in Jessica's tutorial.

#### Install Xilinx ML Suite via docker
##### Steps 
    
1. Launch an instance using the Ubuntu 16.04 AMI from the AWS Marketplace
2. Disable the auto-upgrades (If not disabled, the XRT installed will break if the kernel gets updated. Which requires re-install of XRT)
   Edit “/etc/apt/apt.conf.d/20auto-upgrades” as below:
   
    ```
    APT::Periodic::Update-Package-Lists "0";
    APT::Periodic::Unattended-Upgrade "0";
    ```
3. run “sudo apt-get update” to update the package list
4. Install Docker on Ubuntu (I followed the steps in here [https://www.digitalocean.com/community/tutorials/how-to-install-and-use-docker-on-ubuntu-16-04](https://mmm.cern.ch/owa/redir.aspx?C=eURkoxK39z-8AZXGyyPZZV7Jgb5hdJA1CIiWc09xBcAVcIQEMibXCA..&URL=https%3a%2f%2fwww.digitalocean.com%2fcommunity%2ftutorials%2fhow-to-install-and-use-docker-on-ubuntu-16-04)):
  
  1. Add GPG Key for the official Docker repository to the instance: 
  
     ``` bash
     curl -fsSL [https://download.docker.com/linux/ubuntu/gpg]| sudo apt-key add –
     ```
  
  2. Add the Docker repository to APT sources:
  
     ``` bash
     sudo add-apt-repository "deb [arch=amd64] [https://download.docker.com/linux/ubuntu]$(lsb_release -cs) stable"
     ```
  
  3. Update the package database with the Docker packages from the newly added repo:
  
     ```bash
     sudo apt-get update
     ```
  
  4. Make sure you are about to install from the Docker repo instead of the default Ubuntu 16.04 repo:
  
     ``` bash
     apt-cache policy docker-ce
     ```
  
  5. Install Docker:
  
     ```bash
     sudo apt-get install -y docker-ce
     ```
  
  6. Check if Docker is running:
  
     ```bash
     sudo systemctl status docker
     ```
  
  7. To execute Docker without Sudo:
  
     ```bash
     sudo usermod -aG docker ${USER}
     ```

5. Steps to install XRT on Ubuntu on AWS (Run these commands in bash terminal):
   1. git clone https://github.com/aws/aws-fpga.git
   2. cd aws-fpga
   3. git clone -b 2018.2_XDF.RC5 https://github.com/Xilinx/XRT.git
   4. sudo apt-get install gcc
   5. sudo apt-get install make
   6. source sdaccel_setup.sh  #Even if it fails no worries, the point is to generate 'aws-fpga/sdk/userspace/lib/libfpga_mgmt.a'. lib directory under userspace doesn't exist when aws-fpga is cloned 
   7. XRT/src/runtime_src/tools/scripts/xrtdeps.sh  #install XRT dependencies
   8. cd XRT/build
   9. ./build.sh
   10. sudo dpkg -i xrt_201802.2.1.0_16.04-xrt.deb
       sudo dpkg -i xrt_201802.2.1.0_16.04-aws.deb

6. Download, and/or scp Xilinx ML Suite Docker Image to instance

   Sign up an account and download the image: https://www.xilinx.com/member/forms/download/eula-xef.html?filename=xilinx-ml-suite-ubuntu-16.04-xrt-2018.2-caffe-mls-1.4.tar.gz

   Note: cannot `wget` directly. After accecpting the Xilinx End User License Agreement, then  I copy the download link from the browser and use `wget` command 

7. Load the image

   ```bash
   docker load -i xilinx-ml-suite-ubuntu-16.04-xrt-2018.2-caffe-mls-1.4.tar.gz
   ```

8. Start the container with a script

   ```bash
   ./docker_run.sh
   ```
   see docker_run.sh in the next part. 
   
   
#####  Tips:
1. One may encounter the error `no space left on device`, if this happens, you could ask Burt to insert another disk and mount it somewhere you like, and then tell Docker to put the images onto that mount point.   
   edit `/etc/docker/daemon.json`, more details could be found: https://docs.docker.com/config/daemon/systemd/

2. You need to learn some basic operations about Docker. Here I just post some ideas.   
   If you want keep the modifications after exiting a container and get back, you need to remove the option `--rm` in the script `docker_run.sh`(see below). and then `docker container ls -a` to get the container id and `docker container restart <id>`,`docker attach <id>`. However, this will not load the drivers, so write a new script with parts of loading the drivers and restart the container. 
   More details could be found in docker doc. 

# FPGA accelerated inference service
  
  
Start here: https://arxiv.org/abs/1904.08986, what we want to do is repeat these studies using AWS. 
A slide of my talk: https://docs.google.com/presentation/d/1mUO8WuVR9GddkO8Zb7iNP_5QWjdt8TA4kAmZXEwQ4mw/edit?usp=sharing   


To begin with, get yourself familiar with the tutorials and examples in Xilinx page:
https://github.com/Xilinx/ml-suite/tree/master/examples/caffe
https://github.com/Xilinx/ml-suite/tree/master/examples/deployment_modes


I tried to test the latency of AWS as edge service and cloud service. 
For the edge service, I got some stange results from the XDNN pipeline report. The expert explained that differences may be related to the set up (included HW).
For the cloud service, I followed the tutorial to start REST API on local instance, where I can send images to the localhost to do inference and measure the time, which is not the real cloud service.  



Maybe you could start with learning these tools and verify my results in the talk, and then start to do top jet classification, finally compare the performance with Microsft Azure in the paper. 

docker_run.sh
``` bash
#!/usr/bin/env bash

############################################################################
###############################  Hack For AWS ##############################
############################################################################

sudo /opt/xilinx/xrt/bin/awssak query # Need to run this before changing permissions

setperm () {
  sudo chmod g=u $1
  sudo chmod a=u $1
}
setfpgaperm () {
  for f in $1/*; do setperm $f; done
}
for d in /sys/bus/pci/devices/*; do cat $d/class| grep -q "0x058000" && setfpgaperm $d;  done
setperm /sys/bus/pci/rescan

####################################################################################
####################################################################################
####################################################################################

HERE=`dirname $(readlink -f $0)`

mkdir -p $HERE/share
chmod -R a+rwx $HERE/share

xclmgmt_driver="$(find /dev -name xclmgmt\*)"
docker_devices=""
echo "Found xclmgmt driver(s) at ${xclmgmt_driver}"
for i in ${xclmgmt_driver} ;
do
  docker_devices+="--device=$i "
done

render_driver="$(find /dev/dri -name renderD\*)"
echo "Found render driver(s) at ${render_driver}"
for i in ${render_driver} ;
do
  docker_devices+="--device=$i "
done

#sudo \ 
docker run \
  --rm \
  --net=host \
  --privileged=true \
  --log-driver none \
  -it \
  $docker_devices \
  -v $HERE/share:/opt/ml-suite/share \
  -v /opt/xilinx:/opt/xilinx \
  -w /opt/ml-suite \
  xilinx-ml-suite-ubuntu-16.04-xrt-2018.2-caffe-mls-1.4:latest \
  bash
```

