Skip to content

Commit

Permalink
Merge branch 'master' of github.com:CoEDL/elpis
Browse files Browse the repository at this point in the history
  • Loading branch information
aviraljain99 committed Oct 18, 2022
2 parents 9d55b59 + 83ca0c4 commit 118fbce
Show file tree
Hide file tree
Showing 15 changed files with 178 additions and 199 deletions.
17 changes: 10 additions & 7 deletions docs/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -3,21 +3,24 @@ Welcome to the Elpis ASR documentation!

Elpis is a tool which language workers with minimal computational experience can use to build their own speech recognition system for automatically transcribing audio.

.. toctree::
:maxdepth: 2
:caption: Elpis workshop

wiki/elpis-workshop
wiki/preparing-files

.. toctree::
:maxdepth: 1
:caption: Installing Elpis

wiki/install-elpis-docker
wiki/install-elpis-on-gcp
wiki/install-elpis-on-gcp-kaldi
wiki/install-elpis-on-gcp-gpu


.. toctree::
:maxdepth: 2
:caption: Elpis workshop

wiki/elpis-workshop
wiki/preparing-files


.. toctree::
:maxdepth: 1
:caption: For developers
Expand Down
4 changes: 2 additions & 2 deletions docs/requirements.txt
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
sphinx>2.1
sphinx-autodoc-typehints<1.12.0
sphinx
sphinx-autodoc-typehints
sphinx-rtd-theme
recommonmark
10 changes: 0 additions & 10 deletions docs/usage/installation.rst

This file was deleted.

2 changes: 0 additions & 2 deletions docs/usage/quick_start_guide.rst

This file was deleted.

2 changes: 0 additions & 2 deletions docs/usage/user_manual.rst

This file was deleted.

11 changes: 7 additions & 4 deletions docs/wiki/elpis-dev-recipe.md
Original file line number Diff line number Diff line change
Expand Up @@ -35,10 +35,13 @@ yarn install && yarn watch
Run the Elpis Docker image. Mount your local repositories into the container. Leave out the mounts you aren't actively developing. Thus you get to use the venv in the Docker container, don't need to set up your own, avoiding version issues.

```shell
docker run --rm -it -p 5001:5001/tcp \
-v ~/sandbox/state:/state \
-v ~/sandbox/elpis:/elpis \
--entrypoint zsh coedl/elpis:latest
docker run --rm -it \
--name elpis \
-p 5001:5001/tcp \
-p 6006:6006/tcp \
-v ~/sandbox/state:/state \
-v ~/sandbox/elpis:/elpis \
--entrypoint zsh coedl/elpis:latest
```

Run this command to start the Elpis interface.
Expand Down
13 changes: 11 additions & 2 deletions docs/wiki/elpis-workshop.md
Original file line number Diff line number Diff line change
Expand Up @@ -49,7 +49,7 @@ After the zip file has downloaded, unzip it to create a folder somewhere handy (

- We will provide a list of servers on the workshop day.
- Get an address from the list.
- If you are using Elpis in Docker on your own computer, the address will be `0.0.0.0:5001`
- If you are using Elpis in Docker on your own computer, the address will be [http://0.0.0.0:5001](http://0.0.0.0:5001) (or, if that doesn't work try [http://localhost:5001](http://localhost:5001)).
- Open a new web browser (Chrome or Firefox).
- Paste the address into the location bar.
- Press Enter/Return to start Elpis.
Expand Down Expand Up @@ -237,7 +237,16 @@ Now our training files have been prepared, we can start a new training session.

## Settings

Here you can adjust settings which affect the tool's performance. A unigram (1) value will train the model on each word. A trigram (3) value with train the model by words with their neighbours.
Here you can adjust settings which affect the tool's performance.

If you are using the Kaldi model, you can set the "n-gram" value. A unigram (1) value will train the model on each word. A trigram (3) value with train the model by words with their neighbours.

For HFT models, if you are trying Elpis out on your own computer, change the default settings to the following. These lower settings will reduce the amount of memory required for training.
* Number of epochs: 1
* Min duration: 1
* Max duration: 10
* Batch size: 1
* For testing purposes, select "Debug using a subset of the data" if you have a lot of data. This will use a small sample of your training data to try it out.

![](assets/latest/80-model-settings.png)

Expand Down
62 changes: 26 additions & 36 deletions docs/wiki/handy-gcp-commands.md
Original file line number Diff line number Diff line change
@@ -1,23 +1,30 @@
# Handy GCP commands

Follow [these instructions](https://cloud.google.com/sdk/docs/install) to install the `gcloud` tool.

## Use screen to run in background

SSH to GCP instance.
## Connect to a Virtual Machine

Start screen.
Use `gcloud` to connect from a local terminal to a Google Cloud Platform Virtual Machine. `gcloud init` will authorise gcloud to use your credentials to access your account. Then we will list the available machines, and make an SSH connection to one. Change `instance-1` in the code below to match the name of the machine you want to connect to.
```
screen
gcloud init
gcloud compute instances list
gcloud compute ssh instance-1
```

Run Docker container.

## Using screen

Using screen will avoid long-running training processes from terminating due to network connection failures between your machine and the VM.

Start screen.
```
docker run --gpus all -it -p 80:5000/tcp --entrypoint /bin/zsh coedl/elpis:ben-hft-gpu
screen
```

Do things...
Do things, e.g. run a Docker container...

Then,
Then detach or reattach after a network failure.
* `Ctrl-a` + `Ctrl-d` to detach from the screen
* `screen -ls` to list screens
* `screen -r` to reattach
Expand All @@ -29,57 +36,40 @@ SSH to GCP instance.

Get into the docker container.
```
docker exec -it $(docker ps -q) zsh
cd /state/models
docker exec -it elpis zsh
cd /state/of_origin/models
ls
tar -cvf model.tar HASH_DIR_NAME
```

Keep that Docker container running, and in another SSH terminal, copy from the container to the host.
```
docker cp $(docker ps -q):/state/models/model.tar .
docker cp elpis:/state/of_origin/models/model.tar .
```

Copy from the host to the local machine (do this in a local terminal window).
```
gcloud compute scp instance-3:~/model.tar ~/Downloads/model.tar
gcloud compute scp instance-1:~/model.tar ~/Downloads/model.tar
```

Otherwise, could share state dir from host into docker and save a few steps...


## Fixing the SSH Key

## Setting SSH key
When making an SSH connection using gcloud, you may receive a `Remote Host changed` error. This can be fixed by regenerating some files on your computer.

To fix `Remote Host changed` error, delete these files from your local machine.
Use the Google Cloud Console in your browser to check that the VM is running.

Delete these files from your computer.
```shell
~/.ssh/google_compute_engine
~/.ssh/google_compute_engine.pub
~/.ssh/google_compute_known_hosts
```

Then recreate them. Start the VM in the browser interface. Login and generate new SSH keys.
Run these commands on your computer to authorise you and generate new SSH keys. Replace the zone, instance and project names to suit your situation.
```shell
gcloud auth login
gcloud compute ssh --zone "us-central1-c" "instance-3" --tunnel-through-iap --project "elpis-workshop"
gcloud compute ssh --zone "us-central1-c" "instance-1" --tunnel-through-iap --project "elpis-workshop"
```



## Viewing Tensorboard on GCP

Connect to the tensorboard that shows train loss (currently in the HFT branch).


Do this once for your GCP account:
* Add a Firewall rule in `GCP > VPC networks > Firewall` for TCP port `6006` using a tagname `firewall`.


Then, for your VM instances:
* Include the `firewall` tagname in the list of VM Network tags.
* Start the VM and connect to it. `gcloud compute ssh instance-3`
* Start Docker and expose the 6006 port. `docker run --gpus all -it -p 80:5001/tcp -p 6006:6006/tcp --entrypoint /bin/zsh coedl/elpis:hft`
* Run Tensorboard with host arg. `tensorboard --logdir=/state/models/MODEL-HASH/runs --port 6006 --host=0.0.0.0`
* Browse to the machine's IP address, on http. `http://34.132.91.225:6006`

If the page isn't connecting, check that you aren't on a VPN which could be blocking the port.
8 changes: 4 additions & 4 deletions docs/wiki/install-elpis-docker.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# Installing Elpis with Docker
# Installing Elpis on your computer

Elpis can be installed with Docker, a virtual computer running on **your** computer. To use this version of Elpis, you first need to install Docker.

Expand All @@ -22,14 +22,14 @@ For Windows, open the search field in your taskbar, type `command` or `cmd` int
Download and run the Elpis Docker image by pasting this command in a terminal and pressing `Return` (or `Enter`).

```
docker run --rm -p 5001:5001/tcp coedl/elpis:latest
docker run --rm --name elpis -p 5001:5001/tcp -p 6006:6006/tcp coedl/elpis:latest
```

![Docker run command](assets/elpis-workshop-with-docker/command-1-latest.png)

If this is the first time you have run the command, you should see a message "Unable to find image 'coedl/elpis:latest' locally". All this means is that Docker has looked to see if there's a local copy of the Docker image, and couldn't find one. It will then start to download the image in a series of "layers". Each layer will go through a process of Waiting and Pulling (pulling involves Downloading and Extracting). When all layers are complete, Docker will create a container from the image and start Elpis in the container.

When you see a message about the server running, open `http://0.0.0.0:5001` in a browser.
When you see a message about the server running, open [http://0.0.0.0:5001](http://0.0.0.0:5001) in a browser. If you are on a Windows machine, try [http://localhost:5001](http://localhost:5001) instead.

![Docker running](assets/elpis-workshop-with-docker/command-2-latest.png)

Expand All @@ -39,4 +39,4 @@ You should see the Elpis interface. It might look a little different to this, de
![Docker welcome screen](assets/elpis-workshop-with-docker/10-welcome-latest.png)


With Elpis going, follow the steps in the [Elpis online workshop](elpis-workshop.html).
With Elpis going, follow the steps in the [Elpis workshop guide](elpis-workshop.md).
115 changes: 60 additions & 55 deletions docs/wiki/install-elpis-on-gcp-gpu.md
Original file line number Diff line number Diff line change
@@ -1,26 +1,63 @@
# Install Elpis on Google Cloud with GPU

If needed, do the "Setup you account" steps on the [Install Elpis on Google Cloud](install-elpis-on-gcp.md) wiki page.
If this is your first time using Elpis on Google Cloud, follow the steps on the [Setup Google Cloud account](setup-google-cloud-account.md) page.

This document will go through a process of enabling the network access required to view training progress with Tensorboard, and detail the steps to start a machine running Elpis.

## Create a Virtual Machine
When you have finished using Elpis on a GCP virtual machine, make sure you stop it to prevent ongoing costs.

The type of machine you can create depends on the quotas you have access to.

[GPU quotas](https://console.cloud.google.com/iam-admin/quotas?authuser=2&project=elpis-workshop&folder&organizationId&metric=GPUs%20(all%20regions)&location=GLOBAL)
## Enable network access

[all quotas](https://console.cloud.google.com/iam-admin/quotas?authuser=2&project=elpis-workshop)
Elpis uses Tensorboard to display training progress and plots. To enable us to view the Tensorboard page, we need to add a "firewall rule" in the Cloud console.

For a basic machine, use these settings:
* GPU
* N1 series
* n1-standard-16 (16 vCPUs, 60 GB memory)
* 1 x NVIDIA Tesla T4 (approx $600/month)
Sign in to the console. If you have multiple projects, choose the one you want to work with.

* Standard persistent disk Ubuntu 20.04 approx 300GB
* Allow http traffic
* Add `tensorboard` to the `Networking, Disks, Security, Management, Sole-tenancy` > `Networking` > `Network tags` section
* Add the script below to the `Management` > `Startup scripts` section
In the left hand navigation menu, go to "VPC network > Firewall". Click "Create Firewall Rule" (blue button at the top of the page).

Use the following settings, then click Create. This will create a rule which our machine can use to enable browser traffic to reach the Tensorboard.

* Name: tensorboard
* Direction of traffic: Ingress
* Target tags (make sure this is lowercase, and all one word): tensorboard
* Source IPv4 ranges: 0.0.0.0/0
* Protocols and ports: Specified protocols and ports
* TCP: 6006


## Create a Virtual Machine and run Elpis

Go to the `Compute Engine > VM instances` page.

To run Elpis, create an instance with the following settings. These resources will be adequate for a small amount of data, but may need to be increased depending on the quantity of your data. This configuration would cost approximately $600 to run all day, every day, for a month.

* Name: Give your instance a meaningful name, perhaps the name of the language you are training with.
* Region and zone: These can be left as is, or change to a location near you if required. Note that different regions may have different GPU options.
* Machine family: GPU
* GPU-type: NVIDIA T4
* Number of GPUs: 1
* Machine-type: n1-standard-16 (16 vCPUs, 60 GB memory)


Scroll down to the Boot disk section. Change the boot disk to use the following settings.

* Operating system: Ubuntu
* Version: Ubuntu 20.04 LTS z86/64
* Boot disk type: Standard persistent disk
* Size (GB): 300

Scroll down to the "Firewall" settings. Tick `Allow http traffic`

Click "Advanced options" to open that section.

Click "Networking" to open that section.

Type `tensorboard` in the `Network tags` field. This will allow the virtual machine to use the Tensorboard firewall rule we created earlier.

Scroll down and click on `Management`, and paste the following code into the `Automation Startup script` section. This code will install all the required software, download Elpis to the VM, and start Elpis.

Note that we install this way, and not using image deploy because image deploy limits the OS to "container optimised", which prevents use of `--gpus all` docker run flag. To use `--gpus all` flag, we need to install specific version of nvidia drivers, not container optimised.


```shell
# GPU startup script v0.6.3
Expand Down Expand Up @@ -81,55 +118,23 @@ echo "done"
docker run -d --rm --name elpis --gpus all -p 80:5001/tcp -p 6006:6006/tcp coedl/elpis:latest
```


This startup script will only run the first time the VM starts, to reduce the instance load time on subsequent restarts.

Then, scroll to the bottom of the page and click "Create". The page will redirect to the virtual machine list, and show the status of the machine starting up.

Don't use image deploy because this limits OS to container optimised, which prevents use of `--gpus all` docker run flag. To use `--gpus all` flag, we need to install specific version of nvidia drivers, not container optimised.

After the machine starts, it can take up to 15 minutes for everything in the startup script to be installed. Wait 15 minutes or so, and then copy the External IP address.

Open a browser. In the browser's location field, type `http://` and paste the IP address. It should end up looking like `http://34.125.96.234`. Then press `enter/return` to go to your Elpis machine.

With Elpis going, follow the steps in the [Elpis workshop guide](elpis-workshop.md).

## Connect to the machine

View the VM logs to monitor the setup process. After the startup script has completed, SSH to the machine to create a Docker container and start Elpis.
Replace instance-1 with the name of your VM.
## Adding projects (optional)

```
gcloud init
gcloud compute instances list
gcloud compute ssh instance-1
```


## Start Elpis

Run this command in the SSH connection to create a Docker container from the latest image, and start Elpis.

```
docker run --gpus all --name elpis --rm -it -p 80:5001/tcp coedl/elpis:latest
```

---
Later, you may wish to add a new project to separate the usage of services across different experiments or activities.

## Other handy scripts
Click the project list in the top blue menu. In the popup, click "New Project".

Refer to the [Handy GCP commands](handy-gcp-commands.md) page for some handy scripts.
On the New Project screen, add a project name and press "Create".


## Optionally, download and share data into the container

This may be helpful if you write a python file to run Elpis in the container and avoid the GUI.

Use the tool on [this page](https://angelov.ai/post/2020/wget-files-from-gdrive/) to create a wget command.

```
cd /
sudo mkdir na-elpis && cd na-elpis
sudo wget --load-cookies /tmp/cookies.txt "https://docs.google.com/uc?export=download&confirm=$(wget --quiet --save-cookies /tmp/cookies.txt --keep-session-cookies --no-check-certificate 'https://docs.google.com/uc?export=download&id=1tywUAtOUnAeITxC-YL61I5iTADIipeYS' -O- | sed -rn 's/.*confirm=([0-9A-Za-z_]+).*/\1\n/p')&id=1tywUAtOUnAeITxC-YL61I5iTADIipeYS" -O data.zip && rm -rf /tmp/cookies.txt
sudo unzip data.zip
docker run --gpus all --name elpis -v /na-elpis:/na-elpis --rm -it -p 80:5001/tcp coedl/elpis:latest
```
When the project has been created, you will be prompted to select it. Having done that, the page will show the project's Dashboard.

0 comments on commit 118fbce

Please sign in to comment.