# Run Label Studio on Colab

This notebooks sets up the Label Studio tool on Google Colab. Many of the materials are inspired on the Argilla tutorial.

## Caveats

Code in Google Colaboratory (Colab) is executed in a virtual machine allocated to your Google account on a temporary basis with limited lifetime e.g. up to 12 hours for the free version. Whenever the maximum lifetime is reached or you close your notebook, any data generated by your code is deleted and the virtual machine is recycled and re-allocated.

What this means when running Label Studio in Colab is that Label Studio's data folder (where its database, your data, project settings and annotations are stored) is deleted when your time is up or you close this notebook. After this point, when you launch Label Studio, you will be creating a fresh instance with a fresh database and settings, and therefore:
* You will need to create a new user account for logging into Label Studio.
* Your previous annotations and any unannotated data will need to be imported back into Label Studio, and your projects and labels re-created and re-configured.

### One-time use

For one-time use, make sure to save your annotations using Label Studio's export function, ideally in both CSV and JSON formats (note: relations, meta text (notes), review comments and other meta data are only exported in the JSON format). 


### Continuous use

For continuous use, there are a couple of workarounds to make Label Studio's database and settings persist across Colab sessions and runtime:

**Workaround 1**

When launching Label Studio, specify a folder in your Google Drive to use as the data folder. This will allow the data folder to persist across Colab sessions and runtime.
1. Create a folder named 'label-studio' within your Google Drive.
1. Mount your Google Drive (see below).
1. In the left side of this notebook, open the Files pane and expand the file tree and navigate to the destination folder in your Google Drive. Right click on the folder and select `Copy path`.
1. When launching Label Studio, use the `--data-dir` option e.g.
   ```
   label-studio --data-dir <paste your path here>
   ```
Note: If your path contains any spaces, wrap it in between quotes e.g. `"/content/drive/MyDrive/folk songs/annotation/label-studio"`.

**Workaround 2**

Before you close this notebook, take a copy Label Studio's data folder located at `/root/.local/share/label-studio` and save it to your Google Drive or download it. Then, before you spin up Label Studio again, copy the data folder back to `/root/.local/share/label-studio`.
1. Create a folder named 'label-studio' within your Google Drive.
1. Mount your Google Drive (see below).
1. In the left side of this notebook, open the Files pane and expand the file tree and navigate to the destination folder in your Google Drive. Right click on the folder and select `Copy path`.
1. In a new cell, create a new terminal:
   ```
   %load_ext colabxterm
   %xterm
   ```
1. To copy Label Studio's data folder to your Google Drive:
   ```
   cp -r /root/.local/share/label-studio <paste your path here>
   ```
1. To copy Label Studio's data folder back to Colab:
   ```
   cp -r <paste your path here> /root/.local/share/label-studio
   ```
Note: If your path contains any spaces, wrap it in between quotes e.g. `"/content/drive/MyDrive/folk songs/annotation/label-studio"`.

## Mount Google Drive

This is an optional step to set up a persistent storage to store Label Studio's database and settings, and allow those persist across sessions.

In [None]:
from google.colab import drive
drive.mount('/content/drive')

## Install libraries

In [None]:
!pip install pyngrok~=5.2.1 colab-xterm~=0.1.2 
!python -m pip install label-studio

In [None]:
# create a terminal to run label-studio, in case you don't have Colab Pro.
# type "label-studio" into the terminal that appears below this code cell.
#
# Note: to specify a different location for Label Studio to store its database 
# and settings, append "--data-dir <path>" to the command e.g.
# label-studio --data-dir "/content/drive/MyDrive/folk songs/annotation/label-studio"
%load_ext colabxterm
%xterm

In [None]:
import getpass
from pyngrok import ngrok, conf

print("Enter your authtoken, which can be copied from https://dashboard.ngrok.com/auth")
print("You need to create a free ngrok account to get an authtoken. The token looks something like this: ASDO1283YZaDu95vysXYIUXZXYRR_54YfASDIb8cpNfVoz349587")
conf.get_default().auth_token = getpass.getpass()
# if the above does not work, you can try:
#ngrok.set_auth_token("<INSER_YOUR_NGROK_AUTHTOKEN>")

In [None]:
# disconnect all existing tunnels to avoid issues when rerunning cells
[ngrok.disconnect(tunnel.public_url) for tunnel in ngrok.get_tunnels()]

# create the public link
# ! check whether this is actually the localhost port Label Studio is running on via the terminal above
ngrok_tunnel = ngrok.connect(8081)  # insert the port number Label Studio is running on. e.g. 8081 if the terminal displays something like "Uvicorn running on http://0.0.0.0:8081"
print("You can now access the Argilla localhost with the public link below. (It should look something like 'http://X03b-34-XXX-237-25.ngrok.io')\n")
print(f"Your ngrok public link: {ngrok_tunnel}\n")
print("After clicking on the link, there will be a warning, which you can ignore")