This jupyter notebook needs to run on a x86_64 CPU. We recommend running it on a Linux machine. It works both with and without a Nvidia GPU.

In [1]:
# Install the Lightly Worker and do a quick sanity check
# If these commands fail, follow our docker installation guide at https://docs.lightly.ai/docs/install-lightly#docker
!docker pull lightly/worker:latest
!docker run --shm-size="1024m" --rm -it lightly/worker:latest sanity_check=True

# Install the Lightly Python Client
!pip3 install lightly

latest: Pulling from lightly/worker
Status: Image is up to date for lightly/worker:latest
[2024-03-26 08:33:56] Lightly Worker Solution v2.11.1[0m
[2024-03-26 08:33:56] Congratulations! It looks like the Lightly container is running![0m


In [2]:
# Set the path to the dataset.
# Here we download and use a dataset containing 354 images showing clothes.
# If you want to use your own dataset, just set the dataset_path to it.
dataset_path = "./dataset_clothing_images"
!git clone https://github.com/lightly-ai/dataset_clothing_images.git {dataset_path}
!tree --filelimit=12 {str(dataset_path)}

Cloning into './dataset_clothing_images'...


Resolving deltas: 100% (7/7), done.
[01;34m./dataset_clothing_images[00m
├── LICENSE
├── README.md
├── [01;34mdress[00m [32 entries exceeds filelimit, not opening dir]
├── [01;34mhat[00m [14 entries exceeds filelimit, not opening dir]
├── [01;34mlongsleeve[00m [49 entries exceeds filelimit, not opening dir]
├── [01;34moutwear[00m [24 entries exceeds filelimit, not opening dir]
├── [01;34mpants[00m [49 entries exceeds filelimit, not opening dir]
├── [01;34mshirt[00m [29 entries exceeds filelimit, not opening dir]
├── [01;34mshoes[00m [26 entries exceeds filelimit, not opening dir]
├── [01;34mshorts[00m [25 entries exceeds filelimit, not opening dir]
├── [01;34mskirt[00m
│   ├── [01;35m2459a9ab-b654-4aef-be7c-89340da78cec.jpg[00m
│   ├── ...
│   └── [01;35mecdc68a9-12c7-449b-ab3a-4b2a8e71b197.jpg[00m
└── [01;34mt-shirt[00m [81 entries exceeds filelimit, not opening dir]

10 directories, 14 files


In [3]:
# Schedule a run
from lightly.api import ApiWorkflowClient
from lightly.openapi_generated.swagger_client import DatasetType, DatasourcePurpose

lightly_token = "CHANGE_ME" # Copy the API_TOKEN from https://app.lightly.ai/preferences

# Create the Lightly client to connect to the API.
client = ApiWorkflowClient(token=lightly_token)

# Register the Lightly Worker.
worker_id = client.register_compute_worker(name="clothing-worker")

# Create the dataset on the Lightly Platform.
client.create_dataset(
    dataset_name="clothing-small",
    dataset_type=DatasetType.IMAGES,
)

# Configure the datasource.
client.set_local_config(
    purpose=DatasourcePurpose.INPUT,
)
client.set_local_config(
    purpose=DatasourcePurpose.LIGHTLY,
)

# Schedule a run on the dataset to select 50 samples.
scheduled_run_id = client.schedule_compute_worker_run(
    worker_config={"shutdown_when_job_finished": True},
    selection_config={
        "n_samples": 50,
        "strategies": [
            {"input": {"type": "EMBEDDINGS"}, "strategy": {"type": "DIVERSITY"}}
        ],
    },
)

In [4]:
# Run the Lightly Worker to process the scheduled run.
# If you have a Nvidia GPU, add the flag to use it: --gpus all
!docker run --shm-size="1024m" --rm -it \
    -v {dataset_path}:/input_mount:ro \
    -v "./lightly_runs":/lightly_mount \
    -e LIGHTLY_TOKEN={lightly_token} \
    -e LIGHTLY_WORKER_ID={worker_id} \
    lightly/worker:latest

[2024-03-26 08:34:15] Lightly Worker Solution v2.11.1[0m
[2024-03-26 08:34:16] You are using docker build: Tue Mar 12 07:56:29 UTC 2024.[0m
[2024-03-26 08:34:16] Starting worker with id '65806b455ca68c93b29ad6b3'...[0m
[93m[2024-03-26 08:34:16] Worker 2.11.1 can only process jobs scheduled with Lightly Python client 1.5 or higher.[0m
[2024-03-26 08:34:16] Worker with labels '[]' started. Waiting for jobs...[0m
[2024-03-26 08:34:16] Found 1 open jobs.[0m
[2024-03-26 08:34:16] Started job with job_id '6602887ff5e7d0833c925dee'.[0m
...
...
[2024-03-26 08:35:45] Done![0m
[2024-03-26 08:35:48] Finished compute worker run successfully.[0m
[2024-03-26 08:35:48] Shutting down...[0m


Congratulations! You succesfully ran the Lightly solution.
Now you can view and explore the dataset interactively on the [Lightly Platform](https://app.lightly.ai).
To not only see the metadata and distribution, but also the images itself, you need to serve them from your local disk to your local browser by using the `lightly-serve` CLI command:

In [5]:
!lightly-serve input_mount={dataset_path} lightly_mount="./lightly_runs"

Starting server, listening at 'localhost:3456'
Serving files in './dataset_clothing_images' and './lightly_runs'


In case your browser runs on a different machine than your notebook, you also need to forward a port, see our [docs](https://docs.lightly.ai/docs/local-storage#view-local-data-in-remote-machine-in-lightly-platform).