Skip to content

Commit

Permalink
update readme for llm-transformers example
Browse files Browse the repository at this point in the history
  • Loading branch information
sven-h committed Oct 4, 2023
1 parent 8ece4ee commit f203a48
Showing 1 changed file with 55 additions and 0 deletions.
55 changes: 55 additions & 0 deletions examples/llm-transformers/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -61,3 +61,58 @@ To get the path, activate the environment and execute `which python` (linux) or
The path to the transformers cache (where all the models are stored) can be changed with `transformerscache` option.
Leave it out completely to used the default (usually in home folder).
The sized for `70B` variants (one models) are usually around 130 GB.


## OAEI

### Build the docker image
To create the OAEI matcher, uncomment the corresponding build module in the `pom.xml` file.
Then run `mvn install`.

If you want to use `podman` instead of `docker` (only during build!) you need to run
`podman system service {dashdash}time=0 unix:/run/user/$(id -u)/podman/podman.sock`
in a seperate shell and then before executing maven run `export DOCKER_HOST="unix:/run/user/$(id -u)/podman/podman.sock" `.

### Execute the docker image directly (without MELT)
Running this matcher in a reasonable time requires GPUs (in experiments we used 2 x A100 with 40 GB RAM).

To execute the image run,

```
docker run --rm --publish 8080:8080 llm-transformers-1.0-web
```
(or replace `docker` by `podman`).

This will download the large language model (~250GB) into the container which means that if you run the container again, it will be downloaded again.
To map the cache folder to a folder on your hard drive (replace `{localPath}` with it) you need to execute:

```
docker run --rm --publish 8080:8080 -v {localPath}:/root/.cache/ llm-transformers-1.0-web
```

With this setup it only downloads the large language model only once even if the container is started multiple times.

If a restriction on specific GPUs is required, set the environemnt variable `CUDA_VISIBLE_DEVICES` like

```
docker run --rm --publish 8080:8080 -e CUDA_VISIBLE_DEVICES=0 llm-transformers-1.0-web
```

Both options can also be combined.

Once the container is started, it will listen on the 8080 port and provides a
REST API defined by the [MELT Web format](https://dwslab.github.io/melt/matcher-packaging/web).

### Execute the docker image with MELT

[Download the MELT evaluation client](https://dwslab.github.io/melt/matcher-evaluation/client) and run:

```
java -jar matching-eval-client-latest.jar --systems <path to the tar.gz file> --track <location-URI> <collection-name> <version>
```

if the docker container is already running you need to provide the locahost URL:

```
java -jar matching-eval-client-latest.jar --systems http://localhost:8080/match --track <location-URI> <collection-name> <version>
```

0 comments on commit f203a48

Please sign in to comment.