diff --git a/docs/source/en/quick_tour.md b/docs/source/en/quick_tour.md index 995031d1..c0fe008c 100644 --- a/docs/source/en/quick_tour.md +++ b/docs/source/en/quick_tour.md @@ -39,12 +39,12 @@ docker run --gpus all -p 8080:80 -v $volume:/data --pull always ghcr.io/huggingf -Here we pass a `revision=refs/pr/5`, because the `safetensors` variant of this model is currently in a pull request. +Here we pass a `revision=refs/pr/5` because the `safetensors` variant of this model is currently in a pull request. We also recommend sharing a volume with the Docker container (`volume=$PWD/data`) to avoid downloading weights every run. -Once you have deployed a model you can use the `embed` endpoint by sending requests: +Once you have deployed a model, you can use the `embed` endpoint by sending requests: ```bash curl 127.0.0.1:8080/embed \ @@ -72,7 +72,7 @@ volume=$PWD/data docker run --gpus all -p 8080:80 -v $volume:/data --pull always ghcr.io/huggingface/text-embeddings-inference:1.2 --model-id $model --revision $revision ``` -Once you have deployed a model you can use the `rerank` endpoint to rank the similarity between a query and a list +Once you have deployed a model, you can use the `rerank` endpoint to rank the similarity between a query and a list of texts: ```bash @@ -101,3 +101,23 @@ curl 127.0.0.1:8080/predict \ -d '{"inputs":"I like you."}' \ -H 'Content-Type: application/json' ``` + +## Batching + +You can send multiple inputs in a batch. For example, for embeddings + +```bash +curl 127.0.0.1:8080/embed \ + -X POST \ + -d '{"inputs":["Today is a nice day", "I like you"]}' \ + -H 'Content-Type: application/json' +``` + +And for Sequence Classification: + +```bash +curl 127.0.0.1:8080/predict \ + -X POST \ + -d '{"inputs":[["I like you."], ["I hate pineapples"]]}' \ + -H 'Content-Type: application/json' +```