feat: support roberta #62

kozistr · 2023-11-01T09:43:48Z

What does this PR do?

Support RoBERTa model. (I just referred feat: support camembert #42)

(11/5/23) I tested on the CPU env with my RoBERTa model.

Tests

On CPU

OS : Windows 10
CPU : i7-7700K
rustc : v1.73.0 stable
build : cargo install --path router -F candle -F mkl

hf repo files

roberta-large model & tokenizer.

run server

pooling : cls
type : float32 (default)

output differences

code

# load pytorch model on CPU with eval mode.
model = ...

with torch.inference_mode():
    token = tokenizer(
        ['asdf'],
        truncation=True,
        return_tensors='pt',
    )

    pytorch_embedding = model(**token)[0].numpy()  # shape: (1024,)

tei_embedding = np.asarray(
    requests.post(
        'http://localhost:8080/embed', 
        data=json.dumps({'inputs': 'asdf'}), 
        headers={'Content-type': 'application/json'},
    ).json(),
    dtype=np.float32,
)[0, :]  # shape: (1024,)

np.testing.assert_allclose(
    pytorch_embedding,
    tei_embedding,
    rtol=1e-4,
    atol=1e-4,
)

maybe, it's due to intel-mkl or type casting (f64, f32) issues?

AssertionError: 
Not equal to tolerance rtol=0.0001, atol=0.0001

Mismatched elements: 69 / 1024 (6.74%)
Max absolute difference: 0.00026077
Max relative difference: 1.4665195
 x: array([-0.008894, -0.028655, -0.025436, ...,  0.019459,  0.003749,
        0.022704], dtype=float32)
 y: array([-0.008802, -0.028713, -0.025372, ...,  0.019452,  0.003691,
        0.02279 ], dtype=float32)

On GPU

I don't have any GPUs :(

Errors

noah-go/sentence-search-roberta gives an error like below. maybe some configurations or files are missing.

PS C:\Users\zero\Desktop\text-embeddings-inference> text-embeddings-router --model-id noah-go/sentence-search-roberta --port 8080
2023-11-05T02:26:21.654604Z  INFO text_embeddings_router: router\src/main.rs:152: Args { model_id: "noa*-**/********-******-****rta", revision: None, tokenization_workers: None, dtype: None, pooling: None, max_concurrent_requests: 512, max_batch_tokens: 16384, max_batch_requests: None, max_client_batch_size: 32, hf_api_token: None, hostname: "0.0.0.0", port: 8080, uds_path: "/tmp/text-embeddings-inference-server", huggingface_hub_cache: None, json_output: false, otlp_endpoint: None, cors_allow_origin: None }
2023-11-05T02:26:21.657287Z  INFO download_artifacts: text_embeddings_core::download: core\src\download.rs:9: Starting download
2023-11-05T02:26:21.927572Z  WARN download_artifacts: text_embeddings_core::download: core\src\download.rs:15: `model.safetensors` not found. Using `pytorch_model.bin` instead. Model loading will be significantly slower.
2023-11-05T02:26:21.927994Z  INFO download_artifacts: text_embeddings_core::download: core\src\download.rs:25: Model artifacts downloaded in 270.7077ms
2023-11-05T02:26:21.938743Z  INFO text_embeddings_core::tokenization: core\src\tokenization.rs:22: Starting 4 tokenization workers
2023-11-05T02:26:21.958672Z  INFO text_embeddings_router: router\src/main.rs:253: Starting model backend
2023-11-05T02:26:21.961545Z  INFO text_embeddings_backend_candle: backends\candle\src\lib.rs:79: Starting Bert model on CPU
Error: Could not create backend

Caused by:
    Could not start backend: specified file not found in archive

Who can review?

@OlivierDehaene OR @Narsil

kozistr · 2023-11-05T04:43:21Z

@OlivierDehaene could you please review this PR when you're available? thank you!

anyway, thanks for maintaining an awesome project :)

OlivierDehaene

Thanks

kozistr added 2 commits November 1, 2023 18:16

feat: support roberta

ac53a86

docs: fix table of contents

0df6ea3

kozistr marked this pull request as draft November 5, 2023 01:48

kozistr marked this pull request as ready for review November 5, 2023 02:29

OlivierDehaene approved these changes Nov 6, 2023

View reviewed changes

OlivierDehaene merged commit 4048136 into huggingface:main Nov 6, 2023

kozistr deleted the feat/roberta branch November 6, 2023 19:21

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: support roberta #62

feat: support roberta #62

kozistr commented Nov 1, 2023 •

edited

kozistr commented Nov 5, 2023

OlivierDehaene left a comment

feat: support roberta #62

feat: support roberta #62

Conversation

kozistr commented Nov 1, 2023 • edited

What does this PR do?

Tests

On CPU

hf repo files

run server

output differences

On GPU

Errors

Who can review?

kozistr commented Nov 5, 2023

OlivierDehaene left a comment

Choose a reason for hiding this comment

kozistr commented Nov 1, 2023 •

edited