Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: support roberta #62

Merged
merged 2 commits into from Nov 6, 2023
Merged

feat: support roberta #62

merged 2 commits into from Nov 6, 2023

Conversation

kozistr
Copy link
Contributor

@kozistr kozistr commented Nov 1, 2023

What does this PR do?

  • (11/5/23) I tested on the CPU env with my RoBERTa model.

Tests

On CPU

  • OS : Windows 10
  • CPU : i7-7700K
  • rustc : v1.73.0 stable
  • build : cargo install --path router -F candle -F mkl

hf repo files

roberta-large model & tokenizer.

image

run server

  • pooling : cls
  • type : float32 (default)

image

output differences

code

# load pytorch model on CPU with eval mode.
model = ...

with torch.inference_mode():
    token = tokenizer(
        ['asdf'],
        truncation=True,
        return_tensors='pt',
    )

    pytorch_embedding = model(**token)[0].numpy()  # shape: (1024,)

tei_embedding = np.asarray(
    requests.post(
        'http://localhost:8080/embed', 
        data=json.dumps({'inputs': 'asdf'}), 
        headers={'Content-type': 'application/json'},
    ).json(),
    dtype=np.float32,
)[0, :]  # shape: (1024,)

np.testing.assert_allclose(
    pytorch_embedding,
    tei_embedding,
    rtol=1e-4,
    atol=1e-4,
)

maybe, it's due to intel-mkl or type casting (f64, f32) issues?

AssertionError: 
Not equal to tolerance rtol=0.0001, atol=0.0001

Mismatched elements: 69 / 1024 (6.74%)
Max absolute difference: 0.00026077
Max relative difference: 1.4665195
 x: array([-0.008894, -0.028655, -0.025436, ...,  0.019459,  0.003749,
        0.022704], dtype=float32)
 y: array([-0.008802, -0.028713, -0.025372, ...,  0.019452,  0.003691,
        0.02279 ], dtype=float32)

On GPU

I don't have any GPUs :(

Errors

noah-go/sentence-search-roberta gives an error like below. maybe some configurations or files are missing.

PS C:\Users\zero\Desktop\text-embeddings-inference> text-embeddings-router --model-id noah-go/sentence-search-roberta --port 8080
2023-11-05T02:26:21.654604Z  INFO text_embeddings_router: router\src/main.rs:152: Args { model_id: "noa*-**/********-******-****rta", revision: None, tokenization_workers: None, dtype: None, pooling: None, max_concurrent_requests: 512, max_batch_tokens: 16384, max_batch_requests: None, max_client_batch_size: 32, hf_api_token: None, hostname: "0.0.0.0", port: 8080, uds_path: "/tmp/text-embeddings-inference-server", huggingface_hub_cache: None, json_output: false, otlp_endpoint: None, cors_allow_origin: None }
2023-11-05T02:26:21.657287Z  INFO download_artifacts: text_embeddings_core::download: core\src\download.rs:9: Starting download
2023-11-05T02:26:21.927572Z  WARN download_artifacts: text_embeddings_core::download: core\src\download.rs:15: `model.safetensors` not found. Using `pytorch_model.bin` instead. Model loading will be significantly slower.
2023-11-05T02:26:21.927994Z  INFO download_artifacts: text_embeddings_core::download: core\src\download.rs:25: Model artifacts downloaded in 270.7077ms
2023-11-05T02:26:21.938743Z  INFO text_embeddings_core::tokenization: core\src\tokenization.rs:22: Starting 4 tokenization workers
2023-11-05T02:26:21.958672Z  INFO text_embeddings_router: router\src/main.rs:253: Starting model backend
2023-11-05T02:26:21.961545Z  INFO text_embeddings_backend_candle: backends\candle\src\lib.rs:79: Starting Bert model on CPU
Error: Could not create backend

Caused by:
    Could not start backend: specified file not found in archive

Who can review?

@OlivierDehaene OR @Narsil

@kozistr kozistr marked this pull request as draft November 5, 2023 01:48
@kozistr kozistr marked this pull request as ready for review November 5, 2023 02:29
@kozistr
Copy link
Contributor Author

kozistr commented Nov 5, 2023

@OlivierDehaene could you please review this PR when you're available? thank you!

anyway, thanks for maintaining an awesome project :)

Copy link
Member

@OlivierDehaene OlivierDehaene left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks

@OlivierDehaene OlivierDehaene merged commit 4048136 into huggingface:main Nov 6, 2023
@kozistr kozistr deleted the feat/roberta branch November 6, 2023 19:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants