Skip to content

Cross Encoder #497

@achrafash

Description

@achrafash

Question

I'm trying to run this pre-trained Cross Encoder model (MS Marco TinyBERT) not available in Transformers.js.

I've managed to convert it using the handy script, and I'm successfully running it with the "feature-extraction" task:

const pairs = [
["How many people live in Berlin?", "Berlin had a population of 3,520,031 registered inhabitants in an area of 891.82 square kilometers."],
[ "How many people live in Berlin?", "Berlin is well known for its museums."]
];

const model = await pipeline("feature-extraction", modelName);
const out = await model(pairs[0]);

console.log(Array.from(out.data)) // [-8.387903213500977, -9.811422348022461]

But I'm trying to run it as a Cross Encoder model as it's intended to, like the Python example code:

from sentence_transformers import CrossEncoder

model_name = 'cross-encoder/ms-marco-TinyBERT-L-2-v2'
model = CrossEncoder(model_name, max_length=512)

scores = model.predict([
('How many people live in Berlin?', 'Berlin had a population of 3,520,031 registered inhabitants in an area of 891.82 square kilometers.'), 
('How many people live in Berlin?', 'Berlin is well known for its museums.')
])

print(scores) // [ 7.1523685 -6.2870455]

How can I infer a similarity score from two sentences?

PS: if there are existing models/techniques for sentence similarity I'll take it!

Metadata

Metadata

Assignees

No one assigned

    Labels

    questionFurther information is requested

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions