Progress bar for sBert #1436

Robdboer · 2023-05-11T14:52:17Z

During a conversation with @jteijema we concluded that it would be a good idea to include a progress bar for the sBert feature extractor.

PeterLombaers · 2023-05-11T16:42:52Z

This is already implemented in sentence_transformers, so it's just a matter of adjusting the sbert classifier in asreview to use that implementation:

https://www.sbert.net/docs/package_reference/SentenceTransformer.html#sentence_transformers.SentenceTransformer.encode

jteijema · 2023-05-15T12:28:16Z

I think sbert can be cleaner in more ways than one. I'd be willing to pick this up if necessary, if @Robdboer does not want to.

J535D165 · 2023-05-16T09:39:00Z

I'm interested in a PR as I'm planning to merge this PR #1427. It would help to have an indicator for long-running feature extractors as otherwise it looks like nothing is happening.

Robdboer · 2023-05-16T10:27:01Z

I will start looking into adding a progress bar today!

rohitgarud · 2023-05-16T13:12:40Z

I too have one suggestion for updating SBERT. We need to add a way to use models other than pre-trained SBERT models. I would like to work on it, maybe in a different branch or the same as the one for adding a progress bar.

Robdboer · 2023-05-16T13:38:59Z

@rohitgarud I was also thinking of creating a general sentence-transformer feature extractor, where one could choose a pretrained model out of a list at the start of a simulation. What are your thoughts on this?

PeterLombaers · 2023-05-16T14:04:52Z

I would suggest that any of these changes would go in a different pull request than the one currently open for the progress bar.

@rohitgarud What models do you have in mind that you want to load via sentence_transformers but is not a pretrained sentence transformer model? There are of course a lot of models on huggingface for example, but sentence_transformers is already able to use models from there that are compatible with the package.

@Robdboer Do you mean in asreview lab in the frontend or via the CLI? Via the CLI you can already choose any pretrained model, though it is a bit of a hassle to get it to work. The SBert feature extractor accepts a transformer_model argument to load a pretrained model.

Robdboer · 2023-05-16T14:22:02Z

@PeterLombaers Wow I did not realize that. I see it in the code now. Since I will be testing different multilingual feature extractors for my thesis this could be a very useful feature. If I find any improvements I will create a new pull-request. Thanks!

rohitgarud · 2023-05-16T14:42:58Z

@PeterLombaers There are few models such as the SCIBERT, BIOBERT or longformer which does not have the pooling layer at the end to generate sentence embeddings as they are not SBERT models.
I will create PR for this, please review and let me know if it is a good features to add to SBERT

jteijema · 2023-05-17T09:14:14Z

@rohitgarud there's a very basic reusable sbert model I've been working on my github. Its not usable for other applications than mine currently though.

Moreover, currently, sbert casts to an array. I don't think this is needed anymore and we can just return the model.encode(texts).

def transform(self, texts):
  _check_st()
  
  model = SentenceTransformer(self.transformer_model)
  X = np.array(model.encode(texts))
  return X

rohitgarud · 2023-05-17T10:04:48Z

@jteijema The reusable sbert model is indeed very helpful and I think something like this can be added to Makita itself, so if the feature extractor parameters are not changed, use the previously obtained features for the next simulation for the same dataset

Robdboer assigned J535D165 May 11, 2023

jteijema mentioned this issue May 17, 2023

Allow SBERT to load other transformer models #1440

Merged

J535D165 linked a pull request May 17, 2023 that will close this issue

Add progress bar for sbert feature extraction #1439

Merged

PeterLombaers closed this as completed in #1439 May 17, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Progress bar for sBert #1436

Progress bar for sBert #1436

Robdboer commented May 11, 2023

PeterLombaers commented May 11, 2023

jteijema commented May 15, 2023

J535D165 commented May 16, 2023

Robdboer commented May 16, 2023

rohitgarud commented May 16, 2023 •

edited

Robdboer commented May 16, 2023

PeterLombaers commented May 16, 2023

Robdboer commented May 16, 2023

rohitgarud commented May 16, 2023

jteijema commented May 17, 2023

rohitgarud commented May 17, 2023

Progress bar for sBert #1436

Progress bar for sBert #1436

Comments

Robdboer commented May 11, 2023

PeterLombaers commented May 11, 2023

jteijema commented May 15, 2023

J535D165 commented May 16, 2023

Robdboer commented May 16, 2023

rohitgarud commented May 16, 2023 • edited

Robdboer commented May 16, 2023

PeterLombaers commented May 16, 2023

Robdboer commented May 16, 2023

rohitgarud commented May 16, 2023

jteijema commented May 17, 2023

rohitgarud commented May 17, 2023

rohitgarud commented May 16, 2023 •

edited