Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bump sentence-transformers from 2.4.0 to 2.5.1 in /backend/5-studycompass #385

Conversation

dependabot[bot]
Copy link
Contributor

@dependabot dependabot bot commented on behalf of github Mar 4, 2024

Bumps sentence-transformers from 2.4.0 to 2.5.1.

Release notes

Sourced from sentence-transformers's releases.

v2.5.0 - 2D Matryoshka & Adaptive Layer models, CrossEncoder (re)ranking

This release brings two new loss functions, a new way to (re)rank with CrossEncoder models, and more fixes

Install this version with

pip install sentence-transformers==2.5.0

2D Matryoshka & Adaptive Layer models (#2506)

Embedding models are often encoder models with numerous layers, such as 12 (e.g. all-mpnet-base-v2) or 6 (e.g. all-MiniLM-L6-v2). To get embeddings, every single one of these layers must be traversed. 2D Matryoshka Sentence Embeddings (2DMSE) revisits this concept by proposing an approach to train embedding models that will perform well when only using a selection of all layers. This results in faster inference speeds at relatively low performance costs.

For example, using Sentence Transformers, you can train an Adaptive Layer model that can be sped up by 2x at a 15% reduction in performance, or 5x on GPU & 10x on CPU for a 20% reduction in performance. The 2DMSE paper highlights scenarios where this is superior to using a smaller model.

Training

Training with Adaptive Layer support is quite elementary: rather than applying some loss function on only the last layer, we also apply that same loss function on the pooled embeddings from previous layers. Additionally, we employ a KL-divergence loss that aims to make the embeddings of the non-last layers match that of the last layer. This can be seen as a fascinating approach of knowledge distillation, but with the last layer as the teacher model and the prior layers as the student models.

For example, with the 12-layer microsoft/mpnet-base, it will now be trained such that the model produces meaningful embeddings after each of the 12 layers.

from sentence_transformers import SentenceTransformer
from sentence_transformers.losses import CoSENTLoss, AdaptiveLayerLoss
model = SentenceTransformer("microsoft/mpnet-base")
base_loss = CoSENTLoss(model=model)
loss = AdaptiveLayerLoss(model=model, loss=base_loss)

  • Reference: AdaptiveLayerLoss

Additionally, this can be combined with the MatryoshkaLoss such that the resulting model can be reduced both in the number of layers, but also in the size of the output dimensions. See also the Matryoshka Embeddings for more information on reducing output dimensions. In Sentence Transformers, the combination of these two losses is called Matryoshka2dLoss, and a shorthand is provided for simpler training.

from sentence_transformers import SentenceTransformer
from sentence_transformers.losses import CoSENTLoss, Matryoshka2dLoss
model = SentenceTransformer("microsoft/mpnet-base")
base_loss = CoSENTLoss(model=model)
loss = Matryoshka2dLoss(model=model, loss=base_loss, matryoshka_dims=[768, 512, 256, 128, 64])

  • Reference: Matryoshka2dLoss

Results

Let's look at the performance that we may be able to expect from an Adaptive Layer embedding model versus a regular embedding model. For this experiment, I have trained two models:

... (truncated)

Commits
  • aaec753 Merge branch 'master' into v2.5-release
  • 66e0ee3 Fix CrossEncoder.rank default value for top_k (#2518)
  • aad0642 Don't always normalize the embeddings in clustering example (#2520)
  • 504de8b Update to ruff 0.3.0; update ruff.toml (#2517)
  • cb57d04 Add get_config_dict to new Matryoshka2dLoss & AdaptiveLayerLoss (#2516)
  • f8df32c Update model repo_id in 2dMatryoshka example (#2515)
  • 3857b9b Increment to dev version after v2.5.0 release
  • c4b32c2 Release v2.5.0
  • d884971 [loss] Add AdaptiveLayerLoss; 2d Matryoshka loss modifiers (#2506)
  • 937be8c Add rank() to the CrossEncoder (#2514)
  • Additional commits viewable in compare view

Dependabot compatibility score

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


Dependabot commands and options

You can trigger Dependabot actions by commenting on this PR:

  • @dependabot rebase will rebase this PR
  • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
  • @dependabot merge will merge this PR after your CI passes on it
  • @dependabot squash and merge will squash and merge this PR after your CI passes on it
  • @dependabot cancel merge will cancel a previously requested merge and block automerging
  • @dependabot reopen will reopen this PR if it is closed
  • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
  • @dependabot show <dependency name> ignore conditions will show all of the ignore conditions of the specified dependency
  • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
  • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
  • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)

Bumps [sentence-transformers](https://github.com/UKPLab/sentence-transformers) from 2.4.0 to 2.5.1.
- [Release notes](https://github.com/UKPLab/sentence-transformers/releases)
- [Commits](UKPLab/sentence-transformers@v2.4.0...v2.5.1)

---
updated-dependencies:
- dependency-name: sentence-transformers
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
@dependabot dependabot bot added backend/5-studycompass dependencies Pull requests that update a dependency file python Python code/dependencies labels Mar 4, 2024
@ralf-berger ralf-berger merged commit 1567fc8 into main Mar 15, 2024
7 checks passed
@ralf-berger ralf-berger deleted the dependabot/pip/backend/5-studycompass/sentence-transformers-2.5.1 branch March 15, 2024 16:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend/5-studycompass dependencies Pull requests that update a dependency file python Python code/dependencies
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant