Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bump the minor-patch group with 3 updates #144

Merged
merged 1 commit into from
Mar 25, 2024

Conversation

dependabot[bot]
Copy link
Contributor

@dependabot dependabot bot commented on behalf of github Mar 22, 2024

Bumps the minor-patch group with 3 updates: boto3, sentence-transformers and boto3-stubs.

Updates boto3 from 1.34.68 to 1.34.69

Changelog

Sourced from boto3's changelog.

1.34.69

  • api-change:firehose: [botocore] Updates Amazon Firehose documentation for message regarding Enforcing Tags IAM Policy.
  • api-change:kendra: [botocore] Documentation update, March 2024. Corrects some docs for Amazon Kendra.
  • api-change:pricing: [botocore] Add ResourceNotFoundException to ListPriceLists and GetPriceListFileUrl APIs
  • api-change:rolesanywhere: [botocore] This release relaxes constraints on the durationSeconds request parameter for the *Profile APIs that support it. This parameter can now take on values that go up to 43200.
  • api-change:securityhub: [botocore] Added new resource detail object to ASFF, including resource for LastKnownExploitAt
Commits
  • 4f1c6c0 Merge branch 'release-1.34.69'
  • c3f1e7d Bumping version to 1.34.69
  • e43fc9d Add changelog entries from botocore
  • 6607f5f Merge branch 'release-1.34.68' into develop
  • See full diff in compare view

Updates sentence-transformers from 2.5.1 to 2.6.0

Release notes

Sourced from sentence-transformers's releases.

v2.6.0 - Embedding Quantization, GISTEmbedLoss

This release brings embedding quantization: a way to heavily speed up retrieval & other tasks, and a new powerful loss function: GISTEmbedLoss.

Install this version with

pip install sentence-transformers==2.6.0

Embedding Quantization

Embeddings may be challenging to scale up, which leads to expensive solutions and high latencies. However, there is a new approach to counter this problem; it entails reducing the size of each of the individual values in the embedding: Quantization. Experiments on quantization have shown that we can maintain a large amount of performance while significantly speeding up computation and saving on memory, storage, and costs.

To be specific, using binary quantization may result in retaining 96% of the retrieval performance, while speeding up retrieval by 25x and saving on memory & disk space with 32x. Do not underestimate this approach! Read more about Embedding Quantization in our extensive blogpost.

Binary and Scalar Quantization

Two forms of quantization exist at this time: binary and scalar (int8). These quantize embedding values from float32 into binary and int8, respectively. For Binary quantization, you can use the following snippet:

from sentence_transformers import SentenceTransformer
from sentence_transformers.quantization import quantize_embeddings
1. Load an embedding model
model = SentenceTransformer("mixedbread-ai/mxbai-embed-large-v1")
2a. Encode some text using "binary" quantization
binary_embeddings = model.encode(
["I am driving to the lake.", "It is a beautiful day."],
precision="binary",
)
2b. or, encode some text without quantization & apply quantization afterwards
embeddings = model.encode(["I am driving to the lake.", "It is a beautiful day."])
binary_embeddings = quantize_embeddings(embeddings, precision="binary")

References:

GISTEmbedLoss

GISTEmbedLoss, as introduced in Solatorio (2024), is a guided variant of the more standard in-batch negatives (MultipleNegativesRankingLoss) loss. Both loss functions are provided with a list of (anchor, positive) pairs, but while MultipleNegativesRankingLoss uses anchor_i and positive_i as positive pair and all positive_j with i != j as negative pairs, GISTEmbedLoss uses a second model to guide the in-batch negative sample selection.

This can be very useful, because it is plausible that anchor_i and positive_j are actually quite semantically similar. In this case, GISTEmbedLoss would not consider them a negative pair, while MultipleNegativesRankingLoss would. When finetuning MPNet-base on the AllNLI dataset, these are the Spearman correlation based on cosine similarity using the STS Benchmark dev set (higher is better):

312039399-ef5d4042-a739-41f6-a6ca-eddc7f901411 The blue line is MultipleNegativesRankingLoss, whereas the grey line is GISTEmbedLoss with the small all-MiniLM-L6-v2 as the guide model. Note that all-MiniLM-L6-v2 by itself does not reach 88 Spearman correlation on this dataset, so this is really the effect of two models (mpnet-base and all-MiniLM-L6-v2) reaching a performance that they could not reach separately.

All changes

... (truncated)

Commits
  • a5f7749 Release v2.6.0
  • 13a9f3f [feat] Add binary & scalar embedding quantization support to Sentence Trans...
  • e6af66f Also update return docstring of encode_multi_process (#2548)
  • caaa28d Fix SentenceTransformer encode documentation return type default (numpy vecto...
  • 87f4180 [deprecation] Deprecate save_to_hub in favor of push_to_hub; add safe_s...
  • fc2a2d8 Enable saving modules as pytorch_model.bin (#2542)
  • b9255d9 Add 'get_config_dict' method to GISTEmbedLoss for better model cards (#2543)
  • 465d4f0 Add GISTEmbedLoss (#2535)
  • See full diff in compare view

Updates boto3-stubs from 1.34.68 to 1.34.69

Commits

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


Dependabot commands and options

You can trigger Dependabot actions by commenting on this PR:

  • @dependabot rebase will rebase this PR
  • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
  • @dependabot merge will merge this PR after your CI passes on it
  • @dependabot squash and merge will squash and merge this PR after your CI passes on it
  • @dependabot cancel merge will cancel a previously requested merge and block automerging
  • @dependabot reopen will reopen this PR if it is closed
  • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
  • @dependabot show <dependency name> ignore conditions will show all of the ignore conditions of the specified dependency
  • @dependabot ignore <dependency name> major version will close this group update PR and stop Dependabot creating any more for the specific dependency's major version (unless you unignore this specific dependency's major version or upgrade to it yourself)
  • @dependabot ignore <dependency name> minor version will close this group update PR and stop Dependabot creating any more for the specific dependency's minor version (unless you unignore this specific dependency's minor version or upgrade to it yourself)
  • @dependabot ignore <dependency name> will close this group update PR and stop Dependabot creating any more for the specific dependency (unless you unignore this specific dependency or upgrade to it yourself)
  • @dependabot unignore <dependency name> will remove all of the ignore conditions of the specified dependency
  • @dependabot unignore <dependency name> <ignore condition> will remove the ignore condition of the specified dependency and ignore conditions

Bumps the minor-patch group with 3 updates: [boto3](https://github.com/boto/boto3), [sentence-transformers](https://github.com/UKPLab/sentence-transformers) and [boto3-stubs](https://github.com/youtype/mypy_boto3_builder).


Updates `boto3` from 1.34.68 to 1.34.69
- [Release notes](https://github.com/boto/boto3/releases)
- [Changelog](https://github.com/boto/boto3/blob/develop/CHANGELOG.rst)
- [Commits](boto/boto3@1.34.68...1.34.69)

Updates `sentence-transformers` from 2.5.1 to 2.6.0
- [Release notes](https://github.com/UKPLab/sentence-transformers/releases)
- [Commits](UKPLab/sentence-transformers@v2.5.1...v2.6.0)

Updates `boto3-stubs` from 1.34.68 to 1.34.69
- [Release notes](https://github.com/youtype/mypy_boto3_builder/releases)
- [Commits](https://github.com/youtype/mypy_boto3_builder/commits)

---
updated-dependencies:
- dependency-name: boto3
  dependency-type: direct:production
  update-type: version-update:semver-patch
  dependency-group: minor-patch
- dependency-name: sentence-transformers
  dependency-type: direct:production
  update-type: version-update:semver-minor
  dependency-group: minor-patch
- dependency-name: boto3-stubs
  dependency-type: direct:development
  update-type: version-update:semver-patch
  dependency-group: minor-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
@dependabot dependabot bot added the dependencies Pull requests that update a dependency file label Mar 22, 2024
@gecBurton gecBurton merged commit 0a6875a into main Mar 25, 2024
1 check passed
@gecBurton gecBurton deleted the dependabot/pip/minor-patch-6441405269 branch March 25, 2024 09:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
dependencies Pull requests that update a dependency file
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant