add dtype-based loading by michaelfeil · Pull Request #461 · michaelfeil/infinity

michaelfeil · 2024-11-13T05:17:55Z

This pull request includes several changes to improve the handling of loading strategies, device placement, and quantization in the infinity_emb library. The most important changes involve updates to the SentenceClassifier, CrossEncoder, and SentenceTransformer classes to incorporate new loading strategies and device placements, as well as handling different data types and quantization.

Improvements to handling loading strategies and device placement:

libs/infinity_emb/infinity_emb/transformer/classifier/torch.py: Added support for loading strategies, device placement, and quantization in the SentenceClassifier class. [1] [2]
libs/infinity_emb/infinity_emb/transformer/crossencoder/torch.py: Updated the CrossEncoder class to handle loading strategies, device placement, and quantization. [1] [2] [3] [4]
libs/infinity_emb/infinity_emb/transformer/embedder/sentence_transformer.py: Enhanced the SentenceTransformer class to support loading strategies and device placement.<!--
Congratulations! You've made it this far! Thanks for submitting a PR to Infinity!

License & CLA

By submitting this PR, I confirm that my contribution is made under the terms of the MIT license.
-->

Related Issue

Checklist

I have read the CONTRIBUTING guidelines.
I have added tests to cover my changes.
I have updated the documentation (docs folder) accordingly.

Additional Notes

Add any other context about the PR here.

greptile-apps

PR Summary

This PR implements dtype-based loading strategies and device placement across transformer models, replacing manual dtype/device management with a more consistent approach.

Added loading strategy support in /libs/infinity_emb/infinity_emb/transformer/embedder/sentence_transformer.py with loading_dtype parameter for model initialization
Integrated quantization interface via quant_interface in /libs/infinity_emb/infinity_emb/transformer/classifier/torch.py and /libs/infinity_emb/infinity_emb/transformer/crossencoder/torch.py
Added torch.compile support in classifier and crossencoder implementations
Standardized float32 numpy output in CrossEncoder's encode_post method
Removed manual half-precision conversion in favor of loading_dtype across transformer classes

_{3 file(s) reviewed, 4 comment(s)}
_{Edit PR Review Bot Settings | Greptile}

codecov-commenter · 2024-11-13T05:26:04Z

⚠️ Please install the to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

❌ Patch coverage is 87.09677% with 4 lines in your changes missing coverage. Please review.
✅ Project coverage is 79.08%. Comparing base (4ab717b) to head (11fc52b).
⚠️ Report is 105 commits behind head on main.

Files with missing lines	Patch %	Lines
...y_emb/infinity_emb/transformer/classifier/torch.py	78.57%	3 Missing ⚠️
...emb/infinity_emb/transformer/crossencoder/torch.py	92.85%	1 Missing ⚠️
❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #461      +/-   ##
==========================================
+ Coverage   78.97%   79.08%   +0.10%     
==========================================
  Files          42       42              
  Lines        3392     3414      +22     
==========================================
+ Hits         2679     2700      +21     
- Misses        713      714       +1

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

greptile-apps Bot reviewed Nov 13, 2024

View reviewed changes

add dtype-based loading

11fc52b

michaelfeil merged commit 0a688b6 into main Nov 13, 2024

michaelfeil deleted the dtype-based-loading branch November 13, 2024 06:30

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add dtype-based loading#461

add dtype-based loading#461
michaelfeil merged 1 commit intomainfrom
dtype-based-loading

michaelfeil commented Nov 13, 2024

Uh oh!

greptile-apps Bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

codecov-commenter commented Nov 13, 2024 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

michaelfeil commented Nov 13, 2024

Improvements to handling loading strategies and device placement:

License & CLA

Related Issue

Checklist

Additional Notes

Uh oh!

greptile-apps Bot left a comment

Choose a reason for hiding this comment

PR Summary

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

codecov-commenter commented Nov 13, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

codecov-commenter commented Nov 13, 2024 •

edited

Loading