-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Replace ONNXMiniLM_L6_V2._init_model_and_tokenizer with tokenizer and model cached properties #1194
Conversation
Reviewer ChecklistPlease leverage this checklist to ensure your code review is thorough before approving Testing, Bugs, Errors, Logs, Documentation
System Compatibility
Quality
|
self.DOWNLOAD_PATH, self.EXTRACTED_FOLDER_NAME, "tokenizer.json" | ||
) | ||
@cached_property | ||
def tokenizer(self) -> "Tokenizer": |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Aren't race conditions still possible even if this is cached?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ostensibly they are still possible but they are harmless, at worst the model or the tokenizer are initialized more than once but this is idempotent.
The reason the current race condition is problematic is the and
in if self.model is None and self.tokenizer is None
:
- Initially the condition is True as both model and tokenizer are None
- Thread A enters the block and initializes the tokenizer
- The context switches to thread B before A gets to initialize the model
- Thread B evaluates the condition as False (model is None, tokenizer is not None)
- Thread B attempts to access
self.model.run
AttributeError
💥
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair enough :) Thanks for the analysis!
Tests are failing - I think because ImportError: cannot import name 'cached_property' from 'functools' this is not supported on 3.7 |
Oh 3.7 is still supported? It has reached its end-of-life in June. If you don't want to bump the minimum version to 3.8, you can add cached_property as conditional requirement for 3.7. |
@gsakkis we're not supporting it anymore if you want to push this over the finish line! |
… model cached properties
@beggers rebased! |
🚀 🚀 🚀 |
… model cached properties (chroma-core#1194) ## Description of changes *Summarize the changes made by this PR.* - Improvements & Bug fixes - Fixes chroma-core#1193: race condition in `ONNXMiniLM_L6_V2._init_model_and_tokenizer` ## Test plan *How are these changes tested?* - [x] Tests pass locally with `pytest` for python, `yarn test` for js
Description of changes
Summarize the changes made by this PR.
ONNXMiniLM_L6_V2._init_model_and_tokenizer
Test plan
How are these changes tested?
pytest
for python,yarn test
for js