-
Notifications
You must be signed in to change notification settings - Fork 5
68 make contexts more flexible and efficient #69
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
68 make contexts more flexible and efficient #69
Conversation
Summary of ChangesHello @noamteyssier, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request introduces a significant architectural improvement by refactoring the context management system. The previous single Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request introduces a significant and valuable refactoring of the Context mechanism, making it more flexible and efficient through a trait-based design. The new structure is well-organized and the examples are updated appropriately. I've identified a couple of high-severity issues in the new QualityContext implementation concerning buffer management that could lead to incorrect behavior or memory overallocation. I've also found a minor documentation error. Addressing these points will make this an excellent contribution.
| let squal = self.squal_mut(); | ||
| if squal.len() != slen { | ||
| squal.reserve(slen); | ||
| unsafe { | ||
| squal.set_len(slen); | ||
| } | ||
| } | ||
| squal[..slen].copy_from_slice(record.squal()); | ||
|
|
||
| if record.is_paired() { | ||
| let xlen = record.xlen() as usize; | ||
| let xqual = self.xqual_mut(); | ||
| if xqual.len() != xlen { | ||
| xqual.reserve(xlen); | ||
| unsafe { | ||
| xqual.set_len(xlen); | ||
| } | ||
| } | ||
| xqual[..xlen].copy_from_slice(record.xqual()); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The use of unsafe with set_len here is problematic and contains a bug.
squal.reserve(slen)is incorrect.Vec::reserveallocates additional capacity. Ifsqualis not empty, this will overallocate memory. This bug applies to bothsqualandxqualhandling.- The
unsafeblock violates the safety contract ofset_len, which requires the new elements to be initialized. Whilecopy_from_sliceinitializes them immediately after, this pattern is brittle.
Using safe, idiomatic Rust is clearer, safer, and likely just as performant. I suggest replacing this with a safe alternative that correctly handles buffer resizing.
let squal = self.squal_mut();
squal.clear();
squal.extend_from_slice(record.squal());
if record.is_paired() {
let xlen = record.xlen() as usize;
let xqual = self.xqual_mut();
xqual.clear();
xqual.extend_from_slice(record.xqual());
}| let slen = record.slen() as usize; | ||
| let xlen = record.xlen() as usize; | ||
|
|
||
| // only resize if its not the right size | ||
| if self.squal().len() < slen { | ||
| self.squal_mut().clear(); | ||
| self.squal_mut().resize(slen, DEFAULT_QUALITY); | ||
| } | ||
|
|
||
| // Only resize if there's an extended sequence and it's not already the right size | ||
| if xlen > 0 && self.xqual().len() < xlen { | ||
| self.xqual_mut().clear(); | ||
| self.xqual_mut().resize(xlen, DEFAULT_QUALITY); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The logic in ensure_quality_capacity is buggy and can lead to stale data in the buffers.
- The condition
self.squal().len() < slenis incorrect. If a buffer is longer than needed (from a previously processed longer record), it won't be truncated, leaving stale data. - Similarly, if a paired-end record is followed by a single-end record (
xlen == 0), thexqualbuffer is not cleared, also leaving stale data.
The correct approach is to resize the buffers to the exact required lengths (slen and xlen) if they don't already match.
let slen = record.slen() as usize;
let xlen = record.xlen() as usize;
let squal = self.squal_mut();
if squal.len() != slen {
squal.resize(slen, DEFAULT_QUALITY);
}
let xqual = self.xqual_mut();
if xqual.len() != xlen {
xqual.resize(xlen, DEFAULT_QUALITY);
}| buffer.extend_from_slice(self.sheader); | ||
| } | ||
| } | ||
| /// Clear the buffer and fill it with the sequence header |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No description provided.