-
Notifications
You must be signed in to change notification settings - Fork 972
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
decompress: refine data depdency #140
Conversation
snappy.cc
Outdated
ip += next_literal_tag + 1; | ||
} else { | ||
*tag = ip[tag_type]; | ||
ip += tag_type +1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please, separate +1
with space like + 1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch. Updated in new version.
LGTM |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you very much for this contribution!
Please LMK if you'd prefer that I make this small change while importing the PR into our repo.
snappy.cc
Outdated
*tag = is_literal ? ip[literal_tag_offset] : ip[tag_type]; | ||
ip += is_literal ? literal_tag_offset : tag_type; | ||
ip++; | ||
if (0 == tag_type) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we add the is_literal
constant back?
const bool is_literal = (tag_type == 0);
if (is_literal) {
This would make it easier for me to read the code. I hope it doesn't throw off Clang's codegen?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated to reflect the comment. Thanks.
The final ip advance value doesn't have to wait for the result of offset to load *tag. It can be computed along with the offset, so the codegen will use one csinc in parallel with ldrb. This will improve the throughput. With this change it is observed ~4.2% uplift in UFlat/10 and ~3.7% in UFlatMedley Signed-off-by: Jun He <jun.he@arm.com> Change-Id: I20ab211235bbf578c6c978f2bbd9160a49e920da
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you very much for the revisions!
I'll get this PR through our internal repository.
The final ip advance value doesn't have to wait for
the result of offset to load *tag. It can be computed
along with the offset, so the codegen will use one
csinc in parallel with ldrb. This will improve the
throughput.
With this change it is observed ~4.2% uplift in UFlat/10
and ~3.7% in UFlatMedley
Signed-off-by: Jun He jun.he@arm.com
Change-Id: I20ab211235bbf578c6c978f2bbd9160a49e920da