Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use an optimized crc32 library which is faster #527

Merged
merged 1 commit into from
Aug 31, 2015
Merged

Conversation

eapache
Copy link
Contributor

@eapache eapache commented Aug 31, 2015

Needs testing to ensure that this is actually faster (and is still correct), but may solve #255.

@wvanbergen
Copy link
Contributor

We may also want to check what architectures this will run on, because it uses CPU instructions that may not be available everywhere, and I don't know if it does fallbacks.

@eapache
Copy link
Contributor Author

eapache commented Aug 31, 2015

On a quick glance it uses +build flags and implements the stdlib version as a fallback, so we should be OK.

@wvanbergen
Copy link
Contributor

OK, let's run some benchmarks!

@eapache
Copy link
Contributor Author

eapache commented Aug 31, 2015

My benchmark shows no difference; possibly vagrant does not support the necessary instructions. Since golang profiling is broken on native MacOS, we may have to run something on one of our servers.

@eapache
Copy link
Contributor Author

eapache commented Aug 31, 2015

Even on a bare-metal server whose /proc/cpuinfo says it supports SSE4.2 I see the same amount of time spent in the fallback method... not sure what else I need to do to actually make the optimizations run.

@eapache
Copy link
Contributor Author

eapache commented Aug 31, 2015

Welp, found a bug in our code now when benchmarking. Another PR incoming.

@eapache
Copy link
Contributor Author

eapache commented Aug 31, 2015

Short version is: this basically removes CRC32 from the profile entirely, it drops from 15-20% of the CPU time in my test to a statistically insignificant number of samples. Sold.

Also, the upstream author has submitted this to Go, so there's a very good chance Go 1.6 will have this optimization in the stdlib.

eapache added a commit that referenced this pull request Aug 31, 2015
Use an optimized crc32 library which is faster
@eapache eapache merged commit 2d4c75a into master Aug 31, 2015
@eapache eapache deleted the crc32-optimization branch August 31, 2015 16:37
@wvanbergen
Copy link
Contributor

👏

eapache added a commit that referenced this pull request Aug 31, 2015
I discovered a "send on closed channel" panic in the consumer while testing #527
which I was finally able to track down. If a partition takes a long time to
drain to the user, then the responseFeeder reclaims its ownership token from
the broker so that the broker doesn't block its other partitions. However, if
the user closes the PartitionConsumer (closing the dying channel) then the
brokerConsumer will unconditionally return the ownership token to the dispatcher
even if the responseFeeder is holding it. This results in two ownership tokens
for the same partition (one in the feeder, one in the dispatcher) which leads to
all sorts of subtle brokeness. It manifested in at least two different "send on
closed channel" backtraces depending on the exact timing, and possibly more.
eapache added a commit that referenced this pull request Aug 31, 2015
I discovered a "send on closed channel" panic in the consumer while testing #527
which I was finally able to track down. If a partition takes a long time to
drain to the user, then the responseFeeder reclaims its ownership token from
the broker so that the broker doesn't block its other partitions. However, if
the user closes the PartitionConsumer (closing the dying channel) then the
brokerConsumer will unconditionally return the ownership token to the dispatcher
even if the responseFeeder is holding it. This results in two ownership tokens
for the same partition (one in the feeder, one in the dispatcher) which leads to
all sorts of subtle brokeness. It manifested in at least two different "send on
closed channel" backtraces depending on the exact timing, and possibly more.

To fix, move the check on `child.dying` to the top of the `subscriptionConsumer`
loop where we are guaranteed to have the ownership token. Combine that check
with the 'new subcriptions' check into an `updateSubscriptions` helper method.
The diff is huge because this lets us drop an indentation level in
`handleResponses`, I suggest reviewing with `w=1` to ignore whitespace.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants