Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

perf: faster Nat.testBit #4188

Merged
merged 6 commits into from
May 23, 2024
Merged

Conversation

FR-vdash-bot
Copy link
Contributor

@FR-vdash-bot FR-vdash-bot commented May 16, 2024

1 &&& n is faster than n &&& 1 for big n.


Maybe we could have a much faster C implementation in the future.
@github-actions github-actions bot added the toolchain-available A toolchain is available for this PR, at leanprover/lean4-pr-releases:pr-release-NNNN label May 16, 2024
@leanprover-community-mathlib4-bot
Copy link
Collaborator

leanprover-community-mathlib4-bot commented May 16, 2024

Mathlib CI status (docs):

  • ❗ Batteries/Mathlib CI will not be attempted unless your PR branches off the nightly-with-mathlib branch. Try git rebase f74980ccee82ca2abdae65dcbc5571d4640ed076 --onto 3035d2f8f689b52963f49b2414414913ca296953. (2024-05-16 06:50:42)
  • ❗ Batteries/Mathlib CI will not be attempted unless your PR branches off the nightly-with-mathlib branch. Try git rebase f74980ccee82ca2abdae65dcbc5571d4640ed076 --onto 3de60bb1f63efe9bb56380f911f86980b9f3332c. (2024-05-21 14:29:09)

@FR-vdash-bot
Copy link
Contributor Author

awaiting-review

@github-actions github-actions bot added the awaiting-review Waiting for someone to review the PR label May 16, 2024
src/Init/Data/Nat/Bitwise/Lemmas.lean Outdated Show resolved Hide resolved
Comment on lines 81 to 83
def testBit (m n : Nat) : Bool :=
-- `1 &&& n` is faster than `n &&& 1` for big `n`. This may change in the future.
1 &&& (m >>> n) != 0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you have numbers to confirm this claim?
Is it faster in kernel reduction or for compiled code? If it really is faster, doesn't the compiler just optimize it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

def test (x : Nat) : Nat := 1 &&& x
def test' (x : Nat) : Nat := x &&& 1

def run_test (n : Nat) (x : Nat) : IO Unit := do
  let mut t := 0
  for _ in [0 : n] do
    t := test x

def run_test' (n : Nat) (x : Nat) : IO Unit := do
  let mut t := 0
  for _ in [0 : n] do
    t := test' x

def n := 10^5
def x := 11^1000000

#eval timeit "1 &&& x" (run_test n x) -- 41ms
#eval timeit "x &&& 1" (run_test' n x) -- 1.5s

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a surprisingly big difference, almost suspiciously so. I only know enough about micro-benchmarking to know results can sometimes be misleading, so I tried playing with various variations (e.g., sprinkling no_inline) but everything points towards 1 &&& x indeed being significantly faster.

Admittedly, In my original comment I was only thinking about "small" (up-to 64 bit) numbers, is this x &&& 1 vs 1 &&& x a known behaviour of gmp (which I assume what is going on here, with the big numbers given in your testcase)?

Copy link
Contributor

@alexkeizer alexkeizer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not knowledgable enough about micro-benchmarks to say whether the given example doesn't have any weird inlining or other behaviour that might interfere with the numbers, but the difference does seem so significant that it makes sense to switch over.

Maybe we could have a much faster C implementation in the future.

For 64 bit or lower, bitwise and and shifting already have special cased code generation. I guess for larger numbers you could optimize indexing much more by using knowledge of the gmp representation, but we'd need some good evidence that the extra maintenance burden is worth the benefit. In any case, that's not a discussion relevant to this PR, so maybe just take that comment out of the PR description.

Otherwise, I've got a minor comment about the proof style but the rest LGTM

src/Init/Data/Nat/Bitwise/Lemmas.lean Outdated Show resolved Hide resolved
src/Init/Data/Nat/Bitwise/Basic.lean Outdated Show resolved Hide resolved
@semorrison semorrison removed the awaiting-review Waiting for someone to review the PR label May 23, 2024
@semorrison semorrison added this pull request to the merge queue May 23, 2024
Merged via the queue into leanprover:master with commit 93758cc May 23, 2024
13 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
toolchain-available A toolchain is available for this PR, at leanprover/lean4-pr-releases:pr-release-NNNN
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants