[Stdlib] Speedup `Dict` (changing modulus to bitshifting) #3071

rd4com · 2024-06-17T21:35:25Z

Hello,

it could be a nice improvement, around +80% here (Ubuntu);

Hard to tell without feedbacks, here is the benchmark used:

from time import now
from random import *
from sys.param_env import is_defined
from __dict_simd import Dict2

alias iteration_size = 128
def main():
    var result: Int=0
    var start = now()
    var stop = now()

    
    small = Dict2[Int,Int]()
    start = now()
    for x in range(100):
        for i in range(iteration_size):
            small[i]=i
        for i in range(iteration_size): 
            result += small[i]
    stop = now()
    print(stop-start, result)

    result = 0

    small2 = Dict[Int,Int]()
    start = now()
    for x in range(100):
        for i in range(iteration_size):
            small2[i]=i
        for i in range(iteration_size):
            result += small2[i]
    stop = now()
    print(stop-start, result)

486225 812800
807652 812800

Because the dict always augment the reserved size by <<=1,

it is possible to use &=(self.reserved-1) instead of modulus

🥳 hope you got the same improvements as here

Signed-off-by: rd4com <144297616+rd4com@users.noreply.github.com>

bethebunny

Nice! The fact that this wasn't done automatically makes me wonder if the subtraction can also be optimized out. I'll follow up with some of the compiler folks to understand why this case isn't optimized or if we can somehow tell the compiler that reserved is guaranteed to be a multiple of 2. As far as I'm concerned, % is the correct way to spell this and reads much more clearly as to the intent, but obviously this optimization is the core intention behind always keeping a power-of-2 reserved size ;)

JoeLoser · 2024-06-17T23:44:14Z

!sync

bethebunny · 2024-06-17T23:51:36Z

Following up, the failure to optimize here is because we used a signed int for reserved rather than an unsigned type. That's also a reasonable followup.

bethebunny · 2024-06-17T23:59:09Z

Consensus seems to be "relying on specific codegen optimizations for performance critical things is an antipattern", so +1 on merging this and we don't need any followups!

JoeLoser · 2024-06-18T01:51:42Z

!sync

modularbot · 2024-06-18T02:42:02Z

✅🟣 This contribution has been merged 🟣✅

Your pull request has been merged to the internal upstream Mojo sources. It will be reflected here in the Mojo repository on the nightly branch during the next Mojo nightly release, typically within the next 24-48 hours.

We use Copybara to merge external contributions, click here to learn more.

gabrieldemarmiesse · 2024-06-18T09:49:13Z

Hi, a warning here, a PR made previously make the capacity of the dict being decided by the underlying List, see #2905. While List always grows by a power a two, that doesn't cause any issue. But if in the future, we change the capacity growing logic in List, the capacity of the Dict might not be a power of two anymore.

Should we remove the optimization where the dict uses all the capacity of the list? That would make sure we always work with powers of two. It's the approach I recommend.

EDIT: I wonder if it would make sense to have a private type PowerOfTwo, which would be a wrapper around mlir.index to represent integers that are know, at compile-time, to be power of two. By adding a debug_assert in the constructor, this can help us debug when numbers that are not powers of two are misused as powers of two. This has also some value as it adds another information when reading the type in the struct. We can then implement the __div__, __mul__, __mod__ to use bit operations, and we're garanteed that optimizations will be used everywhere one of those ops are called. What do you think? Usually, having bit operations throughout of codebase makes it hard to debug if something goes wrong. This approach could help by grouping some of those bits ops in a type.

modularbot · 2024-06-19T13:21:23Z

Landed in 504ed85! Thank you for your contribution 🎉

…(#41913) [External] [Stdlib] Speedup `Dict` (changing modulus to bitshifting) Hello, it could be a nice improvement, around +80% here (Ubuntu); Hard to tell without feedbacks, here is the benchmark used: ```mojo from time import now from random import * from sys.param_env import is_defined from __dict_simd import Dict2 alias iteration_size = 128 def main(): var result: Int=0 var start = now() var stop = now() small = Dict2[Int,Int]() start = now() for x in range(100): for i in range(iteration_size): small[i]=i for i in range(iteration_size): result += small[i] stop = now() print(stop-start, result) result = 0 small2 = Dict[Int,Int]() start = now() for x in range(100): for i in range(iteration_size): small2[i]=i for i in range(iteration_size): result += small2[i] stop = now() print(stop-start, result) ``` 486225 812800 807652 812800 --- Because the dict always augment the reserved size by `<<=1`, it is possible to use `&=(self.reserved-1)` instead of modulus :partying_face: hope you got the same improvements as here Co-authored-by: rd4com <144297616+rd4com@users.noreply.github.com> Closes #3071 MODULAR_ORIG_COMMIT_REV_ID: 522c3ccd50ac350ea321951ac526bdf137deaac7

rd4com requested a review from a team as a code owner June 17, 2024 21:35

JoeLoser requested review from bethebunny and rparolin June 17, 2024 21:42

[Stdlib] Speedup Dict (changing modulus to bitshifting)

7cb210f

Signed-off-by: rd4com <144297616+rd4com@users.noreply.github.com>

rd4com force-pushed the dict_improve branch from 9d29c48 to 7cb210f Compare June 17, 2024 21:46

[Stdlib] Speedup Dict (changing modulus to bitshifting)

8f84574

Signed-off-by: rd4com <144297616+rd4com@users.noreply.github.com>

bethebunny approved these changes Jun 17, 2024

View reviewed changes

modular-automation bot assigned JoeLoser Jun 17, 2024

modularbot added the imported-internally Signals that a given pull request has been imported internally. label Jun 18, 2024

modularbot added the merged-internally Indicates that this pull request has been merged internally label Jun 18, 2024

modularbot added the merged-externally Merged externally in public mojo repo label Jun 19, 2024

modularbot closed this Jun 19, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Stdlib] Speedup `Dict` (changing modulus to bitshifting) #3071

[Stdlib] Speedup `Dict` (changing modulus to bitshifting) #3071

rd4com commented Jun 17, 2024

bethebunny left a comment •

edited

Loading

JoeLoser commented Jun 17, 2024

bethebunny commented Jun 17, 2024

bethebunny commented Jun 17, 2024

JoeLoser commented Jun 18, 2024

modularbot commented Jun 18, 2024

gabrieldemarmiesse commented Jun 18, 2024 •

edited

Loading

modularbot commented Jun 19, 2024

[Stdlib] Speedup Dict (changing modulus to bitshifting) #3071

[Stdlib] Speedup Dict (changing modulus to bitshifting) #3071

Conversation

rd4com commented Jun 17, 2024

bethebunny left a comment • edited Loading

Choose a reason for hiding this comment

JoeLoser commented Jun 17, 2024

bethebunny commented Jun 17, 2024

bethebunny commented Jun 17, 2024

JoeLoser commented Jun 18, 2024

modularbot commented Jun 18, 2024

gabrieldemarmiesse commented Jun 18, 2024 • edited Loading

modularbot commented Jun 19, 2024

[Stdlib] Speedup `Dict` (changing modulus to bitshifting) #3071

[Stdlib] Speedup `Dict` (changing modulus to bitshifting) #3071

bethebunny left a comment •

edited

Loading

gabrieldemarmiesse commented Jun 18, 2024 •

edited

Loading