Skip to content

bpo-46848: Optimize mmap.find() with memmem(3) #31554

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 4 commits into from

Conversation

rumpelsepp
Copy link
Contributor

@rumpelsepp rumpelsepp commented Feb 24, 2022

This merge requests optimizes mmap.find() by using libc's memmem(3) function (which is optimized) instead of using a trivial for loop. I could not find an equivalent solution for mmap.rfind() yet.

The following snippet runs 100% faster with this patch applied. The file used is a logfile of 2.3 GB size

with open(sys.argv[1], 'r+b') as file_:
    with mmap.mmap(file_.fileno(), 0, access=mmap.ACCESS_READ) as file:
        cur_offset = 0
        while True:
            cur_offset = file.find(b"\n", cur_offset)
            if cur_offset == -1:
                break
            cur_offset += 1

With this patch applied:

rumpelsepp@kronos ~/P/v/c/debug (main)> time ./python test.py /tmp/log

________________________________________________________
Executed in    3,12 secs    fish           external
   usr time    2,94 secs    1,34 millis    2,94 secs
   sys time    0,15 secs    0,00 millis    0,15 secs

Without this patch:

rumpelsepp@kronos ~/P/v/c/debug (mmap)> time ./python test.py /tmp/log

________________________________________________________
Executed in    6,47 secs    fish           external
   usr time    6,53 secs  248,91 millis    6,29 secs
   sys time    0,18 secs   32,26 millis    0,15 secs

https://bugs.python.org/issue46848

@rumpelsepp
Copy link
Contributor Author

Will fix the failing tests ASAP.

@rumpelsepp rumpelsepp force-pushed the mmap branch 4 times, most recently from 2fd87a0 to 7f81d2d Compare February 24, 2022 15:14
@sweeneyde
Copy link
Member

Another thing you could try out and compare is using the fastsearch functions defined in Objects/stringlib, which should generally be faster than that naive loop and also platform independent.

start_p = self->data + start;
end_p = self->data + end;

#ifdef UNIX
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The function is not specified in POSIX standard. Please add a feature check to configure.ac and check for HAVE_MEMMEM.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@bedevere-bot
Copy link

A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated.

Once you have made the requested changes, please leave a comment on this pull request containing the phrase I have made the requested changes; please review again. I will then notify any core developers who have left a review that you're ready for them to take another look at this pull request.

@rumpelsepp rumpelsepp force-pushed the mmap branch 2 times, most recently from e378cef to 6fb16fd Compare February 28, 2022 23:43
@rumpelsepp
Copy link
Contributor Author

#31625 provides a platform independent proposal.

@rumpelsepp rumpelsepp closed this Mar 1, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants