Skip to content

Improve byte conversion speeds#12

Merged
marcelm merged 10 commits into
marcelm:mainfrom
rhpvorderman:fasterbytes
Sep 15, 2021
Merged

Improve byte conversion speeds#12
marcelm merged 10 commits into
marcelm:mainfrom
rhpvorderman:fasterbytes

Conversation

@rhpvorderman
Copy link
Copy Markdown
Collaborator

Hi. Using .encode('ascii') is a nice way to retrieve bytes, but a dedicated cython method that does the same is two times faster. This leads to noticable results on my fastq filter project:

before:

Benchmark #1: fastq-filter -o /dev/null "median_quality:20" ~/test/500000reads.fastq
  Time (mean ± σ):     608.5 ms ±   9.6 ms    [User: 585.9 ms, System: 22.4 ms]
  Range (min … max):   596.4 ms … 621.6 ms    10 runs

After:

Benchmark #1: fastq-filter -o /dev/null "median_quality:20" ~/test/500000reads.fastq
  Time (mean ± σ):     568.6 ms ±  11.7 ms    [User: 547.4 ms, System: 21.2 ms]
  Range (min … max):   550.0 ms … 583.2 ms    10 runs

That is quite a substantial upgrade in runtime.

Also. I found that writing back to a file was quite slow. So I updated the fastq_bytes method. I tested several methods benchmarked them and then choose the fastest. The process can be seen in the commits.

It turns out that encoding is not necessarily slow, but Python's methods of joining strings together are. using b"'.join([...]) already decreased the runtime by 30% compared to the + methods. But I decided it was still slow and opted to do raw C-API calls and memcpy. That made it more than 3 times faster.
The result is quite noticable when writing fastq files now. See below, where I benchmarked both read and read_and_write. The write speed is the difference between the two.

Before:

Benchmark #1: python dnaio_read.py ~/test/big2.fastq
  Time (mean ± σ):      2.158 s ±  0.021 s    [User: 2.030 s, System: 0.127 s]
  Range (min … max):    2.135 s …  2.190 s    10 runs

Benchmark #1: python dnaio_read_and_write.py ~/test/big2.fastq /dev/null
  Time (mean ± σ):      4.353 s ±  0.036 s    [User: 4.199 s, System: 0.153 s]
  Range (min … max):    4.306 s …  4.406 s    10 runs

All records are written in about 2.2 seconds.

After

Benchmark #1: python dnaio_read.py ~/test/big2.fastq
  Time (mean ± σ):      2.215 s ±  0.020 s    [User: 2.069 s, System: 0.145 s]
  Range (min … max):    2.175 s …  2.240 s    10 runs

Benchmark #1: python dnaio_read_and_write.py ~/test/big2.fastq /dev/null
  Time (mean ± σ):      3.535 s ±  0.035 s    [User: 3.369 s, System: 0.166 s]
  Range (min … max):    3.486 s …  3.578 s    10 runs

All records are written in about 1.3 seconds.

That is quite a good improvement.

Copy link
Copy Markdown
Owner

@marcelm marcelm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool! I used your benchmarking script to test another variant that uses f-strings because I recently heard they’re quite fast. So for reference, this version

def fastq_bytes(self):
   return f"@{self.name}\n{self.sequence}\n+\n{self.qualities}\n".encode("ascii")

comes quite close to your improvements (your version: 0.13, f-strings: 0.15). But in this case, giving up readability is fine.

I haven’t quite finished my review, but see some comments already now.

Comment thread src/dnaio/_core.pyx
Comment thread src/dnaio/_core.pyx Outdated
Comment thread src/dnaio/_core.pyx Outdated
Comment thread src/dnaio/_core.pyx Outdated
Comment thread src/dnaio/_core.pyx Outdated
@rhpvorderman
Copy link
Copy Markdown
Collaborator Author

comes quite close to your improvements (your version: 0.13, f-strings: 0.15). But in this case, giving up readability is fine.

Oh, that is interesting. I hadn't considered f-strings because these were actually slower than the original fastq_bytes implementation. But that was using pure python. I can imagine that Cython converts it automatically to some code that resembles the Python C-API calling code that I made. I think the hand-crafted code is faster because the '\n', '@' and '+' characters are not represented in a python string but are directly added in memory as a raw integer, thus saving some overhead.

@marcelm
Copy link
Copy Markdown
Owner

marcelm commented Sep 15, 2021

Cython converts it automatically to some code that resembles the Python C-API calling code

Exactly, it creates a temporary tuple, fills it one by one with the properly formatted parts of the f-string and then runs PyUnicode_Join to create the final result.

Comment thread src/dnaio/_core.pyx Outdated
Comment thread src/dnaio/_core.pyx Outdated
@rhpvorderman
Copy link
Copy Markdown
Collaborator Author

@marcelm. I adressed the second review comments.

@marcelm
Copy link
Copy Markdown
Owner

marcelm commented Sep 15, 2021

Perfect, thanks, this is a nice improvement!

@marcelm marcelm merged commit ef8f341 into marcelm:main Sep 15, 2021
@rhpvorderman rhpvorderman deleted the fasterbytes branch September 17, 2021 07:40
@rhpvorderman
Copy link
Copy Markdown
Collaborator Author

Thanks for merging! Always a pleasure to work with you!

@rhpvorderman
Copy link
Copy Markdown
Collaborator Author

Sorry for bothering you, but when is a new release planned? I would like fastq-filter to depend on the next version of dnaio so I can use the qualities_as_bytes method. Thanks!

@marcelm
Copy link
Copy Markdown
Owner

marcelm commented Sep 28, 2021

Doing it now

@marcelm
Copy link
Copy Markdown
Owner

marcelm commented Sep 28, 2021

Version 0.6.0 is now on PyPI

@rhpvorderman
Copy link
Copy Markdown
Collaborator Author

Thank you for your quick response!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants