Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add SIMD Support for ARM #7

Closed
wants to merge 9 commits into from
Closed

Add SIMD Support for ARM #7

wants to merge 9 commits into from

Conversation

cauliyang
Copy link
Contributor

@cauliyang cauliyang commented Mar 2, 2023

  • Support ARM
  • Refactor code
  • Make simd a feature
  • Add benchmarking result
    - [ ] Add python binding
    - [ ] Add CI

We can merge this branch with the master. The user has the option to utilize coitrees = {version ="*", default-features = false, features = ["neon"]} or coitrees = {version ="*", default-features = false, features = ["avx"]} to use SIMD on different platforms. Otherwise, the user may use the default implementation. If you wish to try this version earlier, please use coitrees = { git = "https://github.com/cauliyang/coitrees", branch = "avx", default-features = false, features = ["neon"]}

Note

Currently, all tests and examples are passed. However, I will add more tests later.

To evaluate its performance, I used two genomic interval datasets A and B from ga4gh.
Datasets A and B contain 4816112 and 44426501 genomic intervals, respectively.
I compared the performance with and without the neon as well as existing tools including bedtools (v2.30.0), ailist (v0.1.1), and bedtk (v0.0-r25-dirty).
I query every genomic interval of dataset A on dataset B.

T1: Runtime of tools on the sorted dataset

Command Mean [s] Min [s] Max [s] Relative
coitree-neon 3.53 3.50 3.57 1.00
coitree-default 4.36 4.28 4.44 1.24
ailist 5.63 5.54 5.87 1.59
bedtk cov 4.70 4.67 4.75 1.33
bedtools coverage -counts -sorted 13.48 13.40 13.64 3.82

T2: Runtime of tools on the unsorted dataset

Command Mean [s] Min [s] Max [s] Relative
coitree-neon 5.46 5.43 5.50 1.00
coitree-default 6.41 6.35 6.44 1.17
ailist 6.49 6.48 6.49 1.19
bedtk cov 7.11 6.97 7.18 1.30
bedtools coverage -counts 256.08 244.48 276.65 46.88

All experiments were run on a computer with macOS 12.6.2 21G320 arm64 and 32 GB of memory. hyperfine is the benchmarking tool.

@dcjones

@cauliyang cauliyang mentioned this pull request Mar 2, 2023
@jianshu93
Copy link

This is amazing! Really fast and efficient. I will test it recently and get back.

Thanks,

Jianshu

@dcjones
Copy link
Owner

dcjones commented Mar 14, 2023

Wow, great work! Sorry I didn't notice this earlier.

I'll get access to an arm machine to test this myself, and go through the code more carefully, but otherwise is it ready to merge? Python bindings at least seem like something that could be a separate PR.

@cauliyang
Copy link
Contributor Author

cauliyang commented Mar 14, 2023

Hi @dcjones,

Thank you for your kind words! I'm glad to hear that. I will double-check the code as well and add more tests tonight since it's always a good idea to test it thoroughly before merging. As for the Python bindings, you're right that they could be a separate PR, and I will continue to work on that. Let me know if you have any further questions or concerns. Thanks for your time and review!

Yangyang

@cauliyang
Copy link
Contributor Author

cauliyang commented Jul 20, 2023

Hi @dcjones, I plan to move forward with the PR recently.

current design:

  1. Use simd as the default feature (e.g. coitree = "*".
  2. provide nosimd feature to use the default implementation as a backup. (e.g. coitree = { default-features = false, features = ["nosimd"]}

In addition,

I have added the GitHub actions as CI (but they still need to be tested and bugs fixed). I will add more tests to verify the neon implementation.

We may need to consider the Windows platform as well.

After finishing the PR, I will try to provide Python APIs in another PR.

Overall, What do you think?

@dcjones
Copy link
Owner

dcjones commented Jul 28, 2023

Hi @cauliyang,

Sorry, I didn't mean to ignore this. I think this is really cool and want to merge it, I just haven't found the time to fully review. Hopefully soon!

Python bindings and enabling CI are both good ideas, but pretty orthogonal to the reorganization and ARM support you've done. It may be better if python bindings are a separate repo. Nevertheless, I don't think either should be part of this PR, or delay merging it.

@cauliyang
Copy link
Contributor Author

cauliyang commented Jul 28, 2023

Hi @dcjones, That does not matter at all. I have procrastinated on the PR for a long time :).

I totally agree with you, and I just listed my ideas. We should separate the CI and API as independent PRs or repos.

After we make it stable, we can conduct the benchmark comprehensively. Then, port the project to Python. That will let more people reach and benefit from the project.

Another thing we need to pay attention to is that there is a little difference between the API of "default (no SIMD) and SIMD.
It will be better if we can make them consistent.

@cauliyang cauliyang closed this Aug 19, 2023
@cauliyang cauliyang mentioned this pull request Aug 19, 2023
10 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants