Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance improvements, torch.compile() and benchmark #37

Merged
merged 31 commits into from
Aug 31, 2023
Merged

Conversation

Phil26AT
Copy link
Collaborator

@Phil26AT Phil26AT commented Jul 27, 2023

  • benchmark script for LightGlue on example images
  • prefer torch sdp over official flash_attn
  • add heuristics to disable pruning overhead on pairs with few keypoints
  • plenty of performance improvements
  • add support for torch.compile (jit) with static shapes (but supports adaptive-depth!). auto-pad to static shapes if in compile mode. This yields very large performance improvements with few keypoints

RTX 3080:
benchmark

Intel i7 10700K:
benchmark_cpu

@Phil26AT Phil26AT requested a review from sarlinpe July 27, 2023 19:35
@Phil26AT Phil26AT mentioned this pull request Jul 31, 2023
@Phil26AT Phil26AT changed the title Performance improvements and benchmark Performance improvements, torch.compile() and benchmark Jul 31, 2023
@Phil26AT Phil26AT merged commit 5a9e87d into main Aug 31, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants