Skip to content

Conversation

@garrettwrong
Copy link
Collaborator

Stashing WIP before vacation. Still needs a lot of cleanup and testing.

@garrettwrong garrettwrong added enhancement New feature or request cleanup Optimization Performance or Resource Optimzation labels Dec 18, 2024
@garrettwrong garrettwrong self-assigned this Dec 18, 2024
@codecov
Copy link

codecov bot commented Dec 18, 2024

Codecov Report

Attention: Patch coverage is 97.72727% with 2 lines in your changes missing coverage. Please review.

Project coverage is 90.63%. Comparing base (a94bf0d) to head (dea12ca).
Report is 14 commits behind head on develop.

Files with missing lines Patch % Lines
src/aspire/classification/averager2d.py 97.56% 2 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff             @@
##           develop    #1216      +/-   ##
===========================================
- Coverage    90.66%   90.63%   -0.04%     
===========================================
  Files          132      132              
  Lines        13707    13702       -5     
===========================================
- Hits         12428    12419       -9     
- Misses        1279     1283       +4     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@garrettwrong garrettwrong requested a review from j-c-c January 3, 2025 19:01
@garrettwrong
Copy link
Collaborator Author

Passing along for initial review. I believe #1214 should go in first, then this will need to be rebased, maybe resolving some conflicts along the way.

j-c-c
j-c-c previously approved these changes Jan 13, 2025
Copy link
Collaborator

@j-c-c j-c-c left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good!

@garrettwrong garrettwrong marked this pull request as ready for review January 13, 2025 15:43
@garrettwrong garrettwrong requested a review from janden as a code owner January 13, 2025 15:43
Copy link
Collaborator

@janden janden left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great! Just two things.

# for the argmax alignment test.
base_img = _coef[0].reshape(self.alignment_basis.complex_count, 1)

# (cnt, n_transl) * (cnt, 1) -> (cnt, n_transl)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(cnt, n_rot) * (cnt, 1) -> (cnt, n_rot)?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure :), changed.

@garrettwrong
Copy link
Collaborator Author

I want to discuss/confirm the class hierarchy during our meeting and if no changes come out of that I'll merge this then.

@garrettwrong
Copy link
Collaborator Author

Factored the base image rotation table outside the shift loop like we discussed in the meeting.

@janden
Copy link
Collaborator

janden commented Jan 16, 2025

Factored the base image rotation table outside the shift loop like we discussed in the meeting.

Cool. Any speedup?

@garrettwrong
Copy link
Collaborator Author

Factored the base image rotation table outside the shift loop like we discussed in the meeting.

Cool. Any speedup?

Yes. Using GPU mode for 179px and 50 nbrs this saves about 0.3s per class (44.5 vs 44.8 s). I'll call it 1%. I think relative to the matmuls the vector vector broadcast multiplication is very fast for the GPU. On the host, much larger improvement, more like 10%.

Either way across 3-100k classes I'll definitely take it :). Thanks.

@garrettwrong garrettwrong merged commit 1860ef2 into develop Jan 17, 2025
35 checks passed
@garrettwrong garrettwrong deleted the batch_class_avg branch January 17, 2025 13:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cleanup enhancement New feature or request Optimization Performance or Resource Optimzation

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants