Conversation
Codecov Report
@@ Coverage Diff @@
## main #7 +/- ##
==========================================
- Coverage 98.62% 98.50% -0.13%
==========================================
Files 7 7
Lines 364 800 +436
==========================================
+ Hits 359 788 +429
- Misses 5 12 +7
Continue to review full report at Codecov.
|
|
Following discussion in https://julialang.zulipchat.com/#narrow/stream/137791-general/topic/VectorizedStatistics.20.2F.20dealing.20with.20high-dimensional.20arrays, the default syntax is now |
|
We should talk about size thresholding at some point. LoopVectorization already does this (and uses cost modeling to try and guess good cutoffs). |
|
Oh cool! Yeah, let me know what I can do on that front |
This is an attempt at adding multithreaded
vtoptions for most functions. I'm not sure if this is an option we want to add, but it's easy enough to add by swapping in@tturbos here and there. On my 4-core AVX2 laptop, I generally see net speedups (over single-threaded vectorized version) for arrays with about 10k elements or more:100 elements (not faster yet)
10,000 elements (slight speedup)
10^6 elements (significant speedup)