Slicing for std calculation #110

McHaillet · 2024-02-19T10:47:33Z

This PR improves the calculation of the standard deviation over a template matching search. It closes #97.

First some background on why I updated this. Template matching has a huge search space (N_voxels * rotations) which is mainly false positives, and has in comparison a tiny fraction of true positives. If we have a Gaussian for the background (with expected mean 0 and some standard deviation), the false alarm rate can be calculated for a certain cut-off value, as it is dependent on the size of the search space. For example, a false alarm rate of (N_voxels * rotations)^(-1), indicates it would expect 1 false positive in the whole search. This can be calculated with error function,

$$N^{-1} = erfc( \theta / ( \sigma \sqrt{2} ) ) / 2$$

, where theta is the cut-off, sigma the standard deviation of the Gaussian, and N the search space.

The search space is easily calculated, the standard deviation can be kept tracked of for each orientation (which is what this custom square_sum_kernel is used for). However there were true problems:

with the next_fast_len for ffts the volume was padded with zeros that were also incorporated in the square_sum_kernel.
with subvolume splitting I added overhang between subvolumes, needed for accurate scores, but these regions were incorporated doubly in the standard deviation calculation.

I fixed it by passing a search_volume_roi to template matching which contains the actual region of interest wihout fft padding and without template overhang. The indexing is already calculated, so I just created a slicing for the cupy array.

I first added tests to check whether the search_space and std are consistent with subvolume splitting (and also rotation splitting), and realised there was a bug in tmjob.py in calculating the start of the subvolume (lines 380 to 384): I did not put brackets around template_size // 2 which messes with integer division with the minus sign in front 😩 . This is now fixed.

I also added plotting of an extraction graph which shows a histogram of extracted scores together with the background Gaussian (as this is now properly estimated). Which I think is nice for users to see.

Let me know if anything is unclear.

…e results; add failing test for statistics of split jobs that should work

sroet

1 suggestion and 1 line was added twice, LGTM otherwise

src/pytom_tm/matching.py

tests/test_tmjob.py

sroet

(selected the wrong option)

Co-authored-by: Sander Roet <sanderroet@hotmail.com>

sroet

LGTM, feel free to merge

McHaillet added 16 commits February 14, 2024 16:07

move splitting with offset to new test as it does not needs any of th…

2430be8

…e results; add failing test for statistics of split jobs that should work

put slicing for std calculation in

cf90508

add optional search region of interest for calculating search statistics

7bdbecc

dont use wrong API

3caf758

add extraction statistics plot

f82827b

spelling error in color for plot

2ed2308

explicitly set bottom y limit

07309cd

tweak plot params

26a8c16

adjusted figure labels"

f712e0a

remove printing of search slices

10a392f

use search space and std parameters from job consistently

d87f30f

fixed subvolume indexing to obtain correct search_space

11fa74c

add stats test for split rotation search

e3afe0d

remove commented old code

baa5a13

add comment for Gaussian calculation

b8bb52b

revert to previous sub_step

4a88af6

McHaillet requested a review from sroet February 19, 2024 11:17

sroet approved these changes Feb 19, 2024

View reviewed changes

src/pytom_tm/matching.py Outdated Show resolved Hide resolved

tests/test_tmjob.py Outdated Show resolved Hide resolved

sroet requested changes Feb 19, 2024

View reviewed changes

Apply suggestions from code review

aa9d4e1

Co-authored-by: Sander Roet <sanderroet@hotmail.com>

McHaillet requested a review from sroet February 19, 2024 12:23

sroet approved these changes Feb 19, 2024

View reviewed changes

McHaillet merged commit 9fd8f28 into SBC-Utrecht:main Feb 19, 2024

McHaillet deleted the slicing-for-std-calculation branch February 19, 2024 12:59

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Slicing for std calculation #110

Slicing for std calculation #110

McHaillet commented Feb 19, 2024

sroet left a comment

sroet left a comment

sroet left a comment

Slicing for std calculation #110

Slicing for std calculation #110

Conversation

McHaillet commented Feb 19, 2024

sroet left a comment

Choose a reason for hiding this comment

sroet left a comment

Choose a reason for hiding this comment

sroet left a comment

Choose a reason for hiding this comment