Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pattern Matching: Compare experimental with simulated data #233

Merged
merged 59 commits into from Oct 22, 2020

Conversation

onatlandsmyr
Copy link
Collaborator

@onatlandsmyr onatlandsmyr commented Oct 16, 2020

Matching diffraction patterns

Compare experimental and simulated patterns and return keep_n sorted match results. The match result consists of simulation indices and metric results having the shape; (nx*ny, keep_n).

The functions use dask functionality but accept both dask and numpy arrays as inputs. It is possible to divide the computation by slicing up simulations regardless of inputs being dask or numpy arrays. This is achieved by passing n_slices as keyword to pattern_match.

Mainly to be used by StaticDictionaryIndexing and DynamicDictionaryIndexing and not directly by the user.

This PR depends on #231 .

Progress of the PR

Minimal example of the bug fix or new feature

>>> import kikuchipy as kp
>>> import numpy as np
>>> from kikuchipy.indexing import pattern_match
>>> s = kp.signals.EBSD(np.zeros((10, 10, 10, 10)))
>>> s = kp.signals.EBSD(np.zeros((10, 10, 10, 10)))
>>> simulated = np.zeros((1000, 10, 10))
>>> simulation_indices, metric_results = pattern_match(s.data, simulated, keep_n=30, metric="zncc")
>>> s_best_match = kp.signals.EBSD(simulated[simulation_indices[:, 0]].reshape(10,10))

For reviewers

  • Check that the PR title is short, concise, and will make sense 1 year
    later.
  • Check that new functions are imported in corresponding __init__.py.
  • Check that new features, API changes, and deprecations are mentioned in
    the unreleased section in doc/changelog.rst.

onatlandsmyr and others added 30 commits October 12, 2020 23:57
Framework for calculating similarities between 2D gray-tone images of equal size.
length of shape -> ndim, and removed unnecessary squeeze call.
Produce value of 1 with equal pattern and template
Signed-off-by: Håkon Wiik Ånes <hwaanes@gmail.com>
Signed-off-by: Håkon Wiik Ånes <hwaanes@gmail.com>
Signed-off-by: Håkon Wiik Ånes <hwaanes@gmail.com>
Signed-off-by: Håkon Wiik Ånes <hwaanes@gmail.com>
Change data shape from (N,nm) to (nm,N) to correspond better with cdist in scipy and general logic.
Signed-off-by: Håkon Wiik Ånes <hwaanes@gmail.com>
When compute=False and n_slices is not None.
@hakonanes
Copy link
Member

I think we should rename this submodule to pattern matching, because many notable resources use "template matching" for describing finding a smaller image template in a larger image:

Further, I think renaming patterns -> experimental and templates -> simulated should be done.

What do you think, @onatlandsmyr?

@onatlandsmyr
Copy link
Collaborator Author

onatlandsmyr commented Oct 19, 2020

That sounds better and more clear. I guess I should also change patterns and templates throughout SimilarityMetric/#231, for consistency. Pattern matching is probably better since we have an implicit understanding of it being diffraction patterns. Out of context, they both would be the same thing for me. "Simulation matching" may be an alternative, but I'll stick with pattern matching for now.

And more importantly: patterns->experimental, templates->simulated
@onatlandsmyr onatlandsmyr changed the title Template Matching: Compare patterns with templates and keep n sorted results Pattern Matching: Compare experimental with simulated data and keep n sorted results Oct 20, 2020
@onatlandsmyr onatlandsmyr changed the title Pattern Matching: Compare experimental with simulated data and keep n sorted results Pattern Matching: Compare experimental with simulated data Oct 20, 2020
@hakonanes
Copy link
Member

That sounds better and more clear. I guess I should also change patterns and templates throughout SimilarityMetric/#231, for consistency. Pattern matching is probably better since we have an implicit understanding of it being diffraction patterns. Out of context, they both would be the same thing for me. "Simulation matching" may be an alternative, but I'll stick with pattern matching for now.

I agree that simulation matching would be equally descriptive. Pattern matching is used in the literature by e.g. Gert Nolze, Aimo Winkelmann, Angus Wilkinson, and others. Therefore I think it is safe to call it that!

@hakonanes
Copy link
Member

I've closed one review comment and re-commented on the remaining three. I'll go over the PR one more time after these are resolved, and then we'll see if anything remains.

I'll update this branch with master and solve any potential conflicts now.

Also removed match_result tuple
@hakonanes
Copy link
Member

Great work, @onatlandsmyr!

I'm touching up formatting and API reference now, will then approve and we'll let the checks pass a last time before merging.

@onatlandsmyr
Copy link
Collaborator Author

Thank you, @hakonanes, for a great review and putting the finishing touch! It's a pleasure contributing to kikuchipy.

Signed-off-by: Håkon Wiik Ånes <hwaanes@gmail.com>
@hakonanes hakonanes marked this pull request as ready for review October 22, 2020 14:56
@hakonanes hakonanes self-requested a review October 22, 2020 14:56
@hakonanes
Copy link
Member

Thank you, @hakonanes, for a great review and putting the finishing touch! It's a pleasure contributing to kikuchipy.

Thanks, that's nice to hear (:

@hakonanes
Copy link
Member

I see that two tests where you divide by zero throw warnings:

kikuchipy/indexing/tests/test_pattern_matching.py::TestPatternMatching::test_pattern_match_one_to_one
  /home/hakon/kode/kikuchipy/kikuchipy/indexing/similarity_metrics.py:402: RuntimeWarning: invalid value encountered in true_divide
    expt /= (expt ** 2).sum(axis=expt_sum_axis, keepdims=True) ** 0.5

kikuchipy/indexing/tests/test_pattern_matching.py::TestPatternMatching::test_pattern_match_one_to_one
  /home/hakon/kode/kikuchipy/kikuchipy/indexing/similarity_metrics.py:403: RuntimeWarning: invalid value encountered in true_divide
    sim /= (sim ** 2).sum(axis=sim_sum_axis, keepdims=True) ** 0.5

I would like to be warned when there are zero-intensities in my patterns, so I think this is okay. However, the tests shouldn't throw these I think, so I'll just add a np.errstate catch.

Signed-off-by: Håkon Wiik Ånes <hwaanes@gmail.com>
@hakonanes hakonanes merged commit 191b06b into pyxem:master Oct 22, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants