Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pattern Matching: Compare experimental with simulated data #233

Merged
merged 59 commits into from Oct 22, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
59 commits
Select commit Hold shift + click to select a range
1fc5bd8
feat(SimilarityMetric): ZNCC, NDP and user defined metrics
onatlandsmyr Oct 12, 2020
2e8adc9
fix(SimilarityMetric): squeeze dimensions of similarity matrix
onatlandsmyr Oct 12, 2020
9d42d70
Merge branch 'similarity-metrics'
onatlandsmyr Oct 12, 2020
ce40f63
fix(SimiliarityMetric): rechunk dask arrays after type conversion
onatlandsmyr Oct 13, 2020
8d1b68b
Merge branch 'similarity-metrics'
onatlandsmyr Oct 13, 2020
089e0d1
refactor(SimilarityMetric)
onatlandsmyr Oct 13, 2020
83e87c7
test(SimilarityMetric): Many to many ZNCC
onatlandsmyr Oct 13, 2020
0754523
refactor: removed metricscopes including ANY
onatlandsmyr Oct 13, 2020
c1d70db
wip: tests cover most cases
onatlandsmyr Oct 13, 2020
096540e
test: remove print statement
onatlandsmyr Oct 13, 2020
0d5bbb4
Merge branch 'similarity-metrics' into template-matching
onatlandsmyr Oct 13, 2020
21c8c63
refactor: indexation -> indexing
onatlandsmyr Oct 13, 2020
405a587
Reformat docstrings, add Ole to credits, add indexing module to doc
hakonanes Oct 13, 2020
23d93d8
refactor: renamed variables and made functions private
onatlandsmyr Oct 13, 2020
aaa0f12
Merge remote-tracking branch 'origin/similarity-metrics' into similar…
onatlandsmyr Oct 13, 2020
e3f3815
Merge branch 'similarity-metrics' into template-matching
onatlandsmyr Oct 13, 2020
9eb4310
refactor: underscore prefix
onatlandsmyr Oct 13, 2020
6aa285e
Merge branch 'similarity-metrics' into template-matching
onatlandsmyr Oct 13, 2020
c992d0b
refactor: underscore prefix
onatlandsmyr Oct 13, 2020
fbe77be
Merge branch 'similarity-metrics' into template-matching
onatlandsmyr Oct 13, 2020
eed1303
Update docstring table, and more
hakonanes Oct 13, 2020
9ed54a2
Merge branch 'similarity-metrics' of github.com:onatlandsmyr/kikuchip…
hakonanes Oct 13, 2020
b6f729f
Add indexing module to kikuchipy/__init__.py
hakonanes Oct 13, 2020
f6388c5
Add static dictionary indexing note in changelog
hakonanes Oct 13, 2020
be02b8d
fix: change standard shape of similarity matrix
onatlandsmyr Oct 16, 2020
635e24d
test: update tests to new output standard
onatlandsmyr Oct 16, 2020
97ea036
Merge remote-tracking branch 'origin/similarity-metrics' into similar…
onatlandsmyr Oct 16, 2020
c8035c3
Merge branch 'similarity-metrics' into template-matching
onatlandsmyr Oct 16, 2020
a9573d8
fix: _is_compatible to be working for all scopes
onatlandsmyr Oct 16, 2020
2555edf
Clarify parameters in docstrings, some minor syntax changes, repr
hakonanes Oct 16, 2020
85aa302
Merge branch 'similarity-metrics' of github.com:onatlandsmyr/kikuchip…
hakonanes Oct 16, 2020
376e3dc
Merge branch 'similarity-metrics' into template-matching
onatlandsmyr Oct 16, 2020
746893c
Docstring explanation of _is_compatible()
hakonanes Oct 16, 2020
ac5cc3b
Merge branch 'similarity-metrics' into template-matching
onatlandsmyr Oct 16, 2020
dff8456
Return numpy where only numpy arrays are passed, test repr
hakonanes Oct 16, 2020
ed6104a
Improve return numpy/dask test
hakonanes Oct 16, 2020
182f3a0
Merge branch 'similarity-metrics' into template-matching
onatlandsmyr Oct 16, 2020
a0725e9
feat(template_match): Keep n largest metric results and template indices
onatlandsmyr Oct 16, 2020
6402e33
tests: simple tests covering all lines
onatlandsmyr Oct 16, 2020
3f3e7eb
refactor: move code to give user more specific error
onatlandsmyr Oct 16, 2020
869746d
Merge remote-tracking branch 'pyxem/master' into similarity-metrics
onatlandsmyr Oct 16, 2020
1ad6d65
Merge branch 'similarity-metrics' into template-matching
onatlandsmyr Oct 16, 2020
fa0e981
refactor: removed comment
onatlandsmyr Oct 16, 2020
4a6feeb
docs: template_match and slicing of templates
onatlandsmyr Oct 17, 2020
8071674
fix: accept all scopes in template_match
onatlandsmyr Oct 17, 2020
3d1fed6
refactor: renamed variables and removed unnecessary code
onatlandsmyr Oct 17, 2020
083e390
refactor: moved code to read metric str earlier
onatlandsmyr Oct 18, 2020
192fbb0
docs: template slices and punctuation marks.
onatlandsmyr Oct 18, 2020
5a7167b
fix: raise NotImplementedError
onatlandsmyr Oct 18, 2020
bb0ad78
refactor: template_match -> pattern_match
onatlandsmyr Oct 19, 2020
08d31be
refactor(SimilarityMetric): patterns->experimental, templates->simulated
onatlandsmyr Oct 20, 2020
ad46855
docs: data -> patterns
onatlandsmyr Oct 20, 2020
754b993
test: slicing and compute=False raise NotImplementedError
onatlandsmyr Oct 20, 2020
8044627
Merge branch 'master' of https://github.com/pyxem/kikuchipy into temp…
hakonanes Oct 22, 2020
90a95dd
refactor: split pattern_match more nicely into two
onatlandsmyr Oct 22, 2020
7f34857
Merge remote-tracking branch 'origin/template-matching' into pattern-…
onatlandsmyr Oct 22, 2020
e58e405
refactor: metric_result -> scores
onatlandsmyr Oct 22, 2020
1af5fdb
Touch up docstring formatting, add to API ref, update changelog
hakonanes Oct 22, 2020
9cf3b47
Update test to not throw runtime warning
hakonanes Oct 22, 2020
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
7 changes: 4 additions & 3 deletions doc/changelog.rst
Expand Up @@ -22,9 +22,10 @@ Contributors

Added
-----
- Indexing of EBSD patterns through template matching of the patterns with a
static dictionary of simulated patterns with known orientations.
(`#231 <https://github.com/pyxem/kikuchipy/pull/231>`_)
- Indexing of EBSD patterns through matching of patterns with a static
dictionary of simulated patterns with known orientations.
(`#231 <https://github.com/pyxem/kikuchipy/pull/231>`_,
`#233 <https://github.com/pyxem/kikuchipy/pull/233>`_)
- Reader for EMsoft's simulated EBSD patterns returned by their ``EMEBSD.f90``
program. (`#202 <https://github.com/pyxem/kikuchipy/pull/202>`_)
- Modified Lambert mapping, and its inverse, from points on the unit sphere to a
Expand Down
15 changes: 15 additions & 0 deletions doc/reference.rst
Expand Up @@ -207,15 +207,30 @@ indexing
.. currentmodule:: kikuchipy.indexing

.. autosummary::
pattern_matching
similarity_metrics

pattern_matching
----------------

.. currentmodule:: kikuchipy.indexing.pattern_matching

.. autosummary::
pattern_match

.. automodule:: kikuchipy.indexing.pattern_matching
:members:
:undoc-members:
:show-inheritance:

similarity_metrics
------------------

.. currentmodule:: kikuchipy.indexing.similarity_metrics

.. autosummary::
make_similarity_metric
MetricScope

.. automodule:: kikuchipy.indexing.similarity_metrics
:members:
Expand Down
8 changes: 7 additions & 1 deletion kikuchipy/indexing/__init__.py
Expand Up @@ -18,8 +18,14 @@

"""Indexing of EBSD patterns."""

from kikuchipy.indexing.similarity_metrics import make_similarity_metric
from kikuchipy.indexing.similarity_metrics import (
make_similarity_metric,
MetricScope,
)
from kikuchipy.indexing.pattern_matching import pattern_match

__all__ = [
"make_similarity_metric",
"MetricScope",
"pattern_match",
]
270 changes: 270 additions & 0 deletions kikuchipy/indexing/pattern_matching.py
@@ -0,0 +1,270 @@
# -*- coding: utf-8 -*-
# Copyright 2019-2020 The kikuchipy developers
#
# This file is part of kikuchipy.
#
# kikuchipy is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# kikuchipy is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with kikuchipy. If not, see <http://www.gnu.org/licenses/>.

"""Matching of experimental to simulated gray-tone patterns."""

import sys
from typing import Union, Tuple

import dask.array as da
from dask.diagnostics import ProgressBar
import numpy as np

from kikuchipy.indexing.similarity_metrics import (
SIMILARITY_METRICS,
SimilarityMetric,
_get_nav_shape,
_get_number_of_simulated,
)

# TODO: Support masking signal space
# TODO: Support masking navigation space


def pattern_match(
experimental: Union[da.Array, np.ndarray],
simulated: Union[da.Array, np.ndarray],
keep_n: int = 1,
metric: Union[str, SimilarityMetric] = "zncc",
compute: bool = True,
n_slices: int = 1,
) -> Union[Tuple[np.ndarray, np.ndarray], Tuple[da.Array, da.Array]]:
"""Find the best matching simulations to experimental data based on
given `metric`.

Function is primarily for use in
:class:`~kikuchipy.indexing.StaticDictionaryIndexing` and
:class:`~kikuchipy.indexing.DynamicDictionaryIndexing`.

Parameters
----------
experimental : numpy.ndarray or dask.array.Array
Experimental patterns.
simulated : numpy.ndarray or dask.array.Array
Simulated patterns.
keep_n : int, optional
Number of match results to keep for each pattern, by default 1.
metric : str or SimilarityMetric
Similarity metric, by default "zncc".
compute : bool, optional
Whether to compute dask arrays before returning, by default
True.
n_slices : int, optional
Number of simulated slices to process sequentially, by default
1.

Returns
-------
simulation_indices : numpy.ndarray or dask.array.Array
Simulation indices corresponding with metric results.
scores : numpy.ndarray or dask.array.Array
Metric results with data shapes (ny*nx, keep_n). Sorted along
`keep_n` axis according to the metric used.
"""
metric = SIMILARITY_METRICS.get(metric, metric)
if not isinstance(metric, SimilarityMetric):
raise ValueError(
f"{metric} must be either of {list(SIMILARITY_METRICS.keys())} "
"or an instance of SimilarityMetric. See make_similarity_metric"
)

# Expects signal data to be located on the two last axis for all scopes
sig_data_shape = experimental.shape[-2:]
t_sig_shape = simulated.shape[-2:]
if sig_data_shape != t_sig_shape:
raise OSError(
f"The experimental {sig_data_shape} and simulated {t_sig_shape} "
"signal shapes are not identical"
)

if not metric._is_compatible(experimental.ndim, simulated.ndim):
raise OSError(
f"The shape of experimental {experimental.shape} and simulated "
f"{simulated.shape} are not compatible with the scope "
f"{metric.scope} of {type(metric).__name__}"
)

if n_slices == 1:
return _pattern_match_single_slice(
experimental,
simulated,
keep_n=keep_n,
metric=metric,
compute=compute,
)
else:
if not compute:
raise NotImplementedError(
"Slicing simulations and returning dask arrays is not "
"implemented"
)
return _pattern_match_slice_simulated(
experimental,
simulated,
keep_n=keep_n,
metric=metric,
n_slices=n_slices,
)


def _pattern_match_single_slice(
experimental: Union[np.ndarray, da.Array],
simulated: Union[np.ndarray, da.Array],
keep_n: int,
metric: SimilarityMetric,
compute: bool,
) -> Union[Tuple[np.ndarray, np.ndarray], Tuple[da.Array, da.Array]]:
"""See :func:`pattern_match`.

Parameters
----------
experimental : numpy.ndarray or dask.array.Array
Experimental patterns.
simulated : numpy.ndarray or dask.array.Array
Simulated patterns.
keep_n : int
Number of results to keep.
metric : SimilarityMetric
Similarity metric.
compute : bool
Whether to compute dask arrays before returning, by default
True.

Returns
-------
simulation_indices : numpy.ndarray or dask.array.Array
Simulation indices corresponding with metric results.
scores : numpy.ndarray or dask.array.Array
Metric results with data shapes (ny*nx, keep_n). Sorted along
`keep_n` axis according to the metric used.
"""
similarities = metric(experimental, simulated)
similarities = da.asarray(similarities)

# ONE_TO_ONE
if similarities.shape == ():
similarity = (
np.array([similarities.compute()]) if compute else similarities
)
return np.array([0]), similarity

# If N is < keep_n => keep_n = N
keep_n = min(keep_n, len(simulated))

simulated_indices = similarities.argtopk(metric.sign * keep_n, axis=-1)
scores = similarities.topk(metric.sign * keep_n, axis=-1)

if compute:
with ProgressBar():
simulated_indices, scores = da.compute(simulated_indices, scores)

# Flattens the signal axis if not already flat.
# This is foremost a design choice for returning standard outputs
if not metric.flat:
simulated_indices = simulated_indices.reshape(-1, keep_n)
scores = scores.reshape(-1, keep_n)

return simulated_indices, scores


def _pattern_match_slice_simulated(
experimental: Union[np.ndarray, da.Array],
simulated: Union[np.ndarray, da.Array],
keep_n: int,
metric: SimilarityMetric,
n_slices: int,
) -> Tuple[np.ndarray, np.ndarray]:
"""See :func:`pattern_match`.

Parameters
----------
experimental : numpy.ndarray or dask.array.Array
Experimental patterns.
simulated : numpy.ndarray or dask.array.Array
Simulated patterns.
keep_n : int
Number of results to keep.
metric : SimilarityMetric
Similarity metric.
n_slices : int
Number of simulation slices to process sequentially.

Returns
-------
simulation_indices : numpy.ndarray
Simulation indices corresponding with metric results.
scores : numpy.ndarray
Sorted metric results.
"""
# This is a naive implementation, hopefully not stupid, of slicing
# the simulated in batches without thinking about aligning with
# dask chunks or rechunking dask seem to handle the sequential
# slicing decently

nav_shape = _get_nav_shape(experimental)
nav_size = int(np.prod(nav_shape))
num_simulated = _get_number_of_simulated(simulated)
slice_size = num_simulated // n_slices

n = min(keep_n, slice_size)
simulated_indices_aggregate = np.zeros((nav_size, n_slices * n), np.int)
scores_aggregate = np.zeros((nav_size, n_slices * n), metric._dtype_out)

start = 0
for i in range(n_slices):
end = start + slice_size if i != n_slices - 1 else num_simulated

simulated_indices, scores = _pattern_match_single_slice(
experimental,
simulated[start:end],
keep_n=keep_n,
metric=metric,
compute=False,
)

# Adjust simulation indicies matches to correspond with
# original simulated
simulated_indices += start

result_slice = np.s_[:, i * n : (i + 1) * n]
with ProgressBar():
print(
f"Matching patterns, batch {i + 1}/{n_slices}:", file=sys.stdout
)
da.store(
[simulated_indices, scores],
[
simulated_indices_aggregate[result_slice],
scores_aggregate[result_slice],
],
# This should be possible, but do we gain anything?
# regions=(slice(......))
)

start += slice_size

simulated_indices = np.zeros((nav_size, n), np.int32)
scores = np.zeros((nav_size, n), np.float32)
for i in range(nav_size):
indices = (metric.sign * -scores_aggregate[i]).argsort(
kind="mergesort"
)[:keep_n]
simulated_indices[i] = simulated_indices_aggregate[i][indices]
scores[i] = scores_aggregate[i][indices]

return simulated_indices, scores