# Introduction

<div class="alert alert-info">

Info

Please cite this library by citing the corresponding paper or Zenodo reference XXX "".

</div>

## About PyXC
`PyXC` is a point-to-point correlation tool based on Python. This library aims for a self-documenting correlation library on top of the IPython environment.

## Purpose of this tool
This library was initially developed for a correlation task between nano-indentation and EBSD measurements. The main targets of this tool are:

- To provide a self-explaining correlation library based on the Jupyter Notebook environment.
- To provide a flexible environment for correlation between 2-dimensionally sampled data.

## Common workflow
To perform a correlation, several crucial steps are required to be cleared out. Correlation steps can be divided into four different parts. Each step is dealt with in a respective tutorial notebook.

1. Parsing data from the data file.
2. Loading data into the library.
3. Correcting distortion between different measurements.
4. Make a correlation.

## Usage
This library is able to perform a coordinate-based correlation task which enables the correlation between different scientific data sets.

In [None]:
# Load EBSD data
import numpy as np

EBSD = np.genfromtxt(
    "./data/SiC_in_NiSA.ctf", dtype=float, skip_header=15, delimiter="\t", names=True
)

# Load data into the layer
from pyxc.core.layer import Layer
from pyxc.core.processor.arrays import column_parser
from pyxc.core.container import Container2D
from pyxc.core.loader import ImageLoader, XYDLoader
from pyxc.transform.homography import Homography

layer_ebsd = Layer(
    data=column_parser(EBSD, format_string="dxydddddddd"),
    container=Container2D,
    dataloader=XYDLoader,
    transformer=Homography,
)

It can query the datapoints based on the given (x, y) coordinates.

In [None]:
layer_ebsd.query(
    3,
    3,
    cutoff=2,
    output_number=5,
)

Query operation is based on the given (x, y) coordinates. The `cutoff` parameter is used to determine the maximum distance between the given (x, y) coordinates and the queried data points. The `output_number` parameter is used to determine the maximum number of data points to be returned.

In [None]:
import matplotlib.pyplot as plt

xy = (1, 1)
cutoff = [0.5, 1, 2]
output_number = [1, 10, 100]
fig, ax = plt.subplots(3, 3, figsize=(10, 10), constrained_layout=True)

for i, c in enumerate(cutoff):
    for j, o in enumerate(output_number):
        qr = layer_ebsd.query(*xy, cutoff=c, output_number=o)
        ax[i, j].scatter(qr["x"], qr["y"], c=qr["BC"], s=25, cmap="cividis")
        ax[i, j].set_title(f"Cut-off: {c}, Output number: {o}")
        ax[i, j].set_aspect(1)

        ax[i, j].scatter(*xy, marker="+", color="Red")

        ax[i, j].add_patch(
            plt.Circle(xy, radius=c, edgecolor="Blue", facecolor=[0, 0, 0, 0])
        )
        ax[i, j].annotate(
            "Cut-off circle", (xy[0], xy[1] + c), color="Blue", ha="center", va="bottom"
        )
        ax[i, j].set_xlabel("X-coordinate")
        ax[i, j].set_ylabel("Y-coordinate")
        ax[i, j].set_aspect(1)

        ax[i, j].set_xlim(-np.max(cutoff) * 1.5 + xy[0], np.max(cutoff) * 1.5 + xy[0])
        ax[i, j].set_ylim(-np.max(cutoff) * 1.5 + xy[1], np.max(cutoff) * 1.5 + xy[1])

You can use `Reducer` to reduce the number of data points to be returned. Especially good for the statistical analysis of the data.

In [None]:
from pyxc.core.processor.reducer import Reducer
import numpy as np

layer_ebsd.query(3, 3, cutoff=5, output_number=1000, reducer=Reducer((np.mean,)))