# Tutorial 2. Loading data into the Layer object
This section will cover how to properly subclass `DataLoader` class if specialized loader is required.

First, let's load the required data.

In [None]:
# Load EBSD data
import numpy as np

EBSD = np.genfromtxt(
    "./data/SiC_in_NiSA.ctf", dtype=float, skip_header=15, delimiter="\t", names=True
)

## Boilerplate
We will deal with the low-level API first. This pattern will be used repetitively. It is possible to wrap this boilerplate to your own function for the enhanced convenience.

In [None]:
# Load data into the layer
from pyxc.core.layer import Layer
from pyxc.core.processor.arrays import column_parser
from pyxc.core.container import Container2D
from pyxc.core.loader import ImageLoader, XYDLoader
from pyxc.transform.homography import Homography

layer_ebsd = Layer(
    data=column_parser(EBSD, format_string="dxydddddddd"),
    container=Container2D,
    dataloader=XYDLoader,
    transformer=Homography,
)

## Using XYDLoader
The XYDLoader object is useful to load 2-dimensional array-like data or structured arrays. It is required that the array's first and second columns contain numeral values of X and Y data. Therefore, to use XYDLoader, preparation of data to a proper format is important.

### Using `column_parser` function
The column parser function is the utility function to refine data. It reorders columns based on the provided format string. x and y means columns that contain x and y information, while _ means ignore. All other chracters are regarded data.

#### 2-dimensional array-like
Let's do continue with an example. First, 2-dimensional array-like.

In [None]:
arr = np.random.random((3, 4))
arr

Let's assume we are setting the second and third columns as x and y, while the first column remains as data column. You can see, the x and y columns are moved to the first and second columns.

In [None]:
column_parser(arr, "dxy")

or, you can exclude some columns by specifying `_` or explicitly set `return_unspecified` to False.

In [None]:
column_parser(arr, "_dxy")

In [None]:
column_parser(arr, "dxy", return_unspecified=False)

#### Structured array
For a structured array is works exactly same. Let's use ebsd data we've previously loaded. 

In [None]:
EBSD

Let's say, we need X, Y, Phase, Euler1, Euler2, Euler3. Then the format string should be `dxy__ddd`. Also we don't want to retrieve trailing MAD, BC, and BS, so we will explicitly specify 'return_unspecified' to False.

In [None]:
column_parser(EBSD, "dxy__ddd", return_unspecified=False)

Correctly processed data (with proper format string) is compatible with XYDLoader. You can load the data to the Layer.

In [None]:
# Load data into the layer
from pyxc.core.layer import Layer
from pyxc.core.processor.arrays import column_parser
from pyxc.core.container import Container2D
from pyxc.core.loader import ImageLoader, XYDLoader
from pyxc.transform.homography import Homography

layer_ebsd = Layer(
    data=column_parser(EBSD, format_string="dxy__ddd", return_unspecified=False),
    container=Container2D,
    dataloader=XYDLoader,
    transformer=Homography,
)
layer_ebsd.container

## Loading image data
Loading image data is very straightforward. Image data are 2- or 3-dimensional array-like objects with the shape of (i, j, k). Each channel of the image will be stored as serialized form, with the column name of `Channel_{integer}`. Let's make sample image data. Just use `ImageLoader`.

In [None]:
im3channel = np.random.random((4, 4, 3))

Then, we can easily load the data into the array.

In [None]:
# Load data into the layer
from pyxc.core.layer import Layer
from pyxc.core.processor.arrays import column_parser
from pyxc.core.container import Container2D
from pyxc.core.loader import ImageLoader, XYDLoader
from pyxc.transform.homography import Homography

layer_im3c = Layer(
    data=im3channel,
    container=Container2D,
    dataloader=ImageLoader,
    transformer=Homography,
)
layer_im3c.container

In [None]:
x = np.linspace(0, 1, 10)
y = np.linspace(2, 10, 10)
data = np.random.random((10, 3))
example_container = Container2D(x_raw=x, y_raw=y, data=data)

As you can see, an example_container is now initialized correctly.

In [None]:
example_container

We didn't provide the structured array. Therefore, the column names for the data are automatically determined such as Channel_0, Channel_1, and Channel_2.

In [None]:
example_container.dtype.names