# TUTORIAL: Creating your Own Function

Creating a function for nmrPype is made to be a relatively simple and modular process. For demonstrative purposes, I will describe the process of creating a new function called FLIP, which flips the order of the directly detected dimension of input data. This tutorial assumes that you have installed nmrPype [by source](https://github.com/PhiMykah/nmrpype?tab=readme-ov-file#building-from-source) rather than through pip.

## 1. Overview

For creating the function, a new file will be created and the `__init__.py` file must be edited.
Below is an outline of the file structure and what files to look at for creating this new function.
- Highlighted in <font color="#648FFF">blue</font> are the nmrPype folders.
- Highlighted in <font color="#FFB000">yellow</font> is the new file. 
- Highlighted in <font color="#785EF0">purple</font> are files needed to be edited.

<pre>
<font color="#648FFF"><b>nmrPype<b></font>
├── <font color="#648FFF"><b>fn</b></font> <font color="#808080"><em>PUT YOUR FUNCTION IN THIS FOLDER</em></font>
│   ├── 
│   ├── DI.py
│   ├── FT.py
│   ├── function.py
│   ├── <font color="#FFB000">FLIP.py</font>
│   ├── <font color="#785EF0">__init__.py</font>
│   ├── PS.py
│   ├── SP.py
│   ├── TP.py
│   └── ZF.py
├── __init__.py
├── <font color="#648FFF"><b>nmrio</b></font>
│   ├── fileiobase.py
│   ├── __init__.py
│   ├── read.py
│   └── write.py
├── <font color="#648FFF"><b>parse</b></font>
│   ├── __init__.py
│   └── parser.py
├── pype.py
└── <font color="#648FFF"><b>utils</b></font>
    ├── DataFrame.py
    ├── errorHandler.py
    ├── <font color="#648FFF"><b>fdata</b></font>
    │   ├── datamanip.py
    │   └── __init__.py
    └── __init__.py
</pre>

Based on how the parser is handled, it is not necessary to modify the parser for additional command-line arguments. Later in this tutorial you will see how to format the command-line arguments.

## 2. Creating the function's .py file

When creating the .py file, it is advised to follow the [function template](../nmrPype/fn/function.py). Essentially, each function python file should have a class that overloads the DataFunction class. Then, the class should have the \_\_init\_\_ function, and an initialize function, a process function, an updateHeader function, and a command-line argument function. You are free to also create child functions of run, and parallelize to your needs.

Below is an overview of each function in addition to the implementation for FLIP. The Command-line arguments function will have its own section.

### Note

The standard includes are as follows:
```python
from .function import DataFunction as Function
import numpy as np

# type Imports/Definitions
from ..utils import DataFrame

# Multiprocessing imports
from multiprocessing import Pool, TimeoutError
from concurrent.futures import ThreadPoolExecutor
```

### a. \_\_init\_\_

The init function should take the self as a parameter as well as all of the arguments a function should have. Additionally, the arguments for multiprocessing *(mp_enable, mp_proc, and mp_threads)* should all be included **Even if the function does not use multiprocessing.** Check the example for what to do with the multiprocessing arguments.

All of the parameters for the function should start with a lowercase version of the function followed by an underscore and then the name of the parameter. Additionally, all parameters should have a default assignment. For example, for the Fourier Transform function, the inverse fourier transform parameter would be labelled as `ft_inv`. Then, since it is a toggle, the argument will be a boolean that is false by default. Thus, `ft_inv : bool = False` will be the argument passed to \_\_init\_\_. 

Below is the example for the FLIP function:
```python

def __init__(self, flip_zero : bool = False, flip_shift: bool = False,
             mp_enable : bool = False, mp_proc : int = 0, mp_threads : int = 0):

        # Technically you can call these class arguments whatever you want, but I usually like to keep them consistent
        self.flip_zero = flip_zero # Flip zero will flip the entire data, including the zeros 
        self.flip_shift = flip_shift # Flip shift will shift the data by 1 value to the right
        
        # Format the multi-processing 
        self.mp = [mp_enable, mp_proc, mp_threads]

        # Load the parent's constructor using the params from the function to help users that utilize notebooks
        params = {'flip_zero':flip_zero, 'flip_shift':flip_shift}
        super().__init__(params)
``` 

### b. initialize

Initialize updates header values and parameters, as well as prepare for processing.

Initialization follows the following steps:
- Handle function specific arguments
- Update any header values before any calculations occur
    that are independent of the data, such as flags and parameter storage

The initialize function takes the DataFrame as an argument and does not return anything.

Below is an example of the initialize for the function FLIP:

```python
def initialize(self, data : DataFrame):
    """
    Your initalize docstring goes here 

    Parameters
    ----------
    data : DataFrame
        Target data to manipulate 
    """

    currDim = data.getCurrDim() # Collect directly detected dimension of the data

    # Note: NDFLIPFLAG is not a real header parameter, but this is being shown for demonstrative purposes
    flipFlag = bool(data.getParam('NDFLIPFLAG', currDim))

    # Invert flip flag
    flipFlag = not flipFlag

    # Change zero length if needed
    self.zero_length = 0 if self.flip_zero else int(data.getParam('NDZF', data.getCurrDim()))

    # Set flip flag
    data.setParam('NDFLIPFLAG', float(flipFlag), currDim)
    
```



### c. process

The process function is the main body of the code. It is called by the parent function's `run` function. By default, the function takes the data array and returns a data array as arguments. If you want to pass the process function other arguments, you will have to change the `run` function.

Below is an example of the process function for the function FLIP:

```python
def process(self, array : np.ndarray) -> np.ndarray:
    """
    Enter your docstring here

    Parameters
    ----------
    array : ndarray
        Target data array to process with function

    Returns
    -------
    ndarray
        Updated array after function operation
    """
    
    zeroes = self.zero_length
    
    # This is my recommended way to process each array individually at a time. This is for compatibility with the C version,
    #   but if you don't plan to use the C version in your schemes then it should be fine to use other methods
    it = np.nditer(array, flags=['external_loop','buffered'], op_flags=['readwrite'], buffersize=array.shape[-1], order='C')
        with it:
            for x in it:
                # Only flip the non-zero portion if zeros is disabled
                x[...] = np.flip(x) if not zeroes else np.concatenate([np.flip(x[:zeroes]), x[zeroes:]])
                # Shift to the right by 1 if shift is enabled
                if self.flip_shift:
                    x[...] = np.roll(x, 1)
    
    return array
```

### d. updateHeader

Lastly for this section, if there are any specific flags in the header that need to be updated *AFTER* processing, this function would be the one it is done in. It takes in a dataframe as an argument and returns nothing.

For our FLIP example, we will not need to update the header after the processing, so the parent function is sufficient.

## 3. Command-Line Arguments

Without the command-line arguments function, the function is now completely possible to use in python scripts and jupyter notebooks. However, in order to allow support for running the function in the command line, the function `clArgs` needs to be implemented. The function `clArgs` is called from the parser, and adds a subparser. Here is a breakdown on how to set-up the `clArgs` function using our function FLIP:

For thte function declaration, ensure that the function is labelled with the `@staticmethod` decorator. The function should take two arguments, the `subparser`, and the `parent_parser`.
```python
@staticmethod
def clArgs(subparser, parent_parser):
```

For the addition of subparser, there are several things to note:
- The name of the parser is what the user will utilize 
- If you would like to use multiple aliases, pass a list of strings to the add parser function with an `aliases` keyword
- Include `parent_parser` in a list with the `parents` keyword, as seen below to allow for all of the default arguments
- Include a help string!
```python
FLIP = subparser.add_parser('FLIP', parents=[parent_parser], help='Reverse the direction of the data.')
```

For adding arguments, refer to argParse [documentation](https://docs.python.org/3/library/argparse.html) for all of the options. What is important to know for nmrPype:

- **MOST IMPORTANT:** Set a destination for the data to be stored by using the dest argument. These should match the parameters used when initializing the function. For example, the shift parameter with `flip_shift` parameter should be stored in the `flip_shift` destination.
- You can use multiple aliases, but make sure that they are long enough not to conflict with default arguments. Secondly, longer-names should start with two hyphens, and do not use spaces. An example is an inverse parameter: `'-inv', '--inverse', '--inverse-function`.
- Include a help string!

```python

FLIP.add_argument('-zero', action='store_true', dest='flip_zero', help='Flip the entire data, including the zeros')
FLIP.add_argument('-shift', action='store_true', dest='flip_shift', help='Shift the data by to the right value of 1')
```

## 2a. FLIP.py overview

Here is what the FLIP.py would look like:

In [2]:
# Do not use the import below, this is for the notebook. Use this instead:
# from .function import DataFunction as Function < Original Import
from nmrPype.fn.function import DataFunction as Function

import numpy as np

# Do not use the import below, this is for the notebook. Use this instead:
# from ..utils import DataFrame < Original Import
from nmrPype.utils import DataFrame


# Multiprocessing imports
from multiprocessing import Pool, TimeoutError
from concurrent.futures import ThreadPoolExecutor

class Flip(Function):
    def __init__(self, flip_zero : bool = False, flip_shift: bool = False,
             mp_enable : bool = False, mp_proc : int = 0, mp_threads : int = 0):

        # Technically you can call these class arguments whatever you want, but I usually like to keep them consistent
        self.flip_zero = flip_zero # Flip zero will flip the entire data, including the zeros 
        self.flip_shift = flip_shift # Flip shift will shift the data by 1 value to the right
        self.zero_length = 0 # Size of zero-fill represented as a negative value
        
        # Format the multi-processing 
        self.mp = [mp_enable, mp_proc, mp_threads]
        # Load the parent's constructor using the params from the function to help users that utilize notebooks
        params = {'flip_zero':flip_zero, 'flip_shift':flip_shift}
        super().__init__(params)

    def initialize(self, data : DataFrame):
        """
        Your initalize docstring goes here 

        Parameters
        ----------
        data : DataFrame
            Target data to manipulate 
        """

        currDim = data.getCurrDim() # Collect directly detected dimension of the data

        # Note: NDFLIPFLAG is not a real header parameter, but this is being shown for demonstrative purposes
        flipFlag = bool(data.getParam('NDFLIPFLAG', currDim))

        # Invert flip flag
        flipFlag = not flipFlag

        # Change zero length if needed
        self.zero_length = 0 if self.flip_zero else int(data.getParam('NDZF', data.getCurrDim()))

        # Set flip flag
        data.setParam('NDFLIPFLAG', float(flipFlag), currDim)

    def process(self, array : np.ndarray) -> np.ndarray:
        """
        Enter your docstring here

        Parameters
        ----------
        array : ndarray
            Target data array to process with function

        Returns
        -------
        ndarray
            Updated array after function operation
        """

        zeroes = self.zero_length

        # This is my recommended way to process each array individually at a time. This is for compatibility with the C version,
        #   but if you don't plan to use the C version in your schemes then it should be fine to use other methods
        it = np.nditer(array, flags=['external_loop','buffered'], op_flags=['readwrite'], buffersize=array.shape[-1], order='C')
        with it:
            for x in it:
                # Only flip the non-zero portion if zeros is disabled
                x[...] = np.flip(x) if not zeroes else np.concatenate([np.flip(x[:zeroes]), x[zeroes:]])
                # Shift to the right by 1 if shift is enabled
                if self.flip_shift:
                    x[...] = np.roll(x, 1)

        return array
    
    @staticmethod
    def clArgs(subparser, parent_parser):
        """
        Doc string goes here
        """
        FLIP = subparser.add_parser('FLIP', parents=[parent_parser], help='Reverse the direction of the data.')
        FLIP.add_argument('-zero', action='store_true', dest='flip_zero', help='Flip the entire data, including the zeros')
        FLIP.add_argument('-shift', action='store_true', dest='flip_shift', help='Shift the data by to the right value of 1')


## 4. Editing the \_\_init\_\_.py file

The `fn/__init__.py` file has three parts: the import, the function dictionary, and the \_\_all\_\_ list. The import list obtains all of the function classes from within the fn folder, and aliases them accordingly. The function dictionary is used by the parser to specify where to look for command parameters. The \_\_all\_\_ list is used by python to identify what modules should be exported by the wildcard import.

First, for the import portion, add your function's class to the list of imports. 
**NOTE:** Make sure that the class is imported as an all-caps short abbreviation, the same as the one used for the command-line function call. You can alias the class using the `from x import y as z` python to ensure this.

```python
from .function import DataFunction
from .FT import FourierTransform as FT
from .ZF import ZeroFill as ZF
·
·
·
from .DI import DeleteImaginary as DI
from .TP import Transpose4D as ATP
from .FLIP import Flip as FLIP # Our FLIP function added to the imports
```

Then, for the dictionary function, add your function the dictionary with the key being the code used and the value being the function imported. In most cases, these two will be the same, with the key being a string. 
**NOTE:** If your function has multiple aliases, insert each alias into the list as different keys that have the same function value.

```python
fn_list = {
    'function':DataFunction,
    'NULL':DataFunction,
    'FT':FT,
    'ZF':ZF,
    'DI':DI,
    'SP':SP,
    'PS':PS,
    'TP':YTP, 'YTP':YTP, 'XY2YX':YTP,
    'ZTP':ZTP, 'XYZ2ZYX':ZTP,
    'ATP':ATP, 'XYZA2AYZX':ATP,
    'FLIP':FLIP} # Our new function is added here, the command-line function call is the same as the name of the function class
```

Lastly, add the function *CLASS* to the \_\_all\_\_ list. The class is what was imported in the first part, and not the code used by the command-line. If they are named the same as I recommend, then you will not find any issues here. Functions with multiple aliases should only appear here *once*.

```python
__all__ = ['DataFunction', 'FT', 'ZF', 
           'DI','SP', 'PS', 
           'YTP', 'ZTP', 'ATP',
           'FLIP'] # Our new function is added to the list for export.
```

## 5. Publishing your new Function

The best way to share your new function with others is to create a [fork](https://docs.github.com/articles/fork-a-repo) of the nmrPype [repository](https://github.com/PhiMykah/nmrpype). Check the link for more details on how fork works with git. It may reach the main branch as well depending on the new feature!