# Basic data processing

In the previous chapter we saw how to use Bokeh plotting library to get a working visualization of our data in a few lines of code. 

### Import libraries, data and bkplot function

In [1]:
# import libraries
import numpy as np
from bokeh.io import output_notebook, show
from bokeh.plotting import figure, ColumnDataSource

In [2]:
output_notebook()

### Import from local module
Instead of copying code from previous chapters we can save the same code in a new Python module. For that we need to create a directory where to save python files, and add a file called **__init__.py**. This file tells Python to treat that directory as a module and allows the import mechanism to work. This file can have code in it, or it could be empty, it doesn't matter.

In [3]:
# import from our own local module
import procnmr.plotting as plt

## Processing steps
The usual processing steps when dealing with NMR data are the following:
1. Apodization
2. FFT
3. Phase correction
4. Baseline correction


We'll now bring back our favorite FID and code some basic processing functions. 

In [67]:
fid = np.loadtxt('output/my_first_fid.csv', dtype='complex64', delimiter=',')

### Apodization
Apodization is a filtering technique used in signal processing. Also known as window functions is a mathematical treatment of a given signal that changes it in some desired way. the most common window function used in NMR is the exponential window function. Its mathematical form is as follows:

$f(t)=exp^{-lb*t}$

where $lb$ is the line broadening factor, and t is the acquisition time. The way you use window functions is by multiplying the FID by it.

So since we only saved the values of the FID and no reference to the time scale, we need to reconstruct the time array. Hopefully we remember the acquisition time and number of points we used before. Now you know the benefit of using parameter files! 

In [5]:
aq = 8.0
td = 8192
t = np.linspace(0.0, aq, td)

Now we create an array of the same length as the time scale and construct our window function on it.

We'll use a line broadening factor of 0.3

In [6]:
lb = 0.3

In [26]:
# exponential window function
x = np.linspace(0,1,td)
exp_win = np.exp(-x/lb)

In [27]:
exp_win

array([1.        , 0.99959313, 0.99918643, ..., 0.03570304, 0.03568851,
       0.03567399])

In [75]:
fig_exp = plt.bkplot(x=t, y=exp_win)
show(fig_exp)

In [76]:
fid1 = exp_win * fid

In [77]:
fig_exp.line(x=t, y=fid1.real, color='slateblue')

# add original fid for comparison
fig_exp.line(x=t, y=fid.real, color='firebrick')
show(fig_exp)

### FFT
We'll use numpy's FFT algorithm wrapped in a local function for ease of use. So we just need to pass the data to be transformed.

In [70]:
def fft(data):
    return np.fft.fftshift(np.fft.fft(data, axis=-1).astype(data.dtype), -1)

In [71]:
ft1 = fft(fid1)

In [72]:
ft1.shape

(8192,)

In [78]:
fig_ft1 = plt.bkplot(x=range(td), y=ft1.real)

In [79]:
show(fig_ft1)

We have reach the end of usefulness of synthetic data, it is too perfect, for phase and baseline correction steps we need imperfect data.

go [next](04_phase_and_baseline_corrections.ipynb) or go [home](00_introduction.ipynb), there's no going [back](02_nicer_plotting.ipynb)