
# Introduction to Jscatter

#### The aim of Jscatter is treatment of experimental data and models


* Reading and analyzing experimental data with associated attributes as temperature, wavevector, comment, ....
* **Multidimensional fitting** taking attributes into account.
* Providing useful models for **neutron and X-ray scattering** as form factors, structure factors
  and dynamic models (quasi elastic neutron scattering) and other topics.
* Simplified plotting with paper ready quality (preferred in xmgrace).
* Easy model building for non programmers.
* Python scripts to document data evaluation and modelling.

For more details see the 
[Documentation](https://jscatter.readthedocs.io/en/latest/index.html), 
[Beginners Guide](https://jscatter.readthedocs.io/en/latest/BeginnersGuide.html) or
[Examples](https://jscatter.readthedocs.io/en/latest/examples.html)

Included models are documented at 
[formfactor](https://jscatter.readthedocs.io/en/latest/formfactor.html), 
[structurefactor](https://jscatter.readthedocs.io/en/latest/structurefactor.html) or
[dynamic](https://jscatter.readthedocs.io/en/latest/dynamic.html)

This Notebook aims to give a short introduction for Beginners with the possibility to play around with some examples to explore the posibilities without the need to do a full installation.

A typical example for data analysis is given in [Jscatter_DataAnalysis.ipynb](Jscatter_DataAnalysis.ipynb).

### Content
- Demonstration for 1D data in **dataArray** with later fitting a parabola
- Demonstration of multiple 1D data in **dataList**
- Demonstration of multiple 1D data in a **dataList** using attributes from read data
- 2D fit data with an X,Z grid data and Y values


![Jscatter](https://jscatter.readthedocs.io/en/latest/_images/Jscatter.jpeg "Jscatter")

#### Prerequisite 
Install Jscatter and prerequisites on readonly Jupyter server

For installing on Linux, Mac or Windows visit the [installation instructions](https://jscatter.readthedocs.io/en/latest/Installation.html).

In [None]:
import sys
# Install jscatter (optional: "==1.1.0" forces a specific version)
!{sys.executable} -m pip install jscatter          
%matplotlib notebook

import jscatter as js
import numpy as np
js.usempl(True)   # force matplotlib, not needed in Linux

## dataArray with 1D data and later fitting 

**dataArray** is a container for matrix like data like a spreadsheet (numpy ndarray). Additional to the entries in the spreadsheet we have attributes describing metadata like the temperature of the measurement related to the spreadsheet data accessible as data.temperature.


#### Load example data into dataArray, add attributes and inspect it
Additional options allow tunning how datafiles are read. 

In [None]:
data=js.dA(js.examples.datapath+'/exampledata0.dat')
data.pressure=200
data.q=0.96
data.comment=['Its cold here as _2 K']

The attribute temperature is found in the ASCII file as line ``temperature -20`` and is automatically added. Attributes are identified from lines with name followed by a number (and more).

In [None]:
data

In [None]:
data.X

In [None]:
data.Y

#### Inspect attributes and extract one from comment

In [None]:
data.attr    # get list of attributes

In [None]:
data.showattr()    # show attributes with content

In [None]:
data.pressure    # This may be used for model calculations.

Comment is all from ASCII file what is not identified as data or attribute with name. We can extract new attributes from the comments

In [None]:
data.comment    

In [None]:
data.Temp=float(data.comment[0].split('_')[1].split()[0])

In [None]:
data.Temp

#### Manipulate data

Calculations work with vectorized data, so data.Y is a column in the dataArray.
Numpy functions like np.sin or np.exp can deal with this and speedup computations. 
Also matrix operations are feasible.

.Y column can be changed by setColumnIndex(...). 
Direct indexing is done in usual way. Copy includes also attributes. 

In [None]:
data0=data.copy()                # copy with attributes
data0.Y = data0.Y*data0.q**2     # inplace changes
data0[1] = data0[1]* data0.q**2  # same by direct indexing 
data0[1,::2]=0                   # each second of 1 column set to zero

# create new dataArray with additional column
data2 = data0.addColumn(n=1,values=data[0]**2*np.sin(data[1]))           

#### Slicing, indexing, cut, boolean indexing

In [None]:
data4 = data[:,[1,2,3,4]]               # take index 1,2,3,4
data5 = data[:,3:-3:2]                  # cut first and last 3 and ech second
data6 = data[:,data.X>0]                # take only positive .X (Boolean or “mask” index arrays from numpy)
data6[:,data6.X>1] = data6[:,data6.X>1]*2

## How to plot

In [None]:
# p=js.grace() # on Linux using Xmgrace
p=js.mplot()   # empty plot

#### Using dataarray attributes .X .Y .eY  
Needed attributes are taken automatically and used for plotting.

In [None]:
p.Plot(data,le='test $q 1/nm')   # use .X,.Y,.eY automatically

#### Explicit given X, Y and eY components. 
Original matplotlib functionallity is accessible with small letter commands (e.g. p.title(...) ) 

In [None]:
p.Plot(data.X ,np.abs(data.Y*data.X),np.ones_like(data.X)*20,le='more')
p.Plot(data.X,10*data.X**2,symbol=0,line=1,legend='xxxxx')

Pretty Up

In [None]:
p.Yaxis(label='test')
p.Title(r'This is a nonsense plot with $\int_0^\inf\xi \partial\xi$')
p.Legend()

## How to fit
#### Define model 
A model gets as input a number of parameters where typically one is an array with X values. The return value should be the model values Y or a **dataArray** with ``.Y``.
Here ``q`` gets list of values as numpy array.

In [None]:
def parabola(q,a,b,c,d):
   return d*(q-a)**2+b*q+c

#### Fit it defining the model, free parameters, fixed parameters and map model names to data names
#### Use free *d* to improve fit
('X' is always x axis) We get some output thats documents fit progress and the final result if successful.

Additional options define conditions, fit methods and more.

In [None]:
data.fit(parabola ,freepar={'a':2,'b':4,'d':1}, fixpar={'c':-20}, mapNames={'q':'X'})

#### Show an errorPlot with the result. 
Repeating the .fit command will always plot to the same errPlot. The errplot can be created (empty) prior to the fit by .makeErrPlot to observe the progress graphically (only in xmgrace on Linux as matplotlib is too slow.)

We may simulate a changed parameter by setting it like``data.showlastErrPlot(a=6)``

In [None]:
data.showlastErrPlot()

The fit result is accessible in the .lastfit attribute as dataArray and should be saved as the fit result.

1-$\sigma$ errors are accessible as par_err for free parameters. Additional the $\chi^2$ and covariance matrix are included as attributes.

In [None]:
data.lastfit

In [None]:
data.a

In [None]:
data.a_err

## dataList of multiple 1D data

**dataList** is a container for a list of dataArrays with variable size.

#### Create some data as sinusoidal with changing amplitude an phase

In [None]:
import jscatter as js
import numpy as np

x = np.r_[0:10:0.1]
data = js.dL()
ef = 0.1  # increase this to increase error bars of final result
for ff in [0.001, 0.4, 0.8, 1.2, 1.6]:
    data.append(np.c_[x, (1.234 + ff) * np.sin(x + ff) + ef * ff * np.random.randn(len(x)), x * 0 + ef * ff].T)
    data[-1].B = 0.2 * ff / 2  # add attributes

#### Sinusoidal model

In [None]:
def sinus(x, A, a, B, p):
    return A * np.sin(a * x + p) + B

#### Fit with a common parameter *p* for all data (``float`` start value), obviously wrong result

In [None]:
data.fit(sinus, {'a': 1.2, 'p': 0, 'A': 1.2}, {}, {'x': 'X'})
data.showlastErrPlot()  # show fit
data.errPlotTitle('Fit Sine with attribut and common fit parameter')

#### Improve fit using independent `p`, `A` and `B` 
Independent fit parameters are indicated using a list of start parameters ('B' = [0,0.1]). 
Mixing is allowed. The `[]` indicate the list, the last start parameter in this list is repeated to have a value for all missing.

The same is used for different fixed parameter values.

In [None]:
data.fit(sinus, {'a': 1.2, 'p': [0], 'B': [0, 0.1], 'A': [1]}, {}, {'x': 'X'})

Change title.

In [None]:
data.errPlotTitle('Fit Sine with attribut and non common fit parameter')

## dataList of multiple 1D data using attributes from read data

#### Here intermediate scattering function for 16 wavevectors 
The data corresponds to the measurement of a protein with translational and rotational diffusion as measured by Neutron Spinecho Spectroscopy.

Each dataArray has a parameter q for the wavevector that is automatically used in the diffusion model.  

In [None]:
i5=js.dL(js.examples.datapath+'/iqt_1hho.dat')
print(i5)

#### Look at attribute q as wavevector 

In [None]:
print(i5.q)

#### Define Model for fitting
Use numpy functions (here np.exp) as these deal with numpy arrays !!!

Also lineal algebra as matrix multilication is done in a fast way.

In [None]:
def diffusion(A,D,t,elastic,wavevector=0):
    return A*np.exp(-wavevector**2*D*t)+elastic

#### makeErrPlot: see progress of intermediate steps with residuals 
(updated all 2 seconds if used on your personal Linux computer with xmgrace, matplotlib is too slow)

Change lin/log scale by pressing k,l for x,y axis.

In [None]:
i5.makeErrPlot(title='diffusion model residual plot')

#### Fit it
We need only to define that the 'wavevector' is named 'q' in th data. The values are used as found in the data as attribute.

In [None]:
i5.fit(model=diffusion,                                 # the fit function
       freepar={'D':[0.2],'A':1},                       # start values; [..] indicate independent fit parameter
       fixpar={'elastic':0.0},                          # fixed parameters, single values indicates common parameter
       mapNames= {'t':'X','wavevector':'q'},            # map names of the model to names of data attributes
       condition=lambda a: (a.X>0.01) & (a.Y>0.01)  )   # a condition to include only specific values

Inspect resulting parameters and 1-$\sigma$ errors.

In [None]:
print(i5.D)
print(i5.D_err)

#### Plot the main result from the fit.

In [None]:
p=js.mplot()
p.Plot(i5.q,i5.D,i5.D_err,symbol=[1,1,3],le='trans. + rot. diffusion')

Pretty Up

In [None]:
p.Yaxis(label=r'$D\; /\; nm^2/ns$')
p.Xaxis(label=r'$Q\; /\; nm^{-1}$')
p.Plot([i5.q.min,i5.q.max],[i5.D[:4].mean()]*2 ,li=[2,3,1],sy=0, le='trans. diffusion')
p.Legend()

#### Last fit result in .lastfit  with errors and covariance matrix
Save it with data.lastfit.save('mydata_fitdiffusionmodel.dat')

In [None]:
i5.lastfit

#### Simulate with changed parameters
We can also simulate in the above errPlot by ``data.showlastErrPlot(D=0.1)``.

In [None]:
i5.modelValues(D=0.1)

## 2D fit data with an X,Z grid data and Y values 

As we use for 1D data independent variables .X with dependent variable .Y (measurement or output of a model) we use for 2D and 3D data dependent variable .X,.Z,.W to stay at the dependent .Y variable name. 

For fitting we need data with X,Z,W,Y columns indicated .

In this example we create synthetic data.


#### We create synthetic 2D data with X,Z axes and Y values as Y=f(X,Z)

In [None]:
# This is ONE way to make a grid. For fitting it can be unordered, non-gridded X,Z data
x, z = np.mgrid[-5:5:0.25, -5:5:0.25]
xyz = js.dA(np.c_[x.flatten(), z.flatten(), 0.3 * np.sin(x * z / np.pi).flatten() + 0.01 * np.random.randn(
    len(x.flatten())), 0.01 * np.ones_like(x).flatten()].T)

#### Set columns where to find X,Z coordinates and Y values and eY errors )

In [None]:
xyz.setColumnIndex(ix=0, iz=1, iy=2, iey=3)

#### Define model and fit it

In [None]:
def ff(x, z, a, b):
    return a * np.sin(b * x * z)


xyz.fit(model=ff, freepar={'a': 1, 'b': 1 / 3.}, fixpar={}, mapNames={'x': 'X', 'z': 'Z'})

#### Show in 2D plot

In [None]:
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
from matplotlib import cm

fig = plt.figure()
ax = fig.add_subplot(111, projection='3d')
ax.set_title('2D Sinusoidal fit')
# plot data as points
ax.scatter(xyz.X, xyz.Z, xyz.Y)
# plot fit as contour lines
ax.tricontour(xyz.lastfit.X, xyz.lastfit.Z, xyz.lastfit.Y, cmap=cm.coolwarm, antialiased=False)
plt.show(block=False)

## For more examples please visit the Jscatter documentation.