Skip to content

Commit

Permalink
Merge pull request #2565 from SasView/customFitModels
Browse files Browse the repository at this point in the history
Custom fit models
  • Loading branch information
butlerpd committed May 11, 2024
2 parents eed82cb + 3a7d849 commit d263a86
Show file tree
Hide file tree
Showing 11 changed files with 2,080 additions and 1,592 deletions.
265 changes: 139 additions & 126 deletions src/sas/qtgui/Calculators/GenericScatteringCalculator.py

Large diffs are not rendered by default.

3,020 changes: 1,575 additions & 1,445 deletions src/sas/qtgui/Calculators/UI/GenericScatteringCalculator.ui

Large diffs are not rendered by default.

Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
57 changes: 57 additions & 0 deletions src/sas/qtgui/Calculators/media/gsc_ex_customModel_data.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,57 @@
.. gsc_ex_customModel_data.rst
.. _gsc_ex_customModel_data:

Example: Fit the experimental data using the calculated $P(Q)$ from a PDB file
==================================

(Pictures in this document were generated using SasView 6.0.0.) To calculate $P(Q)$ from a PDB file, use the Generic Scattering Calculator in the Tools menu. In this example, a PDB file for apoferritin was downloaded from the PDB data bank (https://www.rcsb.org/structure/6z6u). The custom model function, custom_apoferritinLong, was obtained from the scattering calculator using the Debye full avg. w/ $\beta(Q)$ option after loading the apoferritin PDB file (see print-screen below). Note that the q-range and the number of data points can be customized. The Export Model box should be checked to enable a file name for the custom plugin model that will be produced (custom_apoferritinLong). Clicking "Compute" launches the calculation of $P(Q)$ and $\beta(Q)$, and generate the file into the plugin model directory of the SasView. Once the computation is finished, the plugin model is ready to use to fit scattering data.

.. figure:: gsc_ex_customModel_data_Fig3.jpg

To use the calculated custom apoferritinLong model in Fit panel, one selects "Plugin Models" in the Category box. The custom model should already exist in the SasView directory and can be found in the drop down menu of the "Model name". This custom model, custom_apoferritinLong, returns both the normalized form factor, $\tilde{P}(Q)$, and $\beta(Q)$ caculated using the PDB file. But $\beta(Q)$ is only used if one needs to fit the data with the inter-particle structure factor, $S(Q)$, with the static decoupling approximation.

Warning: when generating the custom plugin model using the Generic Scattering Calculator, please be careful about the following two issues before using it:

1) There should be $enough$ number of data points included in the plugin model. Because the plugin model calculates the theoretical $I(Q)$ by interpolating the data during the fitting, the more data points generated by the Generic Scattering Calculator, the more accurate the theoretical curves is.

2) Because each instrument has a resolution function for each $Q$ value, the theoretical curve needs to be calculated by considering the resolution function. To make the fitting function working properly, the $Q$ range calculated for the plugin model in the Generic Scattering Calculator should be larger than the $Q$ range of the experimental data. If the $Q$ range of theoretical data is too close to (or smaller than) that of the experimental data, the calculated theoretical values may all return as "NaN". When this happens, the plugin model will not function properly to fit experimental data.

The following example is to demonstrate how to fit the data with the calculated form factor.

.. figure:: gsc_ex_customModel_data_Fig1.jpg

The scattering pattern, $I(Q)$, is calculated as

.. math::
I(Q) = \frac{scale}{V}(SLD - SLD_{solvent})^2V_{protein} \tilde{P}(Q\alpha)S_{eff}(Q) + background
$SLD$ is the scattering length density of the protein (or particle). And $SLD\_solvent$ is the scattering length density of solvent.

$V_{protein}$ is the protein volume. If the scattering length density and protein volume are assigned with correct values, $scale$ is the volume fraction of the protein ( or particle).

$\alpha$ is the swelling factor. In general, $\alpha$ should be one (or close to one). The parameter is introduced just in case that there is a slight swelling of the particle.

$S_{eff}(Q)$ is the effective structure factor with

.. math::
S_{eff}(Q) = 1 + \beta(Q\alpha)(S(Q)+1)
And $S(Q)$ is the interparticule structure factor.

For dilute solutions, it is simplified as

.. math::
I(Q) = \frac{scale}{V}(SLD - SLD_{solvent})^2V_{protein} \tilde{P}(Q\alpha) + background
The following figure shows the comparison between one experimental apoferritin protein SANS data of a dilute sample with the calculated $I(Q)$ using this model.

.. figure:: gsc_ex_customModel_data_Fig2.jpg

If one needs to fit concentrated protein solutions, an appropriate structure factor model needs to be chosen. If a protein is anisotropic, one also needs to use the static decoupling approximation. $\beta(Q)$ will be automatially used during the fitting. One fitting parameter to calculate $S(Q)$ using the models available in the SasView is the effective radius that is called radius\_effective in the SasView. One needs to choose how to correlate the effective radius with the size of a protein. Note that the effective radius is related with the interaciton between porteins, and could be different from the radius of a protein/particle. It is therefore ok, and sometimes recommended, to use unconstrained method for the "radius\_effective_mode" since there is no fitting prameter related with the protein size in this custom fitting model. If a protein is nearly spherical such as apoferritin, there is no need to use the static decoupling approximation.


*Document History*

| 2023-10-30 Yun Liu
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
45 changes: 25 additions & 20 deletions src/sas/qtgui/Calculators/media/sas_calculator_help.rst
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@
.. _SANS_Calculator_Tool:

Generic SANS Calculator Tool
Generic SAS Calculator Tool
============================

Description
Expand Down Expand Up @@ -41,7 +41,7 @@ data), in a variety of shapes, such as tetrahedra, cubes or hexahedra.

The scattering length density (SLD) is assumed uniform for each pixel or
element. Depending on the data format the property is either nuclear (in units
of 10\ :sup:`-6`\ |Ang|:sup:`-2`) (`PDB <PDB Files_>`_ file) or
of |Ang|:sup:`-2`) (`PDB <PDB Files_>`_ file) or
magnetic SLDs (`OMF <OMF Files_>`_ file) or a combination of both
(`SLD <SLD Files_>`_ and `VTK <VTK Files_>`_ files). For magnetic neutron
scattering, the :ref:`magnetism` documentation gives further details and
Expand All @@ -59,9 +59,9 @@ discretized with $N$ 3-dimensional rectangular pixels.
The elastic scattering intensity is defined as

.. math::
I(\mathbf{Q}) = \frac{1}{V}\left\lvert\sum_j^Nv_j\beta_j\exp(i\mathbf{Q}\cdot\mathbf{r_j})\right\rvert^2
I(\mathbf{Q}) = \frac{1}{V}\left\lvert\sum_j^Nv_j\rho_j\exp(i\mathbf{Q}\cdot\mathbf{r_j})\right\rvert^2
where $\beta_j$ and $\mathbf{r}_j$ are the scattering length density and
where $\rho_j$ and $\mathbf{r}_j$ are the scattering length density and
the position of the $j^\text{th}$ pixel respectively.

The total volume $V_s$ of structures different than the homogenous media is
Expand All @@ -71,9 +71,9 @@ equal to
V_s = \sum_j^N v_j
for $\beta_j \ne 0$ where $v_j$ is the volume of the $j^\text{th}$
for $\rho_j \ne 0$ where $v_j$ is the volume of the $j^\text{th}$
pixel or natural atomic volume (for .pdb). For atomic structures
$v_j \beta_j \equiv b_j$ is the scattering length of the $j^\text{th}$ atom and
$v_j \rho_j \equiv b_j$ is the scattering length of the $j^\text{th}$ atom and
the natural atomic volume is given by:

$\frac{\text{atomic mass}}{\text{natural molar density}\times\text{Avogadro number}}$
Expand All @@ -86,11 +86,11 @@ For non-magnetic, grid-type data the 1D orientationally averaged scatting intens
can also be calculated using the *Debye full average* option which uses the Debye formula:

.. math::
I(\left\lvert\mathbf{Q}\right\rvert) = \frac{1}{V}\sum_j^N v_j\beta_j \sum_k^N v_k\beta_k
I(\left\lvert\mathbf{Q}\right\rvert) = \frac{1}{V}\sum_j^N v_j\rho_j \sum_k^N v_k\rho_k
\frac{\sin\left(\left\lvert\mathbf{Q}\right\rvert\left\lvert\mathbf{r_j}-\mathbf{r_k}\right\rvert\right)}
{\left\lvert\mathbf{Q}\right\rvert\left\lvert\mathbf{r_j}-\mathbf{r_k}\right\rvert}
*NOTE:* $\beta_j$ *displayed in the GUI may be incorrect (input
*NOTE:* $\rho_j$ *displayed in the GUI may be incorrect (input
parameter* solvent_SLD *) but this will not affect the scattering computation if
the correction of the total volume V is made.*

Expand All @@ -104,15 +104,15 @@ For example this cube is formed of five finite elements:
:align: center

Each element has an associated scattering length
density ($\beta_j$) for the occupied space $V_j$ and the elastic scattering
density ($\rho_j$) for the occupied space $V_j$ and the elastic scattering
intensity is calculated as

.. math::
I(\mathbf{Q}) = \frac{1}{V}\left\lvert\sum_j^N\beta_j\iiint\limits_{V_j}\exp(i\mathbf{Q}\cdot\mathbf{r_j})\text{d}V\right\rvert^2
I(\mathbf{Q}) = \frac{1}{V}\left\lvert\sum_j^N\rho_j\iiint\limits_{V_j}\exp(i\mathbf{Q}\cdot\mathbf{r_j})\text{d}V\right\rvert^2
Note that the Fourier transform is calculated over each element - allowing
regions of space with little variation in $\beta$ to have larger finite
regions of space with little variation in $\rho$ to have larger finite
elements, and regions of interest to have much smaller finite elements, and
hence more detail.

Expand Down Expand Up @@ -175,7 +175,7 @@ How to use the Tool
-------------------
Upon loading the calculator we are shown the following interface:

.. figure:: gen_gui_help.png
.. figure:: GSC_Oct282023_GUI_Index2.jpg
:align: center

..
Expand Down Expand Up @@ -229,6 +229,7 @@ Inputs
In some circumstances these textboxes will be highlighted orange, as a
warning that with the values chosen numerical artefacts may appear due to
the Nyquist criterion, or simulation box size.
* When calculating 1D data, Q values are evenly spaced in the log scale if "Log Spacing" box is checked.

Information Panel
^^^^^^^^^^^^^^^^^
Expand Down Expand Up @@ -259,14 +260,17 @@ Information Panel
and the mousewheel used to zoom in and out.
19) This choice appears only for grid type data and without magnetic SLD.
This tool allows to either compute the fully oriented 2D scattering pattern,
or calculating the 1D orientational averaged intensity $I(Q)$ by the Debye
equation.
or calculating the 1D orientational averaged intensity $I(Q)$ by the Debye equation. One can also choose a compuation option, Debye full avg. w/ $\beta(Q)$, to calculate 1D scattering pattern together with $\beta(Q)$. $\beta(Q)$ is needed when fitting scattering patterns of concentrated solutions using the inter-particle structure factor, $S(Q)$, with the static decoupling approximation.
20) Starts the computation of the scattering pattern.
21) Reset GUI to the initial state.


As an example :ref:`here <gsc_ex_default_data>` you can find a simple demonstration of
the functionality of the Generic scattering calculator using the default
22) If a PDB file is loaded, the radius of gyration is calculated and displayed. "Rg-MASS" is the radius of gyration based on the mass of all atoms in a molecule. "RG-SLD" is the radius of gyration based on the scattering length of all atoms.
23) If the option, Debye full avg. w/ $\beta(Q)$, is chosen, one has the option to check the box "Export Model". Once checked, one can input a file name in the box below. During the computation, the program then exports the calculated normalized form factor, $P(Q)$, and $\beta(Q)$ into this file that automatically become a model in the "Plugin Models". The model name is the same as the file name given in the blox below "Export Model".

One example is given here ( Click :ref:`here <gsc_ex_customModel_data>` ) to illustrate how to calculate $P(Q)$ and $\beta(Q)$ using a PDB file of a protein. These are 1D functions after averaging over all orientiations of proteins. The program can generate a custom model function, which can be used to fit the 1D small angle scattering data.


One other example ( Click :ref:`here <gsc_ex_default_data>` ) is a simple demonstration of
the functionality of the Generic scattering calculator to calculate the 2D scattering pattern using the default
starting values with no files loaded.

After computation the result will appear in the
Expand Down Expand Up @@ -370,9 +374,9 @@ the SLD at the centre of each element. This weighted average is given by:

.. math::
\bar{\beta} = \frac{\sum\limits_j^n \beta_j r_j^{\prime -2}}{\sum\limits_j^nr_j^{\prime -2}}
\bar{\rho} = \frac{\sum\limits_j^n \rho_j r_j^{\prime -2}}{\sum\limits_j^nr_j^{\prime -2}}
Where $\bar{\beta}$ is the estimated SLD for an element and $\beta_j$, $r'_j$
Where $\bar{\rho}$ is the estimated SLD for an element and $\rho_j$, $r'_j$
are the SLDs and distances from the centre of the element of each of the $n$
vertices of the element respectively. $r'_j$ is taken as:

Expand Down Expand Up @@ -638,3 +642,4 @@ References

| 2015-05-01 Steve King
| 2021-09-14 Robert Bourne
| 2023-10-30 Yun Liu
128 changes: 127 additions & 1 deletion src/sas/sascalc/calculator/geni.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,12 @@
import os
import logging
import numpy as np
import periodictable

from typing import Union

from sas.sascalc.calculator.sas_gen import MagSLD, OMF2SLD


try:
if os.environ.get('SAS_NUMBA', '1').lower() in ('1', 'yes', 'true', 't'):
Expand Down Expand Up @@ -475,4 +481,124 @@ def _spin_weights(in_spin, out_spin):
in_spin * (1.0-out_spin) / norm, # ud
in_spin * out_spin / norm, # uu
)
return weight
return weight


def radius_of_gyration(nuc_sl_data: Union[MagSLD, OMF2SLD]) -> tuple[str, str, float]:
"""Calculate parameters related to the radius of gyration using and SLD profile.
:param nuc_sl_data: A scattering length object for a series of atomic points in space
:return: A tuple of the string representation of the radius of gyration, Guinier slope, and Rg as a float.
"""
# Calculate Center of Mass First
c_o_m = center_of_mass(nuc_sl_data)

x = nuc_sl_data.pos_x
y = nuc_sl_data.pos_y
z = nuc_sl_data.pos_z
pix_symbol = nuc_sl_data.pix_symbol
coordinates = np.array([x, y, z]).T
coherent_sls, masses = np.empty(len(pix_symbol)), np.empty(len(pix_symbol))
for i, sym in enumerate(pix_symbol):
atom = periodictable.elements.symbol(sym)
masses[i] = atom.mass
coherent_sls[i] = atom.neutron.b_c
# solvent_slds = atoms.volume() * 10**24 * float(self.txtSolventSLD.text()) * 10**5

# TODO: Implement a scientifically sound method for obtaining protein volume - Current value is a imprecise
# approximation. Until then Solvent SLD does not impact RG - SLD.
# This method only calculates RG of proteins in vacuum. Implementing the RG calcuation in solvent needs
# the input of the solvent volume.
contrast_sls = coherent_sls # femtometer
rsq = np.sum((c_o_m - coordinates)**2, axis=1)

rog_num = np.sum(masses * rsq)
rog_den = np.sum(masses)
guinier_num = np.sum(contrast_sls * rsq)
guinier_den = np.sum(contrast_sls)

if rog_den <= 0: #Should never happen as there are no zero or negative mass atoms
rog_mass = "NaN"
r_g_mass = 0.0
logging.warning("Atomic Mass is zero for all atoms in the system.")
else:
r_g_mass = np.sqrt(rog_num/rog_den)
rog_mass = (str(round(np.sqrt(rog_num/rog_den),1)) + " Å")

#Avoid division by zero - May occur through contrast matching
if guinier_den == 0:
guinier_value = "NaN"
logging.warning("Effective Coherent Scattering Length is zero for all atoms in the system.")
elif (guinier_num/guinier_den) < 0:
guinier_value = (str(round(np.sqrt(-guinier_num/guinier_den), 1)) + " Å")
logging.warning("Radius of Gyration Squared is negative. R(G)^2 is assumed to be |R(G)|* R(G).")
else:
guinier_value = (str(round(np.sqrt(guinier_num/guinier_den), 1)) + " Å")

return rog_mass, guinier_value, r_g_mass # (String, String, Float), float used for plugin model


def center_of_mass(nuc_sl_data: Union[MagSLD, OMF2SLD]) -> list[float]:
"""Calculate Center of Mass(CoM) of provided molecule using an SL profile
:param nuc_sl_data: A coordinate data object (MagSLD or OMF2SLD)
:return: A list of the calculated spatial center of mass, given as cartesian coordinates."""
masses = np.asarray([0.0, 0.0, 0.0])
densities = np.asarray([0.0, 0.0, 0.0])

# Only call periodictable once per element -> minimizes calculation time
coh_b_storage = {}

for i in range(len(nuc_sl_data.pos_x)):
coordinates = np.asarray([float(nuc_sl_data.pos_x[i]), float(nuc_sl_data.pos_y[i]), float(nuc_sl_data.pos_z[i])])

#Coh b - Coherent Scattering Length(fm)
symbol = nuc_sl_data.pix_symbol[i]
coh_b = coh_b_storage.get(symbol, periodictable.elements.symbol(symbol).neutron.b_c)
coh_b_storage[symbol] = coh_b

masses += (coordinates*coh_b)
densities += coh_b

c_o_m = np.divide(masses, densities)

return c_o_m


def create_beta_plot(q_x: np.ndarray, nuc_sl_data: Union[MagSLD, OMF2SLD], form_factor: np.ndarray) -> np.ndarray:
"""Carry out the computation of beta Q using provided & calculated data
:param q_x: The Q values where the beta will be calculated.
:param nuc_sl_data: A coordinate data object (MagSLD or OMF2SLD)
:param form_factor: The form factor calculated prior to applying the beta approximation.
:return: An array of form factor values with the beta approximation applied."""
f_q = f_of_q(q_x, nuc_sl_data)

# Center Of Mass Calculation
data_beta_q = (f_q**2) / form_factor

# Scale Beta Q to 0-1
scaling_factor = data_beta_q[0]
data_beta_q = data_beta_q / scaling_factor

return data_beta_q


def f_of_q(q_x: np.ndarray, nuc_sl_data: Union[MagSLD, OMF2SLD]) -> np.ndarray:
"""Compute the base F(Q) calculation based from the nuclear data.
:param q_x: The Q values where the beta will be calculated.
:param nuc_sl_data: A coordinate data object (MagSLD or OMF2SLD)
:return: An array of form factor data."""
c_o_m = center_of_mass(nuc_sl_data)
r_x = np.subtract(nuc_sl_data.pos_x, c_o_m[0])
r_y = np.subtract(nuc_sl_data.pos_y, c_o_m[1])
r_z = np.subtract(nuc_sl_data.pos_z, c_o_m[2])
magnitude_relative_coordinate = np.sqrt(np.power(r_x, 2) + np.power(r_y, 2) + np.power(r_z, 2))
coh_b = np.asarray([periodictable.elements.symbol(atom).neutron.b_c for atom in nuc_sl_data.pix_symbol])

f_of_q_list = [np.sum(coh_b * np.sinc(q_x[i] * magnitude_relative_coordinate / np.pi)) for i in range(len(q_x))]
f_of_q_list = np.asarray(f_of_q_list) / abs(np.sum(coh_b)) # normalization
return f_of_q_list


Loading

0 comments on commit d263a86

Please sign in to comment.