Notebooks: add short notebook on background computations

OpenSMFS · Nov 22, 2017 · 03e68ee · 03e68ee
1 parent 85783d9
commit 03e68ee
Show file tree

Hide file tree

Showing 2 changed files with 392 additions and 220 deletions.
diff --git a/notebooks/Example - Background estimation.ipynb b/notebooks/Example - Background estimation.ipynb
@@ -0,0 +1,383 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "input_collapsed": false
+   },
+   "source": [
+    "# Example - Background estimation\n",
+    "\n",
+    "*This notebook is part of smFRET burst analysis software [FRETBursts](http://opensmfs.github.io/FRETBursts/).*"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "> In this notebook, we show different ways of computing, plotting and exporting background data.\n",
+    "> For a complete tutorial on burst analysis see \n",
+    "> [FRETBursts - us-ALEX smFRET burst analysis](FRETBursts - us-ALEX smFRET burst analysis.ipynb)."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "import pandas as pd\n",
+    "from fretbursts import *"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "sns = init_notebook(apionly=True)\n",
+    "print('seaborn version: ', sns.__version__)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# Tweak here matplotlib style\n",
+    "import matplotlib as mpl\n",
+    "mpl.rcParams['font.sans-serif'].insert(0, 'Arial')\n",
+    "mpl.rcParams['font.size'] = 12\n",
+    "%config InlineBackend.figure_format = 'retina'"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# Retrieve the data"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "url = 'http://files.figshare.com/2182601/0023uLRpitc_NTP_20dT_0.5GndCl.hdf5'\n",
+    "download_file(url, save_dir='./data')\n",
+    "full_fname = \"./data/0023uLRpitc_NTP_20dT_0.5GndCl.hdf5\"\n",
+    "\n",
+    "d = loader.photon_hdf5(full_fname)\n",
+    "loader.alex_apply_period(d)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# Background estimation\n",
+    "\n",
+    "The first step of the analysis is estimating the background. \n",
+    "The assumption is that the background is a Poisson process and therefore \n",
+    "the corresponding inter photon delays are exponentially distributed. Since the \n",
+    "background can change during the measurement, a new estimation is \n",
+    "computed every `time_s` seconds (this time is called the *background period*).\n",
+    "\n",
+    "The inter-photon delay distribution contains both single-molecule signal and background. \n",
+    "The single-molecule signal produces a inter-photon delays at short time scales, while\n",
+    "the background produces an exponential tail extenting to longer timescales.\n",
+    "Hence, we need a threshold to discriminate between the exponential tail and the single-molecule peak.\n",
+    "\n",
+    "FRETBursts provides several ways to specify the minimum threshold \n",
+    "and different functions to fit the exponential tail.\n",
+    "\n",
+    "The reference documentation is the following:\n",
+    "\n",
+    "- Documentation for [`Data.calc_bg()`](http://fretbursts.readthedocs.org/en/latest/data_class.html#fretbursts.burstlib.Data.calc_bg)\n",
+    "- Documentation for [`background` (e.g. `bg`) module](http://fretbursts.readthedocs.org/en/latest/background.html)\n",
+    "- Documentation for [`exp_fitting` module](http://fretbursts.readthedocs.org/en/latest/background.html#module-fretbursts.fit.exp_fitting)."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Single threshold"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Let start with a standard Maximum Likelihood (ML) \n",
+    "background fit with a minimum tail threshold of 500 μs:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "d.calc_bg(bg.exp_fit, time_s=1000, tail_min_us=500)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "We can look at how the fit looks with:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "dplot(d, hist_bg, show_fit=True);"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Note that the fits are not very good. This is understandable because \n",
+    "we used a single threshold for all the photon streams, each one\n",
+    "having a quite different background."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Multiple thresholds"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "To improve the fit, we can try specifying a threshold for each channel.\n",
+    "This method is bit ad-hoc but it may work well when the \n",
+    "thresholds are properly choosen."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "d.calc_bg(bg.exp_fit, time_s=1000, tail_min_us=(800, 4000, 1500, 1000, 3000))"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "dplot(d, hist_bg, show_fit=True);"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "For ALEX measurements, the tuple passed to\n",
+    "`tail_min_us` in order to define the thresholds needs to contain  \n",
+    "5 values corresponding the 5 distinct photon streams \n",
+    "(the all-photon stream plus the 4 base alternation streams).\n",
+    "The order of these 5 values need to match the order of photon streams in\n",
+    "the `Data.ph_streams` attribute:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "d.ph_streams"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Automatic threshold"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Finally, is possible to let FRETBursts infer the threshold automatically (recommended) with:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "d.calc_bg(bg.exp_fit, time_s=1000, tail_min_us='auto', F_bg=1.7)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Which results in the following fit plot:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "dplot(d, hist_bg, show_fit=True);"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Under the hood, this method estimates the threshold \n",
+    "according to this formula:\n",
+    "\n",
+    "    threshold_auto = F_bg / coarse_background_rate\n",
+    "\n",
+    "where `F_bg` is an input argument (by default 1.7)\n",
+    "and `coarse_background_rate` is an initial background estimation\n",
+    "performed with a fixed threshold. This method is concemptually an\n",
+    "iterative method to compute the threshold that is stopped\n",
+    "after the second iteration (this is usually more than enough for\n",
+    "accurate estimates).\n",
+    "\n",
+    "This latter method is the recommended,\n",
+    "since it works well and without user intervention in \n",
+    "a wide range of experimental conditions."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Background timetrace\n",
+    "\n",
+    "It is a good practice to monitor background rates as a function of time.\n",
+    "Here, we compute background in adjacent 30s windows (called *background periods*)\n",
+    "and plot the estimated rates as a function of time."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "d.calc_bg(bg.exp_fit, time_s=30, tail_min_us='auto', F_bg=1.7)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "dplot(d, timetrace_bg);"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Getting the background rates\n",
+    "\n",
+    "The background rates are stored in `Data()` attribute\n",
+    "`bg`, a dict with photon streams (`Ph_sel` objects) as key. \n",
+    "Each item in the dict contains a list of fitted background rates \n",
+    "for each channel and period."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "d.bg"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "We can also get the average background for each channel:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "d.bg_mean"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "bg_data = pd.DataFrame({str(k): v[0] for k, v in d.bg.items()})\n",
+    "bg_data"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# Burst analysis"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "After background estimation, you are ready to perform burts search.\n",
+    "\n",
+    "If you want more details see\n",
+    "[Burst analysis](FRETBursts - us-ALEX smFRET burst analysis.ipynb#Burst-analysis)."
+   ]
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.6.0"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 1
+}