Bayesian Model for competition assay #127

sonyahanson · 2017-12-08T22:36:43Z

First added simple competition model to bindingmodels.py

…es to very slow general binding model

sonyahanson · 2017-12-08T23:17:17Z

Added example notebook for new competition bayesian model also compares to very slow general binding model.

jchodera · 2017-12-09T17:59:09Z

I looked through 4-Analytic-v-General-Bayesian-Model.ipynb but I have no idea what it is showing.

sonyahanson · 2017-12-12T18:56:09Z

It basically shows that the AnalyticalBindingModel (link below) works, and is better and faster than the general binding model.
https://github.com/choderalab/assaytools/blob/Competition_Bayes/AssayTools/bindingmodels.py#L123-L208

Let me know if you want to go through it in person.

sonyahanson · 2018-01-09T02:13:38Z

Going to add a function in our analytical binding model to prevent this error from happening (in ipynb):

/Users/hansons/anaconda2/envs/python3/lib/python3.6/site-packages/assaytools-0.3.1-py3.6.egg/assaytools/bindingmodels.py:195: RuntimeWarning: invalid value encountered in arccos
  theta = np.arccos((-2*(a**3)+9*a*b-27*c)/(2*np.sqrt((a**2-3*b)**3)))

sonyahanson · 2018-01-09T03:18:17Z

This seems to have worked just fine: no more error and the DeltaG trace is a bit less crazy:

(Orange is the trace of the DeltaG of our competitor.)

Note that there are still other errors popping up in the notebook:

/Users/hansons/anaconda2/envs/python3/lib/python3.6/site-packages/assaytools-0.3.1-py3.6.egg/assaytools/pymcmodels.py:240: RuntimeWarning: overflow encountered in exp
  scaling[large_indices] = (1 - np.exp(-alpha[large_indices])) / alpha[large_indices]
/Users/hansons/anaconda2/envs/python3/lib/python3.6/site-packages/ipykernel_launcher.py:178: RuntimeWarning: overflow encountered in exp
/Users/hansons/anaconda2/envs/python3/lib/python3.6/site-packages/assaytools-0.3.1-py3.6.egg/assaytools/pymcmodels.py:95: RuntimeWarning: invalid value encountered in true_divide
  mu = np.log(mean**2 / np.sqrt(stddev**2 + mean**2))
/Users/hansons/anaconda2/envs/python3/lib/python3.6/site-packages/assaytools-0.3.1-py3.6.egg/assaytools/pymcmodels.py:72: RuntimeWarning: invalid value encountered in multiply
  tau = 0*mean
/Users/hansons/anaconda2/envs/python3/lib/python3.6/site-packages/assaytools-0.3.1-py3.6.egg/assaytools/pymcmodels.py:86: RuntimeWarning: divide by zero encountered in reciprocal
  tau[small_indices] = (x[small_indices] - x[small_indices]**2/2 + x[small_indices]**3/3 - x[small_indices]**4/4 + x[small_indices]**5/5)**(-1)

does this function need to be vectorised?

I used map to do what I think you are alluding to here, perhaps vectorizing would be better...

sonyahanson · 2018-09-14T23:05:29Z

So right now this PR includes the addition of the Analytic Competition Binding model to bindingmodels.py, to be complete it should probably also include the appropriate changes to the make_model function in pymcmodels.py.

For now I've just been making my own simplified version of this make_model function in ipynb's for testing.

One reason I have yet to incorporate this into the real pymcmodels.py is that I'm still seeing the errors mentioned in the comment above, and am not sure if they are related to numerical instabilities and can be easily corrected.

Furthermore I have been playing a bit with simulated data, and the fits seem to get stuck in a way that the real experimental data isn't exhibiting (see stuckness below). The latest commit in this PR, is an ipynb example of this attempt at analyzing simulated data. Perhaps this is also related to some numerical stability that can be easily fixed?

jchodera · 2018-09-14T23:32:24Z

Furthermore I have been playing a bit with simulated data, and the fits seem to get stuck in a way that the real experimental data isn't exhibiting (see stuckness below). The latest commit in this PR, is an ipynb example of this attempt at analyzing simulated data. Perhaps this is also related to some numerical stability that can be easily fixed?

I think this is the issue we talked about on Wednesday. Does your simulated data have experimental measurement noise in it? Or is it still getting stuck even with that noise?

sonyahanson · 2018-09-15T14:37:47Z

It has noise, but it is possible it is not in the right form...

sonyahanson · 2018-10-13T20:46:33Z

@jchodera did you ever get a chance to look at the numerical instabilities in this model and see if there are any easy fixes?

jchodera · 2018-10-13T21:28:40Z

Not yet, but I hope to dive in on Tuesday!

sonyahanson · 2018-11-08T20:17:57Z

Note, setting F_PL does not seem to prevent bistability. This is the delG trace when using a prior on F_PL:

sonyahanson · 2018-12-05T18:55:32Z

So I chatted with @chaya about some potential numerical stabilities in the analytical binding model, and these were the candidates we came up with:

np.exp and np.sqrt would be unstable for any very big or small numbers, I'm not that suspicious of anything involving np.exp, but to check np.sqrtwe should look at what values we are usually getting for a and b
I could improve my 'better_theta' function (https://github.com/choderalab/assaytools/blob/Competition_Bayes/AssayTools/bindingmodels.py#L200-L208) for things that are close to 1 (like 0.999) and not just 1.

@jchodera do you have any opinions about either of these?

steven-albanese · 2018-12-05T19:02:13Z

np.exp and np.sqrt would be unstable for any very big or small numbers, I'm not that suspicious of anything involving np.exp, but to check np.sqrtwe should look at what values we are usually getting for a and b

For mathematical functions in pymc3, they recommend using theano's built in functions. Is there something equivalent for pymc, or is that numpy?

sonyahanson · 2018-12-17T21:34:46Z

@jchodera did you get a chance to take a look at these numerical instabilities? I'll probably take a stab this week if you have any quick suggestions.

sonyahanson · 2019-03-15T17:06:12Z

Since it has gotten burried in other comments, here's a rehash of errors that I think are related to the numerical stability of this competition binding model. These are the ones I have come across in a trial run today to try and get some more insight into this, e.g. by looking at other traces besides just delG.

pymcmodels.py:240: RuntimeWarning: overflow encountered in exp
  scaling[large_indices] = (1 - np.exp(-alpha[large_indices])) / alpha[large_indices]
  
ipykernel_launcher.py:126: RuntimeWarning: overflow encountered in exp

pymcmodels.py:72: RuntimeWarning: invalid value encountered in multiply
  tau = 0*mean
  
pymcmodels.py:86: RuntimeWarning: divide by zero encountered in reciprocal
  tau[small_indices] = (x[small_indices] - x[small_indices]**2/2 + x[small_indices]**3/3 - x[small_indices]**4/4 + x[small_indices]**5/5)**(-1)
  
pymcmodels.py:95: RuntimeWarning: invalid value encountered in true_divide
  mu = np.log(mean**2 / np.sqrt(stddev**2 + mean**2))

jchodera · 2019-03-16T02:08:01Z

It appears that, somehow, these quantities are going negative:

    ELC_ex = epsilon_ex*path_length*concentration
    ELC_em = epsilon_em*path_length*concentration

I'm not sure how this can happen, since all of these quantities must be positive.

jchodera · 2019-03-16T02:11:31Z

Specifically, it looks like the concentrations are going negative in the notebook!

jchodera · 2019-03-16T02:13:34Z

They're not just a little negative either:

AssertionError: concentration should not be negative, but are: [ 3.29791690e-13 -1.18127773e+02  1.08564837e-07  7.92493225e-14
  9.20966796e-03 -1.82372674e+01  5.82184201e-10  3.67619973e+01
  1.51951215e-09  2.74082464e-03  2.72987858e-11  0.00000000e+00]

That's -118 Molar!

jchodera · 2019-03-16T02:17:35Z

@sonyahanson : OK, the problem is your binding model:

    221         # Check all concentrations are nonnegative
--> 222         assert np.all(P >= 0)
    223         assert np.all(L >= 0)
    224         assert np.all(PL >= 0)

I've added some assertions to the model that check to make sure the concentrations are all nonnegative, and this is failing. Since you derived this model, can you fix it?

This is the problematic function.

I'd suggest switching to the GeneralBindingModel I wrote.

jchodera · 2019-03-25T17:29:33Z

Based on our chat today:

@jchodera will

Update TwoComponentBindingModel to not generate small negative concentration
Debug numerical issues in analytic vs general binding model Jupyter notebook

@sonyahanson will

Make sure the CompetitionBindingModel code implements the equations---we suspect there is a small error leading to large negative concentrations

…ecks for negative concentrations

jchodera · 2019-03-26T05:16:28Z

@sonyahanson: I think I've sorted out the issues we discussed today!

I've committed some changes that make the standard TwoComponentBindingModel a more robust to negative concentrations.

I've also discovered the issue with the warnings from your use of GeneralBindingModel! The issue was that it only accepts log concentrations, but you were feeding it log(0) since some wells had zero ligand concentration in them. For now, I've set these to be the log of 1/1000 the smallest nonzero concentration, but I think I can modify the API for the general binding model to accept and return concentrations instead of log concentrations if it's inconvenient to be careful to feed it valid logarithms.

I've checked in an updated Jupyter notebook with my changes, but note that I haven't fixed your CompetitiveBindingModel, so that part of the notebook still fails.

I've created an issue for me to implement a simpler API for you:
#141

The GeneralBindingModel is still super slow. I'll have to see how we can speed that up.

sonyahanson · 2019-03-26T13:43:00Z

Hmmm. Travis is still failing, but for a different reason, now...

jchodera · 2019-03-26T15:29:34Z

Looks like a few things are going on here:

linux builds are failing with:

/home/travis/miniconda3/envs/test/lib/python2.7/site-packages/assaytools/bindingmodels.py:94: VisibleDeprecationWarning: using a non-integer number instead of an integer will result in an error in the future
  Ltot = np.zeros(Ltot, dtype)
unexpected array size: size=1, arr_size=0, rank=1, effrank=0, arr.nd=24, dims=[ 1 ], arr.dims=[ 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ]

osx builds are failing with

  File "/Users/travis/miniconda3/envs/test/lib/python2.7/site-packages/assaytools/pymcmodels.py", line 497, in top_complex_fluorescence_model
    [P_i, L_i, PL_i] = TwoComponentBindingModel.equilibrium_concentrations(DeltaG, Ptrue[:], Ltrue[:])
  File "/Users/travis/miniconda3/envs/test/lib/python2.7/site-packages/assaytools/bindingmodels.py", line 93, in equilibrium_concentrations
    Ptot = np.zeros(Ptot, dtype)
TypeError: 'numpy.float64' object cannot be interpreted as an index

I'll try to debug.

jchodera · 2019-03-26T17:59:49Z

@sonyahanson : I think I've fixed the bug that prevented travis tests from running through.
Now I'm seeing some conda HTTP errors, so I've retriggered the builds that failed because of that---the others pass.

I understand we may be seeing lots of HTTP errors in the future until we migrate to conda-forge, so we may want to think of migrating assaytools to conda-forge and using only conda-forge dependencies as soon as things are a bit more stable.

jchodera · 2019-03-27T00:20:57Z

Tests pass now @sonyahanson

jchodera · 2019-04-12T14:10:17Z

@sonyahanson : I think we should be able to merge your PR!

sonyahanson · 2019-04-12T15:08:18Z

Sure, will do. I just climbed out of a grant resubmission whole so still want to check on if there is an obvious answer for why you are seeing negative concentrations with the analytical binding model, but I can do that in a new PR.

sonyahanson added 2 commits December 8, 2017 17:35

added simple competition model to bindingmodels.py

d033164

added example notebook for new competition bayesian model also compar…

2d61259

…es to very slow general binding model

small updates to analytic v general ipynb

2d07aeb

sonyahanson added 5 commits January 8, 2018 21:16

add function to competitive binding model to prevent nans in theta

6be8f5f

small change to competitive binding model

9a08412

added notebook rerun with new competitive binding model

cea5a4b

add <= to new competitive binding function

8b3d590

notebook rerun after <= change

0591624

added example of simulated data where delG fitting gets stuck

007b076

Add assertions to track down numerical issues.

17e7bb0

Improve robustness of TwoComponentBindingModel; add more assertion ch…

77cd569

…ecks for negative concentrations

jchodera mentioned this pull request Mar 26, 2019

Add new GeneralBindingModel API that accepts/returns concentrations instead of log concentrations #141

Open

Use simpler initial guess for GeneralBindingModel

60bec39

jchodera added 2 commits March 25, 2019 22:26

Merge remote-tracking branch 'origin/master' into Competition_Bayes

76c7390

Update binding models

6b857a2

Fix bug in TwoComponentBindingModel

3df0f29

sonyahanson merged commit 8181150 into master Apr 12, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bayesian Model for competition assay #127

Bayesian Model for competition assay #127

sonyahanson commented Dec 8, 2017

sonyahanson commented Dec 8, 2017

jchodera commented Dec 9, 2017

sonyahanson commented Dec 12, 2017

sonyahanson commented Jan 9, 2018

sonyahanson commented Jan 9, 2018

sonyahanson commented Sep 14, 2018

jchodera commented Sep 14, 2018

sonyahanson commented Sep 15, 2018

sonyahanson commented Oct 13, 2018

jchodera commented Oct 13, 2018

sonyahanson commented Nov 8, 2018

sonyahanson commented Dec 5, 2018

steven-albanese commented Dec 5, 2018

sonyahanson commented Dec 17, 2018

sonyahanson commented Mar 15, 2019

jchodera commented Mar 16, 2019

jchodera commented Mar 16, 2019

jchodera commented Mar 16, 2019

jchodera commented Mar 16, 2019

jchodera commented Mar 25, 2019 •

edited

jchodera commented Mar 26, 2019

sonyahanson commented Mar 26, 2019

jchodera commented Mar 26, 2019 •

edited

jchodera commented Mar 26, 2019

jchodera commented Mar 27, 2019

jchodera commented Apr 12, 2019

sonyahanson commented Apr 12, 2019

Bayesian Model for competition assay #127

Bayesian Model for competition assay #127

Conversation

sonyahanson commented Dec 8, 2017

sonyahanson commented Dec 8, 2017

jchodera commented Dec 9, 2017

sonyahanson commented Dec 12, 2017

sonyahanson commented Jan 9, 2018

sonyahanson commented Jan 9, 2018

sonyahanson commented Sep 14, 2018

jchodera commented Sep 14, 2018

sonyahanson commented Sep 15, 2018

sonyahanson commented Oct 13, 2018

jchodera commented Oct 13, 2018

sonyahanson commented Nov 8, 2018

sonyahanson commented Dec 5, 2018

steven-albanese commented Dec 5, 2018

sonyahanson commented Dec 17, 2018

sonyahanson commented Mar 15, 2019

jchodera commented Mar 16, 2019

jchodera commented Mar 16, 2019

jchodera commented Mar 16, 2019

jchodera commented Mar 16, 2019

jchodera commented Mar 25, 2019 • edited

jchodera commented Mar 26, 2019

sonyahanson commented Mar 26, 2019

jchodera commented Mar 26, 2019 • edited

jchodera commented Mar 26, 2019

jchodera commented Mar 27, 2019

jchodera commented Apr 12, 2019

sonyahanson commented Apr 12, 2019

jchodera commented Mar 25, 2019 •

edited

jchodera commented Mar 26, 2019 •

edited