## 9.50 
### Jocasta Manasseh Lewis and Josefina Correa 
### Project Overview 

In systems neuroscience experiments, it is often of interest to understand how neural activity relates to extrinsic covariates such as 1. the presence of a stimulus, 2. behavior, *or* intrinsic covariates such as 1. spiking history, 2. patterns of activity in local field potentials, or 3. network activity. The *point-process generalized linear model framework* provides a statistical tool for relating neural spiking activity to these types of covariates, and for addressing the uncertainty in the model's estimates. It also takes into account the point process nature of spike trains: a binary timeseries occurring continuously over time, where an event is a random variable. 

In this project, we will explore the point-process generalized linear model framework in detail. We will first study generalized linear models by implementing algorithms for fitting GLMs. We will also study how to simulate spike trains in order to test these algorithms. This will be useful for understanding convergence issues that may arise when fitting GLMs and potential solutions. Two factors that can lead to convergence problems when fitting GLMs are 1. perfect separability and 2. multiple colinearity. Perfect separability within the context of point-process GLMs has been described in Farhoodi and Eden, 2021. This problem arises when the predictors are linearly separable, and is refered to as the problem of *perfect predictors* in Farhoodi and Eden, 2021. Specifically, given a design matrix **X**, the *i-th* column of **X** is a perfect predictor if a non-zero value in column *i* implies a zero spike count at time *j*. This problem also arises if a linear combination of columns generates a perfect predictor. When the design matrix contains perfect predictors, there is an infinite number possible of solutions that can maximize the likelihood function, hence leading to convergence problems.
    
To learn how to address convergence issues, we will implement alternative fitting approaches, including regularization and a bayesian approach, as described in Farhoodi and Eden, 2021. We will study how these different approaches affect the model's predictions using simulated and real data. Additionally, we will compare how different algorithms for solving the weighted-least squares solution affect run-time, an important aspect when analyzing large neural recordings.For simulating data, we will use the methods described in Smith and Brown, 2003, for simulating a conditional intensity function, and then the Time Rescaling Theorem to generate spike trains from this simulated conditional intensity function. Analyses involving real data will be performed by using data collected from non-human primates under the transition to, maintenance of, and emergence from unconsciousness due to propofol.

### Timeline 
    
#### Week 1: March 15 - 21 
    
<ul>
<li>Reading materials: 
    <ul><li>Farhoodi and Eden, 2021</li>
        <li>Truccolo et al., 2005</li>
        <li>Smith and Brown, 2003</li>
    </ul>
</li>
</ul>

#### Weeks 2 & 3: March 22 - April 4

<ul>
<li>Reading materials: 
    <ul><li>Farhoodi and Eden, 2021</li>
        <li>Truccolo et al., 2005</li>
        <li>Smith and Brown, 2003</li>
        <li>Komarek et al., 2005</li>
    </ul>
</li>
<li>Implementation:
    <ul><li>Simulating spike trains via Time Rescaling Theorem</li>
        <li>Building a GLM's design matrix</li>
        <li>Iteratively Reweighted Least Squares</li>
        <li>Assessing goodness of fit via Time Rescaling Theorem</li>
    </ul>
</li>
</ul>

#### Weeks 4 & 5: April 5 - 18

<ul>
<li>Relevant reading materials: 
    <ul><li>Farhoodi and Eden, 2021</li>
    </ul>
</li>
<li>Implementation:
    <ul><li>Understanding the problem of perfect predictors via simulation</li>
        <li>Solving the problem of perfect predictors via logistic regression with regularization</li>
        <li>Comparing goodness of fit for models fit using regularization vs without regularization</li>
    </ul>
</li>
</ul>
    
    
#### Weeks 6 & 7: April 19 - May 2
    
<ul>
<li>Relevant reading materials: 
    <ul><li>Farhoodi and Eden, 2021</li>
    </ul>
</li>
<li>Implementation:
    <ul><li>Understanding the problem of perfect predictors via simulation</li>
        <li>Solving the problem of perfect predictors via Bayesian estimation</li>
        <li>Comparing goodness of fit for models fit using Bayesian estimation to those using regularization vs maximum likelihood (IRLS without regularization)
</li>
</ul>
</li>
</ul>
    
#### Weeks 8 & 9: May 3 - 9
<ul>
<li>Relevant reading materials: 
    <ul><li>Komarek et al., 2005</li>
    </ul>
</li>
<li>Implementation:
    <ul><li>Reducing run-time via conjugate gradient method</li>
        <li>Analyzing real data: do we see the problems detailed in Farhoodi and eden, 2021?</li>
</ul>
</li>
</ul>
    
#### Week 9 & 10: May 10 - 23
<ul>
<li>Relevant reading materials: 
    <ul><li>Komarek et al., 2005</li>
    </ul>
</li>
<li>Analyzing real data:
    <ul>
        <li>Scientific questions: 
            <ul><li>Is ensemble history an important predictor of a neuron's own spiking propensity?</li>
                <li>Are spikes coupled to the phases of other LFP oscillations?</li>
                <li>Are there other LFP features that are also strong correlates of a neuron's spiking propensity?
                </li>
            </ul>
        </li>
        <li>Numerical inquieries: 
            <ul><li>Do different fitting approaches affect the model's predictions?</li>
                <li>If there are additional LFP features (other than phase) that are also strong correlates of a neuron's spiking propensity, can we include them in the model, or does this lead to multicolinearity? </li>
        </ul>
</ul>
</li>
</ul>
    
### References

Bastos, A.M., Donoghue, J.A., Brincat, S.L., Mahnke, M., Yanar, J., Correa, J., Waite, A.S., Lundqvist, M., Roy, J., Brown, E.N. and Miller, E.K., 2020. Neural effects of propofol-induced unconsciousness and its reversal using thalamic stimulation. bioRxiv.
    
Brown, E.N., Barbieri, R., Ventura, V., Kass, R.E. and Frank, L.M., 2002. The time-rescaling theorem and its application to neural spike train data analysis. Neural computation, 14(2), pp.325-346.

Farhoodi, S. and Eden, U., 2021. The problem of perfect predictors in statistical spike train models. arXiv preprint arXiv:2102.00574.

Komarek, P. and Moore, A.W., 2005, November. Making logistic regression a core data mining tool with tr-irls. In Fifth IEEE International Conference on Data Mining (ICDM'05) (pp. 4-pp). IEEE.

Smith, A.C. and Brown, E.N., 2003. Estimating a state-space model from point process observations. Neural computation, 15(5), pp.965-991.

Truccolo, W., Eden, U.T., Fellows, M.R., Donoghue, J.P. and Brown, E.N., 2005. A point process framework for relating neural spiking activity to spiking history, neural ensemble, and extrinsic covariate effects. Journal of neurophysiology, 93(2), pp.1074-1089.

