# Log
This page records ideas and notes chronologically. An attempt will be made to avoid making any edits other than fixing formatting or typos.


### 2022-09-27
Decided to start writing down notes. 

At this point, I have a model that is reasonably good at predicting spikes for a single cell, given the stimulus and the spike history. This is using the distance field model output. 

I'll spend a bit of time today seeing how small I can make the model performance degrades. Mainly I'll try reducing channel and layer counts. I don't want to be needlessly carrying around a heavy model into the next few experiments.


I tried out the experimentation automation tool, Guild AI; however, it was very finicky and configuration file heavy. It was especially hard to get relative paths working. In the end, I figured a manual approach would be more satisfying.

### 2022-09-29
From experiment 1.1.1-1.1.3, the distance field output seems to becoming more capable of pinpointing spikes with increased channel count and number of layers. The 50ms correlation calculation from the evaluation notebook increases 1.1.1 < 1.1.2 < 1.1.3. This is despite both the best-loss for the validation set showing the opposite (1.1.3 < 1.1.2 < 1.1.1). Notably too is the discrepancy between the notebook calculated correlation and the Trainable one. The notebook calculation is often tweaked. Would be good to take more effort to keep them in sync. Another possible explanation could be the presence of a weight regularization term being included in the loss. I'm using the weight regularization term of the PyTorch optimizer; however, I was under the impression that any effects of this would not appear in the loss term. A little test shows that this is in fact the behaviour (no weight regularization term in the loss):

In [1]:
import torch
import torch.nn.functional as F

def weightdecay_test():
    a = torch.Tensor([[
        [-1,0,1,2,3]
    ]])
    W = torch.Tensor([
        [1,1,1,1,1]]
    )
    
    def net(x):
        y = F.linear(x, W)
        return y
        
    target = torch.Tensor([[[0]]])
    def loss_without_reg():
        loss_fn = torch.nn.L1Loss(reduction='sum')
        optimizer_0 = torch.optim.Adam(params=[W], lr=0.01, weight_decay=0)
        optimizer_0.zero_grad()
        b = net(a)
        loss = loss_fn(b, target=target)
        return loss
    
    def loss_with_reg():
        loss_fn = torch.nn.L1Loss(reduction='sum')
        optimizer_0 = torch.optim.Adam(params=[W], lr=0.01, weight_decay=10)
        optimizer_0.zero_grad()
        b = net(a)
        loss = loss_fn(b, target=target)
        return loss
   
    print(f'Without: {loss_without_reg()}')
    print(f'With: {loss_with_reg()}')
    
weightdecay_test()

Without: 5.0
With: 5.0


So what else could be causing the bigger models to have higher validation loss, yet have a better looking distance field output? Maybe the answer is that "better looking" is my subjective guess, and a guess based on a very small portion of the data. But the subjective guess shouldn't be ignored, as it is proximate to easier by-eye spike inference, and the current inference scripts are heavily inspired by by-eye approaches like looking for local minima and not trying to calculate the maximum likelihood solution via energy minimization. It's possible that the smaller models have a better distance field output, but the smoothness makes inference harder when using hacky inference scripts, which the current inference scripts definitely are. 