Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Anti-phase templates #63

Closed
czuba opened this issue May 8, 2019 · 1 comment · Fixed by #595
Closed

Anti-phase templates #63

czuba opened this issue May 8, 2019 · 1 comment · Fixed by #595

Comments

@czuba
Copy link
Contributor

czuba commented May 8, 2019


NOTE: This issue is [thought to be] related to a separate post #60 documenting punctate temporal stuttering in kilosort2 output clusters


I've been having trouble achieving usable sorting results from Kilosort2, and results are actually so far from usable that I suspect that there must be some underlying initialization or parameterization problem.

The data

  • Examples are based on a single plx file recorded during part of a relatively brief [~40 min] duration & [subjectively] stable session.

    • 32 channel plexon U-probes with stereotrode configuration: 50µm within, 100µm between stereo pairs; 1.5 mm total
    • Sampled continuously 40kHz via Plexon MAP system
      (...old but stable hardware, bootstrapped into the modern era)
    • Converted to .dat files with common median referencing to remove cross channel artifacts
  • Though these examples are from a single file, similar issues are present across many attempts at applying Kilosort2 to files both long (2-3 hr) & concatenated (2-4 consecutively recorded data files), and with varying degrees of drift.

  • All kilosort parameters are set to default values or corresponding equivalents for 40kHz
    (...reverted to defaults after many attempts at tuning parameters to no avail.)

    • e.g.
      % make waveform length independent of sampling rate
      ops.nt0    = ceil( 0.0012 * ops.fs); % width of waveform templates (makes consistent N-ms 
      either side of spike, regardless of sampling rate)
      ops.nt0    = ops.nt0 + mod(ops.nt0-1, 2); % ensure oddity (...forum something about aiding 
      spike trough alignment)
      % define batches in seconds of data, not abstract bit chunks
      batchSec  = 10;  % define batch number in seconds
      ops.NT    = ceil(batchSec*ops.fs/32)*32 +ops.ntbuff; % convert to 32 count increments of 
      samples
  • Batch reordering figure:
    antiphaseWF_batchReorderFig

I'm experiencing two primary failure modes:

1) Temporal stuttering

2) Anti-phase templates

The issue of discrepancies between template and mean waveforms was brought up in the issue #31. The previous instance was thought to be related to low-amplitude waveforms (near the noise floor of the recordings) and/or an unmentioned timing issue, but the examples I'm seeing are clearly above the noise, and I've checked & rechecked timing parameters ad nauseam.

antiphaseWF_phyWfsComposite

The filtered and predicted timecourses are clearly evident in the gui display
antiphaseWF_tcFiltered+Prediction

And remnants from the ill fit template subtraction (and altogether missed spikes) are readily apparent (actually amplified) in the residual timecourses:
antiphaseWF_tcResid


...with the futility of further random walks through parameter space, I'm hoping that these dead ends might reflect issues that other users have encountered & defeated, or that the Kilosort authors might be able to chime in on potential causes/solutions.

It seems more than likely that these issues are related, and that part of the temporal stuttering is a result of runaway templates within batches that simultaneously miss & distort (by fitting new templates to the residuals) otherwise corresponding clusters across batches.
Or so one would hope.

Many thanks to any & all who might be able to shed some light on this.

@marius10p
Copy link
Contributor

The discrepancy between templates and mean waveforms probably has nothing to do with this: the templates are after data preprocessing, i,.e high-pass filtering and channel whitening, and that usually changes a little bit what they look like. That change can be larger for some waveforms depending on their spatiotemporal frequency content.

The mismatch between the filtered and predicted trace probably has a different cause. Do you observe this mismatch throughout the recording, or only on segments? If you run it for a shorter segment (i.e. 5 minutes) do you still get mismatches? I ask because the templates are continuously changing as a function of time, and while they might be good for a section of the recordings, they might somehow become mismatched during another segment.

czuba added a commit to czuba/Kilosort that referenced this issue Mar 3, 2021
…a updates for more modest channel count recordings (<100 ch)

- specifically, 32 ch. Plexon Uprobes (standard config: stereotrode, 50/100 um [within/between])

---
Overly detailed notes on merging/backport changes:

---

- _SOMETHING_ in KS30 seems to have made sig progress towards fixing batch-size-dependent template stuttering issue
 - see also: MouseLand#60  &  MouseLand#63
- have long suspected occasional [arbitrary] inversion of template/extracted PC components was a leading cause of stuttering
- while KS30's new clustering algorithm does not perform well on standard probes, _it does_ apply a new standardization to learned template polarity

From [KS30]/clustering/template_learning.m, ~line 95 (near eof):

```matlab
    %% Impose consistent waveform polarity on PCs
    % Done to avoid templates from fracturing into arbitrary batch-sized clusters ("stuttering")
    rez.W = zeros(ops.nt0,   0,3, 'single');
    rez.U = zeros(ops.Nchan, 0,3, 'single');
    rez.mu = zeros(1,0, 'single');
    for  t = 1:n0
        dWU = wPCA * gpuArray(Wpca(:,:,t));
        [w,s,u] = svdecon(dWU);
        wsign = -sign(w(ops.nt0min+1, 1));
        rez.W(:,t,:) = gather(wsign * w(:,1:3));
        rez.U(:,t,:) = gather(wsign * u(:,1:3) * s(1:3,1:3));
        rez.mu(t) = gather(sum(sum(rez.U(:,t,:).^2))^.5);
        rez.U(:,t,:) = rez.U(:,t,:) / rez.mu(t);
    end

    %%
    rez.ops.wPCA = wPCA;
```

- **_need to figure out way to safely/consistently backport this to KS25_**
 - how/where to impose it on learned templates?
  - w/in `learnAndSolve8b.m` would be natural equivalent
  - perhaps _inside_ `svdecon.m`, then would naturally propogate to all cases?
 - how/where/necessary[?] to impose similar polarity fix on extracted waveform snippets when assigning them to template clusters?
  - learnTemplates.m >> mexSVDsmall2() ?
  - w/in  mexMPnu8.cu   (...ugh, I hope not)

---

- mix of KS30 backports, and czuba updates/revisions for 32 ch probe recordings

- all transferred
- remaining differences are specific to KS3 (i.e. comments on deprecated params that are still in gui interface)

**learnTemplates.m**

- KS3 curiously initializes `WPCA` with size of 2* Nrank (??)

- handful of changes related to KS3 "warm start" not transferred
- since seems to be related to running based on existing templates from another file
  - may come back to these, but leaving them out for simplicity on first pass

**find_merges.m**

Changes not transferred:
- KS3 recomputes WPCA with set of temporal lags, then updates simscore with largest value
 - presumably helpful for eliminating subtle temporal offset duplicates
 - this is early candidate for next backport
- curious comment line (from Marius) saying "`% YOU REALLY SHOULD MAKE SURE THE PC CHANNELS MATCH HERE`"
  - hard to tell if this is related to "flippers"/stuttering, or the KS3 fix for ensuring consistent wf template polarity

**align_block2.m**

- fully synced (Mar. 1)
- Lots of datashift process updates by TBC
 - scale averaging width (i.e. original data span) across iterations so that temporally distant corrections are not propagated into new drift target [`F0`]
 - apply smoothing to to correlation matrix before extracting max alignment
  - ...good chance this is illogical/undesireable, but early attempt at reducing runaway on datasets that don't have a zillion channels [wild oversampling capiblity of neuropixels]
 - plot drift alignment during initial [main] phase
  - this phase is totally opaque in standard KS3 method; which only shows fine grained shifts during second phase

**datashift2.m**

- KS3 default Nrank is manually set to 6
 - ...maybe cause of curious doubling of Nrank during `learnTemplates.m`
- KS3 updates to also upsample x dimension
 - note the clunky implementation of this, in case of irregular spacing
  - unlikely to come up on standard Plexon Probes
  - could come up with 'in-stock' [mfg. defect] probes, or Nform arrays (longer spacing for most distal channel on each shank)

**extractPCfromSnippets.m**

- orig hardcoded to only process `1:100:Nbatches` results in overly sparse sampling when using longer batch durations
 - e.g. 32 chan probe recordings often benefit from batch duration of 5-10+ sec
 - _updated_ to use `ops.nskip` parameter instead

**get_whitening_matrix.m**

- reverted all changes to Marius original, since faster memmapping options now encapsulated in `get_whitening_matrix_faster.m`
 - NOTE: gui uses it's own implementation of whitening code (which _does_ use memmapping)

**get_whitening_matrix_faster.m**
- needs to be hardcoded as standard method in `preprocessDataSub.m`
- czuba speed updates when flag `ops.useMemMapping = 1`
 - Marius nixed standard integration b/c said memmapping problems tend to arise with different OSes
 - ...not relevant since we only run sorting on Linux (& Ubuntu >=16.04 specifically)

**preprocessDataSub.m**

- hardcoded to use `_faster` version of `get_whitening_matrix.m`
- left commented-out attempts to also memmap writing of preprocessed data
 - fwrite seemed to be separate bottle neck that precluded sig improvements seen while computing whitening filter
 - possible this was due to asymmetry of read/write speed on old sort machine (~1kMB/s read, 300-500MB/s write)

**standalone_detector.m**

- bug fixes to prevent exceeded indexes on smaller (<100) channel counts
- don't hardcode sig of gaussian decay, instead use existing ops.sig value
 - Marius not keen on this, b/c 10um is based on physiology
 - **_BUT_** in practice, adjusting this variable seems crucial when electrode sampling is more coarse than neuropixels (e.g. uprobe 50-100 um between sites)

Changes not transferred:
- KS25 hardcoded NrankPC==6 & recalc of `WPCA` using `extractTemplatesfromSnippets.m`
- KS30 seems to pull WPCA & wTEMP from existing rez. fields, and moves them to gpu

---

2021-03-03  T.Czuba
czuba added a commit to czuba/Kilosort that referenced this issue Mar 3, 2021
…a updates for more modest channel count recordings (<100 ch)

- specifically, 32 ch. Plexon Uprobes (standard config: stereotrode, 50/100 um [within/between])

_Commit update:_
- adding back markdown formatted headers that were lost in copy-pasted commit text

---
Overly detailed notes on merging/backport changes:
---

**Merging KS30 changes into KS25**

 !!!:  Standardization of waveform template polarity

- _SOMETHING_ in KS30 seems to have made sig progress towards fixing batch-size-dependent template stuttering issue
 - see also: MouseLand#60  &  MouseLand#63
- have long suspected occasional [arbitrary] inversion of template/extracted PC components was a leading cause of stuttering
- while KS30's new clustering algorithm does not perform well on standard probes, _it does_ apply a new standardization to learned template polarity

From [KS30]/clustering/template_learning.m, ~line 95 (near eof):

```matlab
    %% Impose consistent waveform polarity on PCs
    % Done to avoid templates from fracturing into arbitrary batch-sized clusters ("stuttering")
    rez.W = zeros(ops.nt0,   0,3, 'single');
    rez.U = zeros(ops.Nchan, 0,3, 'single');
    rez.mu = zeros(1,0, 'single');
    for  t = 1:n0
        dWU = wPCA * gpuArray(Wpca(:,:,t));
        [w,s,u] = svdecon(dWU);
        wsign = -sign(w(ops.nt0min+1, 1));
        rez.W(:,t,:) = gather(wsign * w(:,1:3));
        rez.U(:,t,:) = gather(wsign * u(:,1:3) * s(1:3,1:3));
        rez.mu(t) = gather(sum(sum(rez.U(:,t,:).^2))^.5);
        rez.U(:,t,:) = rez.U(:,t,:) / rez.mu(t);
    end

    %%
    rez.ops.wPCA = wPCA;
```

- **_need to figure out way to safely/consistently backport this to KS25_**
 - how/where to impose it on learned templates?
  - w/in `learnAndSolve8b.m` would be natural equivalent
  - perhaps _inside_ `svdecon.m`, then would naturally propogate to all cases?
 - how/where/necessary[?] to impose similar polarity fix on extracted waveform snippets when assigning them to template clusters?
  - learnTemplates.m >> mexSVDsmall2() ?
  - w/in  mexMPnu8.cu   (...ugh, I hope not)

---

**File changes KS25::KS30**

- mix of KS30 backports, and czuba updates/revisions for 32 ch probe recordings

---
**./gui**

- all transferred
- remaining differences are specific to KS3 (i.e. comments on deprecated params that are still in gui interface)

**./mainLoop**

**learnTemplates.m**

- KS3 curiously initializes `WPCA` with size of 2* Nrank (??)

- handful of changes related to KS3 "warm start" not transferred
- since seems to be related to running based on existing templates from another file
  - may come back to these, but leaving them out for simplicity on first pass

**./postProcess**

**find_merges.m**

Changes not transferred:
- KS3 recomputes WPCA with set of temporal lags, then updates simscore with largest value
 - presumably helpful for eliminating subtle temporal offset duplicates
 - this is early candidate for next backport
- curious comment line (from Marius) saying "`% YOU REALLY SHOULD MAKE SURE THE PC CHANNELS MATCH HERE`"
  - hard to tell if this is related to "flippers"/stuttering, or the KS3 fix for ensuring consistent wf template polarity

**./preProcess**

**align_block2.m**

- fully synced (Mar. 1)
- Lots of datashift process updates by TBC
 - scale averaging width (i.e. original data span) across iterations so that temporally distant corrections are not propagated into new drift target [`F0`]
 - apply smoothing to to correlation matrix before extracting max alignment
  - ...good chance this is illogical/undesireable, but early attempt at reducing runaway on datasets that don't have a zillion channels [wild oversampling capiblity of neuropixels]
 - plot drift alignment during initial [main] phase
  - this phase is totally opaque in standard KS3 method; which only shows fine grained shifts during second phase

**datashift2.m**

- KS3 default Nrank is manually set to 6
 - ...maybe cause of curious doubling of Nrank during `learnTemplates.m`
- KS3 updates to also upsample x dimension
 - note the clunky implementation of this, in case of irregular spacing
  - unlikely to come up on standard Plexon Probes
  - could come up with 'in-stock' [mfg. defect] probes, or Nform arrays (longer spacing for most distal channel on each shank)

**extractPCfromSnippets.m**

- orig hardcoded to only process `1:100:Nbatches` results in overly sparse sampling when using longer batch durations
 - e.g. 32 chan probe recordings often benefit from batch duration of 5-10+ sec
 - _updated_ to use `ops.nskip` parameter instead

**get_whitening_matrix.m**

- reverted all changes to Marius original, since faster memmapping options now encapsulated in `get_whitening_matrix_faster.m`
 - NOTE: gui uses it's own implementation of whitening code (which _does_ use memmapping)

**get_whitening_matrix_faster.m**
- needs to be hardcoded as standard method in `preprocessDataSub.m`
- czuba speed updates when flag `ops.useMemMapping = 1`
 - Marius nixed standard integration b/c said memmapping problems tend to arise with different OSes
 - ...not relevant since we only run sorting on Linux (& Ubuntu >=16.04 specifically)

**preprocessDataSub.m**

- hardcoded to use `_faster` version of `get_whitening_matrix.m`
- left commented-out attempts to also memmap writing of preprocessed data
 - fwrite seemed to be separate bottle neck that precluded sig improvements seen while computing whitening filter
 - possible this was due to asymmetry of read/write speed on old sort machine (~1kMB/s read, 300-500MB/s write)

**standalone_detector.m**

- bug fixes to prevent exceeded indexes on smaller (<100) channel counts
- don't hardcode sig of gaussian decay, instead use existing ops.sig value
 - Marius not keen on this, b/c 10um is based on physiology
 - **_BUT_** in practice, adjusting this variable seems crucial when electrode sampling is more coarse than neuropixels (e.g. uprobe 50-100 um between sites)

Changes not transferred:
- KS25 hardcoded NrankPC==6 & recalc of `WPCA` using `extractTemplatesfromSnippets.m`
- KS30 seems to pull WPCA & wTEMP from existing rez. fields, and moves them to gpu

---

2021-03-03  T.Czuba
carsen-stringer pushed a commit that referenced this issue Feb 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants