# Glitch Removal with Quasiphysical Model paper

## Layout of paper

 1. Introduction; glitches are bad; questions to answer
 2. Layout types of glitches and FD model
 3. Details of inference
 4. Results on three glitch classes; clusters, amplitudes
 5. Interaction with BBH templates; match
 6. Sketch of practical use; how to use in a search
 7. Analysis of highmass events
 8. Conclusion

### 1. Introduction; glitches are bad

LIGO is afflicted by many short glitches (at least several an hour per IFO) that are similar to high-mass black hole signals. They complicate the data analysis, can obscure nearby signals, and can be mistaken for real signals. No auxiliary channels are able to predict their appearance so we will develop a method to remove them based only on the strain channel.

The questions we can address with a detailed modeling approach are:

 1. How much variation is there between glitches in each class?
 2. What are the amplitude and SNR distributions of the glitches?
 3. Are glitches of the same class identical in spectral shape, with apparent variations only due to noise?
 4. Are the GravitySpy classifications supported by this model?
 5. What is the match of each glitch type with CBC templates?
 6. How completely can we remove any of these glitches?
 7. Do any of the highmass events match one of our glitch templates better than they match a CBC signal?

### 2. Layout types of glitches and FD model

We will use GravitySpy to find glitches for testing and as an initial guide to their classification. There are three common types of short glitches -- blips, koi fish, and tomte -- as classified by their appearance in an Omega scan. It's not certain that these classifications really reflect an underlying reality, or just quirks in the behaviour of the Omega scan.

We create a parameterised frequency-domain model of the glitches. The model is quasi-physical -- although we do not have a detailed physical model of the glitch process, our model incorporates some of the known features of the glitches. From observing the glitches in whitened strain data, we know that they are typically nearly symmetric in the time domain. They are also quite short, with less than two whole cycles of osciallation.

By plotting several whitened FFTs of glitches, we see that their spectra is farily well matched by a log-normal in frequency, with the parameters varying between different glitches. The time symmetry and shortness in time can be captured by a purely real frequency model - there is no phase evolution.

The glitch examples will not be perfectly centered in the time domain, so we include a time offset in our model. There is an overall amplitude which is unknown. The glitches may not have perfect time symmetry, and some seem to instead be anti-symmetric, so we also include an overall phase.

###  3. Details of inference

We use NumPyro, with NUTS sampling, to perform Bayesian inference for our model. Running on a typical processor takes under 10 seconds, but this could be optimised. The log-likelihood is

$$\Lambda = - \sum_k \frac{| d(f_k) - g(f_k) |^2}{2 S(f_k)} \mathrm{(check~factor~of~two)}$$


We are not simply fitting the spectral shape of the glitch. This is a matched filter with a parameterised model. This, plus the amplitude and time offset, allow us to coherently subtract the glitch after inferring it.

### 4. Results on three glitch classes; clusters, amplitudes

Preliminary results - the clusters are not so well defined. The tomtes have frequencies in the 15-25 Hz range, but the bandwidths cover a large range. The koi and blips both have generally higher central frequency, and wider bandwidths than tomtes, but are otherwise hard to distinguish from each other.

### 5. Interaction with BBH templates; match

The overlap of tomtes with CBC templates peaks at Mtotal of 160 - 200, depending on the parameters of the tomte. The match does not depend much on the mass ratio, except that it drops at very asymmetric masses. The maximum match is about 75%. This likely means that the tomtes can cause a serious background issue for searches, but detecting them with our matched filter will allow them to be removed or rejected.

### 6. Sketch of practical use; how to use in a search

One approach is to have a template bank of a few representative types of glitches and search the data for them. When one is found, run the inference to get the best parameters. Then it could be subtracted from the data, or the information could be used in the search.

### 7. Analysis of highmass events

We could do a quick check of how the few highest mass events compare to the glitch templates.

## Things to Investigate

 1. Reparameterise model to physical frequency and bandwidth
 2. Does ADVI work? Why are the MAPs so far off?
 3. What's the overlap with a signal?