LabeledFPOP paper
LOPART = Labeled Optimal Partitioning is a changepoint detection algorithm for data sequences which
- supports 0/1 labels,
- each changepoint is unconstrained (can be either up or down),
- and is quadratic time in the number of data.
GFPOP/PeakSegDisk is a changepoint detection algorithm/package which
- does not support labels,
- each changepoint can be constrained (non-decreasing, non-increasing),
- and is log-linear time in the number of data.
In this paper we investigate a new algorithm, FLOPART = Functional LOPART, which
- supports 0/up/down labels,
- each changepoint can be constrained (non-decreasing, non-increasing),
- and is log-linear time in the number of data.
- See FLOPART repo for R package with C++ code that implements this algorithm.
- Time complexity comparison with PeakSegOptimal / LOPART.
- Accuracy comparison with PeakSegOptimal, similar to LOPART paper.
- TODO LOPART comparison with both labeled data sets.
figure-computation-graph.R makes
figure-motivation.R makes a figure that shows a real data example for which the up-down constrained Poisson model is not working,
figure-smaller-example-intervals.R makes a figure that shows that
within a label, the number of intervals is either constant or
increasing… why? theory?
.
figure-smaller-example.R makes a figure which shows a convincing a simple real data motivating example for LOPART. Without using the label constraints there is no way to get zero label errors, but with label constraints we get a model which is quite reasonable with respect to the visually obvious peaks.
figure-Mono27ac-new-labels.R makes
figure-FLOPART-boundaries.R makes
figure-Mono27ac-FLOPART.R makes
TODO more labels!!
This is a good motivating data set for the new algorithm. We can’t detect the small peak with the un-labeled algo until 13 peaks, and in that model we end up with several big peaks in the data that have two peaks where there should be one, and one peak on the left where there should be none.
figure-Mono27ac.R makes
New PDF contains V single subscript notation rather than D double subscript notation, clarify what needs to be computed for positive labels.
Typed problem and dynamic programming update rules.