-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
0 parents
commit 855c148
Showing
27 changed files
with
1,684 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,30 @@ | ||
Package: fpop | ||
Type: Package | ||
Title: Segmentation using Optimal Partitioning and Function Pruning | ||
Version: 2019.08.26 | ||
Authors@R: c(person("Guillem", "Rigaill", email = "guillem.rigaill@inra.fr", | ||
role = c("aut", "cre")), | ||
person("Toby", "Hocking", | ||
role = c("aut")), | ||
person("Robert", "Maidstone", | ||
role = c("aut")), | ||
person("Michel", "Koskas", | ||
role = c("ctb")), | ||
person("Paul", "Fearnhead", | ||
role = c("aut"))) | ||
Maintainer: Guillem Rigaill <guillem.rigaill@inra.fr> | ||
Description: A dynamic programming algorithm for the fast segmentation of univariate signals into piecewise constant profiles. | ||
The 'fpop' package is a wrapper to a C++ implementation of the fpop (Functional Pruning Optimal Partioning) algorithm described in Maidstone et al. 2017 | ||
<doi:10.1007/s11222-016-9636-3>. The problem of detecting changepoints in an univariate sequence is formulated | ||
in terms of minimising the mean squared error over segmentations. The fpop algorithm exactly minimizes the mean squared error | ||
for a penalty linear in the number of changepoints. | ||
License: LGPL (>= 2.1) | ||
NeedsCompilation: yes | ||
Packaged: 2019-08-26 05:34:19 UTC; grigaill | ||
Author: Guillem Rigaill [aut, cre], | ||
Toby Hocking [aut], | ||
Robert Maidstone [aut], | ||
Michel Koskas [ctb], | ||
Paul Fearnhead [aut] | ||
Repository: CRAN | ||
Date/Publication: 2019-08-27 07:00:03 UTC |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,26 @@ | ||
9b4eff8ad9d4f19ba5689f0189cfcb85 *DESCRIPTION | ||
f40bd2bf2b89f159be3166f925f0ab52 *NAMESPACE | ||
476cad2d426880a65a522c5cbd2efaae *NEWS | ||
3826d0297c61e36a81060f2648874c90 *R/fpaccess.R | ||
94325cedacaf3e5faf95d6e6f3e0be68 *R/multiBinSeg.R | ||
87bea7f58c72fe2da5d36b48f319d0fc *R/onLoad.R | ||
73928861863be99e9a87e7887780e34d *inst/CITATION | ||
d10e04d1d9c428e9b6dc4df884facbb5 *man/Fpop.Rd | ||
faa689e9dcb31ef355f707d0d344dc5c *man/fpop-package.Rd | ||
2c5794b971e0783e174fa9e235ec9ef6 *man/fpop_analysis.Rd | ||
c8c3800c26698b6cd361246c77b1cd17 *man/multiBinSeg.Rd | ||
7f3cc9daf28a53aadacaff82d57c6a63 *man/retour_op.Rd | ||
785fa11da3df520d40db6795e569460c *src/BinSeg_MultiDim.cpp | ||
eccfa93d1a27e4d81262330b154e09e6 *src/BinSeg_MultiDim.h | ||
7ce6a7bc13518a8ef5ee5c46a97dfaa4 *src/Call_BinSeg_MultiDim.cpp | ||
7db0b52ab414c9d3c499a4ce0e8f6e66 *src/Call_BinSeg_MultiDim.h | ||
41312de0741592e07c1bf06a41c65137 *src/Heap.cpp | ||
d019b008db337f73a2b1ad57ef11ad7f *src/Heap.h | ||
e776f5fef8fcb67dc6d6a8c983f02e41 *src/Node.cpp | ||
9efff35b99fc6973867ad4e0a35ca7f9 *src/Node.h | ||
4ad000773b727ee4c03fc5852ecd3fcf *src/Rwrappers.cc | ||
022ddf7f3f60be65f6ca082873e61979 *src/colibri.cc | ||
da5d55c6c4ec95d7b9ac4731af5caeaa *src/colibri.h | ||
ad92d9bcb1c9bff7e25142448f9e7d5e *src/liste.cc | ||
849b053f9e0ef904c9c8ead64b229d28 *src/liste.h | ||
a44d1eb38268a0ca23bcff9bfc01d186 *src/polynome2.h |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,4 @@ | ||
useDynLib(fpop) | ||
export(Fpop,multiBinSeg) | ||
|
||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,40 @@ | ||
2019.01.19 | ||
|
||
Remove dependency on cghseg. | ||
|
||
Register routines in src/Rwrappers.cc | ||
|
||
2016.10.25 | ||
|
||
Fixed | ||
* checking package dependencies ... ERROR | ||
Namespace dependency not required: ‘cghseg’ | ||
|
||
2016.10.18 | ||
|
||
Some improved documentation for the return values of Fpop. | ||
|
||
Suggests: cghseg (instead of Depends). | ||
|
||
packageStartupMessage (instead of cat) and .onAttach (instead of .onLoad). | ||
|
||
2016.10.16 | ||
|
||
Remove GSL requirement, instead use INFINITY which is defined in | ||
math.h | ||
|
||
2016.10.03 | ||
|
||
package and docs updates to pass R CMD check. | ||
|
||
2014.7.16 | ||
|
||
multiBinSeg memory leak fixed. | ||
|
||
passes R CMD check with no errors, no warnings. | ||
|
||
0.0.1 | ||
|
||
first version with fast fpop code. | ||
|
||
multiBinSeg produced a memory leak. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,96 @@ | ||
retour_op <- function | ||
### This function is used by the Fpop function to recover the best | ||
### segment ends from 1:n from the C output. | ||
(path | ||
### the path vector of the "colibri_op_R_c C" function | ||
){ | ||
chaine <- integer(1) | ||
chaine[1] <- length(path) | ||
j <- 2 | ||
while(chaine[j-1] > 0){ | ||
chaine[j] <- path[chaine[j-1]] | ||
j=j+1 | ||
} | ||
rev(chaine)[-1] | ||
### a vector with the best segment ends. | ||
} | ||
|
||
Fpop <- structure(function | ||
### Function calling the fpop algorithm, use functional pruning and | ||
### optimal partionning to recover the best segmentation with respect | ||
### to the L2 loss with a per change-point penalty of lambda. More | ||
### precisely, this function computes the solution to argmin_m | ||
### sum_{i=1}^n (x_i-m_i)^2 + lambda * sum_{i=1}^{n-1} I(m_i != | ||
### m_{i+1}), where the indicator function I counts the number of | ||
### changes in the mean vector m. | ||
(x, | ||
### A vector of double : the signal to be segmented | ||
lambda, | ||
### Value of the penalty | ||
mini=min(x), | ||
### Min value for the mean parameter of the segment | ||
maxi=max(x) | ||
### Max value for the mean parameter of the segment | ||
){ | ||
n <- length(x) | ||
A <- .C("colibri_op_R_c", signal=as.double(x), n=as.integer(n), | ||
lambda=as.double(lambda), min=as.double(mini), | ||
max=as.double(maxi), path=integer(n), cost=double(n) | ||
, PACKAGE="fpop") | ||
A$t.est <- retour_op(A$path) | ||
A$K <- length(A$t.est) | ||
A$J.est <- A$cost[n] - (A$K+1)*lambda + sum(x^2) | ||
return(A); | ||
### Named list with the following elements: input data (signal, n, | ||
### lambda, min, max), path (best previous segment end up to each data | ||
### point), cost (optimal penalized cost up to each data point), t.est | ||
### (vector of overall optimal segment ends), K (optimal number of | ||
### segments), J.est (total un-penalized cost of optimal model). To | ||
### see how cost relates to J.est, see definition of J.est in the R | ||
### source code for this function. | ||
}, ex=function(){ | ||
set.seed(1) | ||
N <- 100 | ||
data.vec <- c(rnorm(N), rnorm(N, 2), rnorm(N)) | ||
fit <- Fpop(data.vec, N) | ||
end.vec <- fit$t.est | ||
change.vec <- end.vec[-length(end.vec)] | ||
start.vec <- c(1, change.vec+1) | ||
segs.list <- list() | ||
for(seg.i in seq_along(start.vec)){ | ||
start <- start.vec[seg.i] | ||
end <- end.vec[seg.i] | ||
seg.data <- data.vec[start:end] | ||
seg.mean <- mean(seg.data) | ||
segs.list[[seg.i]] <- data.frame( | ||
start, end, | ||
mean=seg.mean, | ||
seg.cost=sum((seg.data-seg.mean)^2)) | ||
} | ||
segs <- do.call(rbind, segs.list) | ||
plot(data.vec) | ||
with(segs, segments(start-0.5, mean, end+0.5, mean, col="green")) | ||
with(segs[-1,], abline(v=start-0.5, col="green", lty="dotted")) | ||
}) | ||
|
||
fpop_analysis <- function | ||
### A function to count the number of intervals and or candidate | ||
### segmentation at each step of fpop (under-developpemment) | ||
(x, | ||
### A vector of double : the signal to be segmented | ||
lambda, | ||
### Value of the penalty | ||
mini=min(x), | ||
### Min value for the mean parameter of the segment | ||
maxi=max(x) | ||
### Max value for the mean parameter of the segment | ||
){ | ||
n <- length(x) | ||
A <- .C("colibri_op_R_c_analysis", signal=as.double(x), n=as.integer(n), lambda=as.double(lambda), min=as.double(mini), max=as.double(maxi), path=integer(n), cost=double(n), nbCandidate=integer(n) | ||
, PACKAGE="fpop") | ||
A$t.est <- retour_op(A$path) | ||
return(A); | ||
### return a list with a vector containing the position of the change-points t.est | ||
} | ||
|
||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,30 @@ | ||
multiBinSeg <- function | ||
### Binary segmentation of p profiles using the L2 loss | ||
(geno, | ||
### A matrix with p columns and n lines, each column is one of the profile | ||
Kmax | ||
### Maximum number of change-points | ||
){ | ||
if(class(geno) == "matrix"){ | ||
nRow <- nrow(geno) | ||
nCol <- ncol(geno) | ||
} else { | ||
nRow <- length(geno) | ||
nCol <- 1 | ||
} | ||
|
||
A <- .C("Call_BinSeg", | ||
x_i= as.double((geno)), | ||
K= as.integer(Kmax), | ||
n= as.integer(nRow), | ||
P= as.integer(nCol), | ||
t.est= integer(Kmax), | ||
J.est = double(Kmax), | ||
PACKAGE="fpop") | ||
##A$Cost <- sum(geno^2) - sum(apply(geno, 2, sum)^2/nRow) + c(0, cumsum(A$RupturesCost)) | ||
A | ||
### return an object with the successive change-points found by binseg t.est and the L2 cost J.est | ||
} | ||
|
||
|
||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,5 @@ | ||
.onAttach <- function(lib, pkg, ...) { | ||
packageStartupMessage("Welcome to the fpop package. | ||
This package implements the FPOP algorithm (http://arxiv.org/abs/1409.1842), | ||
see the Fpop function.") | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,12 @@ | ||
|
||
bibentry(bibtype = "Article", | ||
title = "On optimal multiple changepoint algorithms for large data", | ||
author = c(person("Robert", "Maidstone"), | ||
person("Toby", "Hocking"), | ||
person("Guillem", "Rigaill"), | ||
person("Paul", "Fearnhead")), | ||
journal="Statistics and Computing", | ||
year = 2017, | ||
volume = 27, | ||
url = "https://link.springer.com/article/10.1007/s11222-016-9636-3", | ||
publisher="Springer") |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,55 @@ | ||
\name{Fpop} | ||
\alias{Fpop} | ||
\title{Fpop} | ||
\description{Function calling the fpop algorithm, use functional pruning and | ||
optimal partioning to recover the best segmentation with respect | ||
to the L2 loss with a per change-point penalty of lambda. More | ||
precisely, this function computes the solution to argmin_m | ||
sum_{i=1}^n (x_i-m_i)^2 + lambda * sum_{i=1}^{n-1} I(m_i != | ||
m_{i+1}), where the indicator function I counts the number of | ||
changes in the mean vector m.} | ||
\usage{Fpop(x, lambda, mini = min(x), maxi = max(x))} | ||
\arguments{ | ||
\item{x}{A vector of double : the signal to be segmented} | ||
\item{lambda}{Value of the penalty} | ||
\item{mini}{Min value for the mean parameter of the segment} | ||
\item{maxi}{Max value for the mean parameter of the segment} | ||
} | ||
|
||
\value{Named list with the following elements: input data (signal, n, | ||
lambda, min, max), path (best previous segment end up to each data | ||
point), cost (optimal penalized cost up to each data point), t.est | ||
(vector of overall optimal segment ends), K (optimal number of | ||
segments), J.est (total un-penalized cost of optimal model). To | ||
see how cost relates to J.est, see definition of J.est in the R | ||
source code for this function.} | ||
|
||
\author{Guillem Rigaill, Toby Dylan Hocking} | ||
|
||
|
||
|
||
|
||
\examples{ | ||
set.seed(1) | ||
N <- 100 | ||
data.vec <- c(rnorm(N), rnorm(N, 2), rnorm(N)) | ||
fit <- Fpop(data.vec, N) | ||
end.vec <- fit$t.est | ||
change.vec <- end.vec[-length(end.vec)] | ||
start.vec <- c(1, change.vec+1) | ||
segs.list <- list() | ||
for(seg.i in seq_along(start.vec)){ | ||
start <- start.vec[seg.i] | ||
end <- end.vec[seg.i] | ||
seg.data <- data.vec[start:end] | ||
seg.mean <- mean(seg.data) | ||
segs.list[[seg.i]] <- data.frame( | ||
start, end, | ||
mean=seg.mean, | ||
seg.cost=sum((seg.data-seg.mean)^2)) | ||
} | ||
segs <- do.call(rbind, segs.list) | ||
plot(data.vec) | ||
with(segs, segments(start-0.5, mean, end+0.5, mean, col="green")) | ||
with(segs[-1,], abline(v=start-0.5, col="green", lty="dotted")) | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,23 @@ | ||
\name{fpop-package} | ||
\alias{fpop-package} | ||
\alias{fpop} | ||
\docType{package} | ||
\title{Segmentation using optimal partioning and functional pruning} | ||
\description{A wrapper to a C implementation of optimal partioning with functional pruning} | ||
\details{ | ||
\tabular{ll}{Package: \tab fpop\cr | ||
Type: \tab Package\cr | ||
Title: \tab Segmentation using optimal partioning and functional pruning\cr | ||
Version: \tab 2014.7.16\cr | ||
Depends: \tab methods, cghseg\cr | ||
SystemRequirements: \tab GNU GSL\cr | ||
Date: \tab 2014-02-26\cr | ||
Author: \tab Guillem Rigaill\cr | ||
Maintainer: \tab Guillem Rigaill <rigaill@evry.inra.fr>\cr | ||
License: \tab LGPL (>= 2.1)\cr} | ||
} | ||
\author{Guillem Rigaill} | ||
|
||
\keyword{ package } | ||
|
||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,21 @@ | ||
\name{fpop_analysis} | ||
\alias{fpop_analysis} | ||
\title{fpop analysis} | ||
\description{A function to count the number of intervals and or candidate | ||
segmentation at each step of fpop (under-developpemment)} | ||
\usage{fpop_analysis(x, lambda, mini = min(x), maxi = max(x))} | ||
\arguments{ | ||
\item{x}{A vector of double : the signal to be segmented} | ||
\item{lambda}{Value of the penalty} | ||
\item{mini}{Min value for the mean parameter of the segment} | ||
\item{maxi}{Max value for the mean parameter of the segment} | ||
} | ||
|
||
\value{return a list with a vector containing the position of the change-points t.est} | ||
|
||
\author{Guillem Rigaill, Toby Dylan Hocking} | ||
|
||
|
||
|
||
|
||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,18 @@ | ||
\name{multiBinSeg} | ||
\alias{multiBinSeg} | ||
\title{multiBinSeg} | ||
\description{Binary segmentation of p profiles using the L2 loss} | ||
\usage{multiBinSeg(geno, Kmax)} | ||
\arguments{ | ||
\item{geno}{A matrix with p columns and n lines, each column is one of the profile} | ||
\item{Kmax}{Maximum number of change-points} | ||
} | ||
|
||
\value{return an object with the successive change-points found by binseg t.est and the L2 cost J.est} | ||
|
||
\author{Guillem Rigaill, Toby Dylan Hocking} | ||
|
||
|
||
|
||
|
||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,18 @@ | ||
\name{retour_op} | ||
\alias{retour_op} | ||
\title{retour op} | ||
\description{This function is used by the Fpop function to recover the best | ||
segment ends from 1:n from the C output.} | ||
\usage{retour_op(path)} | ||
\arguments{ | ||
\item{path}{the path vector of the "colibri_op_R_c C" function} | ||
} | ||
|
||
\value{a vector with the best segment ends.} | ||
|
||
\author{Guillem Rigaill, Toby Dylan Hocking} | ||
|
||
|
||
|
||
|
||
|
Oops, something went wrong.