Skip to content

Commit

Permalink
version 2019.08.26
Browse files Browse the repository at this point in the history
  • Loading branch information
guillemr authored and cran-robot committed Aug 27, 2019
0 parents commit 855c148
Show file tree
Hide file tree
Showing 27 changed files with 1,684 additions and 0 deletions.
30 changes: 30 additions & 0 deletions DESCRIPTION
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
Package: fpop
Type: Package
Title: Segmentation using Optimal Partitioning and Function Pruning
Version: 2019.08.26
Authors@R: c(person("Guillem", "Rigaill", email = "guillem.rigaill@inra.fr",
role = c("aut", "cre")),
person("Toby", "Hocking",
role = c("aut")),
person("Robert", "Maidstone",
role = c("aut")),
person("Michel", "Koskas",
role = c("ctb")),
person("Paul", "Fearnhead",
role = c("aut")))
Maintainer: Guillem Rigaill <guillem.rigaill@inra.fr>
Description: A dynamic programming algorithm for the fast segmentation of univariate signals into piecewise constant profiles.
The 'fpop' package is a wrapper to a C++ implementation of the fpop (Functional Pruning Optimal Partioning) algorithm described in Maidstone et al. 2017
<doi:10.1007/s11222-016-9636-3>. The problem of detecting changepoints in an univariate sequence is formulated
in terms of minimising the mean squared error over segmentations. The fpop algorithm exactly minimizes the mean squared error
for a penalty linear in the number of changepoints.
License: LGPL (>= 2.1)
NeedsCompilation: yes
Packaged: 2019-08-26 05:34:19 UTC; grigaill
Author: Guillem Rigaill [aut, cre],
Toby Hocking [aut],
Robert Maidstone [aut],
Michel Koskas [ctb],
Paul Fearnhead [aut]
Repository: CRAN
Date/Publication: 2019-08-27 07:00:03 UTC
26 changes: 26 additions & 0 deletions MD5
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
9b4eff8ad9d4f19ba5689f0189cfcb85 *DESCRIPTION
f40bd2bf2b89f159be3166f925f0ab52 *NAMESPACE
476cad2d426880a65a522c5cbd2efaae *NEWS
3826d0297c61e36a81060f2648874c90 *R/fpaccess.R
94325cedacaf3e5faf95d6e6f3e0be68 *R/multiBinSeg.R
87bea7f58c72fe2da5d36b48f319d0fc *R/onLoad.R
73928861863be99e9a87e7887780e34d *inst/CITATION
d10e04d1d9c428e9b6dc4df884facbb5 *man/Fpop.Rd
faa689e9dcb31ef355f707d0d344dc5c *man/fpop-package.Rd
2c5794b971e0783e174fa9e235ec9ef6 *man/fpop_analysis.Rd
c8c3800c26698b6cd361246c77b1cd17 *man/multiBinSeg.Rd
7f3cc9daf28a53aadacaff82d57c6a63 *man/retour_op.Rd
785fa11da3df520d40db6795e569460c *src/BinSeg_MultiDim.cpp
eccfa93d1a27e4d81262330b154e09e6 *src/BinSeg_MultiDim.h
7ce6a7bc13518a8ef5ee5c46a97dfaa4 *src/Call_BinSeg_MultiDim.cpp
7db0b52ab414c9d3c499a4ce0e8f6e66 *src/Call_BinSeg_MultiDim.h
41312de0741592e07c1bf06a41c65137 *src/Heap.cpp
d019b008db337f73a2b1ad57ef11ad7f *src/Heap.h
e776f5fef8fcb67dc6d6a8c983f02e41 *src/Node.cpp
9efff35b99fc6973867ad4e0a35ca7f9 *src/Node.h
4ad000773b727ee4c03fc5852ecd3fcf *src/Rwrappers.cc
022ddf7f3f60be65f6ca082873e61979 *src/colibri.cc
da5d55c6c4ec95d7b9ac4731af5caeaa *src/colibri.h
ad92d9bcb1c9bff7e25142448f9e7d5e *src/liste.cc
849b053f9e0ef904c9c8ead64b229d28 *src/liste.h
a44d1eb38268a0ca23bcff9bfc01d186 *src/polynome2.h
4 changes: 4 additions & 0 deletions NAMESPACE
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
useDynLib(fpop)
export(Fpop,multiBinSeg)


40 changes: 40 additions & 0 deletions NEWS
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@
2019.01.19

Remove dependency on cghseg.

Register routines in src/Rwrappers.cc

2016.10.25

Fixed
* checking package dependencies ... ERROR
Namespace dependency not required: ‘cghseg’

2016.10.18

Some improved documentation for the return values of Fpop.

Suggests: cghseg (instead of Depends).

packageStartupMessage (instead of cat) and .onAttach (instead of .onLoad).

2016.10.16

Remove GSL requirement, instead use INFINITY which is defined in
math.h

2016.10.03

package and docs updates to pass R CMD check.

2014.7.16

multiBinSeg memory leak fixed.

passes R CMD check with no errors, no warnings.

0.0.1

first version with fast fpop code.

multiBinSeg produced a memory leak.
96 changes: 96 additions & 0 deletions R/fpaccess.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,96 @@
retour_op <- function
### This function is used by the Fpop function to recover the best
### segment ends from 1:n from the C output.
(path
### the path vector of the "colibri_op_R_c C" function
){
chaine <- integer(1)
chaine[1] <- length(path)
j <- 2
while(chaine[j-1] > 0){
chaine[j] <- path[chaine[j-1]]
j=j+1
}
rev(chaine)[-1]
### a vector with the best segment ends.
}

Fpop <- structure(function
### Function calling the fpop algorithm, use functional pruning and
### optimal partionning to recover the best segmentation with respect
### to the L2 loss with a per change-point penalty of lambda. More
### precisely, this function computes the solution to argmin_m
### sum_{i=1}^n (x_i-m_i)^2 + lambda * sum_{i=1}^{n-1} I(m_i !=
### m_{i+1}), where the indicator function I counts the number of
### changes in the mean vector m.
(x,
### A vector of double : the signal to be segmented
lambda,
### Value of the penalty
mini=min(x),
### Min value for the mean parameter of the segment
maxi=max(x)
### Max value for the mean parameter of the segment
){
n <- length(x)
A <- .C("colibri_op_R_c", signal=as.double(x), n=as.integer(n),
lambda=as.double(lambda), min=as.double(mini),
max=as.double(maxi), path=integer(n), cost=double(n)
, PACKAGE="fpop")
A$t.est <- retour_op(A$path)
A$K <- length(A$t.est)
A$J.est <- A$cost[n] - (A$K+1)*lambda + sum(x^2)
return(A);
### Named list with the following elements: input data (signal, n,
### lambda, min, max), path (best previous segment end up to each data
### point), cost (optimal penalized cost up to each data point), t.est
### (vector of overall optimal segment ends), K (optimal number of
### segments), J.est (total un-penalized cost of optimal model). To
### see how cost relates to J.est, see definition of J.est in the R
### source code for this function.
}, ex=function(){
set.seed(1)
N <- 100
data.vec <- c(rnorm(N), rnorm(N, 2), rnorm(N))
fit <- Fpop(data.vec, N)
end.vec <- fit$t.est
change.vec <- end.vec[-length(end.vec)]
start.vec <- c(1, change.vec+1)
segs.list <- list()
for(seg.i in seq_along(start.vec)){
start <- start.vec[seg.i]
end <- end.vec[seg.i]
seg.data <- data.vec[start:end]
seg.mean <- mean(seg.data)
segs.list[[seg.i]] <- data.frame(
start, end,
mean=seg.mean,
seg.cost=sum((seg.data-seg.mean)^2))
}
segs <- do.call(rbind, segs.list)
plot(data.vec)
with(segs, segments(start-0.5, mean, end+0.5, mean, col="green"))
with(segs[-1,], abline(v=start-0.5, col="green", lty="dotted"))
})

fpop_analysis <- function
### A function to count the number of intervals and or candidate
### segmentation at each step of fpop (under-developpemment)
(x,
### A vector of double : the signal to be segmented
lambda,
### Value of the penalty
mini=min(x),
### Min value for the mean parameter of the segment
maxi=max(x)
### Max value for the mean parameter of the segment
){
n <- length(x)
A <- .C("colibri_op_R_c_analysis", signal=as.double(x), n=as.integer(n), lambda=as.double(lambda), min=as.double(mini), max=as.double(maxi), path=integer(n), cost=double(n), nbCandidate=integer(n)
, PACKAGE="fpop")
A$t.est <- retour_op(A$path)
return(A);
### return a list with a vector containing the position of the change-points t.est
}


30 changes: 30 additions & 0 deletions R/multiBinSeg.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
multiBinSeg <- function
### Binary segmentation of p profiles using the L2 loss
(geno,
### A matrix with p columns and n lines, each column is one of the profile
Kmax
### Maximum number of change-points
){
if(class(geno) == "matrix"){
nRow <- nrow(geno)
nCol <- ncol(geno)
} else {
nRow <- length(geno)
nCol <- 1
}

A <- .C("Call_BinSeg",
x_i= as.double((geno)),
K= as.integer(Kmax),
n= as.integer(nRow),
P= as.integer(nCol),
t.est= integer(Kmax),
J.est = double(Kmax),
PACKAGE="fpop")
##A$Cost <- sum(geno^2) - sum(apply(geno, 2, sum)^2/nRow) + c(0, cumsum(A$RupturesCost))
A
### return an object with the successive change-points found by binseg t.est and the L2 cost J.est
}



5 changes: 5 additions & 0 deletions R/onLoad.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
.onAttach <- function(lib, pkg, ...) {
packageStartupMessage("Welcome to the fpop package.
This package implements the FPOP algorithm (http://arxiv.org/abs/1409.1842),
see the Fpop function.")
}
12 changes: 12 additions & 0 deletions inst/CITATION
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@

bibentry(bibtype = "Article",
title = "On optimal multiple changepoint algorithms for large data",
author = c(person("Robert", "Maidstone"),
person("Toby", "Hocking"),
person("Guillem", "Rigaill"),
person("Paul", "Fearnhead")),
journal="Statistics and Computing",
year = 2017,
volume = 27,
url = "https://link.springer.com/article/10.1007/s11222-016-9636-3",
publisher="Springer")
55 changes: 55 additions & 0 deletions man/Fpop.Rd
Original file line number Diff line number Diff line change
@@ -0,0 +1,55 @@
\name{Fpop}
\alias{Fpop}
\title{Fpop}
\description{Function calling the fpop algorithm, use functional pruning and
optimal partioning to recover the best segmentation with respect
to the L2 loss with a per change-point penalty of lambda. More
precisely, this function computes the solution to argmin_m
sum_{i=1}^n (x_i-m_i)^2 + lambda * sum_{i=1}^{n-1} I(m_i !=
m_{i+1}), where the indicator function I counts the number of
changes in the mean vector m.}
\usage{Fpop(x, lambda, mini = min(x), maxi = max(x))}
\arguments{
\item{x}{A vector of double : the signal to be segmented}
\item{lambda}{Value of the penalty}
\item{mini}{Min value for the mean parameter of the segment}
\item{maxi}{Max value for the mean parameter of the segment}
}

\value{Named list with the following elements: input data (signal, n,
lambda, min, max), path (best previous segment end up to each data
point), cost (optimal penalized cost up to each data point), t.est
(vector of overall optimal segment ends), K (optimal number of
segments), J.est (total un-penalized cost of optimal model). To
see how cost relates to J.est, see definition of J.est in the R
source code for this function.}

\author{Guillem Rigaill, Toby Dylan Hocking}




\examples{
set.seed(1)
N <- 100
data.vec <- c(rnorm(N), rnorm(N, 2), rnorm(N))
fit <- Fpop(data.vec, N)
end.vec <- fit$t.est
change.vec <- end.vec[-length(end.vec)]
start.vec <- c(1, change.vec+1)
segs.list <- list()
for(seg.i in seq_along(start.vec)){
start <- start.vec[seg.i]
end <- end.vec[seg.i]
seg.data <- data.vec[start:end]
seg.mean <- mean(seg.data)
segs.list[[seg.i]] <- data.frame(
start, end,
mean=seg.mean,
seg.cost=sum((seg.data-seg.mean)^2))
}
segs <- do.call(rbind, segs.list)
plot(data.vec)
with(segs, segments(start-0.5, mean, end+0.5, mean, col="green"))
with(segs[-1,], abline(v=start-0.5, col="green", lty="dotted"))
}
23 changes: 23 additions & 0 deletions man/fpop-package.Rd
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
\name{fpop-package}
\alias{fpop-package}
\alias{fpop}
\docType{package}
\title{Segmentation using optimal partioning and functional pruning}
\description{A wrapper to a C implementation of optimal partioning with functional pruning}
\details{
\tabular{ll}{Package: \tab fpop\cr
Type: \tab Package\cr
Title: \tab Segmentation using optimal partioning and functional pruning\cr
Version: \tab 2014.7.16\cr
Depends: \tab methods, cghseg\cr
SystemRequirements: \tab GNU GSL\cr
Date: \tab 2014-02-26\cr
Author: \tab Guillem Rigaill\cr
Maintainer: \tab Guillem Rigaill <rigaill@evry.inra.fr>\cr
License: \tab LGPL (>= 2.1)\cr}
}
\author{Guillem Rigaill}

\keyword{ package }


21 changes: 21 additions & 0 deletions man/fpop_analysis.Rd
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
\name{fpop_analysis}
\alias{fpop_analysis}
\title{fpop analysis}
\description{A function to count the number of intervals and or candidate
segmentation at each step of fpop (under-developpemment)}
\usage{fpop_analysis(x, lambda, mini = min(x), maxi = max(x))}
\arguments{
\item{x}{A vector of double : the signal to be segmented}
\item{lambda}{Value of the penalty}
\item{mini}{Min value for the mean parameter of the segment}
\item{maxi}{Max value for the mean parameter of the segment}
}

\value{return a list with a vector containing the position of the change-points t.est}

\author{Guillem Rigaill, Toby Dylan Hocking}





18 changes: 18 additions & 0 deletions man/multiBinSeg.Rd
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
\name{multiBinSeg}
\alias{multiBinSeg}
\title{multiBinSeg}
\description{Binary segmentation of p profiles using the L2 loss}
\usage{multiBinSeg(geno, Kmax)}
\arguments{
\item{geno}{A matrix with p columns and n lines, each column is one of the profile}
\item{Kmax}{Maximum number of change-points}
}

\value{return an object with the successive change-points found by binseg t.est and the L2 cost J.est}

\author{Guillem Rigaill, Toby Dylan Hocking}





18 changes: 18 additions & 0 deletions man/retour_op.Rd
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
\name{retour_op}
\alias{retour_op}
\title{retour op}
\description{This function is used by the Fpop function to recover the best
segment ends from 1:n from the C output.}
\usage{retour_op(path)}
\arguments{
\item{path}{the path vector of the "colibri_op_R_c C" function}
}

\value{a vector with the best segment ends.}

\author{Guillem Rigaill, Toby Dylan Hocking}





0 comments on commit 855c148

Please sign in to comment.