dois

LeoEgidi · Jun 11, 2024 · 11684a1 · 11684a1
1 parent dbcf867
commit 11684a1
Show file tree

Hide file tree

Showing 2 changed files with 27 additions and 15 deletions.
diff --git a/paper/paper.bib b/paper/paper.bib
@@ -24,7 +24,8 @@ @incollection{egidi2018maxima
   year = 2018,
   editor={C. Perna and M. Pratesi and A. Ruiz-Gazen} ,
   publisher={Springer}, 
- pages={71--81}
+ pages={71--81},
+ doi = {https://doi.org/10.1007/978-3-319-73906-9_7}
  }
 
 
@@ -37,7 +38,8 @@ @article{egidi2018relabelling
   number={4},
   pages={957--969},
   year={2018},
-  publisher={Springer}
+  publisher={Springer},
+  doi = {https://doi.org/10.1007/s11222-017-9774-2}
 }
 
 
@@ -48,7 +50,8 @@ @ARTICLE{JMLR02
 journal={Journal on Machine Learning Research},
 year={2002},
 volume={3},
-pages={583-617}
+pages={583-617},
+doi = {https://doi.org/10.1162/153244303321897735}
 }
 
 
@@ -60,7 +63,8 @@ @article{fredJain05
  volume = {27},
  number = {6},
  year = {2005},
- pages = {835--850}
+ pages = {835--850},
+ doi = {10.1109/TPAMI.2005.113}
 } 
 
 
@@ -79,7 +83,8 @@ @incollection{puolamaki2009bayesian
   booktitle={Advances in Intelligent Data Analysis VIII},
   pages={381--392},
   year={2009},
-  publisher={Springer}
+  publisher={Springer},
+  doi = {https://doi.org/10.1007/978-3-642-03915-7_33}
 }
 
 @article{fritsch2009improved,
@@ -90,7 +95,8 @@ @article{fritsch2009improved
   number={2},
   pages={367--391},
   year={2009},
-  publisher={International Society for Bayesian Analysis}
+  publisher={International Society for Bayesian Analysis},
+  doi = {10.1214/09-BA414}
 }
 
 @article{stephens2000dealing,
@@ -101,7 +107,8 @@ @article{stephens2000dealing
   number={4},
   pages={795--809},
   year={2000},
-  publisher={Wiley Online Library}
+  publisher={Wiley Online Library},
+  doi = {https://doi.org/10.1111/1467-9868.00265}
 }
 
 @article{yao2012bayesian,

diff --git a/paper/paper.md b/paper/paper.md
@@ -34,6 +34,7 @@ affiliations:
 ---
 
 # Summary
+
 
 We introduce the `R` package `pivmet`, a software that performs different pivotal methods for identifying, extracting, and using 
 the so-called pivotal units  that are chosen from a partition of data points to represent the groups to which they belong. 
@@ -46,7 +47,7 @@ model-based clustering through sparse finite mixture models (SFMM) [@malsiner201
 which may allow to improve classical clustering techniques---e.g. the classical $k$-means---via a careful seeding; 
 and Dirichlet process mixture models (DPMM) in Bayesian nonparametrics [@ferguson1973bayesian; @escobar1995bayesian; @neal2000markov]. 
 
-## Installation
+# Installation
 
 The stable version of the package can be installed from the [Comprehensive R Archive Network (CRAN)](http://CRAN.R-project.org/package=pivmet):
 
@@ -58,14 +59,15 @@ library(pivmet)
 However, before installing the package, the user should  make sure to download the JAGS program at
 [https://sourceforge.net/projects/mcmc-jags/](https://sourceforge.net/projects/mcmc-jags/).
 
+
 # Statement of need
 
 In the modern *big-data* and *machine learning* age, summarizing some essential information from a dataset is often relevant and can
 help simplifying the data pre-processing steps. The advantage of identifying representative units of a group---hereafter *pivotal units* 
 or *pivots*---chosen in such a way that they are as far as possible from units in the other groups and/or as similar as possible to the units in the same 
 group, is that they may convey relevant information about the group they belong to while saving wasteful operations.  
 Despite the lack of a strict theoretical framework behind their characterization, the pivots may be beneficial in many machine learning frameworks,
-such as clustering, classification, and mixture modelling when the interest is in deriving reliable estimates in mixture models and/or finding a partition of the data points.  The theoretical framework concerning the pivotal methods implemented in the `pivmet` package is provided in [@egidi2018relabelling].
+such as clustering, classification, and mixture modelling when the interest is in deriving reliable estimates in mixture models and/or finding a partition of the data points.  The theoretical framework concerning the pivotal methods implemented in the `pivmet` package is provided in @egidi2018relabelling.
 
 The `pivmet`  package  for `R` is available from the Comprehensive `R`  Archive Network (CRAN) at
 [http://CRAN.R-project.org/package=pivmet](http://CRAN.R-project.org/package=pivmet) [@pivmet] and implements various pivotal selection criteria to 
@@ -106,8 +108,9 @@ such as the number of consensus partitions.
 
 # Example 1: relabelling for dealing with label switching 
 
+
 The Fishery dataset in the `bayesmix` [@bayesmix] package has been previously used by @titterington1985statistical and @papastamoulis2016label. 
-It consists of 256 snapper length measurements---see left plot of \autoref{fig:example1} for the data histogram, along with an estimated 
+It consists of 256 snapper length measurements---see \autoref{fig:example1} for the data histogram, along with an estimated 
 kernel density. Analogously to some previous works, we assume a Gaussian mixture model  with $k=5$ groups, where $\mu_j$, $\sigma_j$ and $\eta_j$ 
 are respectively the mean, the standard deviation and the weight of group $j = 1, \dots, k$. We fit our model by simulating $15000$ samples from the 
 posterior distribution of $(\mathbf{z}, \boldsymbol{\mu}, \boldsymbol{\sigma}, \boldsymbol{\eta})$, by selecting the default argument `software="rjags"`; 
@@ -141,18 +144,20 @@ cat(res_stan$model)
 
 ![Histograms of the Fishery data. The blue line represents the estimated kernel density. \label{fig:example1}](fish_hist.png)
 
-![Fishery dataset: traceplots of the parameters $(\mathbf{\mu}, \mathbf{\sigma}, \mathbf{\eta})$ obtained via the `rjags` option for the
- `piv_MCMC` function (Gibbs sampling, 15000 MCMC iterations). Top row: Raw MCMC outputs. 
-Bottom row: relabelled MCMC samples. \label{fig:example2}](fish_chains.pdf){width=60%}
-
 
-\autoref{fig:example2} displays the traceplots for the parameters $(\mathbf{\mu}, \mathbf{\sigma}, \mathbf{\eta})$. From the first row 
+\autoref{fig:example2} displays the traceplots for the parameters $(\mu, \sigma, \eta)$. From the first row 
 showing the raw MCMC outputs as given by the Gibbs sampling, we note that label switching clearly occurred. Our algorithm is able to fix label-switching 
 and reorder the means $\mu_j$ and the weights $\eta_j$, for $j=1,\ldots,k$, as emerged from the second row of the plot. 
 
+![Fishery dataset: traceplots of the parameters $(\mu, \sigma, \eta)$ obtained via the `rjags` option for the
+ `piv_MCMC` function (Gibbs sampling, 15000 MCMC iterations). Top row: Raw MCMC outputs. 
+Bottom row: relabelled MCMC samples. \label{fig:example2}](fish_chains.pdf){width=70%}
+
+
 
 # Example 2: consensus clustering
 
+
 As widely known, one of the drawbacks of the $k$-means algorithm is represented by its inefficiency in distinguishing between groups of unbalanced sizes. 
 The recent literature on clustering methods has explored some approaches to combine several partitions via a consensus clustering, which may improve the solution obtained from a single run of a clustering algorithm.  
 Here, we consider a consensus clustering technique based on $k$-means and pivotal methods used for a careful initial pivotal seeding.