Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Revisions for CRAN Upload #32

Open
wants to merge 7 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 0 additions & 1 deletion .Rprofile

This file was deleted.

4 changes: 2 additions & 2 deletions DESCRIPTION
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
Package: alto
Type: Package
Title: Aligns Topics across LDA Models
Version: 0.1.0
Version: 0.1.1
Authors@R: c(
person("Kris", "Sankaran", email = "ksankaran@wisc.edu",role = c("aut","cre")),
person("Laura","Symul", email = "laura.symul@uclouvain.be", role = c("aut")),
Expand All @@ -15,7 +15,7 @@ Description: "alto" is an R package that aligns topics from different LDA
License: MIT + file LICENSE
Encoding: UTF-8
LazyData: true
RoxygenNote: 7.2.3
RoxygenNote: 7.3.1
Imports:
T4transport,
dplyr,
Expand Down
15 changes: 13 additions & 2 deletions R/align_topics.R
Original file line number Diff line number Diff line change
Expand Up @@ -318,7 +318,7 @@ product_weights <- function(gammas, ...) {
#' @export
transport_weights <- function(gammas, betas, reg = 0.1, ...) {
betas_mat <- do.call(rbind, betas)
costs <- suppressMessages(JSD(betas_mat))
costs <- suppressMessages(2 ^ JSD(betas_mat))
ix <- seq_len(nrow(betas[[1]]))

a <- colSums(gammas[[1]])
Expand Down Expand Up @@ -429,29 +429,38 @@ setClass("alignment",
#' @export
setMethod("show", "alignment", print_alignment)

#' Generic - Extract Weights
#' @param x An alignment object output from \code{align_topics}.
setGeneric("weights", function(x) standardGeneric("weights"))
#' Weights Accessor for Alignment Class
#' @param x An alignment object output from \code{align_topics}.
#' @import methods
#' @export
setMethod("weights", "alignment", function(x) x@weights)

setGeneric("n_models", function(x) standardGeneric("n_models"))

#' Generic - Number of Topics
#' @param x An alignment object output from \code{align_topics}.
setGeneric("n_models", function(x) standardGeneric("n_models"))
#' Number of Models Method for Alignment Class
#' @param x An alignment object output from \code{align_topics}.
#' @import methods
#' @export
setMethod("n_models", "alignment", function(x) nlevels(x@weights$m))


#' Generic - Number of Topics
#' @param x An alignment object output from \code{align_topics}.
setGeneric("n_topics", function(x) standardGeneric("n_topics"))
#' Number of Topics Method for Alignment Class
#' @param x An alignment object output from \code{align_topics}.
#' @import methods
#' @export
setMethod("n_topics", "alignment", function(x) nrow(x@topics))


#' Generic - Extract Models
#' @param x An alignment object output from \code{align_topics}.
setGeneric("models", function(x) standardGeneric("models"))
#' Extract Models underlying Alignment
#' @param x An alignment object output from \code{align_topics}.
Expand All @@ -460,6 +469,8 @@ setGeneric("models", function(x) standardGeneric("models"))
setMethod("models", "alignment", function(x) x@models)


#' Generic - List of Topics and their Summaries
#' @param x An alignment object output from \code{align_topics}.
setGeneric("topics", function(x) standardGeneric("topics"))
#' Extract List of Topics and their Summaries
#' @param x An alignment object output from \code{align_topics}.
Expand Down
16 changes: 16 additions & 0 deletions R/plot_alignment.R
Original file line number Diff line number Diff line change
Expand Up @@ -598,6 +598,22 @@ discrepancy <- function(p, lambda = 1e-7) {
#' encode? Defaults to 'path'. Other possible arguments are 'coherence',
#' 'refinement', or 'topic'.
#' @param model_name_repair_fun How should names be repaired before plotting?
#' @param label_topics (optional, default = \code{FALSE}) A \code{logical}
#' specifying if topics should be labeled with the \code{"color_by"} information.
#' @param add_leaves (optional, default = \code{FALSE}) A \code{logical}
#' specifying if the topic composition of leave-topics should be printed.
#' @param leaves_text_size (optional, default = \code{10}) specifies the font
#' size of leaves annotations in \code{pt} if \code{add_leaves} is \code{TRUE}.
#' @param n_features_in_leaves (optional, default = 3) specifies the maximum
#' number of features that should be included in the leaves annotations
#' if \code{add_leaves} is \code{TRUE}.
#' @param min_feature_prop (optional, default = 0.1) specifies the minimum
#' proportion of a feature in a topic for that feature to be included in
#' the leaves annotations if \code{add_leaves} is \code{TRUE}.
#' @param top_n_edges (optional, \code{integer}, default = \code{NULL}) specifies
#' the number of edges that should be drawn between the topics of subsequent models.
#' The \code{top_n_edges} with the highest weights are drawn. If \code{NULL}
#' (default), all edges are drawn.
#' @seealso align_topics
#' @return A \code{ggplot2} object describing the alignment weights across
#' models.
Expand Down
17 changes: 11 additions & 6 deletions README.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -16,15 +16,16 @@ knitr::opts_chunk$set(
# alto

<!-- badges: start -->
[![badge](https://img.shields.io/badge/launch-binder-579aca.svg?logo=data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAFkAAABZCAMAAABi1XidAAAB8lBMVEX///9XmsrmZYH1olJXmsr1olJXmsrmZYH1olJXmsr1olJXmsrmZYH1olL1olJXmsr1olJXmsrmZYH1olL1olJXmsrmZYH1olJXmsr1olL1olJXmsrmZYH1olL1olJXmsrmZYH1olL1olL0nFf1olJXmsrmZYH1olJXmsq8dZb1olJXmsrmZYH1olJXmspXmspXmsr1olL1olJXmsrmZYH1olJXmsr1olL1olJXmsrmZYH1olL1olLeaIVXmsrmZYH1olL1olL1olJXmsrmZYH1olLna31Xmsr1olJXmsr1olJXmsrmZYH1olLqoVr1olJXmsr1olJXmsrmZYH1olL1olKkfaPobXvviGabgadXmsqThKuofKHmZ4Dobnr1olJXmsr1olJXmspXmsr1olJXmsrfZ4TuhWn1olL1olJXmsqBi7X1olJXmspZmslbmMhbmsdemsVfl8ZgmsNim8Jpk8F0m7R4m7F5nLB6jbh7jbiDirOEibOGnKaMhq+PnaCVg6qWg6qegKaff6WhnpKofKGtnomxeZy3noG6dZi+n3vCcpPDcpPGn3bLb4/Mb47UbIrVa4rYoGjdaIbeaIXhoWHmZYHobXvpcHjqdHXreHLroVrsfG/uhGnuh2bwj2Hxk17yl1vzmljzm1j0nlX1olL3AJXWAAAAbXRSTlMAEBAQHx8gICAuLjAwMDw9PUBAQEpQUFBXV1hgYGBkcHBwcXl8gICAgoiIkJCQlJicnJ2goKCmqK+wsLC4usDAwMjP0NDQ1NbW3Nzg4ODi5+3v8PDw8/T09PX29vb39/f5+fr7+/z8/Pz9/v7+zczCxgAABC5JREFUeAHN1ul3k0UUBvCb1CTVpmpaitAGSLSpSuKCLWpbTKNJFGlcSMAFF63iUmRccNG6gLbuxkXU66JAUef/9LSpmXnyLr3T5AO/rzl5zj137p136BISy44fKJXuGN/d19PUfYeO67Znqtf2KH33Id1psXoFdW30sPZ1sMvs2D060AHqws4FHeJojLZqnw53cmfvg+XR8mC0OEjuxrXEkX5ydeVJLVIlV0e10PXk5k7dYeHu7Cj1j+49uKg7uLU61tGLw1lq27ugQYlclHC4bgv7VQ+TAyj5Zc/UjsPvs1sd5cWryWObtvWT2EPa4rtnWW3JkpjggEpbOsPr7F7EyNewtpBIslA7p43HCsnwooXTEc3UmPmCNn5lrqTJxy6nRmcavGZVt/3Da2pD5NHvsOHJCrdc1G2r3DITpU7yic7w/7Rxnjc0kt5GC4djiv2Sz3Fb2iEZg41/ddsFDoyuYrIkmFehz0HR2thPgQqMyQYb2OtB0WxsZ3BeG3+wpRb1vzl2UYBog8FfGhttFKjtAclnZYrRo9ryG9uG/FZQU4AEg8ZE9LjGMzTmqKXPLnlWVnIlQQTvxJf8ip7VgjZjyVPrjw1te5otM7RmP7xm+sK2Gv9I8Gi++BRbEkR9EBw8zRUcKxwp73xkaLiqQb+kGduJTNHG72zcW9LoJgqQxpP3/Tj//c3yB0tqzaml05/+orHLksVO+95kX7/7qgJvnjlrfr2Ggsyx0eoy9uPzN5SPd86aXggOsEKW2Prz7du3VID3/tzs/sSRs2w7ovVHKtjrX2pd7ZMlTxAYfBAL9jiDwfLkq55Tm7ifhMlTGPyCAs7RFRhn47JnlcB9RM5T97ASuZXIcVNuUDIndpDbdsfrqsOppeXl5Y+XVKdjFCTh+zGaVuj0d9zy05PPK3QzBamxdwtTCrzyg/2Rvf2EstUjordGwa/kx9mSJLr8mLLtCW8HHGJc2R5hS219IiF6PnTusOqcMl57gm0Z8kanKMAQg0qSyuZfn7zItsbGyO9QlnxY0eCuD1XL2ys/MsrQhltE7Ug0uFOzufJFE2PxBo/YAx8XPPdDwWN0MrDRYIZF0mSMKCNHgaIVFoBbNoLJ7tEQDKxGF0kcLQimojCZopv0OkNOyWCCg9XMVAi7ARJzQdM2QUh0gmBozjc3Skg6dSBRqDGYSUOu66Zg+I2fNZs/M3/f/Grl/XnyF1Gw3VKCez0PN5IUfFLqvgUN4C0qNqYs5YhPL+aVZYDE4IpUk57oSFnJm4FyCqqOE0jhY2SMyLFoo56zyo6becOS5UVDdj7Vih0zp+tcMhwRpBeLyqtIjlJKAIZSbI8SGSF3k0pA3mR5tHuwPFoa7N7reoq2bqCsAk1HqCu5uvI1n6JuRXI+S1Mco54YmYTwcn6Aeic+kssXi8XpXC4V3t7/ADuTNKaQJdScAAAAAElFTkSuQmCC)](https://mybinder.org/v2/gh/krisrs1128/alto_demo/HEAD?urlpath=rstudio)
<!-- badges: end -->

[`alto`](https://lasy.github.io/alto/index.html) for aligning topics across a collection of LDA models. It provides functions to support the most common tasks in the analysis workflow,
[`alto`](https://lasy.github.io/alto/index.html) is an R package for aligning topics across a collection of LDA models. It provides functions to support the most common tasks in the analysis workflow,

* `run_lda_models()` fits a collection of LDA models across resolution levels
* `align_topics()` aligns topics across a collection of LDA models
* `topics()` provides metrics of alignment quality
* `plot()` shows the flow of alignment weight across resolution levels
* `plot_beta()` shows the topics associated with each underlying model
- `run_lda_models()` fits a collection of LDA models across resolution levels
- `align_topics()` aligns topics across a collection of LDA models
- `topics()` provides metrics of alignment quality
- `plot()` shows the flow of alignment weight across resolution levels
- `plot_beta()` shows the topics associated with each underlying model

Alignment can be used for (multiresolution) exploratory analysis.
By highlighting topics that are robust across choices of K, it can also support evaluation of LDA models.
Expand All @@ -34,6 +35,10 @@ By highlighting topics that are robust across choices of K, it can also support
<figcaption>The figure above shows an example alignment between topic models. See the vignette "Using `alto` on vaginal microbiome data" to reproduce this figure.</figcaption>
</figure>

You can learn more about alto's algorithm and use cases in our paper:

> Fukuyama, J., Sankaran, K., & Symul, L. (2022). Multiscale analysis of count data through topic alignment. In Biostatistics (Vol. 24, Issue 4, pp. 1045–1065). Oxford University Press (OUP). https://doi.org/10.1093/biostatistics/kxac018


## Installation

Expand Down
35 changes: 17 additions & 18 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,48 +4,47 @@
# alto

<!-- badges: start -->

[![badge](https://img.shields.io/badge/launch-binder-579aca.svg?logo=data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAFkAAABZCAMAAABi1XidAAAB8lBMVEX///9XmsrmZYH1olJXmsr1olJXmsrmZYH1olJXmsr1olJXmsrmZYH1olL1olJXmsr1olJXmsrmZYH1olL1olJXmsrmZYH1olJXmsr1olL1olJXmsrmZYH1olL1olJXmsrmZYH1olL1olL0nFf1olJXmsrmZYH1olJXmsq8dZb1olJXmsrmZYH1olJXmspXmspXmsr1olL1olJXmsrmZYH1olJXmsr1olL1olJXmsrmZYH1olL1olLeaIVXmsrmZYH1olL1olL1olJXmsrmZYH1olLna31Xmsr1olJXmsr1olJXmsrmZYH1olLqoVr1olJXmsr1olJXmsrmZYH1olL1olKkfaPobXvviGabgadXmsqThKuofKHmZ4Dobnr1olJXmsr1olJXmspXmsr1olJXmsrfZ4TuhWn1olL1olJXmsqBi7X1olJXmspZmslbmMhbmsdemsVfl8ZgmsNim8Jpk8F0m7R4m7F5nLB6jbh7jbiDirOEibOGnKaMhq+PnaCVg6qWg6qegKaff6WhnpKofKGtnomxeZy3noG6dZi+n3vCcpPDcpPGn3bLb4/Mb47UbIrVa4rYoGjdaIbeaIXhoWHmZYHobXvpcHjqdHXreHLroVrsfG/uhGnuh2bwj2Hxk17yl1vzmljzm1j0nlX1olL3AJXWAAAAbXRSTlMAEBAQHx8gICAuLjAwMDw9PUBAQEpQUFBXV1hgYGBkcHBwcXl8gICAgoiIkJCQlJicnJ2goKCmqK+wsLC4usDAwMjP0NDQ1NbW3Nzg4ODi5+3v8PDw8/T09PX29vb39/f5+fr7+/z8/Pz9/v7+zczCxgAABC5JREFUeAHN1ul3k0UUBvCb1CTVpmpaitAGSLSpSuKCLWpbTKNJFGlcSMAFF63iUmRccNG6gLbuxkXU66JAUef/9LSpmXnyLr3T5AO/rzl5zj137p136BISy44fKJXuGN/d19PUfYeO67Znqtf2KH33Id1psXoFdW30sPZ1sMvs2D060AHqws4FHeJojLZqnw53cmfvg+XR8mC0OEjuxrXEkX5ydeVJLVIlV0e10PXk5k7dYeHu7Cj1j+49uKg7uLU61tGLw1lq27ugQYlclHC4bgv7VQ+TAyj5Zc/UjsPvs1sd5cWryWObtvWT2EPa4rtnWW3JkpjggEpbOsPr7F7EyNewtpBIslA7p43HCsnwooXTEc3UmPmCNn5lrqTJxy6nRmcavGZVt/3Da2pD5NHvsOHJCrdc1G2r3DITpU7yic7w/7Rxnjc0kt5GC4djiv2Sz3Fb2iEZg41/ddsFDoyuYrIkmFehz0HR2thPgQqMyQYb2OtB0WxsZ3BeG3+wpRb1vzl2UYBog8FfGhttFKjtAclnZYrRo9ryG9uG/FZQU4AEg8ZE9LjGMzTmqKXPLnlWVnIlQQTvxJf8ip7VgjZjyVPrjw1te5otM7RmP7xm+sK2Gv9I8Gi++BRbEkR9EBw8zRUcKxwp73xkaLiqQb+kGduJTNHG72zcW9LoJgqQxpP3/Tj//c3yB0tqzaml05/+orHLksVO+95kX7/7qgJvnjlrfr2Ggsyx0eoy9uPzN5SPd86aXggOsEKW2Prz7du3VID3/tzs/sSRs2w7ovVHKtjrX2pd7ZMlTxAYfBAL9jiDwfLkq55Tm7ifhMlTGPyCAs7RFRhn47JnlcB9RM5T97ASuZXIcVNuUDIndpDbdsfrqsOppeXl5Y+XVKdjFCTh+zGaVuj0d9zy05PPK3QzBamxdwtTCrzyg/2Rvf2EstUjordGwa/kx9mSJLr8mLLtCW8HHGJc2R5hS219IiF6PnTusOqcMl57gm0Z8kanKMAQg0qSyuZfn7zItsbGyO9QlnxY0eCuD1XL2ys/MsrQhltE7Ug0uFOzufJFE2PxBo/YAx8XPPdDwWN0MrDRYIZF0mSMKCNHgaIVFoBbNoLJ7tEQDKxGF0kcLQimojCZopv0OkNOyWCCg9XMVAi7ARJzQdM2QUh0gmBozjc3Skg6dSBRqDGYSUOu66Zg+I2fNZs/M3/f/Grl/XnyF1Gw3VKCez0PN5IUfFLqvgUN4C0qNqYs5YhPL+aVZYDE4IpUk57oSFnJm4FyCqqOE0jhY2SMyLFoo56zyo6becOS5UVDdj7Vih0zp+tcMhwRpBeLyqtIjlJKAIZSbI8SGSF3k0pA3mR5tHuwPFoa7N7reoq2bqCsAk1HqCu5uvI1n6JuRXI+S1Mco54YmYTwcn6Aeic+kssXi8XpXC4V3t7/ADuTNKaQJdScAAAAAElFTkSuQmCC)](https://mybinder.org/v2/gh/krisrs1128/alto_demo/HEAD?urlpath=rstudio)
<!-- badges: end -->

[`alto`](https://lasy.github.io/alto/index.html) is an R package for aligning
topics across a collection of LDA models. It provides functions to support the
most common tasks in the analysis workflow,
[`alto`](https://lasy.github.io/alto/index.html) is an R package for
aligning topics across a collection of LDA models. It provides functions
to support the most common tasks in the analysis workflow,

- `run_lda_models()` fits a collection of LDA models across resolution
levels
- `align_topics()` aligns topics across a collection of LDA models
- `topics()` provides metrics of alignment quality
- `plot()` shows the flow of alignment weight across resolution levels
- `plot_beta()` shows the topics associated with each underlying model
- `run_lda_models()` fits a collection of LDA models across resolution
levels
- `align_topics()` aligns topics across a collection of LDA models
- `topics()` provides metrics of alignment quality
- `plot()` shows the flow of alignment weight across resolution levels
- `plot_beta()` shows the topics associated with each underlying model

Alignment can be used for (multiresolution) exploratory analysis. By
highlighting topics that are robust across choices of K, it can also
support evaluation of LDA models.

<figure>

<img src="man/figures/README-alignment-viz-2.png" width="75%"/>

<figcaption>

The figure above shows an example alignment between topic models. See
the vignette “Using `alto` on vaginal microbiome data” to reproduce this
figure.

</figcaption>

</figure>

You can learn more about alto’s algorithm and use cases in our paper:

> Fukuyama, J., Sankaran, K., & Symul, L. (2022). Multiscale analysis of
> count data through topic alignment. In Biostatistics (Vol. 24, Issue
> 4, pp. 1045–1065). Oxford University Press (OUP).
> <https://doi.org/10.1093/biostatistics/kxac018>

## Installation

<!-- You can install the released version of alto from [CRAN](https://CRAN.R-project.org) with: -->

<!-- ``` r -->

<!-- install.packages("alto") -->

<!-- ``` -->

<!-- And -->

You can install the development version from
Expand Down
2 changes: 1 addition & 1 deletion docs/404.html

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion docs/LICENSE-text.html

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion docs/LICENSE.html

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Loading