Skip to content

Submission: BaseSet  #359

@llrs

Description

@llrs

Submitting Author: Lluís (@llrs)
Repository: llrs/BaseSet
Version submitted: 0.0.10
Editor: @annakrystalli
Reviewer 1: @arendsee
Reviewer 2: @j23414
Archive: TBD
Version accepted: TBD


  • Paste the full DESCRIPTION file inside a code block below:
Package: BaseSet
Title: Provides classes for working with sets
Version: 0.0.10
Authors@R: 
    person(given = "Lluís ",
           family = "Revilla Sancho",
           role = c("aut", "cre"),
           email = "lluis.revilla@gmail.com")
Description: A set collection, while not "tidy" in itself, can
    be thought of as three tidy data frames describing sets, elements and
    relations respectively. 'BaseSet' provides an approach to manipulate,
    load and use these virtual data frames.
License: MIT + file LICENSE
URL: https://github.com/llrs/BaseSet
BugReports: https://github.com/llrs/BaseSet/issues
Depends: 
    R (>= 3.6.0)
Imports: 
    dplyr (>= 0.7.8),
    magrittr,
    methods,
    rlang,
    utils,
    xml2
Suggests: 
    BiocStyle,
    covr,
    forcats,
    ggplot2,
    GO.db,
    GSEABase,
    knitr,
    org.Hs.eg.db,
    reactome.db,
    rmarkdown,
    spelling,
    testthat (>= 2.1.0),
    Biobase
VignetteBuilder: 
    knitr
Encoding: UTF-8
Language: en-US
LazyData: true
Roxygen: list(markdown = TRUE)
RoxygenNote: 7.0.2
Collate: 
    'AllClasses.R'
    'AllGenerics.R'
    'GMT.R'
    'GeneSetCollection.R'
    'activate.R'
    'add.R'
    'add_column.R'
    'add_relation.R'
    'adjacency.R'
    'arrange.R'
    'basesets-package.R'
    'cartesian.R'
    'complement.R'
    'data_frame.R'
    'deactivate.R'
    'droplevels.R'
    'elements.R'
    'filter.R'
    'group.R'
    'group_by.R'
    'head.R'
    'incidence.R'
    'independent.R'
    'operations.R'
    'intersection.R'
    'length.R'
    'list.R'
    'move_to.R'
    'mutate.R'
    'names.R'
    'naming.R'
    'nested.R'
    'obo.R'
    'power_set.R'
    'print.R'
    'pull.R'
    'relations.R'
    'remove.R'
    'remove_column.R'
    'rename.R'
    'select.R'
    'set.R'
    'size.R'
    'subtract.R'
    'tidy-set.R'
    'union.R'
    'utils-pipe.R'
    'xml.R'
    'zzz.R'

Scope

  • Please indicate which category or categories from our package fit policies this package falls under: (Please check an appropriate box below. If you are unsure, we suggest you make a pre-submission inquiry.):

    • data retrieval
    • data extraction
    • database access
    • data munging
    • data deposition
    • workflow automataion
    • version control
    • citation management and bibliometrics
    • scientific software wrappers
    • database software bindings
    • geospatial data
    • text analysis
  • Explain how and why the package falls under these categories (briefly, 1-2 sentences):

The package implements methods to work on sets, doing intersection, union, complementary and other set operations in a "tidy" way. It also allows to import from several formats used in the life science world. Like the GMT and the GAF or the OBO format file for ontologies.

  • Who is the target audience and what are scientific applications of this package?

The idea is to use the package for working with sets and signatures of genes in scRNAseq or in pathways and ontologies but it might work with other fields.

There is the sets package which implements a more generalized approach, that can store functions or lists as an element of a set (while mine it only allows to store a character or factor), but it is harder to operate in a tidy/long way. Also the operations of intersection and union need to happen between two different objects, while TidySet objects (the class implemented in BaseSet) can store a single set or thousands of them.
In BaseSet is easier to operate and implement new fuzzy logic operations. It is developed openly on github compared to sets which I couldn't track how it is being developed.

The GSEABase partially implements this, but it doesn't allow to store fuzzy sets and it is also quite slow as it creates several classes for annotating each set. Neither does the BiocSets the package, which don't use the fuzzy set logic.

There is also the hierarchicalSets package that is focused on clustering of sets that are inside other sets and visualizations. However, BaseSet is focused on storing and manipulate sets including hierarchical sets.

  • If you made a pre-submission enquiry, please paste the link to the corresponding issue, forum post, or other discussion, or @tag the editor you contacted.

Most of the replies are copied from #339, handeled by @melvidoni.

Technical checks

Confirm each of the following by checking the box. This package:

Publication options

JOSS Options
  • The package has an obvious research application according to JOSS's definition.
    • The package contains a paper.md matching JOSS's requirements with a high-level description in the package root or in inst/.
    • The package is deposited in a long-term repository with the DOI:
    • (Do not submit your package separately to JOSS)
MEE Options
  • The package is novel and will be of interest to the broad readership of the journal.
  • The manuscript describing the package is no longer than 3000 words.
  • You intend to archive the code for the package in a long-term repository which meets the requirements of the journal (see MEE's Policy on Publishing Code)
  • (Scope: Do consider MEE's Aims and Scope for your manuscript. We make no guarantee that your manuscript will be within MEE scope.)
  • (Although not required, we strongly recommend having a full manuscript prepared when you submit here.)
  • (Please do not submit your package separately to Methods in Ecology and Evolution)

Code of conduct

Metadata

Metadata

Assignees

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions