Skip to content

lawremi/rdocdb

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Introduction

When analyzing data, it is usually most convenient to model the data as a table, e.g., a data.frame. However, a table is a very rigid structure, and many modern formats (XML, JSON, YAML, etc) and databases (MongoDb, Solr, etc) follow the more flexible recursive list model, which can represent ragged, less structured data.

When analyzing a hierarchical dataset, we retain the desire to query it as if it were a table. We could project the list onto a table, but that may be inefficient if the data are sparse. Thus, we need a sparse table representation, i.e., something that resembles a data.frame on the outside but is actually a list.

The rdocdb package defines an abstraction for working with document collections in R. The API extends the base R API for data.frames and lists so that one can treat list data as hierarchical or tabular, without coercion. Many of the extensions have semantics specific to document collections. The package also implements the extensions on top of data.frame, so that one can choose the most efficient implementation with minimal adaptation.

Usage

TODO

About

Document collections in R

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages