Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions RcppTskit/NEWS.md
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,9 @@ and releases adhere to [Semantic Versioning](https://semver.org/spec/v2.0.0.html
- Added `TableCollection$build_index()` to build indexes and
`TableCollection$drop_index()` to drop indexes.
- Added ``TableCollection$num_*()` getters for the number of rows in the tables.
- Added `rtsk_individual_table_add_row()` and
`TableCollection$individual_table_add_row()` to append individual rows from
\code{R}, mirroring `tsk_individual_table_add_row()`.
- TODO

### Changed
Expand Down
49 changes: 49 additions & 0 deletions RcppTskit/R/Class-TableCollection.R
Original file line number Diff line number Diff line change
Expand Up @@ -143,6 +143,55 @@ TableCollection <- R6Class(
rtsk_table_collection_get_num_individuals(self$xptr)
},

#' @description Add a row to the individuals table.
#' @param flags integer flags for the new individual.
#' @param location numeric vector with the location of the new individual.
#' @param parents integer vector with parent individual IDs (0-based).
#' @param metadata for the new individual; accepts \code{NULL},
#' a raw vector, or a character of length 1.
#' @details See the \code{tskit Python} equivalent at
#' \url{https://tskit.dev/tskit/docs/stable/python-api.html#tskit.IndividualTable.add_row}.
#' The function casts inputs to the expected class.
#' @return Integer row ID (0-based) of the newly added individual.
#' @examples
#' ts_file <- system.file("examples/test.trees", package = "RcppTskit")
#' tc <- tc_load(ts_file)
#' n_before <- tc$num_individuals()
#' new_id <- tc$individual_table_add_row()
#' new_id <- tc$individual_table_add_row(location = c(5, 8))
#' new_id <- tc$individual_table_add_row(flags = 0L)
#' new_id <- tc$individual_table_add_row(parents = c(0L, 2L))
#' new_id <- tc$individual_table_add_row(metadata = "abc")
#' new_id <- tc$individual_table_add_row(metadata = charToRaw("cba"))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think matadata should be a nested list but not a string. For example:
In python:

tb.individuals.add_row(metadata={'file_id':33})

In R with reticulate:

tb$individuals$add_row(metadata=list(file_id=as.integer(33)))

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@LynxJinyangii thanks for raising this. I am not really clued in about the metadata so this part is very hazy for me. Looking at the C function https://tskit.dev/tskit/docs/stable/c-api.html#c.tsk_individual_table_add_row metadata is a character vector. Looking at the Python function https://tskit.dev/tskit/docs/stable/python-api.html#tskit.IndividualTable.add_row metadata is Any object that is valid metadata for the table’s schema. Defaults to the default metadata value for the table’s schema. This is typically {}. For no schema, None and I am a bit clueless what is object that is valid metadata! Do you know and could you suggest what to use for this?! Is it indeed a list()? @bryo-han thoughts from your end?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

okay, I guess we'd better not handel this at C/C++ layer, they mentioned "The main area of difference is, unlike the Python API, the C API doesn’t do any decoding, encoding or schema validation of [Metadata] fields, instead only handling the byte string representation of the metadata. Metadata is therefore never used directly by any tskit C API method, just stored" (https://tskit.dev/tskit/docs/stable/c-api.html). Perhaps we can use https://jeroen.r-universe.dev/jsonlite/doc/manual.html in R. @gregorgorjanc could you please give me an example of how to access a row in the individual table, so I can try writing metadata to a list/dictionary (things like https://tskit.dev/pyslim/docs/latest/metadata.html) in R, storing it in binary in C, and then decoding it again?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know much about JSON (though it looks simpleish), which is why I struggle regarding the metadata side of things.

As to examples of rtsk_individual_table_add_row() and tc.individual_table_add_row() see https://github.com/HighlanderLab/RcppTskit/pull/122/changes#diff-b8bc9e42f1189821e14b71369310c9d873f56ac1337fa3a2f766817ccb09341aR156 and https://github.com/HighlanderLab/RcppTskit/pull/122/changes#diff-912ff421309575a5784c260f0fdfa9bfa88fb4f5c48acc539b1ca1542f16bc3cR1256 (these examples are part of this PR;)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@LynxJinyangii @bryo-han we could for now ignore (play ignorant about) metadata for now and just assume it will be a character and we sort it later as part of #36 and #24 - once we figure what is the best way of handling the metadata, we can then easily propose a solution for that aspect later instead of getting bogged down with how to handle metadata while we are trying to add the add_row methods. Yes, let's focus on the add_row methods first for all the tables, and worry about the metadata later!

#' n_after <- tc$num_individuals()
individual_table_add_row = function(
flags = 0L,
location = NULL,
parents = NULL,
metadata = NULL
) {
if (is.null(metadata)) {
metadata_raw <- NULL
} else if (is.raw(metadata)) {
metadata_raw <- metadata
} else if (
is.character(metadata) && length(metadata) == 1L && !is.na(metadata)
) {
metadata_raw <- charToRaw(metadata)
} else {
stop(
"metadata must be NULL, a raw vector, or a length-1 non-NA character string!"
)
}
rtsk_individual_table_add_row(
tc = self$xptr,
flags = as.integer(flags),
location = if (is.null(location)) NULL else as.numeric(location),
parents = if (is.null(parents)) NULL else as.integer(parents),
metadata = metadata_raw
)
},

#' @description Get the number of nodes in a table collection.
#' @return A signed 64 bit integer \code{bit64::integer64}.
#' @examples
Expand Down
8 changes: 8 additions & 0 deletions RcppTskit/R/RcppExports.R
Original file line number Diff line number Diff line change
Expand Up @@ -203,6 +203,10 @@ rtsk_table_collection_metadata_length <- function(tc) {
.Call(`_RcppTskit_rtsk_table_collection_metadata_length`, tc)
}

rtsk_individual_table_add_row <- function(tc, flags = 0L, location = NULL, parents = NULL, metadata = NULL) {
.Call(`_RcppTskit_rtsk_individual_table_add_row`, tc, flags, location, parents, metadata)
}

test_tsk_bug_assert_c <- function() {
invisible(.Call(`_RcppTskit_test_tsk_bug_assert_c`))
}
Expand Down Expand Up @@ -235,3 +239,7 @@ test_rtsk_table_collection_build_index_forced_error <- function(tc) {
invisible(.Call(`_RcppTskit_test_rtsk_table_collection_build_index_forced_error`, tc))
}

test_rtsk_individual_table_add_row_forced_error <- function(tc) {
invisible(.Call(`_RcppTskit_test_rtsk_individual_table_add_row_forced_error`, tc))
}

5 changes: 5 additions & 0 deletions RcppTskit/inst/include/RcppTskit_public.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -57,5 +57,10 @@ void rtsk_table_collection_build_index(SEXP tc, int options = 0);
void rtsk_table_collection_drop_index(SEXP tc, int options = 0);
Rcpp::List rtsk_table_collection_summary(SEXP tc);
Rcpp::List rtsk_table_collection_metadata_length(SEXP tc);
int rtsk_individual_table_add_row(
SEXP tc, int flags = 0,
Rcpp::Nullable<Rcpp::NumericVector> location = R_NilValue,
Rcpp::Nullable<Rcpp::IntegerVector> parents = R_NilValue,
Rcpp::Nullable<Rcpp::RawVector> metadata = R_NilValue);

#endif
71 changes: 71 additions & 0 deletions RcppTskit/man/TableCollection.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

27 changes: 27 additions & 0 deletions RcppTskit/src/RcppExports.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -534,6 +534,21 @@ BEGIN_RCPP
return rcpp_result_gen;
END_RCPP
}
// rtsk_individual_table_add_row
int rtsk_individual_table_add_row(const SEXP tc, const int flags, const Rcpp::Nullable<Rcpp::NumericVector> location, const Rcpp::Nullable<Rcpp::IntegerVector> parents, const Rcpp::Nullable<Rcpp::RawVector> metadata);
RcppExport SEXP _RcppTskit_rtsk_individual_table_add_row(SEXP tcSEXP, SEXP flagsSEXP, SEXP locationSEXP, SEXP parentsSEXP, SEXP metadataSEXP) {
BEGIN_RCPP
Rcpp::RObject rcpp_result_gen;
Rcpp::RNGScope rcpp_rngScope_gen;
Rcpp::traits::input_parameter< const SEXP >::type tc(tcSEXP);
Rcpp::traits::input_parameter< const int >::type flags(flagsSEXP);
Rcpp::traits::input_parameter< const Rcpp::Nullable<Rcpp::NumericVector> >::type location(locationSEXP);
Rcpp::traits::input_parameter< const Rcpp::Nullable<Rcpp::IntegerVector> >::type parents(parentsSEXP);
Rcpp::traits::input_parameter< const Rcpp::Nullable<Rcpp::RawVector> >::type metadata(metadataSEXP);
rcpp_result_gen = Rcpp::wrap(rtsk_individual_table_add_row(tc, flags, location, parents, metadata));
return rcpp_result_gen;
END_RCPP
}
// test_tsk_bug_assert_c
void test_tsk_bug_assert_c();
RcppExport SEXP _RcppTskit_test_tsk_bug_assert_c() {
Expand Down Expand Up @@ -612,6 +627,16 @@ BEGIN_RCPP
return R_NilValue;
END_RCPP
}
// test_rtsk_individual_table_add_row_forced_error
void test_rtsk_individual_table_add_row_forced_error(const SEXP tc);
RcppExport SEXP _RcppTskit_test_rtsk_individual_table_add_row_forced_error(SEXP tcSEXP) {
BEGIN_RCPP
Rcpp::RNGScope rcpp_rngScope_gen;
Rcpp::traits::input_parameter< const SEXP >::type tc(tcSEXP);
test_rtsk_individual_table_add_row_forced_error(tc);
return R_NilValue;
END_RCPP
}

static const R_CallMethodDef CallEntries[] = {
{"_RcppTskit_test_validate_options", (DL_FUNC) &_RcppTskit_test_validate_options, 2},
Expand Down Expand Up @@ -661,6 +686,7 @@ static const R_CallMethodDef CallEntries[] = {
{"_RcppTskit_rtsk_table_collection_drop_index", (DL_FUNC) &_RcppTskit_rtsk_table_collection_drop_index, 2},
{"_RcppTskit_rtsk_table_collection_summary", (DL_FUNC) &_RcppTskit_rtsk_table_collection_summary, 1},
{"_RcppTskit_rtsk_table_collection_metadata_length", (DL_FUNC) &_RcppTskit_rtsk_table_collection_metadata_length, 1},
{"_RcppTskit_rtsk_individual_table_add_row", (DL_FUNC) &_RcppTskit_rtsk_individual_table_add_row, 5},
{"_RcppTskit_test_tsk_bug_assert_c", (DL_FUNC) &_RcppTskit_test_tsk_bug_assert_c, 0},
{"_RcppTskit_test_tsk_bug_assert_cpp", (DL_FUNC) &_RcppTskit_test_tsk_bug_assert_cpp, 0},
{"_RcppTskit_test_tsk_trace_error_c", (DL_FUNC) &_RcppTskit_test_tsk_trace_error_c, 0},
Expand All @@ -669,6 +695,7 @@ static const R_CallMethodDef CallEntries[] = {
{"_RcppTskit_test_rtsk_treeseq_copy_tables_forced_error", (DL_FUNC) &_RcppTskit_test_rtsk_treeseq_copy_tables_forced_error, 1},
{"_RcppTskit_test_rtsk_treeseq_init_forced_error", (DL_FUNC) &_RcppTskit_test_rtsk_treeseq_init_forced_error, 1},
{"_RcppTskit_test_rtsk_table_collection_build_index_forced_error", (DL_FUNC) &_RcppTskit_test_rtsk_table_collection_build_index_forced_error, 1},
{"_RcppTskit_test_rtsk_individual_table_add_row_forced_error", (DL_FUNC) &_RcppTskit_test_rtsk_individual_table_add_row_forced_error, 1},
{NULL, NULL, 0}
};

Expand Down
Loading
Loading