Skip to content

Refactor large string vector buffer allocation and creation#400

Merged
eddelbuettel merged 4 commits intomasterfrom
de/sc-16919/large_char_vec
Apr 19, 2022
Merged

Refactor large string vector buffer allocation and creation#400
eddelbuettel merged 4 commits intomasterfrom
de/sc-16919/large_char_vec

Conversation

@eddelbuettel
Copy link
Copy Markdown
Contributor

This PR updates and refactors how character vectors are handles in write queries, and replaces a return of a temporary back to R (where R's size limits apply) by keeping everything at the C++ level. In the process we can also remove one helper function.

This helps with the issue documented in tiledbsc #43.

@shortcut-integration
Copy link
Copy Markdown

Copy link
Copy Markdown
Member

@ihnorton ihnorton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One question about allocations, otherwise LGTM.

Comment thread src/libtiledb.cpp
for (size_t i=0; i<n; i++) {
std::string s(vec[i]);
bufptr->offsets[i] = cumlen;
bufptr->str += s;
Copy link
Copy Markdown
Member

@ihnorton ihnorton Apr 19, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this copy s into vlc_buf_t? Just wondering if there's any way we can save one allocation and do the copy directly from the CharacterVector (since we already have one copy in the std::string s construction above).

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a CharacterVector -> string_view conversion available? That would also work to avoid the allocation.

Copy link
Copy Markdown
Contributor Author

@eddelbuettel eddelbuettel Apr 19, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. It copies. R holds 'strings' in another place so I have always relied on creating a fresh std::string. That is different from the int or double vector case.

Rcpp has no string_view converter matching the string case. PRs welcome :)

@eddelbuettel eddelbuettel merged commit 5a543c6 into master Apr 19, 2022
@eddelbuettel eddelbuettel deleted the de/sc-16919/large_char_vec branch April 19, 2022 13:51
@eddelbuettel eddelbuettel mentioned this pull request May 16, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants