Refactor large string vector buffer allocation and creation#400
Refactor large string vector buffer allocation and creation#400eddelbuettel merged 4 commits intomasterfrom
Conversation
|
This pull request has been linked to Shortcut Story #16919: Create string buffer without intermediates variables returned to R. |
ihnorton
left a comment
There was a problem hiding this comment.
One question about allocations, otherwise LGTM.
| for (size_t i=0; i<n; i++) { | ||
| std::string s(vec[i]); | ||
| bufptr->offsets[i] = cumlen; | ||
| bufptr->str += s; |
There was a problem hiding this comment.
Does this copy s into vlc_buf_t? Just wondering if there's any way we can save one allocation and do the copy directly from the CharacterVector (since we already have one copy in the std::string s construction above).
There was a problem hiding this comment.
Is there a CharacterVector -> string_view conversion available? That would also work to avoid the allocation.
There was a problem hiding this comment.
Yes. It copies. R holds 'strings' in another place so I have always relied on creating a fresh std::string. That is different from the int or double vector case.
Rcpp has no string_view converter matching the string case. PRs welcome :)
This PR updates and refactors how character vectors are handles in write queries, and replaces a return of a temporary back to R (where R's size limits apply) by keeping everything at the C++ level. In the process we can also remove one helper function.
This helps with the issue documented in
tiledbsc#43.