Skip to content

String encoding is dropped in C++ round-trip through String class #988

@clauswilke

Description

@clauswilke

Is this behavior as expected? It causes unexpected bugs that only show up on Windows, see e.g. here.

library(Rcpp)

cppFunction('CharacterVector test1(const String &s) {
  CharacterVector v(s);
  return v;
}')

cppFunction('CharacterVector test2(CharacterVector c) {
  String s = c[0];
  CharacterVector v(s);
  return v;
}')

cppFunction('CharacterVector test3(CharacterVector c) {
  CharacterVector v(c);
  return v;
}')

# the following fails on Windows for test1() and test2()
x <- "special char: \u03bc"
test1(x)
#> [1] "special char: μ"
test2(x)
#> [1] "special char: μ"
test3(x)
#> [1] "special char: μ"

# The encoding is lost for functions test1() and test2()
Encoding(x)
#> [1] "UTF-8"
Encoding(test1(x))
#> [1] "unknown"
Encoding(test2(x))
#> [1] "unknown"
Encoding(test3(x))
#> [1] "UTF-8"

Created on 2019-08-11 by the reprex package (v0.3.0)

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions