Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Need to mark the string returned as UTF-8 #278

Merged
merged 2 commits into from Oct 26, 2018

Conversation

Projects
None yet
3 participants
@shrektan
Copy link
Contributor

shrektan commented Oct 25, 2018

Hi @eddelbuettel , first of all, thanks for making this great package.

Occasionally, we need to return non-English characters from Bloomberg (as the example below). However, because the current version doesn't mark the encoding as UTF8 explicitly, it causes trouble on a Windows machine, where the native encoding varies.

This PR will fix this. I have manually compiled the PR version and tested it on the Bloomberg computer in the office and find no problems by far.

If there's more need to be done, please let me know.

Thanks.

Example

library(Rblpapi)
blpConnect()
out <- bdp(securities = c("BBG006YQMFQ5", "BBG005CMJQ71"),
           fields = c("NAME_CHINESE_SIMPLIFIED", "ISSUE_DT"),
           overrides = c("SECURITY_NAME_LANG" = "10"))
Encoding(out$NAME_CHINESE_SIMPLIFIED)
# before this PR, it says "unknown" and it causes trouble, because we have 
# to change the encoding by hand like `Encoding(out$NAME_CHINESE_SIMPLIFIED) <- "UTF-8"`
# in order to display the strings correctly.
out$ISSUE_DT
out$NAME_CHINESE_SIMPLIFIED
@shrektan

This comment has been minimized.

Copy link
Contributor Author

shrektan commented Oct 25, 2018

Sorry, there's a bug... The NA Date value now returns "1970-01-01". I'll fix this and let you know...

@shrektan

This comment has been minimized.

Copy link
Contributor Author

shrektan commented Oct 25, 2018

It's weird... If I compiled the package from the master branch (w/o this PR) on my computer, it still returns "1970-01-01" but the CRAN version returns the correct value "NA".

I have no clue at all.

So I think it's unrelated to this PR and the PR itself should be safe...

@shrektan

This comment has been minimized.

Copy link
Contributor Author

shrektan commented Oct 25, 2018

I believe the "Date Issue" is caused by the PR #273

Rblpapi/src/bds.cpp

Lines 134 to 136 in 6e52cec

case BLPAPI_DATATYPE_DATE:
ans = Rcpp::DateVector(n);
break;

The default value created by Rcpp::DateVector() is 1970-01-01 rather than NA

Rcpp::cppFunction("SEXP test() { 
SEXP out;
out = Rcpp::DateVector(5);
return out;
}")
test()
#> [1] "1970-01-01" "1970-01-01" "1970-01-01" "1970-01-01" "1970-01-01"

Created on 2018-10-25 by the reprex package (v0.2.1)

@shrektan

This comment has been minimized.

Copy link
Contributor Author

shrektan commented Oct 25, 2018

@eddelbuettel I address the above issue in another PR #279

And as usual, I tested it and works on my computer.

@eddelbuettel

This comment has been minimized.

Copy link
Member

eddelbuettel commented Oct 25, 2018

Yes, marking as utf-8 is a good idea. Having a similar issue in another project as it happens.

I cannot currently test. @johnlaing Any chance you could take a peek?

Show resolved Hide resolved DESCRIPTION Outdated
@johnlaing

This comment has been minimized.

Copy link
Contributor

johnlaing commented Oct 25, 2018

Not going to be able to get to this during the day, will try to have a look this evening.

@johnlaing johnlaing merged commit 1a24cbb into Rblp:master Oct 26, 2018

1 check passed

continuous-integration/travis-ci/pr The Travis CI build passed
Details
@johnlaing

This comment has been minimized.

Copy link
Contributor

johnlaing commented Oct 26, 2018

@shrektan This is great, thanks for your contribution here and also on the date regression.

@eddelbuettel

This comment has been minimized.

Copy link
Member

eddelbuettel commented Oct 26, 2018

Seconded, and FWIW I just got a very similar "mark as UTF-8" PR for RcppTOML. Being where we are we something forget not that all encodings as as "baked in" as the one we use here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.