Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.Sign up
Stata (before 13) needs optional encoding parameter #163
If a version 13 dta file with labels defined with non UTF-8 encoding, the 0.2.0.9000 version fails to recognize variable labels.
I think these lines should be commented out in
@hadley I've extended the C API to allow manual specification of the file encoding. The trouble is that pre-14 Stata uses the system encoding (usually Win 1252) but does not indicate what that encoding is anywhere in the file. For kicks I also allow specifying the output encoding, which defaults to UTF-8. Here's the API diff from WizardMac/ReadStat@c4e0d48:
// Usually inferred from the file, but sometimes a manual override is desirable. // In particular, pre-14 Stata uses the system encoding, which is usually Win 1252 // but could be anything. `encoding' should be an iconv-compatible name. readstat_error_t readstat_set_input_character_encoding(readstat_parser_t *parser, const char *encoding); // Defaults to UTF-8. Pass in NULL to disable transliteration. readstat_error_t readstat_set_output_character_encoding(readstat_parser_t *parser, const char *encoding);