Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.Sign up
write_dta not working for data.frame which columns contain unicode characters #383
Started from v14, STATA support unicode letter appeared in column names. So the legal column names include: _, 0-9 and unicode letters (Not only latin characters).
However, the code in haven.R used to validate whether the names are legal/valid:
This is not correct, it should include another parameter: version, for version >= 14, and can use the following code for version >= 14:
However, validate_dta is not the only function to validate the column names.
The function 'dta_validate_name' in readstat_dta_write.c also check the column names. I tried to comment these lines:
It seems work, but I am not sure due to my limited experience in C.