Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[iesave] comma in label breaks csv output #278

Closed
kbjarkefur opened this issue Aug 3, 2022 · 1 comment
Closed

[iesave] comma in label breaks csv output #278

kbjarkefur opened this issue Aug 3, 2022 · 1 comment
Assignees
Labels
resolved but not yet published Issue is fixed, but not yet published on SSC

Comments

@kbjarkefur
Copy link
Contributor

Variables with a , in the variable label, such as Milage, mpg, is allowed in Stata but will make the data points for that variable be shifted one column in the data table.

The csv solution for that would be to enclose all cells in double quotes. Such as mpg,Milage , mpg,byte,74 becomes "mpg","Milage , mpg","byte","74". However, then all strings needs to be compounded `" "' strings. And also, these extra quotes should not be added in the .md format.

The way I avoided that in iebaltab was to write a tab separated temp file I imported to Stata as a data file and then used Stata native features to export to csv that takes care of this. Not sure if that is the best approach here as the header is different and there is no equivalent for .md.

Another approach is to no mix code that generates the data point with code that outputs. If the data point code just create all values then some other code can specialize in output. In the current version of iesave in #276 the code is structured the same way as the old approach in iebaltab. iesave is likely to not be as complex so it could be ok, but maybe worth trying to avoid.

Both the csv and the md output is structured enough that this should be possible to be abstracted away in some functions. The best approach would be if Stata had support for lists or arrays.

@kbjarkefur
Copy link
Contributor Author

kbjarkefur commented Aug 3, 2022

Why is it that you always have the best idea as soon as you hit submit? 😄

The only place a comma can appear is in the variable label and in the user name (rare and idiotic but possible). So lets just enclose these in " " and handle that properly. So for the csv file you will have mpg,"Milage , mpg",byte,74 and Milage , mpg will be in a single column. The md file will be | mpg | "Milage , mpg" | byte | 74 and the quotation signs will show, but I think that is ok.

This still require that the local line is always handled as a compounded string `" "'

luizaandrade added a commit that referenced this issue Aug 3, 2022
@luizaandrade luizaandrade added the resolved but not yet published Issue is fixed, but not yet published on SSC label Aug 3, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
resolved but not yet published Issue is fixed, but not yet published on SSC
Projects
None yet
Development

No branches or pull requests

2 participants