Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Minor suggestion: add option to sanitize atypical unicode #55

Open
owasow opened this issue Feb 22, 2021 · 1 comment
Open

Minor suggestion: add option to sanitize atypical unicode #55

owasow opened this issue Feb 22, 2021 · 1 comment

Comments

@owasow
Copy link

owasow commented Feb 22, 2021

I was working with some survey data in which a number of open-ended text responses included atypical unicode characters that broke latex compilation (though could work with xelatex). These text strings tended to be nonsense input from users so it wasn't essential to include them in the codebook but I found it hard to find the right code to strip them out before running dataReporter. Ultimately, it turned out to be easy with the following code gsub('[^\x20-\x7E]', '', text) but it took me a while to locate this particular solution on StackOverflow (see link below).
https://stackoverflow.com/questions/38828620/how-to-remove-strange-characters-using-gsub-in-r

I wonder if an option/argument to automatically sanitize character strings would make sense given that this will likely be an issue for a wide range of data sets.

@annennenne
Copy link
Collaborator

Thanks for reaching out!

We are a bit hesitant about implementing features that alters the data even though it's just for the sake of formatting.

Would you mind providing a bit more info about the encoding problem your were having? What did the text strings look like?

It sounds like you have found a good workaround. But as an alternative, you could also run makeDataReport with standAlone = FALSE - this gives you a .rmd document without the YAML header - and then make the YAML header yourself. This would allow you to choose which latex engine to use, include a .tex header and more.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants