Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

question about encoding in the catalog_file #312

antuki opened this issue Sep 6, 2017 · 5 comments

question about encoding in the catalog_file #312

antuki opened this issue Sep 6, 2017 · 5 comments
feature a feature request or enhancement


Copy link

antuki commented Sep 6, 2017


I'm trying the very useful "read_sas" function to load results of a french survey.

read_sas(data_file, catalog_file = NULL, encoding = NULL,
cols_only = NULL)

The "encoding" option works well when i want to change data_file encoding.
But it seems that the catalog_file i'm using has not the same encoding than my data_file and the encoding option does'nt seem to have an effect on catalog_file's encoding.

Do you know if there is a solution to change catalog_file encoding from R ?

Thank you

Copy link

hadley commented Jan 7, 2018

Can you please provide a minimal reprex (reproducible example)? The goal of a reprex is to make it as easy as possible for me to recreate your problem so that I can fix it: please help me help you!

If you've never heard of a reprex before, start by reading "What is a reprex", and follow the advice further down the page. Please make sure your reprex is created with the reprex package as it gives nicely formatted output and avoids a number of common pitfalls.

(@evanmiller just in case, does this ring any bells? It certainly looks like readstat_parse_sas7bcat uses parser->input_encoding)

@hadley hadley added the reprex needs a minimal reproducible example label Jan 7, 2018
Copy link

@hadley It sounds like you may want to provide an additional catalog_encoding option that you pass to the catalog parser.

Copy link

hadley commented Jan 7, 2018

Oh hmmm. That might be a better reading of the question than mine.

Copy link

antuki commented Jan 8, 2018

I think @evanmiller may have a good understanding of my question.
Here is a reprex.
The reprex package gave me a strang encoding for the french output (compilé for example) in my browser so I manually changed it so that you can see what I obtain in my R session.
You can download the 2 files I used here : (I don't know if we can read sas files directly from github ?)
the dataset : dataset_format.sas7bdat
the formats : formats.sas7bcat
As you can see in the result "à1" is correctly written in the dataset but not in the label which comes from the formats ("< e0 >1").

      #> Warning: le package 'haven' a été compilé avec la version R 3.3.3
      sas_base <- read_sas(data_file="dataset_format.sas7bdat", catalog_file = "formats.sas7bcat", encoding = NULL,cols_only = NULL)
      #> <Labelled character>
      #> [1] à1 à2 à3
      #> Labels:
      #>  value             label
      #>     à1 modalit<e9> <e0>1
      #>     à2 modalit<e9> <e0>2

      #> Session info -------------------------------------------------------------
      #>  setting  value                       
      #>  version  R version 3.3.2 (2016-10-31)
      #>  system   x86_64, mingw32             
      #>  ui       RTerm                       
      #>  language (EN)                        
      #>  collate  French_France.1252          
      #>  tz       Europe/Paris                
      #>  date     2018-01-08
      #> Packages -----------------------------------------------------------------
      #>  package   * version    date       source                          
      #>  backports   1.1.1      2017-09-25 CRAN (R 3.3.3)                  
      #>  base      * 3.3.2      2016-10-31 local                           
      #>  datasets  * 3.3.2      2016-10-31 local                           
      #>  devtools    1.13.3     2017-08-02 CRAN (R 3.3.3)                  
      #>  digest      0.6.12     2017-01-27 CRAN (R 3.3.3)                  
      #>  evaluate    0.10.1     2017-06-24 CRAN (R 3.3.3)                  
      #>  graphics  * 3.3.2      2016-10-31 local                           
      #>  grDevices * 3.3.2      2016-10-31 local                           
      #>  htmltools   0.3.6      2017-04-28 CRAN (R 3.3.3)                  
      #>  knitr       1.17       2017-08-10 CRAN (R 3.3.3)                  
      #>  magrittr    1.5        2014-11-22 CRAN (R 3.3.3)                  
      #>  memoise     1.1.0      2017-04-21 CRAN (R 3.3.3)                  
      #>  methods   * 3.3.2      2016-10-31 local                           
      #>  Rcpp        0.12.13    2017-09-28 CRAN (R 3.3.3)                  
      #>  rmarkdown   1.8        2017-11-17 CRAN (R 3.3.3)                  
      #>  rprojroot   1.2        2017-01-16 CRAN (R 3.3.3)                  
      #>  stats     * 3.3.2      2016-10-31 local                           
      #>  stringi     1.1.5      2017-04-07 CRAN (R 3.3.3)                  
      #>  stringr     1.2.0      2017-02-18 CRAN (R 3.3.3)                  
      #>  tools       3.3.2      2016-10-31 local                           
      #>  utils     * 3.3.2      2016-10-31 local                           
      #>  withr 2018-01-08 Github (jimhester/withr@df18523)
      #>  yaml        2.1.14     2016-11-12 CRAN (R 3.3.3)

Thanks !

@hadley hadley added feature a feature request or enhancement and removed reprex needs a minimal reproducible example labels Jan 8, 2018
@hadley hadley closed this as completed in a1065fd Jan 16, 2018
Copy link

lock bot commented Jul 15, 2018

This old issue has been automatically locked. If you believe you have found a related problem, please file a new issue (with reprex) and link to this issue.

@lock lock bot locked and limited conversation to collaborators Jul 15, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
feature a feature request or enhancement
None yet

No branches or pull requests

3 participants