New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

question about encoding in the catalog_file #312

Closed
antuki opened this Issue Sep 6, 2017 · 5 comments

Comments

Projects
None yet
3 participants
@antuki
Copy link

antuki commented Sep 6, 2017

Hi,

I'm trying the very useful "read_sas" function to load results of a french survey.

read_sas(data_file, catalog_file = NULL, encoding = NULL,
cols_only = NULL)

The "encoding" option works well when i want to change data_file encoding.
But it seems that the catalog_file i'm using has not the same encoding than my data_file and the encoding option does'nt seem to have an effect on catalog_file's encoding.

Do you know if there is a solution to change catalog_file encoding from R ?

Thanks
Thank you

@hadley

This comment has been minimized.

Copy link
Member

hadley commented Jan 7, 2018

Can you please provide a minimal reprex (reproducible example)? The goal of a reprex is to make it as easy as possible for me to recreate your problem so that I can fix it: please help me help you!

If you've never heard of a reprex before, start by reading "What is a reprex", and follow the advice further down the page. Please make sure your reprex is created with the reprex package as it gives nicely formatted output and avoids a number of common pitfalls.

(@evanmiller just in case, does this ring any bells? It certainly looks like readstat_parse_sas7bcat uses parser->input_encoding)

@hadley hadley added the reprex label Jan 7, 2018

@evanmiller

This comment has been minimized.

Copy link
Contributor

evanmiller commented Jan 7, 2018

@hadley It sounds like you may want to provide an additional catalog_encoding option that you pass to the catalog parser.

@hadley

This comment has been minimized.

Copy link
Member

hadley commented Jan 7, 2018

Oh hmmm. That might be a better reading of the question than mine.

@antuki

This comment has been minimized.

Copy link

antuki commented Jan 8, 2018

I think @evanmiller may have a good understanding of my question.
Here is a reprex.
The reprex package gave me a strang encoding for the french output (compilé for example) in my browser so I manually changed it so that you can see what I obtain in my R session.
You can download the 2 files I used here : https://github.com/antuki/reprex/tree/master/tidyverse_haven_312 (I don't know if we can read sas files directly from github ?)
the dataset : dataset_format.sas7bdat
the formats : formats.sas7bcat
As you can see in the result "à1" is correctly written in the dataset but not in the label which comes from the formats ("< e0 >1").

      library(haven)                                                                                                                
      #> Warning: le package 'haven' a été compilé avec la version R 3.3.3
      sas_base <- read_sas(data_file="dataset_format.sas7bdat", catalog_file = "formats.sas7bcat", encoding = NULL,cols_only = NULL)
      sas_base$a                                                                                                                    
      #> <Labelled character>
      #> [1] à1 à2 à3
      #> 
      #> Labels:
      #>  value             label
      #>     à1 modalit<e9> <e0>1
      #>     à2 modalit<e9> <e0>2


      devtools::session_info()
      #> Session info -------------------------------------------------------------
      #>  setting  value                       
      #>  version  R version 3.3.2 (2016-10-31)
      #>  system   x86_64, mingw32             
      #>  ui       RTerm                       
      #>  language (EN)                        
      #>  collate  French_France.1252          
      #>  tz       Europe/Paris                
      #>  date     2018-01-08
      #> Packages -----------------------------------------------------------------
      #>  package   * version    date       source                          
      #>  backports   1.1.1      2017-09-25 CRAN (R 3.3.3)                  
      #>  base      * 3.3.2      2016-10-31 local                           
      #>  datasets  * 3.3.2      2016-10-31 local                           
      #>  devtools    1.13.3     2017-08-02 CRAN (R 3.3.3)                  
      #>  digest      0.6.12     2017-01-27 CRAN (R 3.3.3)                  
      #>  evaluate    0.10.1     2017-06-24 CRAN (R 3.3.3)                  
      #>  graphics  * 3.3.2      2016-10-31 local                           
      #>  grDevices * 3.3.2      2016-10-31 local                           
      #>  htmltools   0.3.6      2017-04-28 CRAN (R 3.3.3)                  
      #>  knitr       1.17       2017-08-10 CRAN (R 3.3.3)                  
      #>  magrittr    1.5        2014-11-22 CRAN (R 3.3.3)                  
      #>  memoise     1.1.0      2017-04-21 CRAN (R 3.3.3)                  
      #>  methods   * 3.3.2      2016-10-31 local                           
      #>  Rcpp        0.12.13    2017-09-28 CRAN (R 3.3.3)                  
      #>  rmarkdown   1.8        2017-11-17 CRAN (R 3.3.3)                  
      #>  rprojroot   1.2        2017-01-16 CRAN (R 3.3.3)                  
      #>  stats     * 3.3.2      2016-10-31 local                           
      #>  stringi     1.1.5      2017-04-07 CRAN (R 3.3.3)                  
      #>  stringr     1.2.0      2017-02-18 CRAN (R 3.3.3)                  
      #>  tools       3.3.2      2016-10-31 local                           
      #>  utils     * 3.3.2      2016-10-31 local                           
      #>  withr       2.1.1.9000 2018-01-08 Github (jimhester/withr@df18523)
      #>  yaml        2.1.14     2016-11-12 CRAN (R 3.3.3)

Thanks !

@hadley hadley added feature and removed reprex labels Jan 8, 2018

@hadley hadley closed this in a1065fd Jan 16, 2018

@lock

This comment has been minimized.

Copy link

lock bot commented Jul 15, 2018

This old issue has been automatically locked. If you believe you have found a related problem, please file a new issue (with reprex) and link to this issue. https://reprex.tidyverse.org/

@lock lock bot locked and limited conversation to collaborators Jul 15, 2018

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.