Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

catalog_file ignored in read_sas() when file created on Unix system #696

Closed
maike2011 opened this issue Sep 16, 2022 · 1 comment · Fixed by #713
Closed

catalog_file ignored in read_sas() when file created on Unix system #696

maike2011 opened this issue Sep 16, 2022 · 1 comment · Fixed by #713
Labels

Comments

@maike2011
Copy link

maike2011 commented Sep 16, 2022

Having to work with SAS files created a unix system (SAS 9.4), we observed the following using read_sas() with a catalog_file (haven 2.5.1, R 4.2.1):

The catalog file (sas7bcat) seems to be ignored (no message, no error) if created on Unix, while read_sas() works as expected for catalog files created on windows, irrespective of the system that the corresponding data file was created on.

The attached zip contains a reproducible example with recreations of haven's example sas data sets hadley.sas7bdat and formats.sas7bcat in different variations: both were recreated twice using either

  • Windows: WINDOWS_64 or
  • Unix: HP_UX_64, RS_6000_AIX_64, SOLARIS_64, HP_IA64

All data sets have wlatin1 encoding.

Both data files can be read with the Windows catalog file with formats being applied as expected, but no formats are available when using the Unix catalog file.

@maike2011 maike2011 changed the title Catalog file ignored in read_sas() when created on Unix system catalog_file ignored in read_sas() when file created on Unix system Sep 16, 2022
@gorcha
Copy link
Member

gorcha commented Nov 24, 2022

Hi @maike2011, thanks for the bug report and for the example files!

This is likely an issue in ReadStat, the underlying C library. SAS don't publish any info on their file formats so all open source SAS readers rely on reverse engineering to support the various different file structures.

I'll have a look and see if there's an obvious cause, but it might take a while for any necessary changes to be made to ReadStat and flow downstream to haven.

gorcha added a commit that referenced this issue Feb 21, 2023
Maintains iconv hack from c1f9f19 and solaris hack from 4a878a1.

* Fix various SAS catalog file reading bugs (fix #529, fix #653, fix #680, fix #696, fix #705).
* Increase maximum SAS page file size to 16MB (fix #697).
* Ignore invalid SAV timestamp strings (fix #683).
* Fix compiler warnings (fix #707).
@gorcha gorcha closed this as completed in 196e8eb Feb 22, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants