New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for variable formats #123

Closed
wants to merge 6 commits into
base: master
from

Conversation

Projects
None yet
3 participants
@gorcha
Copy link
Contributor

gorcha commented Oct 21, 2015

Feature request (closes #119) - add support for variable formats when reading and writing. Similarly to variable labels, formats are stored as an attribute on each vector. Missing or invalid formats (e.g. trying to assign a string format to a numeric) get the default format for the variable type.

@evanmiller

This comment has been minimized.

Copy link
Contributor

evanmiller commented Oct 21, 2015

It is perhaps worth noting that SAV, DTA, and SAS7BDAT formats are incompatible. In your current implementation, a format read from an SAV file and then (unwittingly) written to a DTA file may have unexpected results.

Ideally these formats would be converted to some kind of intermediate representation either in ReadStat or in haven, but this of course will be a lot of work. A workaround would be to store format strings in separate attributes, format.spss, format.stata, etc.

@gorcha

This comment has been minimized.

Copy link
Contributor

gorcha commented Oct 22, 2015

Thanks Evan, good idea. Have modified so the attributes are FileType specific.

@hadley

This comment has been minimized.

Copy link
Member

hadley commented May 30, 2016

I think this is a reasonable idea, but it will need to be rewritten a little to work with #145, which I think is a good approach in the long term.

@hadley

This comment has been minimized.

Copy link
Member

hadley commented May 31, 2016

Overall, I like this idea, although it's going to need some updates with the changes I've made the internals. Do you have time to have a stab at updating this PR in the next week or so?

I think it might also be useful to implement a zap_formats() function that would drop all the format attributes, which you can then use in the unit tests.

@hadley

This comment has been minimized.

Copy link
Member

hadley commented Jun 6, 2016

This needs quite a bit of work after other changes, so I'll take it on

@hadley hadley closed this in 3707524 Jun 6, 2016

@lock lock bot locked and limited conversation to collaborators Jun 26, 2018

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.