Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for SPSS multiple response categories #25

Open
Berndvanderwielen opened this issue May 24, 2019 · 8 comments
Open

Support for SPSS multiple response categories #25

Berndvanderwielen opened this issue May 24, 2019 · 8 comments
Labels
enhancement New feature or request requires changes in Readstat waiting for changes in the C library Readstat

Comments

@Berndvanderwielen
Copy link

Support to retrieve (meta) data on MRVs / multiple answer question groupings would be great.

@ofajardo
Copy link
Collaborator

Sorry, no idea what MRVs are. Can you please provide an example SPSS file and explain what it is and what information are you trying to retrieve?

@Berndvanderwielen
Copy link
Author

The official name is "Multiple Response sets". URL: https://www.ibm.com/support/knowledgecenter/en/SSLVMB_23.0.0/spss/base/multiple_response_intro.html

If the SPSS documentation is not enough I can provide an example SPSS file.

@ofajardo
Copy link
Collaborator

ofajardo commented May 27, 2019

Yes, a sample file will be needed. In addition a description of the contents in plain text, because the job is to guess where in the binary file is the content you are looking for.

This will require changes to the Readstat C library. I can file an issue over there or you can do it yourself if you prefer. There is no guarantee that they will do it, nor timelines either.

It will help a lot if there is somewhere a description of how are these fields represented in the binary file. If you could find that would be great, because otherwise it will be very difficult to implement. An example of such specification is here or here but these doesn't seem to explain the feature you are requesting (can you see them?) Also other libraries in python or other languages that can do the job could also be useful to look at.

@ofajardo
Copy link
Collaborator

Closed due to lack of example files.

@SamMousa
Copy link

SamMousa commented Nov 9, 2020

@ofajardo, sorry for the ping, but I assume you're not receiving comments on closed issues.

Could this be reopened?

The specs for this record in the SPSS file are actually part of the spec you linked: https://www.gnu.org/software/pspp/pspp-dev/html_node/Multiple-Response-Sets-Records.html

Essentially what it does is specify the relationship between multiple questions that should be interpreted as a single question with multiple values instead.
example.sav.zip

Attached is an example file that contains 2 sets, one multiple category and multiple dichotomy. For details you could check docs here (https://www.gnu.org/software/pspp/manual/html_node/MRSETS.html#MRSETS), but for implementation that is not relevant.

Let me know if I can be of further assistance, I'm not familiar with python at all, but have spent many hours hating the binary file format that is SPSS SAV...

@ofajardo ofajardo reopened this Nov 9, 2020
@ofajardo
Copy link
Collaborator

ofajardo commented Nov 9, 2020

hi there,

Haven't look into it in detail yet, but this will require that Readstat (the C library behind pyreadstat) implements this.

Could you therefore open an issue there? (I am sure they will appreciate your insights into the binary file format). Once it is implemented in Readstat I will be able to bring it into pyreadstat.

@slobodan-ilic
Copy link

I've opened a new PR to address this #259 . In accordance with our team at Crunch.io and Evan Miller. We'll also open a PR on readstat, so this won't be immediately available. The idea is to rebase the ☝️ pr once readstat changes get shipped.

@arsoni20
Copy link

This feature will be hugely appreciated.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request requires changes in Readstat waiting for changes in the C library Readstat
Projects
None yet
Development

No branches or pull requests

5 participants