Skip to content
This repository has been archived by the owner on Jun 23, 2020. It is now read-only.

Survey Datasets Transformation and Re-Configuration #26

Closed
evaristoc opened this issue May 2, 2016 · 8 comments
Closed

Survey Datasets Transformation and Re-Configuration #26

evaristoc opened this issue May 2, 2016 · 8 comments
Assignees

Comments

@evaristoc
Copy link
Collaborator

evaristoc commented May 2, 2016

We are working on data transformation to fit the requirements for the analysis.

People who have been working on this so far:

We are currently working on @erictleung's fork but reporting through this channel to preserve history.

@evaristoc
Copy link
Collaborator Author

evaristoc commented May 2, 2016

@erictleung identified several discrepancies, particularly related to data parsing. Those that he found, he solved. We must be following this until being sure that all corrections are complete.

@evaristoc
Copy link
Collaborator Author

evaristoc commented May 2, 2016

These are main tasks:

Currently working on best way to encode variables and values to facilitate analysis. We are working on:

  • Variable re-naming and or Creation
  • Variable labelling
  • Value encoding
  • Value labelling
  • Missing value encoding and labelling
  • Metadata File Management

We should take care that changes made to values and variables should be updated in the metadata file manager.

@QuincyLarson
Copy link
Contributor

@evaristoc @erictleung Before you do this, I recommend you get a final dump of the survey. We've had several hundred additional responses since I last did a dump. I will upload the files now.

@QuincyLarson
Copy link
Contributor

@erictleung @evaristoc OK - I've added the updated data.

@erictleung
Copy link
Member

Just for reference, here is the branch of my fork of the repo to clean and combine the data. I'm primarily doing the data processing in R, for those interested.

@evaristoc
Copy link
Collaborator Author

We found several inconsistencies for the answers given to:

  • Resources : Others
  • PodCasts : Others
  • Code Events : Others

We are working on parsing the data and trying to give some consistency to those inconsistency based on general assumptions. This also means that answers that can not be responded for those assumptions will be considered as missing.

This procedure might not be perfect but it is the best we can do.

@evaristoc
Copy link
Collaborator Author

evaristoc commented May 12, 2016

Sorry for the two recent messages (5 min ago): they were deleted and discussion moved to #29

@SamAI-Software
Copy link
Member

Reference for questions about data #41

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

4 participants