New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Missing values problem read_dta() #43

Closed
aftollefsen opened this Issue Mar 10, 2015 · 1 comment

Comments

Projects
None yet
2 participants
@aftollefsen
Copy link

aftollefsen commented Mar 10, 2015

Great package. However, I have some issues reading STATA files. All the missing observations are somehow ignored, and not correctly identified.
My example data is here: https://www.dropbox.com/s/msbz5f7d5p84k2d/BFIR21FL.zip?dl=0

Code replicating my issue:

bfir21fl <- read_dta(path = "BFIR21FL.DTA")
str(bfir21fl$v121)
Class 'labelled' atomic [1:6354] 1 1 1 1 ...
..- attr(, "label")= chr "has television"
..- attr(
, "labels")= Named int [1:2] 0 1
.. ..- attr(*, "names")= chr [1:2] "no" "yes"
sum(is.na(bfir21fl$v121))
[1] 0
sum(is.na(as_factor(bfir21fl$v121)))
[1] 0

Using the read.dta from foreign results in:

bfir21fl_1 <- read.dta("BFIR21FL.DTA")
str(bfir21fl_1$v121)
Factor w/ 2 levels "no","yes": 2 2 2 2 2 1 1 1 1 1 ...
length(bfir21fl_1$v121)
[1] 6354
sum(is.na(bfir21fl_1$v121))
[1] 61

61 is the correct result. My question is, either there is something I am doing wrong, or the conversion is not working correctly, or not according to my expectations.

@evanmiller

This comment has been minimized.

Copy link
Contributor

evanmiller commented Mar 11, 2015

@hadley this file is imported correctly with ReadStat, with 61 values identified as missing in column v121.

@hadley hadley closed this in 45c559e Apr 7, 2015

@lock lock bot locked and limited conversation to collaborators Jun 27, 2018

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.