Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error in names(filedatatypelist_DHS) <- paste0("filedatatypelist_", qdapRegex::rm_between(filedatatypelist_DHS_line, : 'names' attribute [16] must be the same length as the vector [1] #131

Closed
chilochibi opened this issue May 8, 2022 · 12 comments · Fixed by #132

Comments

@chilochibi
Copy link

Hello, I have seen that this issue has been addressed but I seem not to go through to download the datasets. I have access to two projects on the DHS site. However, I am also getting the error below. I followed the step by step example provided in the link below. https://www.rdocumentation.org/packages/rdhs/versions/0.7.3

Error in names(filedatatypelist_DHS) <- paste0("filedatatypelist_", qdapRegex::rm_between(filedatatypelist_DHS_line,  : 
  'names' attribute [16] must be the same length as the vector [1]
@jeffeaton
Copy link
Collaborator

Thanks for reporting this. I cannot tell from the above what you are trying to do or what commands are giving rise to the error.

Can you please post the code that you are trying to run that gives rise to the error and the full console output?

Thanks,
Jeff

@chilochibi
Copy link
Author

Thanks @jeffeaton for the response, below is the console output. I am trying to access full DHS datasets by downloading using the get_datasets function.

`> library(rdhs)

sc <- dhs_survey_characteristics()
sc[grepl("Malaria", sc$SurveyCharacteristicName), ]
SurveyCharacteristicID SurveyCharacteristicName
71 96 Malaria DBS
72 90 Malaria microscopy
73 124 Malaria microscopy
74 119 Malaria microscopy - thin smear
75 57 Malaria questions
76 89 Malaria RDT
ids <- dhs_countries(returnFields=c("CountryName", "DHS_CountryCode"))

survs <- dhs_surveys(surveyCharacteristicIds = 89, countryIds = c("CD","TZ"), surveyYearStart = 2013)
datasets <- dhs_datasets(surveyIds = survs$SurveyId, fileFormat = "FL", fileType = "PR")
str(datasets)
'data.frame': 3 obs. of 13 variables:
$ FileFormat : chr "Flat ASCII data (.dat)" "Flat ASCII data (.dat)" "Flat ASCII data (.dat)"
$ FileSize : int 6595349 6491292 2171918
$ DatasetType : chr "Survey Datasets" "Survey Datasets" "Survey Datasets"
$ SurveyNum : int 421 485 529
$ SurveyId : chr "CD2013DHS" "TZ2015DHS" "TZ2017MIS"
$ FileType : chr "Household Member Recode" "Household Member Recode" "Household Member Recode"
$ FileDateLastModified: chr "September, 19 2016 09:58:23" "September, 28 2019 17:58:28" "June, 11 2019 15:38:22"
$ SurveyType : chr "DHS" "DHS" "MIS"
$ SurveyYearLabel : chr "2013-14" "2015-16" "2017"
$ SurveyYear : chr "2013" "2015" "2017"
$ DHS_CountryCode : chr "CD" "TZ" "TZ"
$ FileName : chr "CDPR61FL.ZIP" "TZPR7BFL.ZIP" "TZPR7IFL.ZIP"
$ CountryName : chr "Congo Democratic Republic" "Tanzania" "Tanzania"

set_rdhs_config(email = "myemaill@gmail.com",
project = "Net ownership by individual",
config_path = "rdhs.json",
cache_path = "project_one",
password_prompt = TRUE,
global = FALSE)
Writing your configuration to:
-> rdhs.json

microbenchmark::microbenchmark(dhs_surveys(surveyYear = 2015),times = 1)
Unit: milliseconds
expr min lq mean median uq max neval
dhs_surveys(surveyYear = 2015) 3.0491 3.0491 3.0491 3.0491 3.0491 3.0491 1

microbenchmark::microbenchmark(dhs_surveys(surveyYear = 2015), times = 1)
Unit: milliseconds
expr min lq mean median uq max neval
dhs_surveys(surveyYear = 2015) 3.1685 3.1685 3.1685 3.1685 3.1685 3.1685 1

downloads <- get_datasets(datasets$FileName)
Logging into DHS website...
Error in names(filedatatypelist_DHS) <- paste0("filedatatypelist_", qdapRegex::rm_between(filedatatypelist_DHS_line, :
'names' attribute [16] must be the same length as the vector [1]
`

@jeffeaton
Copy link
Collaborator

Thanks very much. I am not familiar with that error unfortunately.

From reviewing the console output, it looks like you are trying to download three MIS datasets, correct? Could you try the following code to see if that gives you the same error?

library(rdhs)
datasets <- c("CDPR61FL.ZIP", "TZPR7BFL.ZIP", "TZPR7IFL.ZIP")
downloads <- get_datasets(datasets)

@chilochibi
Copy link
Author

Yes, that's correct. I have tried to run the code above but still no luck. I am still getting the same error. See out put below

`library(rdhs)

datasets <- c("CDPR61FL.ZIP", "TZPR7BFL.ZIP", "TZPR7IFL.ZIP")
downloads <- get_datasets(datasets)
Logging into DHS website...
Error in names(filedatatypelist_DHS) <- paste0("filedatatypelist_", qdapRegex::rm_between(filedatatypelist_DHS_line, :
'names' attribute [16] must be the same length as the vector [1]`

@jeffeaton
Copy link
Collaborator

Thanks for checking. It looks like the function is failing while creating the list of available datasets that have been approved for your project:

rdhs/R/authentication.R

Lines 101 to 110 in c368bdb

filedatatypelist_DHS_line <- grep("name=\"filedatatypelist_", y, value = TRUE)
filedatatypelist_DHS <- qdapRegex::rm_between(
filedatatypelist_DHS_line, '"', '"', extract = TRUE
) %>% lapply(function(x) x[3])
names(filedatatypelist_DHS) <- paste0(
"filedatatypelist_",
qdapRegex::rm_between(filedatatypelist_DHS_line,
"filedatatypelist_", "\" value", extract = TRUE)
)

Are you able to download these data sets when you login via the DHS webpage? E.g. from here: https://dhsprogram.com/data/dataset/Tanzania_MIS_2017.cfm?flag=0

To make sure the authentication for your account and project is working, can you try the following and ensure it returns a valid proj_id

library(rdhs)
my_config <- get_rdhs_config()
rdhs:::authenticate_dhs(my_config)

This is what I get back:

> rdhs:::authenticate_dhs(my_config)
Logging into DHS website...
$user_name
[1] "jeffrey.eaton@imperial.ac.uk"

$user_pass
[1] "<REDACTED>"

$proj_id
[1] "75312"

(Make sure not to copy/paste the full output -- it contains your account login password)

@chilochibi
Copy link
Author

chilochibi commented May 9, 2022

Yes, I am able to download all DHS/MIS datasets manually. Below is what I get after checking authentication.

`

my_config <- get_rdhs_config()
rdhs:::authenticate_dhs(my_config)
Logging into DHS website...
$user_name
[1] "cchiziba@gmail.com"

$user_pass
[1] "REDACTED"

$proj_id
[1] "159623"`

@horaciochacon
Copy link

I am having the exact same problem when trying to download datasets for which I have access:

> dhs_datasets <- get_datasets(dataset_filenames = "PKPR71FL.ZIP") Logging into DHS website... Error in names(filedatatypelist_DHS) <- paste0("filedatatypelist_", qdapRegex::rm_between(filedatatypelist_DHS_line, : 'names' attribute [16] must be the same length as the vector [1]

@jeffeaton
Copy link
Collaborator

Thanks very much. I'm a bit stumped on this and not able to reproduce.

@OJWatson -- any ideas?

OJWatson added a commit that referenced this issue May 9, 2022
Fix for #131. Bug due to new version of `qdapRegex` on CRAN
@OJWatson
Copy link
Collaborator

OJWatson commented May 9, 2022

Hey @jeffeaton, @horaciochacon, @chilochibi,

Thanks all for the helpful debugs. Think this was due to a new version of qdapRegex on CRAN (which won't get picked up by the CRAN checks as available_datasets requires a DHS login).

I have just merged a fix in for this. Please have a go redownloading rdhs v0.7.4 and let me know if this fixes it.

OJ

@OJWatson OJWatson reopened this May 9, 2022
@horaciochacon
Copy link

Hi @OJWatson, this definitely solved the issue. Thanks!

@OJWatson
Copy link
Collaborator

OJWatson commented May 9, 2022

Great to hear. Will close this then now.

@OJWatson OJWatson closed this as completed May 9, 2022
@mohankhanal19
Copy link

how to download rdhs v0.7.4

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants