-
Notifications
You must be signed in to change notification settings - Fork 38
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix bug where retmode is not passed down to make_entrez_query in entr… #121
Conversation
…ez_fetch. bump version
Thanks for this @gmbecker, and apologies for the bug. Let me check a few things and I will try to get this up on CRAN today |
Hey @gmbecker , I will improve the way cat(entrez_fetch(db='nuccore', rettype='gb',retmode='text', id=104)) LOCUS CAA29094 357 aa linear MAM 14-JUL-2016
DEFINITION beta-subunit, partial [Bos taurus].
ACCESSION CAA29094
VERSION CAA29094.1
DBSOURCE embl accession X05605.1
KEYWORDS .
SOURCE Bos taurus (cattle)
... |
@dwinter. Thanks for looking at this so quickly!
It still fails for me with rettype "gbwithparts" (which is what genbankr
uses) because of the missing retmode. In the first try below I just inspect
args and let it run, which fails, in the second, I manually insert
retmode="text" into the args, which works:
debug(entrez_fetch)
cat(entrez_fetch(db='nuccore', rettype='gbwithparts',retmode='text',
id=104))
debugging in: entrez_fetch(db = "nuccore", rettype = "gbwithparts", retmode
= "text",
id = 104)
debug: {
identifiers <- id_or_webenv()
if (parsed) {
if (!is_xml_record(rettype, retmode)) {
msg <- paste("At present, entrez_fetch can only parse XML
records, got",
rettype)
stop(msg)
}
}
args <- c(list("efetch", db = db, rettype = rettype, config = config,
...), identifiers)
records <- do.call(make_entrez_query, args)
if (parsed) {
return(parse_response(records, rettype))
}
records
}
Browse[2]> n
debug: identifiers <- id_or_webenv()
Browse[2]>
debug: if (parsed) {
if (!is_xml_record(rettype, retmode)) {
msg <- paste("At present, entrez_fetch can only parse XML records,
got",
rettype)
stop(msg)
}
}
Browse[2]>
debug: args <- c(list("efetch", db = db, rettype = rettype, config = config,
...), identifiers)
Browse[2]>
debug: records <- do.call(make_entrez_query, args)
Browse[2]> args
[[1]]
[1] "efetch"
$db
[1] "nuccore"
$rettype
[1] "gbwithparts"
$config
NULL
$id
[1] 104
Browse[2]> n
*Error: HTTP failure: 400*
*<?xml version="1.0" encoding="UTF-8" ?>*
*<!DOCTYPE eEfetchResult PUBLIC "-//NLM//DTD efetch 20131226//EN"
"https://eutils.ncbi.nlm.nih.gov/eutils/dtd/20131226/efetch.dtd
<https://eutils.ncbi.nlm.nih.gov/eutils/dtd/20131226/efetch.dtd>">*
*<eFetchResult>*
* <ERROR> Error: CEFetchPApplication::proxy_stream(): ID list is empty! In
it there are neither IDs nor accessions.</ERROR>*
*</eFetchResult>*
debug(entrez_fetch)
cat(entrez_fetch(db='nuccore', rettype='gbwithparts',retmode='text',
id=104))
debugging in: entrez_fetch(db = "nuccore", rettype = "gbwithparts", retmode
= "text",
id = 104)
debug: {
identifiers <- id_or_webenv()
if (parsed) {
if (!is_xml_record(rettype, retmode)) {
msg <- paste("At present, entrez_fetch can only parse XML
records, got",
rettype)
stop(msg)
}
}
args <- c(list("efetch", db = db, rettype = rettype, config = config,
...), identifiers)
records <- do.call(make_entrez_query, args)
if (parsed) {
return(parse_response(records, rettype))
}
records
}
Browse[2]> n
debug: identifiers <- id_or_webenv()
Browse[2]> n
debug: if (parsed) {
if (!is_xml_record(rettype, retmode)) {
msg <- paste("At present, entrez_fetch can only parse XML records,
got",
rettype)
stop(msg)
}
}
Browse[2]> n
debug: args <- c(list("efetch", db = db, rettype = rettype, config = config,
...), identifiers)
Browse[2]> n
debug: records <- do.call(make_entrez_query, args)
Browse[2]> args
[[1]]
[1] "efetch"
$db
[1] "nuccore"
$rettype
[1] "gbwithparts"
$config
NULL
$id
[1] 104
*Browse[2]> args$retmode = "text"*
*Browse[2]> n*
debug: if (parsed) {
return(parse_response(records, rettype))
}
Browse[2]> n
debug: records
Browse[2]> n
exiting from: entrez_fetch(db = "nuccore", rettype = "gbwithparts", retmode
= "text",
id = 104)
LOCUS CAA29094 357 aa linear MAM
14-JUL-2016
DEFINITION beta-subunit, partial [Bos taurus].
ACCESSION CAA29094
VERSION CAA29094.1
DBSOURCE embl accession X05605.1
KEYWORDS .
SOURCE Bos taurus (cattle)
*<snip>*
*> sessionInfo()*
* R version 3.4.0 (2017-04-21) Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: OS X El Capitan 10.11.6 Matrix products: default BLAS:
/Users/beckerg4/local/R-3.4.0/R.framework/Versions/3.4/Resources/lib/libRblas.dylib
LAPACK:
/Users/beckerg4/local/R-3.4.0/R.framework/Versions/3.4/Resources/lib/libRlapack.dylib
locale: [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
attached base packages: [1] stats graphics grDevices utils
datasets methods base other attached packages: [1] rentrez_1.2.0
loaded via a namespace (and not attached): [1] httr_1.2.1
compiler_3.4.0 R6_2.2.2 tools_3.4.0 curl_2.7 [6]
jsonlite_1.5 XML_3.98-1.9 tcltk_3.4.0 *
*~G*
…On Tue, Feb 20, 2018 at 3:11 PM, David Winter ***@***.***> wrote:
Hey @gmbecker <https://github.com/gmbecker> , I will improve the way
retmode and rettype are documented and used. But for now I think you will
just need to specify both arguments (the NCBI is now returning xml for
genbank by default). This works for me on the CRAN version
cat(entrez_fetch(db='nuccore', rettype='gb',retmode='text', id=104))
LOCUS CAA29094 357 aa linear MAM 14-JUL-2016
DEFINITION beta-subunit, partial [Bos taurus].
ACCESSION CAA29094
VERSION CAA29094.1
DBSOURCE embl accession X05605.1
KEYWORDS .
SOURCE Bos taurus (cattle)
...
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#121 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AA3dscPXQR45GPCDIrB6KT6yUUNYevbPks5tW1EZgaJpZM4SMsAK>
.
--
Gabriel Becker, Ph.D
Scientist
Bioinformatics and Computational Biology
Genentech Research
|
Hi @gmbecker, Seems like a few behaviours have changed recently at NCBI (so even with this fix some tests fail). Making a complete fix for this will need a few tweaks, and I would like to update the docs to make the retmode/rettypes a bit clearer. That probably means a CRAN release is a couple of days away. In the meantime, I have merged the relevant bits of this PR into the develop branch. If you have users being struck by this you can get them to install the dev branch: devtools::install_github("ropensci/rentrez", ref = "develop") Thanks again for pointing this our and finding the solution. Will let you know when a release goes to CRAN. |
Hi @gmbecker , sorry for the delay in getting a fix to this on CRAN. The develop branch now was a complete fix for this with updates to tests and documentation. Can you confirm the version on the development branch works with genbankr? If so I can get it on CRAN and you'll stop being bugged by the Bioconductor testing platform :) |
@dwinter Looks looks good to go, vignette and tests now passing for release version of genbankr (they weren't before). Can you confirm that 1.2.1 is the version number that will be used when submitting to CRAN? if so I'll push a versioned dep into the DESCRIPTION file for release and devel versions of genbankr and we should be good. |
Hi @gmbecker. Yup, 1.2.1 uploaded to CRAN now, will hopefully be available in the next little while. |
The retmode parameter is ignored in entrez_fetch, which causes xml to be returned when a genbank file was asked for. This breaks integration with genbankr. If you could push a narrow bugfix to cran for this soon I would appreciate it, as genbankr is going to continue to fail in the Bioc build system until either this is fixed or I disable the integration (which Id on't want to do, since I think it's a very powerful feature).
I went ahead and bumped the version in the DESCRIPTION file because I need to be able to put a versioned dependency in genbankr, but if you'd prefer those happen separately I can recreate the PR without that.