Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

unable to fetch all the documents from the api #24

Closed
jmohit13 opened this issue May 30, 2022 · 1 comment
Closed

unable to fetch all the documents from the api #24

jmohit13 opened this issue May 30, 2022 · 1 comment

Comments

@jmohit13
Copy link

Hi, I am using Bio 1.3.7 version to retrieve documents from the pubmed db. I observed a mismatch in the number of search results from the Pubmed search and Bio api.

Entrez.email = "test@gmail.com"
Entrez.api_key = <API_KEY>
handle = Entrez.esearch(db=DB, term=QUERY, rettype="medline")
record = Entrez.read(handle)

count = int(record['Count'])
handle = Entrez.esearch(db=DB, term=QUERY, retmax=count, rettype="medline")
record = Entrez.read(handle)

id_list = record["IdList"]

query = ((cSCC) OR (Cutaneous squamous cell carcinoma)) AND ((relapse) OR (relapse rate) OR (treatment progression))

No. of results from Bio api = 1552
No. of results from Pubmed search = 1749

For a few other queries, I observed this difference to be quite large. Can you please look into this. Thanks.

@ialbert
Copy link
Owner

ialbert commented May 30, 2022

This issue occasionally pops up on Biostar

note that the problem comes from NCBI not bio or entrez direct in general. The NCBI website returns a different number of results on wether you connect from command line or via the web.

@ialbert ialbert closed this as completed Nov 16, 2022
This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants