New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

retmax #4

Closed
vid opened this Issue Aug 22, 2014 · 3 comments

Comments

2 participants
@vid

vid commented Aug 22, 2014

are you planning to support passing an option for different number of required results?

thanks.

@bmpvieira

This comment has been minimized.

Member

bmpvieira commented Aug 22, 2014

Yes, I need to add two options. One for limiting the number of results (right now it returns everything for a particular search) and another for changing the number of items asked per internal bionode-ncbi request to NCBI servers (currently 50). The latter doesn't affect the number of results since bionode-ncbi will paginate internally until it returns everything, but can affect performance and stability.

So if you do a search that will return 1000 items, bionode-ncbi will currently do sequentially 20 requests to NCBI. Increasing retmax for example to 500 so that it only does 2 requests can improve speed. However if you're running it in an pipeline, in some cases, it's better/faster to do many small requests and pipe frequently to other steps downstream than to wait for NCBI to process 500 items and then pipe all those items at once on your downstream processing.

Another reason to ask for less items per request is that for some NCBI databases, each item can contain a lot of data so asking for 500 can actually cause a timeout of the request.

So I'll probably keep the number of items per request low, or adjusted to the average item size for each type of database (e.g., sra, pubmed, biosample) but I will provide an option to override it so that advanced users can tweak it.

@vid

This comment has been minimized.

vid commented Aug 23, 2014

Great. I'm interested in the first option. I am not using dat currently but will replace my own ncbi search with this and work on transitioning to dat.

@bmpvieira

This comment has been minimized.

Member

bmpvieira commented Aug 27, 2014

Option added:
bionode-ncbi search human --limit 10 (or just -l)

or in JavaScript

ncbi.search({ db: 'sra', term: 'human', limit: 10 }).on('data', console.log)

@bmpvieira bmpvieira closed this Aug 27, 2014

@bmpvieira bmpvieira self-assigned this Apr 5, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment