Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Expose all download history through the API #557

Closed
Eh2406 opened this Issue Feb 15, 2017 · 11 comments

Comments

Projects
None yet
4 participants
@Eh2406
Copy link
Contributor

Eh2406 commented Feb 15, 2017

Hi,

What is the best way to get the download history for all crates? The data is publicly available by scraping the graphs out of each crates page, but that is rude and inelegant.

If the data is available, how can I improve the docs to make it easier to find?
If it is not available, what can I do to add that functionality?

@alexcrichton

This comment has been minimized.

Copy link
Member

alexcrichton commented Feb 15, 2017

Right now the only route for this is the downloads route but that just provides the data you see rendered already. I don't believe there's a route for a historical paginated version of this. Adding one would be fine though!

@Eh2406

This comment has been minimized.

Copy link
Contributor Author

Eh2406 commented Feb 15, 2017

Thank you for that link! I will try and grok the code and open a pr. So far I mostly have questions. :-P

I think the comment may be out of date or I don't know how to read it, the link /crates/quadrature/downloads just gets me an error. It seems to be maching L89: "/crates/:crate_id/:version" insted of line L94: "/crates/:crate_id/downloads".

Is there some kind of caching for tx.prepare( or are we rebuilding with each request?

Is there some kind of caching for the website? All the download counts (except for today's data) are going to be static, seems a shame to hit the database repeatedly.

Is there a schema for the tables that we can query?

@alexcrichton

This comment has been minimized.

Copy link
Member

alexcrichton commented Feb 15, 2017

Nah currently we don't have any caching, everything hits the database. Also there's currently no caching around tx.prepare(..). The schema is probably best looked at by following the README to prepare a local database and exploring that.

@Eh2406

This comment has been minimized.

Copy link
Contributor Author

Eh2406 commented Feb 15, 2017

I will experiment with a locale instance when I have a chance. :-)

How do I hit the "/crates/:crate_id/:version" target? Everything I try just gets me error messages.

@alexcrichton

This comment has been minimized.

Copy link
Member

alexcrichton commented Feb 15, 2017

This should do the trick:

curl -H 'Content-Type: application/json' https://crates.io/api/v1/crates/libc/downloads

@carols10cents carols10cents changed the title programmatically got downlod history? Expose all download history through the API Feb 21, 2017

@Eh2406

This comment has been minimized.

Copy link
Contributor Author

Eh2406 commented Feb 24, 2017

Concrete suggestion:
Add an offset query parameter to version::downloads. This value is the number of days ago the most recent result will be from.

let offset = req.query().get("offset").parce().unwrap_or(0);
let cutoff_date_end = ::now() + Duration::days(-offset);
let cutoff_date_start = cutoff_date_end + Duration::days(-90);

This is primarily the smallest change I can think of to make the data available.

@Eh2406

This comment has been minimized.

Copy link
Contributor Author

Eh2406 commented Mar 3, 2017

Alternative suggestion:
Add a page query parameter to version::downloads. This is the number of 90 day units ago to get.

let offset = req.query().get("page").parce().unwrap_or(0);
let cutoff_date_end = ::now() + Duration::days(-90 * offset);
let cutoff_date_start = cutoff_date_end + Duration::days(-90);

Are these at all an acceptable idea? How can these ideas be improved?

@carols10cents

This comment has been minimized.

Copy link
Member

carols10cents commented Mar 9, 2017

Sure, paging sounds great and would match other endpoints' interfaces.

@sgrif

This comment has been minimized.

Copy link
Contributor

sgrif commented Mar 9, 2017

Random note if that endpoint is using Diesel by the time someone gets to it -- it can be done entirely in SQL as

use diesel::expression::dsl::*;

let cutoff_end_date = now - (90 * offset).days();
let cuttof_start_date = cutoff_end_date - 90.days();
@sgrif

This comment has been minimized.

Copy link
Contributor

sgrif commented Mar 9, 2017

Also there's currently no caching around tx.prepare(..).

All of the endpoints moved to Diesel do! ;) https://github.com/diesel-rs/diesel/blob/428db9515e5c7769a9313b4cf9bc14f1ced290e7/diesel/src/connection/statement_cache.rs

@Eh2406

This comment has been minimized.

Copy link
Contributor Author

Eh2406 commented Mar 13, 2017

Closed in #611

@Eh2406 Eh2406 closed this Mar 13, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.