Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Help me understand the "releases"? #35

Closed
jennybc opened this issue Jun 4, 2015 · 9 comments
Closed

Help me understand the "releases"? #35

jennybc opened this issue Jun 4, 2015 · 9 comments

Comments

@jennybc
Copy link
Contributor

jennybc commented Jun 4, 2015

I'm looking at the number of releases in the ~6500 packages on CRAN around the end of April. There's a huge number with 0 releases (see attached figure). I don't think this is real or right, yes? For example dplyr is one of those packages. Can you help me figure out what I'm looking at? BTW I computed number of releases from the length of the releases vector. We are getting this info through the API but below I just use crandb.

> library(crandb)
> package("dplyr")$releases
Argument unicode has been deprecated. YAJL always parses unicode.
list()
> package("lattice")$releases
Argument unicode has been deprecated. YAJL always parses unicode.
list()
> package("grid")$releases
Argument unicode has been deprecated. YAJL always parses unicode.
 [1] "2.0.0"  "2.0.1"  "2.1.0"  "2.1.1"  "2.2.0"  "2.2.1"  "2.3.0"  "2.3.1" 
 [9] "2.4.0"  "2.4.1"  "2.5.0"  "2.5.1"  "2.6.0"  "2.6.1"  "2.6.2"  "2.7.0" 
[17] "2.7.1"  "2.7.2"  "2.8.0"  "2.8.1"  "2.9.0"  "2.9.1"  "2.9.2"  "2.10.0"
[25] "2.10.1" "2.11.0" "2.11.1" "2.12.0" "2.12.1" "2.12.2" "2.13.0" "2.13.1"
[33] "2.13.2" "2.14.0" "2.14.1" "2.14.2" "2.15.0" "2.15.1" "2.15.2" "2.15.3"
[41] "3.0.0"  "3.0.1" 

n_rel

@Ironholds @dgrtwo

@gaborcsardi
Copy link
Contributor

A package version is included in a release if it was current at the time when that R version was released. As if CRAN was snapshotted at the time of the R release.

The only problem is that recent releases are missing, because I essentially gave up on the idea. Without any support from CRAN, it is too error-prone. You can see that 3.1.1 is the last "release".

So the reasons for the many packages without any releases are:

  • their most recent version was released after 3.1.1, and
  • possibly errors.

The majority is the former, I guess.

If you want, I can add 3.1.2, 3.1.3 and 3.2.0, but the packages that were published after 3.2.0 will still have no release.

@jennybc
Copy link
Contributor Author

jennybc commented Jun 4, 2015

Ooohhh. I completely didn't understand this field.

So when I look at a page like this:

http://cran.r-project.org/src/contrib/Archive/lattice/

or this:

https://github.com/cran/lattice/releases

and see the release history … that's not systematically exposed in this API?

Do you have any suggestions for obtaining that info? Given the above, I can think of various web-scrapy hacks / usage of the GitHub API but wonder if you have any thoughts on the matter.

@gaborcsardi
Copy link
Contributor

See e.g. http://crandb.r-pkg.org/dplyr/all and search for timeline. Or from the command line:

crandb::package("dplyr", version = "all")
#> ...
#> Other versions: 0.4.0, 0.3.0.2, 0.3.0.1, 0.3, 0.2, 0.1.3, 0.1.2, 0.1.1,
#>   0.1

@jennybc
Copy link
Contributor Author

jennybc commented Jun 4, 2015

Got it. Thanks. I can re-download the metadata we're getting from your API to get all versions (sadly, not our current default).

@jennybc jennybc closed this as completed Jun 4, 2015
@gaborcsardi
Copy link
Contributor

You can just get http://cran-------------/allall?limit=5 without the limit, and that will give you everything. No need to download package by package.

@jennybc
Copy link
Contributor Author

jennybc commented Jun 4, 2015

OK that worked! And only took several seconds actually. This should keep me busy :).

@jennybc
Copy link
Contributor Author

jennybc commented Jun 4, 2015

For posterity, wanna correct the url? http://cran----------------- (you've got two "all"s above).

@gaborcsardi
Copy link
Contributor

Sorry removed that URL, because I don't want robots to hit on it, it is heavy on the site. The allall is not a mistake, it also includes archived packages: https://github.com/metacran/crandb#-allall-complete-records-for-all-packages-including-archived-ones

Not the best name, I agree.....

@jennybc
Copy link
Contributor Author

jennybc commented Jun 4, 2015

OK got it. Sorry and thanks again.

jennybc pushed a commit to Ironholds/practice that referenced this issue Jun 6, 2015
if we want all metadata, then get it all at once via the API

r-hub/crandb#35 (comment)

by default, get metadata for all versions of a package, as opposed to only the latest
This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants