Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ruby (MRI) scraper #3

Closed
jasonkarns opened this issue Jan 12, 2016 · 8 comments · Fixed by #9
Closed

Ruby (MRI) scraper #3

jasonkarns opened this issue Jan 12, 2016 · 8 comments · Fixed by #9

Comments

@jasonkarns
Copy link
Owner

No description provided.

@hsbt
Copy link

hsbt commented Jan 17, 2016

I saw rbenv/ruby-build#882 patch. Did you scrape from https://rubinius-releases-rubinius-com.s3.amazonaws.com/index.txt ?

If it's true, I will prepare digest hash list of CRuby compatible rubinius format.

@jasonkarns
Copy link
Owner Author

yep! the rubinius scraper is in #2 (https://github.com/jasonkarns/ruby-build-update-defs/pull/2/files#diff-83ea838d1f1ef90b299cc1478d161c33)

But honestly, I think an easier format might be a simple TSV

<version> <tarball_url> <checksum>

or perhaps:

<version> <tarball_url> <checksum_algorithm> <checksum>
...

The second format would allow easily publishing multiple checksums in various formats (md5, sha2-256, sha2-512, etc) for each release.

Basically, I've set up this plugin to support multiple scrapers, assuming that each ruby would require it's own scraper. There's no need to base it off of rubinius' format, but rather, do what's best for your project and build process. Copying rubinius' format might make this scraper trivial in the short term, but I don't think their format is the most robust. So it's entirely up to you! Feel free to post notes here as you have drafts of the release manifest (for lack of a better term), and I'll take a stab at the scraper for it, providing any feedback from there.

@jasonkarns
Copy link
Owner Author

@hsbt any progress on cruby manifest file (for scraping?)

@hsbt
Copy link

hsbt commented Apr 26, 2016

No progress :( Please wait several weeks...

@hsbt
Copy link

hsbt commented Apr 27, 2016

@jasonkarns
Copy link
Owner Author

jasonkarns commented May 10, 2016

@hsbt awesome!

Might I suggest just a couple tweaks?

  • drop the file extension from the version name (so that it's just the version/release name). the file extension in the url sufficiently distinguishes the releases, IMO.
  • only single release (for a given file format) per line. this makes line-oriented tools much more useful in parsing the file. requires placing the checksums one after the other.
  • to help support ^^, the first line can be a header line acting as column names

Using 2.3.0 as an example:

name    url sha1        sha256  sha512
ruby-2.3.0  https://cache.ruby-lang.org/pub/ruby/2.3/ruby-2.3.0.tar.gz  2dfcf7f33bda4078efca30ae28cb89cd0e36ddc4    ba5ba60e5f1aa21b4ef8e9bf35b9ddb57286cb546aac4b5a28c71f459467e507    914d0201ecefaeb67aca0531146d2e89900833d8d2a597ec8a19be94529ab6b4be367f9b0cee2868b407288896cc14b64d96150223cac0aef8aafc46fc3dd7cc
ruby-2.3.0  https://cache.ruby-lang.org/pub/ruby/2.3/ruby-2.3.0.tar.xz  96e620e38af351c8da63e40cfe217ec79f912ba1    70125af0cfd7048e813a5eecab3676249582bfb65cfd57b868c3595f966e4097    d893c5e6db5a0533e0da48d899c619964388852ef90e7d1b92a4363d5f189cd2dba32a009581f62b9f42a8e6027975fc3c18b64faf356f5e3ac43a8d69ec5327
ruby-2.3.0  https://cache.ruby-lang.org/pub/ruby/2.3/ruby-2.3.0.zip 3f88617568d9a4f491e8b32dca532363f73eaa71    8270bdcbc6b62a18fdf1b75bd28d5d6fc0fc26b9bd778d422393a1b98006020a    a3f397bb3c9c19d9b797552c5d60bb01c82db884cfa966df84881125bea35713cffd99f88fb86b271bae72d9cfb09ad9b33838cffcf6365c091459479914fdef

Many thanks for publishing the manifest file. These tweaks would make parsing a bit easier, at least for POSIX shell tooling.

Edit:

Apologies. I just noticed that there are multiple files published per version so I've update my comment above

@jasonkarns jasonkarns mentioned this issue May 10, 2016
@hsbt
Copy link

hsbt commented May 11, 2016

I applied your suggestions. Please confirm http://cache.ruby-lang.org/pub/ruby/index.txt

@jasonkarns
Copy link
Owner Author

@hsbt Great! I've finished and merged the scraper in #9. Running the scraper also found a bunch of old rubies that didn't exist in ruby-build. I've opened rbenv/ruby-build#949 for them.

Ideally, this now allows the ruby-build definition process to be fully scripted, as I've done with node-build. I intend to open a PR to ruby-build soon with a script for the full process.

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants