Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Regression and/or documentation lackage: cannot parse version from rubygems index #3

Open
AMDmi3 opened this issue Jun 17, 2019 · 10 comments

Comments

@AMDmi3
Copy link

AMDmi3 commented Jun 17, 2019

Here's a simple program to parse rubygems index I've successfully used with rubymarshal 1.0.3:

#!/usr/bin/env python3
  
import gzip
import requests
import rubymarshal.reader

data = requests.get('https://api.rubygems.org/latest_specs.4.8.gz').content
data = gzip.decompress(data)

for (name, ver, gemplat), _ in zip(rubymarshal.reader.loads(data), range(10)):
    print(name, ver, gemplat)

It's output with 1.0.3:

_ UsrMarshal:Gem::Version(['1.4']) ruby
- UsrMarshal:Gem::Version(['1']) b'ruby'
0mq UsrMarshal:Gem::Version(['0.5.3']) b'ruby'
0xdm5 UsrMarshal:Gem::Version(['0.1.0']) b'ruby'
0xffffff UsrMarshal:Gem::Version(['0.1.0']) b'ruby'
10to1-crack UsrMarshal:Gem::Version(['0.1.3']) b'ruby'
1234567890_ UsrMarshal:Gem::Version(['1.2']) b'ruby'
12_hour_time UsrMarshal:Gem::Version(['0.0.4']) b'ruby'
16watts-fluently UsrMarshal:Gem::Version(['0.3.1']) b'ruby'
189seg UsrMarshal:Gem::Version(['0.0.1']) b'ruby'

With 1.2.6 it looks like this

_ UsrMarshal({}) ruby
- UsrMarshal({}) ruby
0mq UsrMarshal({}) ruby
0xdm5 UsrMarshal({}) ruby
0xffffff UsrMarshal({}) ruby
10to1-crack UsrMarshal({}) ruby
1234567890_ UsrMarshal({}) ruby
12_hour_time UsrMarshal({}) ruby
16watts-fluently UsrMarshal({}) ruby
189seg UsrMarshal({}) ruby

Nice thing is that unicode problem has gone, but bad thing is that custom object is no longer parsed.

At the very least, this requires major version bump.

Next, the documentation is not clean or wrong on how this can be parsed now. Changing it the way an example suggests:

#!/usr/bin/env python3
  
import gzip
import requests
import rubymarshal.reader
from rubymarshal.classes import RubyObject, registry


data = requests.get('https://api.rubygems.org/latest_specs.4.8.gz').content
data = gzip.decompress(data)

class GemVersion(RubyObject):
    ruby_class_name = "Gem::Version"

registry.register(GemVersion)

for (name, ver, gemplat), _ in zip(rubymarshal.reader.loads(data), range(10)):
    print(name, ver, gemplat)

doesn't change a thing.

In fact, this cannot work (at least with this data file), because ClassRegistry uses class names in form of strs, but class name is read by Reader.read as Symbol("Gem::Version"), which is hashed differently, so self.registry.get(class_name, UsrMarshal) always returns UsrMarshal.

I've solved this by using ver.marshal_dump() instead, but I don't think it's correct solution.

@jayvdb
Copy link

jayvdb commented Jul 7, 2019

The only other code I can find which uses this for Gem Version is https://github.com/d9pouces/Moneta/blob/master/moneta/repositories/ruby.py and https://github.com/ATIX-AG/pulp_gem/blob/master/pulp_gem/specs.py written by @mdellweg , which may be also affected or might hold the answer for how to workaround this.

@d9pouces
Copy link
Owner

d9pouces commented Jul 8, 2019

You're right.
I'll update the doc and create two versions:

  • 1.3 -> back to the previous behavior
  • 2.0 -> the new behavior, which is closer to the Ruby implementation.

@AMDmi3
Copy link
Author

AMDmi3 commented Jul 31, 2019

Still, what's the correct way to get Gem::Version value now?

@jayvdb
Copy link

jayvdb commented Dec 30, 2019

Anyone only wanting a solution for Ruby versions, and only need py35+ support, https://github.com/dephell/dephell_specifier includes a fairly good Ruby version parser.
I havent run a full scan of rubygems, so I do expect there are some oddballs, and I'll be happy to help fixing any issues raised.

@pombredanne
Copy link

@d9pouces do you need some help to fix this? (and as an aside, thank you ++ for this library 🙇 )

@pombredanne
Copy link

@jayvdb re:

The only other code I can find which uses this for Gem Version ...

Actually I have a tool that's about to be released at last that uses this for Gem Version too.

@pombredanne
Copy link

FWIW I ran a quick git bisect and the commit that introduced the problem is at 7197a3d.... but I cannot fathom why this makes things fail. @d9pouces would you have some idea of where to poke?

@AMDmi3
Copy link
Author

AMDmi3 commented Aug 30, 2021

so I guess there is a way?

I've mentioned it right in the issue.

@mdellweg
Copy link

I got it to work with this class (It needs to inherit UsrMarshal, because it uses marshal_dump/load on the ruby side.):

class GemVersion(UsrMarshal):
    ruby_class_name = "Gem::Version"

    @property
    def version(self):
        return self._private_data[0]

    def __repr__(self):
        return f"{self.ruby_class_name}('{self.version}')"

    def __str__(self):
        return f"{self.ruby_class_name}('{self.version}')"

    def __eq__(self, other):
        return isinstance(other, self.__class__) and self._private_data == self._private_data


rubymarshal.classes.registry.register(GemVersion)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants