Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CLI for inspection, and conversion to skops format, of sklearn pickle files #241

Open
BenjaminBossan opened this issue Dec 5, 2022 · 8 comments
Assignees
Labels
persistence Secure persistence feature

Comments

@BenjaminBossan
Copy link
Collaborator

It would be useful for skops to provide a CLI that allows to find the untrusted types in an sklearn pickle file, and to convert it to the skops format.

Details have yet to be discussed, but it could look something like this:

$ ls
my-model.pkl my-other-model.pkl
$ python -m skops inspect my-model.pkl
untrusted types: ...
$ python -m skops convert my-model.pkl
error: untrusted types ... found, pass trusted=true if you trust them or pass a list of explicitly trusted types
$ python -m skops convert my-model.pkl truste=true  # works
$ ls
my-model.pkl my-model.skops my-other-model.pkl
$ python -m skops convert *.pkl  # convert multiple models
$ ls
my-model.pkl my-model.skops my-other-model.pkl my-other-model.skops
@BenjaminBossan BenjaminBossan added the persistence Secure persistence feature label Dec 5, 2022
@merveenoyan
Copy link
Collaborator

It will simply be a script with argument parser? Is there anything else on top of this?

@BenjaminBossan
Copy link
Collaborator Author

Maybe we could add a utility function to skops.io so that the conversion can also be called from Python, but apart from that, it should be really straightforward. Probably for this purpose, argparse would be sufficient.

@adrinjalali
Copy link
Member

@E-Aho
Copy link
Collaborator

E-Aho commented Dec 5, 2022

Picking this up :)

@E-Aho
Copy link
Collaborator

E-Aho commented Dec 5, 2022

So I'm working on this, and I want to double check I'm not missing something:

To check a pkl file is "trusted" or not, or convert it, we need to call pkl.load() on a pickled object, then construct a tree for it, right?

Double checking I've not missed a method or API we already have for this.

@BenjaminBossan
Copy link
Collaborator Author

Yes, it involves loading the pickled object. That's a good point. So the help of those commands should emphasize that this will happen and what it implies.

@adrinjalali
Copy link
Member

Right now the tree can only be constructed from a skops file, not a pickle file/object, and there isn't much of a point to try and check security once the object is loaded anyway. If the pickle file is compromised, the act of loading it runs arbitrary code.

So as far as security goes, we only tell users that loading arbitrary pickle files can run arbitrary code, and then we convert the file. Once the file is converted, we can tell them which types they'd need to pass as trusted to load the file. This is also what I did in this space: https://huggingface.co/spaces/adrin/pickle-to-skops

@BenjaminBossan
Copy link
Collaborator Author

Yes, it should be framed as "unknown types", not "untrusted".

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
persistence Secure persistence feature
Projects
None yet
Development

No branches or pull requests

4 participants