Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

source key definition collision #85

Closed
oeway opened this issue May 31, 2021 · 8 comments
Closed

source key definition collision #85

oeway opened this issue May 31, 2021 · 8 comments

Comments

@oeway
Copy link
Contributor

oeway commented May 31, 2021

We have source key in both RDF and model.yaml, which has distinct meaning, for RDF it means the original source of the resource item, but in the model.yaml, it means the source code.

We should rename one of then.

@constantinpape
Copy link
Collaborator

Yes, we should def. fix this. Does this

original source of the resource item

mean the url (or doi) where the resource can be found?

I think I would rather change this in the rdf than in the model, because we currently have more models than other things.
But I don't have a very strong opinion on this.

@oeway
Copy link
Contributor Author

oeway commented May 31, 2021

Yes, we should def. fix this. Does this

original source of the resource item

mean the url (or doi) where the resource can be found?

Yes.

I think I would rather change this in the rdf than in the model, because we currently have more models than other things.
But I don't have a very strong opinion on this.

I am fine with both too. We need to think a bit more on what key to use then. WIth that said, I think the source key in the model.yaml is not very descriptive though, maybe it's just me, I feel it hard to directly link to source code when I see source. And here we are actually more specifically mean the model architecture.

@FynnBe
Copy link
Member

FynnBe commented May 31, 2021

If a model is supposed to be a valid RDF as well, we also would need to add the 'new' source field to the model, right? That might be better than renaming the model.source to something else and than adding the 'new' model.source, unless we rename both source fields to avoid confusion.

@oeway
Copy link
Contributor Author

oeway commented May 31, 2021

After thinking a bit more in the other issue about the model source key, wouldn't it make more sense that we move the source code to the pytorch weights format itself?

Let's say we have 2 weights formats in weights for the same model: 1) keras weights only file 2) torch state dict file.

Now it becomes very tricky because both weights format contains only weights, they both require python source code to construct a model. And we won't be able to support both because there is only one source key. The solution for this is obvious, we can simply move the source to the weights format itself. Make it a non-standard key that only required for certain weights format. What do you think? @constantinpape @FynnBe

@constantinpape
Copy link
Collaborator

After thinking a bit more in the other issue about the model source key, wouldn't it make more sense that we move the source code to the pytorch weights format itself?

Yes, I totally agree with this. We should move it to the weight format because it is relevant for loading the weights only.

This opens up another question though: what happens to language and framework? These are currently tied to the general source field.
I would vote for just removing these fields. Afaik they don't provide any information and they can also be inconsistent if we provide multiple weights.

@oeway
Copy link
Contributor Author

oeway commented May 31, 2021

This opens up another question though: what happens to language and framework? These are currently tied to the general source field.
I would vote for just removing these fields. Afaik they don't provide any information and they can also be inconsistent if we provide multiple weights.

Removing them make total sense, we can also infer framework (and language ?) from the weights format. should we also move dependencies into weights since it's also source code and weights format specific? E.g. for tensorflow models we depends on tensorflow 2, and for pytorch model we depends on torch?

Would you want to take the chance to rename source key into a more descriptive one? Maybe architecture or model_class since it's specifically a model class we are describing. Then perhaps we can keep the top-level source key for RDF?

@constantinpape
Copy link
Collaborator

Removing them make total sense, we can also infer framework (and language ?)

Exactly.

from the weights format. should we also move dependencies into weights since it's also source code and weights format specific? E.g. for tensorflow models we depends on tensorflow 2, and for pytorch model we depends on torch?

Yes, I agree; that makes sense.

Would you want to take the chance to rename source key into a more descriptive one? Maybe architecture or model_class since it's specifically a model class we are describing. Then perhaps we can keep the top-level source key for RDF?

I agree, we can rename it. I like architecture, but don't have a strong opinion here.

@FynnBe
Copy link
Member

FynnBe commented Jun 1, 2021

👍 to all above. I like architecture, too (It might as well point to a getter function)

@FynnBe FynnBe added this to To Do in RDF Releases via automation Aug 4, 2021
@FynnBe FynnBe moved this from To Do to Model RDF 0.4.0 in RDF Releases Aug 4, 2021
@FynnBe FynnBe mentioned this issue Nov 29, 2021
@FynnBe FynnBe closed this as completed Nov 29, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
No open projects
RDF Releases
Model RDF 0.4.0
Development

No branches or pull requests

3 participants