Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

hasDataFile would benefit from a better description #690

Closed
maxhbr opened this issue Apr 3, 2024 · 10 comments · Fixed by #815
Closed

hasDataFile would benefit from a better description #690

maxhbr opened this issue Apr 3, 2024 · 10 comments · Fixed by #815
Milestone

Comments

@maxhbr
Copy link
Member

maxhbr commented Apr 3, 2024

The following line does not help to understand what it means and how it should be used. It is missing context and data file is a very generic term.

- hasDataFile: The `from` Element treats each `to` Element as a data file

@goneall
Copy link
Member

goneall commented Apr 3, 2024

Looks like it was just copied from the SPDX 2.X definitions.

@maxhbr If you want to create a PR, we can get this into 3.0, otherwise I'll target it for 3.1

@maxhbr
Copy link
Member Author

maxhbr commented Apr 3, 2024

If someone can explain to me, what a "data file" is, sure. But then it might even be easier to create the PR instead of explaining it to me.

@goneall
Copy link
Member

goneall commented Apr 3, 2024

It has been a very long time since this was discussed.

I'll move it to 3.1 unless someone wants to volunteer to write a PR.

@goneall goneall added this to the 3.1 milestone Apr 3, 2024
@kestewart
Copy link
Contributor

@rgopikrishnan91 - do you want to take a pass at this description?

@bact
Copy link
Collaborator

bact commented Jul 30, 2024

In this example SBOM, I use it to documented that a classifier script has a classifier model (data file).

[ predict.py ] --hasDataFile--> [ model.bin ]

{
    "type": "Relationship"
    "relationshipType": "hasDataFile",
    "from": "https://spdx.org/spdxdocs/File11",
    "to": [
        "https://spdx.org/spdxdocs/File10"
    ],
}

@bact
Copy link
Collaborator

bact commented Jul 30, 2024

Open PR #815 to get more suggestions

@rgopikrishnan91
Copy link
Contributor

I left a review on the PR, but here are my thoughts

Generic database fie --> File
I am unsure if saying hasDataFile is the right name? Should it hasAsset? because hasDataFile for a model executable and log file seems odd and non-intitutive.
Also, I am unsure and maybe we should clarify how its different from dependsOn relationship?

@bact @kestewart

@goneall
Copy link
Member

goneall commented Jul 30, 2024

Should it hasAsset?

In other parts of the SPDX spec, we've used the term Artifact to mean something more general than a file.

@bact
Copy link
Collaborator

bact commented Jul 30, 2024

I agree with @rgopikrishnan91 that we should clarify how hasDataFile (or hasAsset / hasArtifact) is different from dependsOn.

I can see that hasDataFile doesn't imply dependency, while dependsOn explicitly does.

hasDataFile also suggests that the to Element should be a File (although I don't think it is enforced by the model), while the to of dependsOn can be anything.

In this sense, dependsOn works more at the abstract level. While hasDataFile details the implementation level.

@bact
Copy link
Collaborator

bact commented Aug 1, 2024

Yesterday AI team meeting (2024-07-31), we settled with this:

  • hasDataFile: The from Element treats each to Element as a data file. A data file is an artifact that stores data required or optional for the from Element's functionality. A data file can be a database file, an index file, a log file, an AI model file, a calibration data file, a temporary file, a backup file, and more. For AI training dataset, test dataset, test artifact, configuration data, build input data, and build output data, please consider using the more specific relationship types: trainedOn, testedOn, hasTest, configures, hasInputs, and hasOutputs, respectively. This relationship does not imply dependency.

(see #815) @maxhbr do you think this is sufficient? thank you

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants