Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Transform6: outputs: profile data #33

Closed
kcoyle opened this issue May 27, 2021 · 5 comments
Closed

Transform6: outputs: profile data #33

kcoyle opened this issue May 27, 2021 · 5 comments

Comments

@kcoyle
Copy link
Collaborator

kcoyle commented May 27, 2021

  • Output format(s)?
    • JSON
    • YAML
    • ??
  • Output structure(s)?
    • structured, with shape and statements. Tom's example
      ShapeA
      statement1
      statement2
      ShapeB
      statement3
      etc.

    • unstructured, same as rows (as output from csv-to-json), i.e.

ShapeA
statement1

ShapeA
statement2

  • TAPs with no shapes
    • output just the statements?
    • use an empty shape structure? (as per Tom's example)
@tombaker
Copy link
Collaborator

In yesterday's call, Nishad and I argued strongly for having just one expected form of output from a DCTAP Instance: Shapes with statement constraints. The minimal DCTAP instance is a list of property identifiers, but in that case the output should show those statement constraints to be grouped in a default shape, because a shape is always expected to be present.

We had some discussion about what identifier to use for the default shape. I do not think it really matters much what identifier is used. In the 'dctap-python' utility, the default is set to be :default, but this should be configurable by the user. My readthedocs documentation puts it this way: "A default shape identifier is assigned if not provided in the CSV... In a 'shape-less' application, the shape identifier can simply be ignored."

Assigning an empty string as the default shape identifier does not seem like a good idea as it would break DCTAP instances in which a shape identifier were declared only after a series of "shape-less" statement constraints.

@kcoyle
Copy link
Collaborator Author

kcoyle commented Jun 10, 2021

My only comment is that a program would need to be able to distinguish between >1 default shapeID in the same table. So whether it is "default/default1...n" or what isn't of consequence, but that possibility needs to be there.

@tombaker
Copy link
Collaborator

tombaker commented Jun 10, 2021 via email

@tombaker
Copy link
Collaborator

The script now outputs JSON and YAML, eg:

dctap inspect --json some.csv

If there is any question about anonymous shapes, perhaps we could open a separate issue (and close this one).

@kcoyle
Copy link
Collaborator Author

kcoyle commented Jun 28, 2021

The decision was to use structured output, with json and yaml being the first models offered. In TAPs with only statements (no shape)the statements will be subordinate to a default shape. If we go beyond this we can open a new issue that is more specific. I will open an issue for future consideration of validating labels without IDs. This would be just a warning. If people want to have a profile with only labels, and that works for them, so be it.

@kcoyle kcoyle closed this as completed Jun 28, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants