[PROCESS] Term Data Model and Form #24

vinomaster · 2020-10-07T20:46:35Z

Issue/Feature Description

We need to simplify the current Issue Template for Terms.

Proposed Solution

Use Markdown

Term/Phrase: <value>
Definition: <value>
Usage example: <external reference>
Relavent Communities: <value>
Tags: <value>

We should eliminate the existing Scope and Concept forms

The text was updated successfully, but these errors were encountered:

vinomaster · 2020-10-08T10:43:37Z

@dhh1128 Do we want each submission to be via github? What about batch submissions? How would we handle? I suggest a PR approach via a dedicated folder in the repo.

All Issue based submission can be accepted by CTWG and dropped into that folder for handling.
All batch submissions would create discrete .md files 1/term in that folder.
We would not accept term.md files that contain multiple terms.

Thoughts?

RieksJ · 2020-10-13T06:11:50Z

Are we already clear about what it is we will be doing with those inputs, what the results of our doing will be and who will be using such results (for what purposes)? Did I miss something?

vinomaster · 2020-10-14T14:47:38Z

No we are not clear -- those are process questions. Since we do not have a process in place I suggested that we place the .md file in a submission folder.

We discussed as a team that we will take a first pass at grabbing Sovrin (automated input) and Bedrock (manual input) as a start test case.

vinomaster · 2020-10-19T16:50:03Z

As per our last mtg, I have submitted sample content from the BBU Glossary. Instead using a separate issue for each term I prepared a PR that places candidate terms in the submissions folder. The CTWG can now process these submissions and insert the content into whatever internal data store tool is used.

I believe @dhh1128 will be submitting Sovrin content to continue this sample exercise.

Issue #24 Bedrock submission samples.

vinomaster · 2020-10-19T16:52:30Z

Will keep this issue open to allow @dhh1128 to submit his Sovrin changes.

dhh1128 · 2020-10-19T18:03:00Z

ETA on my part = EOD today

dhh1128 · 2020-10-20T01:20:42Z

Okay, I have a PR that represents a first pass extraction from the Sovrin Glossary: #26. The extraction was done with a script that I can modify and re-run; I'd like to improve the content before a merge, so please don't merge until we discuss.

Some questions I have, and things I want to discuss/fix before this sort of thing gets merged:

Is "submissions" a simple triage bucket, or does it deserve to be a permanent archive? (I'm assuming that submitted terms get turned into canonical data through a combination of manual and automated transformations, and that the glossary generation process runs off canonical data, not raw submissions. Do we agree?)
Many of the terms I've extracted need hyperlinks to one another. I believe @RieksJ has a way to do that with a slight tweak to the markup, to match the cross-linking feature he has from the Docusaurus approach. I'd like to discuss whether this can be added to our template.
Capitalization. The Sovrin Glossary follows the convention of capitalizing all proper terms (a convention also used in many legal contracts in English). I don't like this convention because it's not natural. It shows up in the filenames I generated for the markdown.
I didn't include any usage examples. However, I could probably generate examples from the Sovrin Governance Framework. Is that valuable?
I didn't check, but I believe a few terms in my submission might overlap terms from Bedrock that Dan submitted. The communities are different (though perhaps somewhat overlapping). How do we avoid confusion when two labels have the same value but point to different things? (I know how to do it later, in canonical data; I'm just asking how to handle it during submission.)

RieksJ · 2020-10-20T14:08:50Z

I've also created a PR #28, from work that is being done in eSSIF-Lab. I haven't followed the template format everywhere – sometimes stuff is missing, in other cases additional stuff is added. The idea is that it might provide test-cases for exercising any procedures we entertain. @dhh1128 question 1: Seems fair enough. Since it is git, people can go back if they really need to. @dhh1128 question 4: Sometimes an example is useful, sometimes it isn't. Developing terminology/documentation isn't like writing software which (ideally) needs to be complete. Documentation, particularly terminology, sometimes works all right if details are missing. The nice thing is that we can add such details no earlier than when the need for them arises. @dhh1128 question 5: There may also be overlaps with eSSIF-Lab terms – I, too, didn't check.

Issue #24 moved bedrock terms to bbu folder

vinomaster · 2020-10-21T16:29:23Z

My POV:

Is "submissions" a simple triage bucket, or does it deserve to be a permanent archive?
- The glossary generation process will run off canonical data, not raw submissions.
- The submissions folder is for capturing raw data prior to injecting it into the CTWG internal storage data model (TBD).
- There are several submission input mechanisms:
  1. GitHub Issue which would need to be manually copied into a submission folder entry OR directly into CTWG internal storage data model
  2. Pull-request approach (batch) where "n" new terms are submitted as raw data for evaluation. This comes in several flavors;
  - Manual generation of terms in submissions/xxx where xxx is the name of a sub-folder containing new terms.
  - Automated extraction of terms in submissions/xxx where xxx is the name of a sub-folder containing new terms. This is accompanied by a new entry in the code/yyy folder where yyy contains the code specific to this extraction effort.
  - Automated extract of terms in submissions/xxx where xxx is the name of a sub-folder containing new terms. This is accompanied by a new entry in the code/yyy folder where yyy contains the code specific to this extraction effort. This entry also comes with some degree of job scheduling capability and management.
Many of the terms I've extracted need hyperlinks to one another. I believe @RieksJ has a way to do that with a slight tweak to the markup, to match the cross-linking feature he has from the Docusaurus approach. I'd like to discuss whether this can be added to our template.
- I am ok with template changes for this. We will have several until we get the model and process down.
- Maybe the propoer step is a script that runs against the submission folder and preps new terms and submits them into CTWG internal storage data model and then deletes from submission folder. This to me is internal CTWG team process activity to manage from raw to canonical .
Capitalization. The Sovrin Glossary follows the convention of capitalizing all proper terms (a convention also used in many legal contracts in English). I don't like this convention because it's not natural. It shows up in the filenames I generated for the markdown.I didn't include any usage examples. However, I could probably generate examples from the Sovrin Governance Framework. Is that valuable?
- I do not like caps in files names so I always used filename convention "lowercaseword_lowercaseword"
- Adding sample usage is a process thing. Do we require it? Makes sense for Issue based submissions but for autogenerated terms I would waive that and deal with in inside the canonical maturation process.
I didn't check, but I believe a few terms in my submission might overlap terms from Bedrock that Dan submitted. The communities are different (though perhaps somewhat overlapping). How do we avoid confusion when two labels have the same value but point to different things? (I know how to do it later, in canonical data; I'm just asking how to handle it during submission.)
- I would assume (at least initially) we allow for overlaps and deal with it via links in canonical data. These overlaps or relationships are something of a positive data point (artifact) that comes from this exercise.
I was unable to determine best way to list tags. We need a convention that makes downstream search and glossary generation easier.

vinomaster assigned vinomaster and dhh1128 Oct 8, 2020

vinomaster added a commit to vinomaster/concepts-and-terminology that referenced this issue Oct 19, 2020

Issue trustoverip#24 Bedrock submission samples.

2bf5294

vinomaster added a commit that referenced this issue Oct 19, 2020

Merge pull request #25 from vinomaster/master

64362b7

Issue #24 Bedrock submission samples.

vinomaster added a commit to vinomaster/concepts-and-terminology that referenced this issue Oct 21, 2020

Issue trustoverip#24 moved bedrock terms to bbu folder

3564e25

vinomaster added a commit that referenced this issue Oct 21, 2020

Merge pull request #29 from vinomaster/master

8211937

Issue #24 moved bedrock terms to bbu folder

vinomaster mentioned this issue Mar 9, 2021

[PROCESS] ToIP Deliverables Process compliant glossary proposal #43

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[PROCESS] Term Data Model and Form #24

[PROCESS] Term Data Model and Form #24

vinomaster commented Oct 7, 2020

vinomaster commented Oct 8, 2020

RieksJ commented Oct 13, 2020

vinomaster commented Oct 14, 2020

vinomaster commented Oct 19, 2020

vinomaster commented Oct 19, 2020

dhh1128 commented Oct 19, 2020

dhh1128 commented Oct 20, 2020

RieksJ commented Oct 20, 2020 via email •

edited

vinomaster commented Oct 21, 2020

[PROCESS] Term Data Model and Form #24

[PROCESS] Term Data Model and Form #24

Comments

vinomaster commented Oct 7, 2020

Issue/Feature Description

Proposed Solution

vinomaster commented Oct 8, 2020

RieksJ commented Oct 13, 2020

vinomaster commented Oct 14, 2020

vinomaster commented Oct 19, 2020

vinomaster commented Oct 19, 2020

dhh1128 commented Oct 19, 2020

dhh1128 commented Oct 20, 2020

RieksJ commented Oct 20, 2020 via email • edited

vinomaster commented Oct 21, 2020

RieksJ commented Oct 20, 2020 via email •

edited