Skip to content
This repository has been archived by the owner on Aug 5, 2021. It is now read-only.

Sneak peek at the new draft metadata schema! [Work in Progress] #273

Closed
okamanda opened this issue Jul 18, 2017 · 8 comments
Closed

Sneak peek at the new draft metadata schema! [Work in Progress] #273

okamanda opened this issue Jul 18, 2017 · 8 comments

Comments

@okamanda
Copy link
Contributor

Hi folks,

As we build an updated version of the metadata schema, we want to give you the opportunity to share your feedback. We've started incorporating comments from the previous metadata schema thread: #250 .This draft is an early iteration of both the metadata schema and the sample code.json file. You'll note a number of syntactical changes in the newer draft, and we've provided a diff file below to show changes, but we specifically want to draw your attention to two new fields costSavings and openSourceMeasureType.

costSavings How should we measure the impact of Code.gov? The costSavings field could provide a meaningful and flexible metric for agencies to demonstrate how much money has been saved by creating or using each repo. But we know this isn't the only way to show value. Some other alternatives to consider are laborHours, value, cost. Which of these, if any, make most sense to you and what are the arguments for using them instead of costSavings?

openSourceMeasureType The Federal Source Code Policy requires that 20% of an agency's code inventory be made available as open-source. In order to do so, agencies must (1) identify a method for measuring the 20% metric and (2) apply the measurement method to the total enterprise code inventory to establish a baseline. We've added a field for this to the updated version of the draft metadata schema. Agencies can use a few different methods for measuring their source code inventories as described here, which include:

  • Source lines of code
  • Number of self-contained modules
  • Cost
  • Number of software projects
  • System certification and accreditation boundaries

You can take a look at the changes to the sample code.json and the draft of version 1.0.2 metadata schema below.

Changes to Sample Code.json in Version 1.0.2.

--- sample_code.json	2017-07-12 16:22:02.000000000 -0400
+++ draft_v102_sample_code.json	2017-07-14 16:20:35.000000000 -0400
@@ -1,49 +1,56 @@
 {
-    "version":"1.0.1",
+    "version":"1.0.2",
     "agency": "DOABC",
     "projects": [
         {
             "name": "mygov",
             "organization": "XYZ Department",
             "description": "A Platform for Connecting People and Government",
-            "license": "https://path.to/license",
-            "openSourceProject": 1,
-            "governmentWideReuseProject": 0,
+            "license": {
+              "URL": "https://path.to/license",
+              "name": "CC0"
+          
+            },
+            "openSourceProject": true,
+            "governmentWideReuseProject": false,
 
             "tags": [
               "platform",
               "government",
               "connecting",
               "people"
             ],
             "contact": {
               "email": "project@agency.gov",
               "name": "Project Coordinator Name",
               "URL": "https://twitter.com/projectname",
               "phone": "2025551313"
             },
             "status": "Alpha",
             "vcs": "git",
-            "repository": "https://github.com/presidential-innovation-fellows/mygov",
-            "homepage": "https://agency.gov/project-homepage",
+            "repositoryURL": "https://github.com/presidential-innovation-fellows/mygov",
+            "homepageURL": "https://agency.gov/project-homepage",
             "downloadURL": "https://agency.gov/project/dist.tar.gz",
             "languages": [
               "java",
               "python"
             ],
+            "costSavings": "1000.00",
+            "openSourceMeasureType": "lines of code",
             "partners": [
                 {
                     "name": "DOXYZ",
                     "email": "project@doxyz.gov"
                 }
             ],
             "exemption": null,
             "exemptionText": "No exemption requested",
-            "updated": {
-                "lastCommit": "2016-04-30",
-                "metadataLastUpdated": "2016-04-13",
-                "lastModified": "2016-04-12"
+            "date": {
+                "created": "2016-04-12",
+                "lastModified": "2016-04-12",
+                "metadataLastUpdated": "2016-04-13"
+                
             }
         }
     ]
 }

Draft Metadata Schema in Version 1.0.2.

{
  "$schema":"http://json-schema.org/draft-04/schema#",
  "title":"Code.gov Inventory",
  "description":"A fedeeral source code catalog",
  "type":"object",
  "properties":{
    "version":{
      "type":"string"
    },
    "agency":{
      "type":"string"
    },
    "projects":{
      "type":"object",
      "properties":{
        "name":{
          "type":"string"
        },
        "organization":{
          "type":"string"
        },
        "description":{
          "type":"string"
        },
        "license":{
          "type":"array",
          "items":{
            "type":"object",
            "properties":{
            "URL":{
              "type":"uri"
              },
            "name":{
              "type":"string"
              }
          }
        },
        "openSourceProject":{
          "type":"boolean"
        },
        "governmentWideReuseProject":{
          "type":"boolean"
        },
        "tags":{
          "type":"array",
          "items":{
            "type":"string"
          }
        },
        "contact":{
          "type":"object",
          "properties":{
            "email":{
              "type":"string"
            },
            "name":{
              "type":"string"
            },
            "URL":{
              "type":"uri"
            },
            "phone":{
              "type":"number"
            }
          }
        },
        "status":{
          "type":"string"
        },
        "vcs":{
          "type":"string"
        },
        "repositoryURL":{
          "type":"uri"
        },
        "homepageURL":{
          "type":"uri"
        },
        "downloadURL":{
          "type":"uri"
        },
        "languagues":{
          "type":"array",
          "items":{
            "type":"string"
          }
        },
        "costSavings":{
          "type":"number"
        },
        "openSourceMeasureType":{
          "type":"string"
        },
        "partners":{
          "type":"object",
          "properties":{
            "name":{
              "type":"string"
            },
            "email":{
              "type":"string"
            }
          }
        },
        "exemption":{
          "type":"number"
        },
        "exemptionText":{
          "type":[
            "null",
            "number"
          ]
        },
        "date":{
          "type":"object",
          "properties":{
            "created":{
              "type":"date"
            },
            "lastModified":{
              "type":"date"
            },
            "metadataLastUpdated":{
              "type":"date"
            }
          }
        }
      },
      "required":[
        "name",
        "repositoryURL",
        "description",
        "licenseURL",
        "openSourceProject",
        "governmentWideReuseProject",
        "tags",
        "contact",
        "email"
      ]
    }
  },
  "required":[
    "version",
    "agency",
    "projects"
  ]
}
@apyle
Copy link

apyle commented Jul 18, 2017

@okamanda Should we bump the version up to 2.0.0 based on semantic versioning?

@okamanda
Copy link
Contributor Author

@apyle yeah that's a good call.

@iadgovuser1
Copy link

@okamanda

Is this correct:

"required":[
"name",
"repositoryURL",
"description",
"licenseURL", <<<<<<<<<

Should it be license.URL or some other JSON schema construct? I'm not familiar with JSON schema.

@apyle
Copy link

apyle commented Jul 21, 2017

@okamanda

  1. I think the name changes like repositoryURL, homepageURL, etc. makes sense.
  2. I also like the change from 0 and 1 to false and true.
  3. Making license an array is needed for a couple of our cases so glad that is included.
  4. Since each agency should be using only one way to measure their code, the openSourceMeasureType should move out of the projects object and be at the top level as a peer of agency and version.

Question: Is costSavings the value to be realized if the project is reused once or for all known reuses? For instance, if a project represented 1000 laborHours and was being reused by 5 agencies, would the value be 1000 or 5000?

@okamanda
Copy link
Contributor Author

Thanks @apyle

That's a good question. costSavings field is anticipated to be the agency's measurement of costsavings across all known (internal) instances.

@IanLee1521
Copy link
Contributor

IanLee1521 commented Jul 27, 2017

@okamanda -- did this discussion move to a different issue? I know I'm coming in late. Edit: Found it here: #280

One thought (though more minor), in the schema, license is an array, but in your sample it is an object (I suggest the schema is correct). Edit: I see this was fixed in the other issue.

Another minor clarification question is: are the license.name values expected to be selected from a list? If not I would worry that you might get difference in spellings / expansions, e.g. "GPL" vs "GPL2" vs "GNU Public License" vs ...

@styfle
Copy link

styfle commented Jul 27, 2017

@IanLee1521 I would expect the license to use a SPDX Short Identifier.

@IanLee1521
Copy link
Contributor

@styfle that seems reasonable to me, but I didn't notice that vein explicitly stated.. perhaps I missed it though?

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

5 participants