Skip to content
This repository has been archived by the owner on Aug 5, 2021. It is now read-only.

Organization member at wrong level for Agency-wide inventories #187

Closed
apyle opened this issue Nov 20, 2016 · 12 comments
Closed

Organization member at wrong level for Agency-wide inventories #187

apyle opened this issue Nov 20, 2016 · 12 comments

Comments

@apyle
Copy link

apyle commented Nov 20, 2016

When an Agency has multiple sub-agencies or bureaus the organization field will cause a JSON duplicate key error when trying to combine these sub-agency's inventories for the Agency inventory. For instance, if we combine the USDA OCIO inventory

{
	"agency": "USDA",
	"organization": "OCIO",
	"project": [{
		"name": "DigitalGov Analytics",
		"snip": "happens"
	}, {
		"name": "Gov-Drupal",
		"snip": "happens"
	}]
}

with the USDA APHIS inventory

{
	"agency": "USDA",
	"organization": "APHIS",
	"project": [{
		"name": "Animal Disease Spread Model",
		"snip": "happens"
	}]
}

and create

	"agency": "USDA",
	"organization": "OCIO",
	"project": [{
		"name": "DigitalGov Analytics",
		"snip": "happens"
	}, {
		"name": "Gov-Drupal",
		"snip": "happens"
	}],
	"organization": "APHIS",
	"project": [{
		"name": "Animal Disease Spread Model",
		"snip": "happens"
	}]
}

we get the duplicate key error. There isn't a way in the current specification to combine these inventories and keep the organization associated with its projects. Since the specification version 1.0 seems to be set (see #46), a solution for Agency-wide inventories is to use the agency name for the mandatory organization member and then add an organization member to each project object. This will still associate the sub-agency with the project details.

{
	"agency": "USDA",
	"organization": "USDA",
	"project": [{
		"organization": "OCIO",
		"name": "DigitalGov Analytics",
		"snip": "happens"
	}, {
		"organization": "OCIO",
		"name": "Gov-Drupal",
		"snip": "happens"
	}, {
		"organization": "APHIS",
		"name": "Animal Disease Spread Model",
		"snip": "still happens"
	}]
}

Anyone have another approach?

@okamanda
Copy link
Contributor

@mattbailey0 and @michael-balint, I think @apyle is right. The organization field should probably be nested inside the projects object, right?

@apyle
Copy link
Author

apyle commented Nov 21, 2016

@okamanda, a more radical restructuring might be in order. If we create organizations as an array of projects then it would eliminate the duplication of organization in every project. It would also make it easier as we bring in different sub-agency's inventories. Something along the lines of

	"agency": "USDA",
	"organizations": [{
		"name": "OCIO",
		"projects": [{
			"name": "DigitalGov Analytics",
			"snip": "happens"
		}, {
			"name": "Gov-Drupal",
			"snip": "happens"
		}]
	}, {
		"name": "APHIS",
		"projects": [{
			"name": "Animal Disease Spread Model",
			"snip": "will it ever end?"
		}]
	}]
}

This is definitely something for a new schema version and not addressing the immediate issue.

@michael-balint
Copy link
Contributor

Yes, if we're listing sub-agencies as organizations, it obviously wouldn't make sense to create duplicate organization keys. The latter suggestion (to nest projects in organizations and organizations in agencies) follows the DRY convention that we've set - otherwise, if we choose to list the organization in each project, we might as well also list the agency and keep everything denormalized for consistency's sake.

@IanLee1521
Copy link
Contributor

IanLee1521 commented Nov 24, 2016

I guess I'd thought of the agency + organization together as the key for the object. So that they wouldn't get collapsed, and instead would remain un-combined.

Therefore, from the original example above, the "USDA OCIO" and "USDA APHIS" would each have their own projects and not actually get combined. At least in the code.json file..

@apyle
Copy link
Author

apyle commented Nov 24, 2016

@IanLee1521, I'm not quite following you. Could you provide a code.json snippet to show me what you're thinking?

@IanLee1521
Copy link
Contributor

Sure, I was thinking something like:

[{
    "agency": "USDA",
    "organization": "OCIO",
    "projects": [{
        "name": "DigitalGov Analytics",
        "snip": "happens"
    }, {
        "name": "Gov-Drupal",
        "snip": "happens"
    }]
}, {
    "agency": "USDA",
    "organization": "APHIS",
    "projects": [{
        "name": "Animal Disease Spread Model",
        "snip": "will it ever end?"
    }]
}]

@apyle
Copy link
Author

apyle commented Nov 30, 2016

@IanLee1521, gotcha. That would work too and would be pretty simple to build from sub-agency inventories.

IanLee1521 added a commit to IanLee1521/code-gov-web that referenced this issue Dec 1, 2016
@IanLee1521
Copy link
Contributor

IanLee1521 commented Dec 1, 2016

That's what I was thinking, though @mattbailey0 or others should probably chime in with the official answer.

FWIW, I created a simpler example over on: #196 (comment)

Edit: I should also add that I posted a sample code.json for DOE @LLNL projects which I generated from our @llnl/scraper tool as a gist for reference:

https://gist.github.com/IanLee1521/b7d7c0c2d8c24b10dd04edd5e8cab6c4

Note it is just the JSON object for DOE LLNL, and another process would have to append them into a JSON array.

IanLee1521 added a commit to IanLee1521/code-gov-web that referenced this issue Dec 1, 2016
IanLee1521 added a commit to IanLee1521/code-gov-web that referenced this issue Dec 1, 2016
IanLee1521 added a commit to IanLee1521/code-gov-web that referenced this issue Dec 1, 2016
lukad03 pushed a commit that referenced this issue Dec 12, 2016
@IanLee1521
Copy link
Contributor

IanLee1521 commented Jan 11, 2017

FYI -- This was also mentioned just today:

The issue that v1.0.0 of the schema had with the organization field was resolved by moving organization into each of the objects in projects. I liked your suggestion to resolve the issue by allowing the code.json to have an array of json objects, but unfortunately, that's not where we shook out for v1.0.1 of the spec.
-- @michael-balint on #217 (comment)

Perhaps the change to move the organization key could be made in v1.1.0 of the schema? (Though according to semantic versioning, maybe that should actually appear in v2.0 of the schema).

Currently, v1.0.1 defines it:

{
	"agency": "USDA",
	"projects": [{
		"organization": "OCIO",
		"name": "DigitalGov Analytics",
		...
	}, {
		"organization": "OCIO",
		"name": "Gov-Drupal",
		...
	}, {
		"organization": "APHIS",
		"name": "Animal Disease Spread Model",
		...
	}]
}

@mattbailey0
Copy link
Contributor

Hi all - sorry to have been so absent on this discussion. A few things:

  • with the move to version 1.0.1 of the schema, we've triaged the immediate issue by moving organization into projects.
  • we are definitely seeing a use case for organizations or even teams within agencies that want to post their own code.json files. In some cases this is because it is hard from a workflow standpoint within the agency to automate a centralized collection of the data, in some cases this is because the organizations are actually pretty much independent of one another (e.g. R&D labs), and in some cases this is because you just have all-star teams who are really excited to participate but are within agencies that are slow adopters
  • we're planning a more significant update to the schema in a couple of months that will be an opportunity to make a bigger change; in the mean time we are working with the agencies on nuts and bolts compliance and want to keep the schema as stable as possible - hence the triage approach.
  • based on the conversation to date, really like both @apyle and @IanLee1521's approaches. the big question though is going to be how much we actually care about organization. While we definitely want to support organizations to post their own inventories, it's not clear to me in terms of the UX that org matters. Most users don't care about the internal structure of agencies in trying to find software they can use or contribute to. So in terms of the data structure we probably don't want to get to a place where we are requiring agencies that are posting an agency-level inventory to tag repos to a particular organization (and what if a repo is maintained by several organizations?). So I think of organization as more of a namespacing issue.
  • this is a very long comment.

@IanLee1521
Copy link
Contributor

Thanks @mattbailey0, some responses:

we're planning a more significant update to the schema in a couple of months that will be an opportunity to make a bigger change ...

I totally understand wanting to keep this stable. Perhaps a way to move forward into the 1.2 version of the schema and get input from the community would be to create the file schema along side the current version (in the source code repo, and not published to code.gov), so we could work on that in code, rather than simply discussion.

... the big question though is going to be how much we actually care about organization ...

My 2 cents are that the schema should handle organizations under agencies as far as an agency code.json file, however, that may get flattened out when visualized on code.gov to just the agency. I think your point of this being primarily a namespacing issue is correct.

That would allow flexibility for independent organizations under an agency to host their own code.json files, that the parent agency could roll up their children organizations for hosting the agency.gov/code.json file.

... we probably don't want to get to a place where we are requiring agencies that are posting an agency-level inventory to tag repos to a particular organization ...

My feeling is that would be solved by keeping the organization field optional, as it currently is in version 1.0.1.

@okamanda
Copy link
Contributor

Resolved in version 1.0.1 of the schema

BalajiJBcs pushed a commit to BalajiJBcs/code-gov-web that referenced this issue Jan 3, 2018
BalajiJBcs pushed a commit to BalajiJBcs/code-gov-web that referenced this issue Jan 3, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

6 participants