Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Declaring the consolidated PPP extension #142

Closed
duncandewhurst opened this issue May 24, 2018 · 30 comments
Closed

Declaring the consolidated PPP extension #142

duncandewhurst opened this issue May 24, 2018 · 30 comments

Comments

@duncandewhurst
Copy link
Member

The technical guidance section of the docs includes the following statement;

To produce and validate OCDS for PPPs data, create an OCDS 1.1 file, and declare the consolidated OCDS for PPPs extensions as part of a release or record package.

There is a consolidated extension but I don't think there is currently a way of declaring it.

Instead the example data declares all the constituent extensions.

  "extensions": [
    "https://raw.githubusercontent.com/open-contracting/public-private-partnerships/master/extension.json",
    "https://raw.githubusercontent.com/open-contracting/ocds_process_title_extension/master/extension.json",
    "https://raw.githubusercontent.com/open-contracting/ocds_location_extension/master/extension.json",
    "https://raw.githubusercontent.com/open-contracting/ocds_requirements_extension/master/extension.json",
    "https://raw.githubusercontent.com/open-contracting/ocds_budget_breakdown_extension/master/extension.json",
    "https://raw.githubusercontent.com/open-contracting/ocds_budget_projects_extension/master/extension.json",
    "https://raw.githubusercontent.com/open-contracting/ocds_documentation_extension/master/extension.json",
    "https://raw.githubusercontent.com/open-contracting/ocds_metrics_extension/master/extension.json",
    "https://raw.githubusercontent.com/open-contracting/ocds-riskAllocation-extension/master/extension.json",
    "https://raw.githubusercontent.com/open-contracting/ocds-shareholders-extension/master/extension.json",
    "https://raw.githubusercontent.com/open-contracting/ocds_finance_extension/master/extension.json",
    "https://raw.githubusercontent.com/open-contracting/ocds_qualification_extension/master/extension.json",
    "https://raw.githubusercontent.com/open-contracting/ocds_tariffs_extension/master/extension.json",
    "https://raw.githubusercontent.com/open-contracting/ocds_performance_failures/master/extension.json",
    "https://raw.githubusercontent.com/open-contracting/ocds_contract_signatories_extension/master/extension.json",
    "https://raw.githubusercontent.com/open-contracting/ocds_charges_extension/master/extension.json",
    "https://raw.githubusercontent.com/open-contracting/ocds_transactions_relatedMilestone_extension/master/extension.json",
    "https://raw.githubusercontent.com/open-contracting/ocds_bid_extension/master/extension.json",
    "https://raw.githubusercontent.com/open-contracting/ocds_milestone_documents_extension/master/extension.json"
  ],

That approach is messy and also makes it harder for publishers to pin to a specific version of the PPP profile.

Is there a way we can adapt the build process for the profile so we end up with an extension.json for the consolidated extension built somewhere?

(I think that would also involve pulling together the codelists from all the constituent extensions into one folder.)

Thoughts @jpmckinney @timgdavies @kindly ?

@jpmckinney
Copy link
Member

jpmckinney commented May 24, 2018

It seems reasonable to build a single extension as part of the build process. This may simply involve changing/copying the file paths (and anything that refers to the current paths) of the built ppp-extension.json and consolidated CSV files in compiledCodelists.

The apply_extensions method writes compiledCodelists and *-extension.json. We can either leave them where they are, or put them in, e.g. profile/release-schema.json and profile/codelists, and add a profile/extension.json file.

@timgdavies
Copy link
Contributor

Agreed this should be fairly straightforward, and happy to have a look at it this week.

@duncandewhurst
Copy link
Member Author

duncandewhurst commented Jun 7, 2018

This may simply involve changing/copying the file paths (and anything that refers to the current paths) of the built ppp-extension.json and consolidated CSV files in compiledCodelists.

The codelist CSVs in compiledCodelists are patched version of the base codelists, but for the consolidated ppp extension, we need to collect together the codelists from each extension which makes up the PPP profile (i.e. not the patched codelists, just the extensions)

The extension codelists are currently written to docs/codelists by apply-extensions.py but were missing the codelists from the ppp-extension itself, so I've:

To try and avoid confusing the terminology, I've used 'consolidatedExtension' rather than profile, so that there is a distinction between ppp extension in this repo, all the extensions which make up the profile (the consolidated extension) and the profile itself (the extended schema, codelists and docs)

Note: the build will fail on the consolidated-extension branch unless you update your local requirements file to pull in the profile-consolidated-extension branch of the documentation-support repo.

To do:

@duncandewhurst duncandewhurst self-assigned this Jun 7, 2018
@timgdavies
Copy link
Contributor

I've been reviewing this (and it's taken a while to unpick where things are happening in all the refactored scripts: @jpmckinney I wonder if we wouldn't be better recognising that profiles are still an evolving idea, and we don't want to over-optimise at this stage for maintenance whilst we're still learning how they will evolve?)

With apologies for adding issues late in the day, and hopeful most should be fairly easy to resolve:

  • (1) With the new way (I think) the extension registry has been set-up, I'm not sure that having the consolidated extension, and the bespoke pieces of PPP extension in the same repository now fully works (if version information is being picked up from Git repos, this forces us to version the two things together, which we may not want to do)

  • (2) The consolidated extension.json doesn't 100% reflect the codelists we want to have in the PPP extension. For example, it ends up pulling in a '+documentType.csv', and then straight after, replacing that completely with 'documentType.csv'. In this case, the ordering means we should get the anticipated behaviour (just using the PPP extensions documentType codelist), but I'm not sure we should be including the '+documentType.csv' at all here (accepting, as we have, that OCDS for PPPs does remove content in a way that other extensions would not be permitted to do).

  • (3) I'm not certain right now how far we can pin to certain versions of extensions? If an extension changes - do we risk pulling that change into the PPP profile on a re-build? What controls will we need on doing this?

On (1) my proposal would be to have a separate ocds_ppp_profile_specific_extension with a README that basically says 'don't use this - use the profile!' and then to make the extension.json and release-schema.json in the root of this repo into the consolidated extension, with the README explaining this is a frozen snapshot of a set of other extensions. I.e. you can use them independently if you want, but to do OCDS for PPPs, this is the extension you should declare and use.

In terms then of the location of the consolidated extension, it would be in this repo, and we wouldn't point to it anywhere else for now.

(There is one other issue to work out there re: translation of extensions schema... but let's leave that for the moment).

@timgdavies
Copy link
Contributor

@duncandewhurst If you get chance to give thoughts on the above, we can try and perhaps chat later in the week to work out full plan?

@jpmckinney
Copy link
Member

jpmckinney commented Jun 12, 2018

To respond to my mention: In terms of the refactor, there is only one new dependency (documentation-support), which moved code that previously resided in both the standard repository and this repository (or, in the case of apply_extensions, exclusively in this repository) into a single repository. The main difference to that code was to add more comments. As described in #121 (which I carefully reviewed on a call with @duncandewhurst), the refactor was a very fruitful exercise that fixed a large number of issues.

The only other big change was to move to Make commands instead of one long script (see open-contracting/standard#666). The Make commands are described in the handbook. Rather than read the Make files, I highly recommend simply running Make commands with -n (i.e. --dry-run), which will show the commands that would be run if -n were omitted.

@duncandewhurst
Copy link
Member Author

@timgdavies I'm in favour of separating out the ppp extension and ppp profile into separate repositories as I think having one repo being two extensions is confusing.

Regarding versioning, I might be missing out on a nuance here, but I think the controls are:

  1. apply-extensions.py is run manually, so we can run the build without necessarily pulling in updated extensions, though that doesn't give us the flexibility to only pull in updates to a particular extension

  2. Copying the consolidated extension to the _static folder in the built version of the profile and referencing that, rather than the copy in repository, in the docs and the extension registry means we can strictly version it along with the profile (when we start doing that)

Regarding codelists, I think documentType is a special case, because the ppp extension includes a full copy of the final codelist, whereas for all other codelists it only includes added or removed codes, as an extension normally would, so it might be a case of updating apply-extensions.py so that +documentType isn't copied over (there are already some other special cases for documentType in the script)

So, yes, let's talk it through.

@duncandewhurst
Copy link
Member Author

apply-extensions.py is run manually, so we can run the build without necessarily pulling in updated extensions, though that doesn't give us the flexibility to only pull in updates to a particular extension

I forgot about the extension_versions key in conf.py which pins the ppp build to certain versions of extensions.

Currently most extensions are pointing at the master branch, so we may want to create a release and pin to it once the current round of updates to the PPP extension are complete (and now that there are a few different publishers working on implementations)

@duncandewhurst
Copy link
Member Author

@jpmckinney I'm trying to check that declaring the consolidated extension works in the validator, but the build for the consolidated-extension branch is failing, I think due to a test:

=================================== FAILURES ===================================
tmp/test_json.py:769: assert 1 == 0

I ran the tests locally against the public-private-partnerships repo and test_indent fails on schema/consolidatedExtension/extension.json, so I ran ocdskit indent against that file, but the test still fails. Can you help?

@duncandewhurst duncandewhurst changed the title Declaring the PPP extension Declaring the consolidated PPP extension Jun 14, 2018
@jpmckinney
Copy link
Member

jpmckinney commented Jun 14, 2018

Ah, yeah, the error message for that test is not helpful, and the more verbose error message is getting mixed with a lot of informative warning messages. I'll update the scripts to prefix "ERROR:" to error messages. Anyhow, the error message is:

ERROR: release-schema.json has unused codelists: votingRights.csv, preQualificationStatus.csv, bidStatistics.csv, riskAllocation.csv, financeCategory.csv, financeType.csv, riskCategory.csv, responseSource.csv, relatesTo.csv, chargePaidBy.csv, bidStatus.csv, dataType.csv

This is because, in an extension, the tests first discover which codelists are in the repository, and then check whether the schema actually makes use of all of them. This caught some real errors in the past. Anyhow, in OCDS for PPPs, the logic (for now anyway) needs to be a bit different, as the release-schema for the PPP extension and PPP profile are in the same repo. Once we move the PPP extension out of the PPP profile, we may be able to restore such logic.

Once I merge open-contracting/standard-maintenance-scripts#102 you can re-run the build on Travis, and tests should pass.

@duncandewhurst
Copy link
Member Author

@jpmckinney thanks for that, it's still failing on a different test though:

ERROR: release-schema.json has mismatch between enum and codelist at /properties/tag; added {'qualification', 'shortlist', 'qualificationUpdate', 'qualificationCancellation', 'qualificationAmendment', 'shortlistUpdate'}

These codes are added to releaseTag.csv by the qualification extension, but added to the enum for release/tag by the PPP extension. I'm not sure if splitting them across the two repositories was intentional.

I tried moving the changes to the enum into the qualification extension repository, which stops that test failing, but then the qualification extension build fails with:

AssertionError: Exception: unexpectedly overwrites /properties/tag/items/enum /home/travis/build/open-contracting/ocds_qualification_extension/release-schema.json

Presumably because the tests don't allow additions to a closed codelist (noting that this is probably another reason the extension is described as incompatible with OCDS 1.X in it's description).

I also tried the other way round (moving +releaseTag.csv from the qualification extension into the PPP extension) but the PPP build still fails with:

ERROR: release-schema.json has mismatch between enum and codelist at /properties/tag; added {'shortlistUpdate', 'qualificationCancellation', 'qualificationUpdate', 'qualificationAmendment', 'shortlist', 'qualification'}

Any ideas on a solution?

I've closed the PR on the qualification extension because I think it might make more sense to keep changes which a normal extension wouldn't be permitted to make in one place (the ppp extension)

@duncandewhurst
Copy link
Member Author

Noting that with json_merge_patch it seems you can't patch an enum and instead need to replace the whole list

@jpmckinney
Copy link
Member

jpmckinney commented Jun 15, 2018

The broad issue is that the tests don't anticipate a consolidated extension. I think it may be easiest to setup the PPPs extension and PPPs profile in the way that we think is correct, and then I can update the tests to work with the new arrangement. In the meantime, if it helps, you can comment out lines in .travis.yml. We can still run the tests manually during development.

I forget if it's stated anywhere, but if I remember correctly, extensions shouldn't modify closed codelists. I don't think the test scripts fully cover this, as the qualifications extension passes the tests, while still having +releaseTag.csv. I created open-contracting/standard-maintenance-scripts#103 to follow up on that.

In sum, we'll probably need to make a special case in the tests to allow OCDS for PPPs to modify the closed codelist.

@duncandewhurst
Copy link
Member Author

#148 is ready to review and merge (open-contracting-extensions/ocds_qualification_extension#23 and open-contracting-archive/documentation-support#10 should also be merged)

@jpmckinney
Copy link
Member

jpmckinney commented Jun 18, 2018

Are we still planning on moving the root release-schema.json and extension.json files into their own repository? I think it will be cleaner and clearer than having both an extension and a profile expressed in the same repository.

If we make that change, I can propose some changes to the layout of this repo that I think will improve the new consolidated extension pattern, for both this profile and future profiles. I can also more easily restore common tests.

@duncandewhurst
Copy link
Member Author

Yes we are, but we can do that at a later stage, since it won't affect how users declare the profile now that we have created the consolidated extension.

Since deploying the consolidated extension is blocking providing feedback to a couple of publishers, that's the immediate priority and we can work one splitting out the repository afterwards

@jpmckinney
Copy link
Member

jpmckinney commented Jun 18, 2018

I'm thinking that if we move the extension, then we can simply have the 'consolidated extension' in the root of the profile – the same as how an extension is in the root of its repo. However, moving the consolidated extension would change how it's declared.

It doesn't seem like it would add too much time to split it out - would it?

We can schedule a call to better understand the level of urgency for providing feedback.

@duncandewhurst
Copy link
Member Author

The consolidated extension is copied to the build/_static directory so the URL the user declares isn't dependent on the structure of the Github repo (e.g. http://standard.open-contracting.org/profiles/ppp/master/en/_static/consolidatedExtension/extension.json).

Regarding splitting out the extension, it shouldn't be too much work, but if we keep this repository as the profile/consolidated extension it will involve:

  • Creating a new repository for the PPP extension
  • Updating apply-extensions.py in the documentation-support repo so it no longer processes the repo it is called from and to change the location the consolidated extension is built to
  • Updating apply-extension.py in this repo to add the new extension repo
  • Updating/writing readmes
  • Updating the makefiles so that the consolidated extension is copied from the root of the repository to the build/_static/consolidatedExtension folder (I will need your assistance for this bit!)
  • Updating the extensions registry

@duncandewhurst
Copy link
Member Author

duncandewhurst commented Jun 19, 2018

@jpmckinney pointed out that:

http://standard.open-contracting.org/profiles/ppp/latest/en/_static/consolidatedExtension/extension.json

is quite a long URL and we should aim to have something similar to the schema URLs used in the core standard, e.g

http://standard.open-contracting.org/schema/1__1__3/release-schema.json

Since the PPP profile doesn't yet have a version number yet we can't construct exactly that type of URL, however in the core standard the schema files also reside at http://standard.open-contracting.org/latest/en/release-schema.json (this is the URL at which the schema is referenced from the docs and the location from which it is copied to /schema/{{version}}/ in this step of the deployment process).

So, for consistency we should make the consolidated extension available at:

http://standard.open-contracting.org/profiles/ppp/latest/en/extension.json

I've staged updates to the docs and sample data on the update-extension-URL branch based on using the above URL, so I think the process to get this round of updates deployed now looks like:

@Bjwebb and @timgdavies if you're able to work on this and get it deployed today that would be great.

@jpmckinney
Copy link
Member

Sounds good, @duncandewhurst, and thanks for summarizing our call. I agree with this process.

Bjwebb added a commit that referenced this issue Jun 19, 2018
Thsi copies files there to the root directory of the docs.
@Bjwebb
Copy link
Contributor

Bjwebb commented Jun 19, 2018

Update the update-extension-URL branch so that the contents of the schema/consolidatedExtension directory are copied to http://standard.open-contracting.org/profiles/ppp/latest/en/ @Bjwebb

Done in 3422070, we now have http://standard.open-contracting.org/profiles/ppp/update-extension-URL/en/extension.json which will become http://standard.open-contracting.org/profiles/ppp/latest/en/extension.json when this goes live.

This is done via adding a line to the Sphinx conf.py, which I think is the easiest way for now. We'll need something different to handle translation of the schema, but I'm assuming this isn't a blocker for this going live, as we don't have this currently.

@timgdavies
Copy link
Contributor

I'm just working through steps here. I've hit one issue with extension.json missing initiationType.csv which I'm adding on a branch and testing before we're ready for the deploy.

I wasn't sure it this was generated by apply-extension.py or not (my understanding is not?) - but in testing re-running apply-extension.json I was getting major diffs against the master branch, which was unexpected.

I thought this might be down to not all places in the apply-extension.py script using orderedDict, but even using a commit that handles for that, I get major diffs. The other possible explanation I can see for this is the order in which extensions are applied?

As I understand we are now deferring to the extension registry for this - instead of a list within the profile. We seem to have really broken the principle of reproducible builds and profiles with all the abstraction and refactoring that has taken, and I think will have some work to discuss and hopefully untangle this.

@timgdavies
Copy link
Contributor

Ok. I've merged the changes I need. Over to @Bjwebb for the final deploy.

@jpmckinney
Copy link
Member

@duncandewhurst I'm working on a fix, which will be available shortly.

This issue can be swiftly corrected locally by using an OrderedDict when specifying the extensions to merge in conf.py, which had recently changed from a list to a dict.

In general, the best way to ensure no hiccups is to slow things down. Duncan and I had a good relationship working through some of the challenges and uncertainties around profiles, so I was happy to proceed in a fashion that would achieve a quicker result, with the risk that things break for short periods of time. I am happy with the process thus far.

As I’ve tried to point out before, the apply_extensions method is complex, but it is just as complex as it was before it was moved to documentation-support (in fact, it’s a little better documented than it was). There is no significant new abstraction or complexity.

I would be eager to work out kinks in the process with Duncan, who has been deep in this work and who I believe will have fruitful observations and experiences to share. I trust that we can work out any issues.

@jpmckinney
Copy link
Member

jpmckinney commented Jun 19, 2018

See #155 for a fix.

I also noticed that I couldn't reproduce and hadn't encountered this issue, because I use Python 3.6, where dicts are insertion ordered. (In Python 3.7, to be released soon, insertion order will become a language feature, not only an implementation detail.)

@jpmckinney
Copy link
Member

Also, to clarify, profiles don't defer any logic to the extension registry. Previously, profiles would defer the choice of version of each extension to the registry. This is no longer the case.

@Bjwebb
Copy link
Contributor

Bjwebb commented Jun 20, 2018

This is deployed live now http://standard.open-contracting.org/profiles/ppp/latest/en/

@Bjwebb Bjwebb closed this as completed Jun 20, 2018
@jpmckinney
Copy link
Member

🎉

@jpmckinney
Copy link
Member

@duncandewhurst What updates did you anticipate to the standard development handbook (in your checklist)? You can create an issue there if appropriate.

Also, is the checklist item about the spreadsheet template logged as an issue in the CRM (or elsewhere)? Just want to be sure we don't forget it.

@duncandewhurst
Copy link
Member Author

@jpmckinney I've made the handbook updates in open-contracting/standard-development-handbook#178 and updated the spreadsheet template (simply changing the extensions key in the meta tab)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants