Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature openapi3 #479

Merged
merged 34 commits into from
Feb 12, 2020
Merged

Feature openapi3 #479

merged 34 commits into from
Feb 12, 2020

Conversation

wilko77
Copy link
Collaborator

@wilko77 wilko77 commented Feb 5, 2020

ok, this all started with me wanting to extend the clk upload endpoint to alternatively accept clk'n'blocks.

I think that the oneOf operator would be nice and elegant. However, this is only supported from openapi 3 onwards.

In version 3, the clks endpoint would then look like this:

'/projects/{project_id}/clks':
    post:
      operationId: entityservice.views.project.project_clks_post
      summary: Upload encoded PII data to a linkage project.

      description: ...

      parameters:
        - $ref: '#/components/parameters/project_id'
        - $ref: '#/components/parameters/token'
      requestBody:
        description: the clks as json
        required: true
        content:
          application/json:
            schema:
              oneOf:
                - $ref: '#/components/schemas/CLKUpload'
                - $ref: '#/components/schemas/CLKnBlockUpload'
          application/octet-stream:
            schema:
              type: string
              format: binary

Unfortunately, connexion is not helping.
Currently, if an endpoint is defined to accept more than one kind of content types, connexion
ignores this information and assumes that they are all application/json. See for instance here.
There is some work in progress that started in 2018 to address this.

So for now, I addressed this by creating a separate endpoint (binaryclks) to accept the binary format.

Another annoyance: I added the swaggerUI, thought that would be handy to play with, and read about the api. Unfortunately, connexion does some magic to the servers definition in the api spec such that (in our case) they disappear.

Can you have a look at this so far and tell me what you think?

@wilko77 wilko77 requested a review from hardbyte February 5, 2020 01:40
Copy link
Collaborator

@hardbyte hardbyte left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is on the right track. Down with swagger, up with OpenAPI.

This endpoint can be used with the Content-Type: application/json and uses the `CLKUpload`
structure of a JSON array of base64 encoded strings.

### Binary Upload
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Binary uploads using the existing endpoint will still be supported? If not this section can now be removed

'503':
$ref: '#/components/responses/RateLimited'

'/projects/{project_id}/binaryclks':
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you're feeling really brave you could setup nginx to inspect requests to POST /clks and if they have Content-Type: application/octet-stream send the request here.

description: Base64 encoded CLK data

CLKnBlockUpload:
description: Array of this party's Bloom Filters
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggest adding more detail describing this object either in the endpoint docs, or the description here.

# TODO actually stream the upload data straight to Minio. Currently we can't because
# connexion has already read the data before our handler is called!
# https://github.com/zalando/connexion/issues/592
# stream = get_stream()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shame this hasn't been fixed :-/

@wilko77
Copy link
Collaborator Author

wilko77 commented Feb 7, 2020

Another connexion weirdness:

I defined the CLKnBlocks upload like this:

CLKnBlockUpload:
      description: Array of this party's Bloom Filters including blocking information
      type: object
      required: [clknblocks]
      properties:
        clknblocks:
          type: array
          items:
            type: array
            items:
              allOf:
                - type: string
                  format: byte
                  description: Base64 encoded CLK data
                - type: array
                  items:
                    oneOf:
                      - type: string
                      - type: integer

connexion is not impressed though:

'clknblocks':[['IugCCIixXAEECgS4LCUUREgswuUBaIAwiqJIQAAZqsCAjLYBCOkBABACgOWPpngwhloFgwDLEj1RIKMQAC1ALWIU5QQSqAQiQKBvZAIOKDFEK4igATRIAJcAGiRYGglYEBA1AjhAHFATu1VBRV49hEQIJDxVEisWDyQ0DJDaXIg=', ['000011010110100111100', '111100000101001010100', '200011010001000001001', '310000001000000000001', '400011001100111100010', '501000000100001101110', '610000000111001010010', '700001101011000000000', '800000000100001001010', '900000000101000101000', '1001000101000110000000', '1100000010110100000001', '1210011010000100010010', '1300100011110010100100', '1401000100100000000010', '1510010101100000011010', '1601000011010010010010', '1700110000000010001000', '1800000000000000000101', '1910000010000001110000', '2000000010010010011001', '2100000010010000000011', '2201100101110110110000', '2310010100000110001000', '2411001000000110000000', '2500100010000001000000', '2601110010010000101000', '2710001010100000100101', '2811000000111000000010', '2900010100000100100000']]]} is not valid under any of the given schemas

If I simplify the definition to this:

CLKnBlockUpload:
      description: Array of this party's Bloom Filters including blocking information
      type: object
      required: [clknblocks]
      properties:
        clknblocks:
          type: array
          items:
            type: array
            items:
              anyOf:
                - type: string
                  format: byte
                  description: Base64 encoded CLK data
                - type: string
                - type: integer

it works. However, this definition is not as clean, as it does not require to have either clk or block info.

@wilko77 wilko77 marked this pull request as ready for review February 10, 2020 23:22
@wilko77
Copy link
Collaborator Author

wilko77 commented Feb 10, 2020

OK, I officially declare this branch ready for review.

Main work:

  • I separated binary from json upload, because connexion does not support multiple content types This has the advantage, that for the binary upload endpoint, we can now define and enforce the extra headers.
  • the json upload endpoint expects either a CLKUpload or CLKnBlocksUpload request body. As described in the comment above, I am not totally happy with the CLKnBlocksUpload definition. If someone can tame connexion to accept a better format, I'm all up for it.

Copy link
Collaborator

@hardbyte hardbyte left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good to merge, but I have a few suggestions & questions inline.

@@ -3,7 +3,7 @@ bitmath==1.3.1.2
celery==4.4.0
clkhash==0.15.1
colorama==0.4.1 # required for structlog
connexion[swagger-ui]==2.6
connexion==2.6.0
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought we saw warnings logged if we didn't include swagger-ui?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

two options:

  1. disable swagger-ui.
con_app.add_api(pathlib.Path("openapi.yaml"),
                base_path='/',
                options={'swagger_ui': False},
                strict_validation=config.CONNEXION_STRICT_VALIDATION,
                validate_responses=config.CONNEXION_RESPONSE_VALIDATION)
  1. use swagger_ui. For this, we have to solve two problems. First the default url to the spec is not working in our case.
con_app.add_api(pathlib.Path("openapi.yaml"),
                base_path='/',
                options={'swagger_ui_config': {'url': '/api/v1/openapi.json'}},
                strict_validation=config.CONNEXION_STRICT_VALIDATION,
                validate_responses=config.CONNEXION_RESPONSE_VALIDATION)

Yay. Secondly, connexion removes the server description/url. (See above) Therefore you cannot actually use the UI to hit the endpoints.

Because 2 doesn't work, I favor 1.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also in favor of option 1 👍

name: 'Confidential Computing, Data61 | CSIRO'
email: confidential-computing@csiro.au
description: >-
Allows two organisations to carry out private record linkage -
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Allows two organisations to carry out private record linkage -
Allows multiple organisations to carry out private record linkage -

`project_id` and a valid `upload_token` in order to contribute data.

The data uploaded must be of one of the following formats.
- CLKs only upload: An array of base64 encoded CLKs (./concepts.html#cryptographic-longterm-keys), one per
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use markdown for the link

tags:
- Project
description: |
An experimental api has been added for uploading CLKs as a binary file. This is to allow for
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
An experimental api has been added for uploading CLKs as a binary file. This is to allow for
An experimental api for uploading CLKs as a binary file. This is to allow for

description: Base64 encoded CLK data

CLKnBlockUpload:
description: Array of this party's Bloom Filters including blocking information
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an object containing a single element which is an array. Do we need/want the extra layer just so you can say clknblocks?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With the clks only, we upload a json of the form {"clks": [...]}.
Therefore I thought that for better separation, we will give the new format a new key.
That simplifies figuring out the format on the receiving end.
I also like to have the clks under a key and not top level, such that we can easily add meta info without breaking anything.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would still be nice for the description to be accurate - i.e. object rather than Array

@wilko77 wilko77 merged commit 81d6850 into develop Feb 12, 2020
@wilko77 wilko77 deleted the feature_openapi3 branch February 12, 2020 04:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants