Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open API spec nullability inconsistencies with Airbyte API responses #10774

Closed
azhard opened this issue Mar 1, 2022 · 6 comments
Closed

Open API spec nullability inconsistencies with Airbyte API responses #10774

azhard opened this issue Mar 1, 2022 · 6 comments
Labels
area/platform issues related to the platform autoteam community frozen Not being actively worked on team/compose type/bug Something isn't working

Comments

@azhard
Copy link
Contributor

azhard commented Mar 1, 2022

Enviroment

  • Airbyte version: 0.30.22-alpha
  • OS Version / Instance: GCP n2.
  • Deployment: GCP Compute Engine
  • Source Connector and version: N/A
  • Destination Connector and version: N/A
  • Severity: Medium
  • Step where error happened: Deploy

Current Behavior

As a follow up to #7466, I've noticed that there are further inconsistencies with the OpenAPI spec and a few different object schemas where an object should be marked as nullable because the upstream requests to the Airbyte API that return those schemas pass null in for those objects.

Specific example looking at OperatorDbt and OperatorRead schemas.
Definition from the open API spec:

    OperationRead:
      type: object
      required:
        - operationId
        - name
        - operatorConfiguration
        - workspaceId
      properties:
        workspaceId:
          $ref: "#/components/schemas/WorkspaceId"
        operationId:
          $ref: "#/components/schemas/OperationId"
        name:
          type: string
        operatorConfiguration:
          $ref: "#/components/schemas/OperatorConfiguration"

    OperatorConfiguration:
      type: object
      required:
        - operatorType
      properties:
        # Instead of this type field, we would prefer a json schema "oneOf" but unfortunately,
        # the jsonschema2pojo does not seem to support it yet: https://github.com/joelittlejohn/jsonschema2pojo/issues/392
        operatorType:
          $ref: "#/components/schemas/OperatorType"
        normalization:
          $ref: "#/components/schemas/OperatorNormalization"
        dbt:
          $ref: "#/components/schemas/OperatorDbt"

    OperatorDbt:
      type: object
      required:
        - gitRepoUrl
      properties:
        gitRepoUrl:
          type: string
        gitRepoBranch:
          type: string
        dockerImage:
          type: string
        dbtArguments:
          type: string

Note that OperatorDbt is not marked as nullable and dbt isn't required for OperatorConfiguration.

When I make a request to the v1/operations/list endpoint I get back a list of operations for a connection that only has one operation with basic normalization enabled, I get this response JSON:

{
   "operations":[
      {
         "workspaceId":"X",
         "operationId":"X",
         "name":"Normalization",
         "operatorConfiguration":{
            "operatorType":"normalization",
            "normalization":{
               "option":"basic"
            },
            "dbt":null
         }
      }
 

So dbt is being returned here and its value is null. Pulling this into an OperatorConfiguration object from the generated client library causes an exception to be thrown because dbt is included but set to null even though it was never marked as non-nullable.

An example of a fix is the work I did in #10107 to fix nullable connection schedules in the Open API spec. However an alternative would be in the Airbyte API itself, don't return non-required objects when their value is null.

Logs

N/A

Steps to Reproduce

  1. Create client library using generator tool, e.g. https://github.com/OpenAPITools/openapi-generator
  2. Get an operation using code snippet like this one
from airbyte_api.airbyte_api import Client
from airbyte_api.airbyte_api.api.operation import get_operation
from airbyte_api.airbyte_api.models import OperationIdRequestBody

client = Client(
    base_url=f"http://localhost:8000/api",
    timeout=20,
    verify_ssl=False,
)

get_operation.sync(
    client=client,
    json_body=OperationIdRequestBody(
        operation_id="X"
    ),
)
  1. Observe error

Are you willing to submit a PR?

Sure but would like some guidance about how to approach as I imagine this is an issue for most if not all of the schemas in the spec.

@azhard
Copy link
Contributor Author

azhard commented Mar 17, 2022

@alafanechere can I bug you to take a look at this as you approved my last semi-related PR

@alafanechere
Copy link
Contributor

Hi @azhard I exactly encountered the same problem using an Open API generated client for the octavia-cli project.
I did not want to make too many changes to the open API spec while working on this project, so I disabled the check on the return types from the client (e.g

return self._search_fn(self.api_instance, self.search_payload, _check_return_type=check_return_type)
).

@alafanechere
Copy link
Contributor

An example of a fix is the work I did in #10107 to fix nullable connection schedules in the Open API spec. However an alternative would be in the Airbyte API itself, don't return non-required objects when their value is null.

If it's a blocker for you I'd suggest you to open a PR to set the nullable fields on the endpoint you need to work with. I agree that it could be better to not return non-required objects but I need to escalate this to the rest of the team to understand the impact.

@azhard
Copy link
Contributor Author

azhard commented Mar 29, 2022

Thanks for getting back to me @alafanechere, for reference we've implemented our own version of octavia_cli to version control and enable CI/CD of our Airbyte configs / deployments, so that's where we're running into issues.

Right now we're getting around the inconsistencies by making some manual changes to the generated open api client which is enough for now. Would be great if it could be fixed upstream (server side) or in the spec across the board though!

@alafanechere
Copy link
Contributor

@pmossman shared guidance here on how to fix it server side.

@igrankova igrankova added the area/platform issues related to the platform label Jun 7, 2023
@sh4sh sh4sh removed needs-triage team/tse Technical Support Engineers labels Oct 3, 2023
@bleonard bleonard added the frozen Not being actively worked on label Mar 22, 2024
@timroes
Copy link
Collaborator

timroes commented Apr 4, 2024

Closing this as no longer planned, since we deprecated Custom dbt transformations (#34860).

@timroes timroes closed this as not planned Won't fix, can't repro, duplicate, stale Apr 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/platform issues related to the platform autoteam community frozen Not being actively worked on team/compose type/bug Something isn't working
Projects
None yet
Development

No branches or pull requests

8 participants