Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Security Solution] Collect the first feedback on documentation bundling approaches #183019

Closed
4 tasks done
maximpn opened this issue May 9, 2024 · 10 comments
Closed
4 tasks done
Assignees
Labels
Team: SecuritySolution Security Solutions Team working on SIEM, Endpoint, Timeline, Resolver, etc. v8.15.0

Comments

@maximpn
Copy link
Contributor

maximpn commented May 9, 2024

Epic: https://github.com/elastic/security-team/issues/9407

Summary

Introduction of a versioned router allows to version Kibana's API endpoints with different granularity. It has a direct impact on bundling OpenAPI specifications (OAS) for API documentation purposes since documentation reflects the version. Each bundle should group API endpoints of the same version.

The goal of this ticket is to collect the first feedback from Docs Engineering and Kibana Core teams on OAS bundling approaches after opening PRs for

  • bundling all Security Solution OAS into one spec
  • bundling each Security Solution domain OAS (like Endpoints, Detection Engine etc) into each individual spec

Details

Kibana is a large monorepo consisting of different plugins. From business point of view it's separated into Enterprise Search, Observability and Security solutions. Currently Kibana's API endpoints are explicitly versioned (as it's required for zero downtime Serverless upgrades) so public endpoints have version 2023-10-31 (there is only one version at the moment) and internal endpoints have version 1 (there is only one version at the moment too). We consider only public Kibana API endpoints for the purpose of public API documentation.

API consumers have to specify API version via an HTTP header elastic-api-version as it's required by versioned router. OAS doesn't provide a way to declare different request and response schemas per version in one OAS document when an HTTP header like elastic-api-version is used to discriminate API versions . Technically it requires creation of OAS per API endpoint version.

Docs Engineering team prefers to have a simple and straightforward approach to consume Kibana OAS. One of the ideas is to have a single OAS per Kibana so it can be uploaded to the documentation platform. Obviously this approach isn't feasible taking into account limitations of OAS regarding API versioning via elastic-api-version HTTP header.

Additionally there is some discrepancy for schema's info.version between OpenAPI specification and practical usage. Official documentation states that info.version is an OAS document version

The version of the OpenAPI document (which is distinct from the OpenAPI Specification version or the API implementation version).

while the majority of practical examples (1, 2, 3) use info.version as an API implementation version.

Security Solution relies on OAS for code generation (Zod schemas, TS types, API clients) where schema's info.version is used as an API implementation version which is currently 2023-10-31. Taking into account this approach it's definitely problematic to bundle the all Kibana OAS because

  • Conflicts are inevitable (Docs Engineering needs to have a single OAS file). For example Id schemas which is number in one solution and string it the other.
  • Bumping API version for one of the solutions e.g. Observability will cause bumping version for all solutions even there are no changes. Engineers will have to manually spec and/or routes to add a new version. Users will be confused since there are no difference in the API docs. Additionally Serverless provides separate deployment per solution.

Taking above into account OAS bundles should be segregated. The main question is "What should be granularity of this segregation?". Wether bundles should be segregated by solution (e.g. Observability or Security) or domain (e.g. Security Endpoint or Security Detection Engine).

The goal of this ticket is to get is to collect the first feedback from Docs Engineering and Kibana Core teams on OAS bundling approaches where

  • The first approach to produce multiple OAS bundles per Security Solution domain
  • The second approach to produce a single OAS bundle for Security Solution

From DX point of view it's convenient and logical to have a bundle per API domain. Each domain/bundle would version independently.

Action items

  • Create a PR to produce a single OAS bundle for Security Solution (PR)
  • Create a PR to produce multiple OAS bundles per Security Solution domain (PR)
  • Request review
  • Summarize the feedback
@maximpn maximpn added Team: SecuritySolution Security Solutions Team working on SIEM, Endpoint, Timeline, Resolver, etc. Project:Serverless Work as part of the Serverless project for its initial release v8.15.0 labels May 9, 2024
@maximpn maximpn self-assigned this May 9, 2024
@elasticmachine
Copy link
Contributor

Pinging @elastic/security-solution (Team: SecuritySolution)

@jloleysens
Copy link
Contributor

jloleysens commented May 13, 2024

Taking above into account OAS bundles should be segregated. The main question is "What should be granularity of this segregation?". Wether bundles should be segregated by solution (e.g. Observability or Security) or domain (e.g. Security Endpoint or Security Detection Engine).

This is a good question. Technically I don't see a problem with either approach for security solution, but my gut feel is that starting with a single bundle per version would be simpler.

Taking a step back: would it make sense to bundle Kibana Core generated OAS and your existing OAS into a single file for Kibana?

We still need to persist generated OAS to a file somewhere, and once it is we should probably bundle all our JSON together into a single, global file. Maybe a new location like <kibana-root>/oas_docs/<version>_bundle.json. This will likely need to all form part of the same sequence of steps where:

(1) We generate OAS from Kibana /api/oas to some file (per this issue)
(2) We take any post-steps (like bundling in security solution), we don't currently support generating OAS from Zod

WDYT @lcawl @maximpn ?

@lukeelmers
Copy link
Member

From DX point of view it's convenient and logical to have a bundle per API domain. Each domain/bundle would version independently.

Separate bundles would enable independent versioning, but not require it. At this point I don't think it is likely we will have a future with small subsets of APIs being versioned separately as their own domain. The plan as of now is still to just have solution-level domain groupings (platform + 3 solutions). Whether or not they live as separate bundles is an implementation detail, as long as they aren't being versioned in a way that's out of sync with each other.

would it make sense to bundle Kibana Core generated OAS and your existing OAS into a single file for Kibana?

I think a "post-build" step like this worth exploring, especially if having a single OAS file makes things easier for the docs team.

@maximpn
Copy link
Contributor Author

maximpn commented May 14, 2024

Hi @jloleysens and @lukeelmers,

Thanks for the review!

I see your point that you lean towards solution bundles and justification has valid points. At the same time domain bundles might be much more flexible. Let me share my thoughts

  • We can always merge domain bundles into bigger bundles if there are no conflicts between them (shared schemas may have the same name in two or more documents which causes conflicts). For example run bundler on domain bundles.
  • API documentation UX. @lcawl shared example API documentation page with API groups which are actually domains. It's achieved by using OAS tags. Looking at the whole picture any more or less large API documentation should look like that with separation APIs by domain. I guess the main problem in uploading a bundle for each domain/section separately is documentation platform implementation.

image

  • Domain bundles will be smaller in size. It might be beneficial in ownership and manual reviewing bundles when necessary.
  • In case there is a necessity for a team to bundle API version due to incompatible API changes it will be possible to make it per domain, per solution or per whole Kibana.

With bigger scope bundles there is a question who is gonna be pinged when all Kibana or solution bundle is changed? Who is gonna review it? An answer may be we don't need to review it since it's an automatically produced artifact but in some rare case we'll have troubles with that.

On top of that I'd like to highlight how ESS documentation looks like. Security Solution APIs are listed by domains.

image

It looks like we need an extra folding level. So documentation doesn't have to present the whole Kibana APIs but rather split them to core, platform and solutions.


And there are answers on questions above.

Taking a step back: would it make sense to bundle Kibana Core generated OAS and your existing OAS into a single file for Kibana?

To answer this question it's good to understand why it might be beneficial to bundle Kibana Core and Security Solution generated OAS bundles. I see the following

  • If we follow the same as ESS API documentation approach documentation platform should be able to display subcategories under Kibana like Core, Observability or Security. In this case merging bundles don't give anything.
  • Serverless is deployed per project so it's solution API's + core API's. In this case merging bundles will be necessary to display documentation on one page if documentation platform won't support uploading bundles individually.

(1) We generate OAS from Kibana /api/oas to some file (per #181992 issue)

What will /api/oas output? Is it the whole Kibana's OAS?

@jloleysens @kbn/openapi-bundler supports functionality you need in #181992 via custom x- properties.

@lcawl
Copy link
Contributor

lcawl commented May 14, 2024

OAS doesn't provide a way to declare different request and response schemas per version in one OAS document when an HTTP header like elastic-api-version is used to discriminate API versions . Technically it requires creation of OAS per API endpoint version.

I don't think this is accurate, since when I generate the automated output related to #180056, I happily see the versioning represented in the header parameter and the request and response bodies. For example:

...
"parameters": [
          {
            "in": "header",
            "name": "elastic-api-version",
            "description": "The version of the API to use",
            "schema": {
              "type": "string",
              "enum": [
                "2023-10-31"
              ],
              "default": "2023-10-31"
            }
          },
...
        "requestBody": {
          "content": {
            "application/json; Elastic-Api-Version=2023-10-31": {
              "schema": {
                "anyOf": [
                  {
                    "type": "object",
...

From a docs point of view, this level of version-specific detail is great! Per side discussions with @jloleysens, if we want to be able to say which versions support specific API parameters, we might need to request support for a new custom extension like "x-availability-since" or something like that, but otherwise I'm happy with the automated Kibana API docs progress so far.

On a related note, I don't see the header parameter values or version qualifiers in the request/response bodies in either #183021 or #183026 yet, but ideally they'd contain the same level of detail.

On top of that I'd like to highlight how ESS documentation looks like. Security Solution APIs are listed by domains.

The tool that we'll use to publish our API documentation can group the endpoints either by tag or else automatically by deriving a grouping from the path. So far we've been advocating that teams explicitly add tags rather than relying on auto-generated groups. In particular, we advise that teams "define a set of tags that group related endpoints in a meaningful way, such as by feature or logical objects. Use human-readable tag names and add tag descriptions". Right now in #183021 or #183026 the tags don't align with the types of groupings in the old docs (e.g. tags are "Rules API", "Import/Export API", etc whereas it would be more consistent with what we're doing in other areas to use tag names like "detections", "exceptions" or maybe even "security detections", etc) @natasha-moore-elastic would likely know this content area better than I do to offer specific tag name suggestions.

Taking a step back: would it make sense to bundle Kibana Core generated OAS and your existing OAS into a single file for Kibana?

That's what I'm currently doing to get a single Kibana Serverless OpenAPI document for documentation purposes and it would be great to have this accomplished automatically. I use the redocly join tool, which can qualify any components with duplicated names, but there are likely lots of alternative bundle/join tools. From a docs point of view, we'd just ask that the end document (1) be a valid OAS file, and (2) pass our linting rules for the minimum things required for good docs. If you're taking multiple generated and non-generated files and joining them together automatically, it likely makes sense to have the linting performed on each team's individual file and any failures fixed at that stage rather than having any open-endedness about who "owns" the final file.

If we follow the same as ESS API documentation approach documentation platform should be able to display subcategories under Kibana like Core, Observability or Security...

Our plan at this point is for the docs to categorize APIs via tags that reflect the feature or domain, rather than the solution.

Serverless is deployed per project so it's solution API's + core API's. In this case merging bundles will be necessary to display documentation on one page if documentation platform won't support uploading bundles individually.

Ditto previous answer. At this time we do not intend to subdivide the Kibana Serverless API docs by solution. If there are errors that occur if you try to run an Kibana Security (serverless) API against a Search or Observability project, my expectation is that the OpenAPI document would contain that error code and a clear explanation.

If I've missed addressing any questions in this thread, happy to meet and discuss further.

@maximpn
Copy link
Contributor Author

maximpn commented May 17, 2024

@lcawl

I tried different variants in Bump.sh to discover it's capabilities. Bump.sh allows to create hubs and additionally groups API documentations inside hubs via Hub category

image

Domain level documentation

I used hubs and hub category to create Kibana (Serverless) example documentation based on [Security Solution][POC] Bundle Security Solution OAS per domain.

image

Screen.Recording.2024-05-16.at.12.41.18.mov

Versions for Detections Engine API documentation are handled via branches

version 2023-10-31
image

version 2024-05-15
image

Solution level documentation

I also created a solution level documentation example by created Kibana (Serverless) hub based on [Security Solution][POC] Bundle Security Solution OAS(without using hub categories).

Screen.Recording.2024-05-16.at.12.42.45.mov

version 2023-10-31
image

version 2024-05-15
image

Please don't pay attention to Observability API documentation since it's just for grouping demonstration purposes.

Why are there just two API endpoints having 2024-05-15 version?

You may notice that there are just two endpoints with a new 2024-05-15 version. This reflects real Kibana server behavior. There are route handler registrations for /api/detection_engine/rules/_import and /api/detection_engine/rules/_find with the new 2024-05-15 version. It becomes clear that the bigger domain the harder to bump its version. It's gonna be immense amount of work to go over all API endpoints even in Security Solution.

This is important to understand that solution level API versioning can easily lead to bad DX where hundreds files need to be changed even if there are not so much changes. Check out how many files added just for bumping two API endpoints version.

cc @jloleysens @lukeelmers

@maximpn
Copy link
Contributor Author

maximpn commented May 17, 2024

@lcawl

I don't think this is accurate, since when I generate the automated output related to #180056, I happily see the versioning represented in the header parameter and the request and response bodies.

I'm afraid there is misunderstanding here. Let me elaborate my arguments.

OAS Response Media type

Checking OAS specification you may find Response Object documentation. It has content field which has Map[string, [Media Type Object](https://swagger.io/specification/v3/#media-type-object)] type that is a map from Media type to Media Type Object. There are some well known and popular media types like application/json, text/html and text/xml.

I doubled checked the documentation and it states that Media types can have a parameter like type/subtype;parameter=value.

Considering OAS produced by /api/oas we get media types like application/json; Elastic-Api-Version=2023-10-31. It means there is a media type application/json with Elastic-Api-Version parameter.

Taking into account Kibana's API versioning works via HTTP header Elastic-Api-Version it has nothing in common with Media types. It just adds a discrepancy between real behavior and OAS describing this behavior. For example importing such an OAS to Postman gives me what you can see below

image

You can notice that version is duplicated twice. In fact Media type parameter Elastic-Api-Version gets ignored by Kibana server and only the HTTP header Elastic-Api-Version matters.

When Media type parameter approach does not work?

I added a breaking change example with a necessary version bump to the PRs above ([Security Solution][POC] Bundle Security Solution OAS and [Security Solution][POC] Bundle Security Solution OAS per domain). It demonstrates really close to real life API changes where

  • /api/detection_engine/rules/_import changed request body media type from multipart/form-data to application/json and changed response as well
  • /api/detection_engine/rules/_find changed it's query and response

It may be possible to find some tricks (maybe with x-availability-since) to describe changes to /api/detection_engine/rules/_find. But I don't see how can Media type parameters guarantee that a specific request body corresponds to a specific response body in /api/detection_engine/rules/_import.

Let's look at an example:

version 2023-10-31

/api/detection_engine/rules/_import:
    post:
      ...
      requestBody:
        required: true
        content:
          multipart/form-data:
            schema:
              type: object
              properties:
                file:
                  type: string
                  format: binary
                  description: The `.ndjson` file containing the rules.
      ...
      responses:
        200:
          description: Indicates a successful call.
          content:
            application/json:
              schema:
                type: object
                additionalProperties: false
                required:
                  - exceptions_success
                  - exceptions_success_count
                  - exceptions_errors
                  - rules_count
                  - success
                  - success_count
                  - errors
                  - action_connectors_errors
                  - action_connectors_warnings
                  - action_connectors_success
                  - action_connectors_success_count
                properties:
                  exceptions_success:
                    type: boolean
                  exceptions_success_count:
                    type: integer
                    minimum: 0
                  exceptions_errors:
                    type: array
                    items:
                      $ref: '../../model/error_schema.schema.yaml#/components/schemas/ErrorSchema'
                  rules_count:
                    type: integer
                    minimum: 0
                  success:
                    type: boolean
                  success_count:
                    type: integer
                    minimum: 0
                  errors:
                    type: array
                    items:
                      $ref: '../../model/error_schema.schema.yaml#/components/schemas/ErrorSchema'
                  action_connectors_errors:
                    type: array
                    items:
                      $ref: '../../model/error_schema.schema.yaml#/components/schemas/ErrorSchema'
                  action_connectors_warnings:
                    type: array
                    items:
                      $ref: '../../model/warning_schema.schema.yaml#/components/schemas/WarningSchema'
                  action_connectors_success:
                    type: boolean
                  action_connectors_success_count:
                    type: integer
                    minimum: 0

and a new added version 2024-05-15

/api/detection_engine/rules/_import:
    post:
      ...
      requestBody:
        required: true
        content:
          application/json:
            schema:
              type: array
              items:
                $ref: '../../model/rule_schema/rule_schemas.schema.yaml#/components/schemas/RuleCreateProps'
      ...
      responses:
        200:
          description: Indicates a successful call.
          content:
            application/json:
              schema:
                type: object
                additionalProperties: false
                required:
                  - imported_rules_count
                  - success
                  - errors
                properties:
                  imported_rules_count:
                    type: integer
                    minimum: 0
                  success:
                    type: boolean
                  errors:
                    type: array
                    items:
                      $ref: '../../model/error_schema.schema.yaml#/components/schemas/ErrorSchema'

You can notice that request body and response have changed. Request body became application/json Media type in 2024-05-15 version and response has changed as well.

When it's described with Media type parameter Elastic-Api-Version it becomes

/api/detection_engine/rules/_import:
    post:
      ...
      requestBody:
        required: true
        content:
          'multipart/form-data; Elastic-Api-Version=2023-10-31':
            schema:
              type: object
              properties:
                file:
                  type: string
                  format: binary
                  description: The `.ndjson` file containing the rules.
          'application/json; Elastic-Api-Version=2024-05-15':
              schema:
                type: array
                items:
                  $ref: '../../model/rule_schema/rule_schemas.schema.yaml#/components/schemas/RuleCreateProps'
      ...
      responses:
        200:
          description: Indicates a successful call.
          content:
            'application/json; Elastic-Api-Version=2023-10-31':
              schema:
                type: object
                additionalProperties: false
                required:
                  - exceptions_success
                  - exceptions_success_count
                  - exceptions_errors
                  - rules_count
                  - success
                  - success_count
                  - errors
                  - action_connectors_errors
                  - action_connectors_warnings
                  - action_connectors_success
                  - action_connectors_success_count
                properties:
                  exceptions_success:
                    type: boolean
                  exceptions_success_count:
                    type: integer
                    minimum: 0
                  exceptions_errors:
                    type: array
                    items:
                      $ref: '../../model/error_schema.schema.yaml#/components/schemas/ErrorSchema'
                  rules_count:
                    type: integer
                    minimum: 0
                  success:
                    type: boolean
                  success_count:
                    type: integer
                    minimum: 0
                  errors:
                    type: array
                    items:
                      $ref: '../../model/error_schema.schema.yaml#/components/schemas/ErrorSchema'
                  action_connectors_errors:
                    type: array
                    items:
                      $ref: '../../model/error_schema.schema.yaml#/components/schemas/ErrorSchema'
                  action_connectors_warnings:
                    type: array
                    items:
                      $ref: '../../model/warning_schema.schema.yaml#/components/schemas/WarningSchema'
                  action_connectors_success:
                    type: boolean
                  action_connectors_success_count:
                    type: integer
                    minimum: 0
            'application/json; Elastic-Api-Version=2024-05-15':
              schema:
                type: object
                additionalProperties: false
                required:
                  - imported_rules_count
                  - success
                  - errors
                properties:
                  imported_rules_count:
                    type: integer
                    minimum: 0
                  success:
                    type: boolean
                  errors:
                    type: array
                    items:
                      $ref: '../../model/error_schema.schema.yaml#/components/schemas/ErrorSchema'

The latter example makes it impossible (according to OAS) to match specific request body (version 2023-10-31 or 2024-05-15) to a specific response body (version 2023-10-31 or 2024-05-15) since it represents many to many relationship.

Taking all the above into account using Media type parameter Elastic-Api-Version looks like a hack to overcome bump.sh's limitations in documenting different API versions. It adds discrepancy between OAS and real Kibana's behavior which makes it harder to reuse the same OAS to consume Kibana's API (for example via Postman). On top of that It reduces documentation UX forcing users to track which request body version corresponds to which response body. Ideal UX would be to have tabs shown for each API endpoint having multiple versions where each tab's content shows only specific API version request/response.

FYI @jloleysens

@lcawl
Copy link
Contributor

lcawl commented May 17, 2024

Taking all the above into account using Media type parameter Elastic-Api-Version looks like a hack to overcome bump.sh's limitations in documenting different API versions...

To be clear, none of the guidelines we're proposing for publishing our API docs are at all limited by who we're partnering with. We are very excited about this opportunity to fully document our APIs to a degree that we've never managed to date. We'll get to address some long-standing user pain points and be strategic about how we handle the ever-growing API docs landscape and increasingly divergent versioning strategy.

The difficulty of relating specific requests and responses to query/header parameter values seems to be a long-standing issue (e.g. https://stackoverflow.com/questions/74305458/openapi-3-x-x-having-different-response-with-the-change-of-optional-parameters). I'm hopeful that it will be revisited in the API signature improvements in Moonwalk, but until then mapping Elastic-Api-Version values to requests and responses with matching qualifiers seems sufficient from a docs point of view.

@banderror
Copy link
Contributor

@maximpn Could you please summarize the feedback and the decisions made and close this ticket?

@maximpn
Copy link
Contributor Author

maximpn commented Jun 3, 2024

After discussing the first feedback on documentation bundling approaches with Docs Engineering and Kibana Core teams the following decisions have been made

  • API reference documentation for Kibana (Serverless) will be generated from a single Kibana OpenAPI spec (OAS) file. Security Solution OAS needs to be merged/bundled with the rest Kibana OAS to produce a single Kibana OAS.

@lcawl (Docs Engineering team) highlighted that it's an already made decision on their side to have a whole Kibana OAS. Kibana Core team doesn't have objections here. Kibana Core team's effort in this direction is made in #183338.

  • API versioning granularity and API reference documentation have to be considered independently as suggested by @lukeelmers.
  • API versioning approach should be discussed broader by creating an RFC which will be handled by @elastic/security-detection-rule-management team.
  • To avoid any potential version conflicts currently Security Solution will produce OAS with versions represented as MIME type parameter like Elastic-Api-Version=<public version>, e.g.
responses:
    '200':
      content:
        application/json; Elastic-Api-Version=2023-10-31:
          schema:
            ...

@maximpn maximpn closed this as completed Jun 3, 2024
@banderror banderror removed the Project:Serverless Work as part of the Serverless project for its initial release label Jun 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Team: SecuritySolution Security Solutions Team working on SIEM, Endpoint, Timeline, Resolver, etc. v8.15.0
Projects
None yet
Development

No branches or pull requests

6 participants