Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: create README.mdx generation command #59

Merged

Conversation

jvallesm
Copy link
Collaborator

@jvallesm jvallesm commented Feb 21, 2024

What is the purpose of this PR?

CleanShot 2024-02-23 at 12 31 08

Because

  • Component docs get outdated easily and keeping them up-to-date is a (mostly) mechanical task.

This PR

  • Introduces compogen, a README.mdx generation tool (and potentially more) for components.

compogen

compogen is a generation tool for Instill AI component schemas. It uses the
information in a component schema to automatically generate the component
documentation.

Installation

git clone https://github.com/instill-ai/component
cd component/tools/compogen
go install .

Generate the documentation of a component

compogen can generate the README of a component by reading its schemas. The
format of the documentation is MDX, so the generated files can directly be used
in the Instill AI website.

compogen readme path/to/component/config path/to/component/README.mdx

Validation & guidelines

In order to successfully build the README of a component, the definitions.json
and tasks.json files must be present in the component configuration directory.

The definitions.json file must contain an array with one object in which the
following fields must be present and comply with the following guidelines:

  • id.
  • title.
  • description - It should contain a single sentence describing the component.
    The template will use it next to the component title ({{ .Title }}{{ .Description }}.) so it must be written in third person, present tense.
  • version - Must be valid SemVer 2.0.0.
  • type - Connector definitions must contain this field and its value must
    match one of the (string) values defined in protobufs.
  • available_tasks - This array must have at least one value, which should be
    one of the root-level keys in the tasks.json file.
  • source_url - Must be a valid URL. It must not end with a slash, as the
    definitions path will be appended.

Certain optional fields modify the document behaviour:

  • public, when true, will set the draft property to false.
  • For connector components, the content of prerequisites will be displayed in
    an info block next to the resource configuration details.
    • Note that this section only applies when a connector is being documented,
      i.e. when the --connector flag is passed.`
  • A table will be built for the spec.resource_specification properties. They
    must contain an instillUIOrder field so the row order is deterministic.

TODO

  • Support oneOf schemas for resource properties, present in, e.g., the Airbyte
    or the REST API connectors.
  • In the "supported tasks" tables, provide better documentation for nested
    arrays and objects (currently the type doesn't support nesting).
  • If task definitions contain examples for the (required) input and output
    fields, generate param samples as in https://github.com/instill-ai/instill.tech/blob/dedaaa3/docs/v0.12.0-beta/vdp/ai-connectors/openai.en.mdx#L148
  • Implement a way to inject extra sections if a component needs further
    documentation (e.g. by adding a doc.json file with a structured array that
    describes the position and content of the new section.

Next steps

  • compogen validate might be used validate any component configuration.
  • compogen new [--operator] might be used to generate the skeleton of a component.

@jvallesm jvallesm self-assigned this Feb 21, 2024
Copy link

linear bot commented Feb 21, 2024

Copy link

codecov bot commented Feb 21, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 46.15%. Comparing base (52804d4) to head (ef16230).

Additional details and impacted files
@@            Coverage Diff             @@
##             main      #59      +/-   ##
==========================================
+ Coverage   45.89%   46.15%   +0.26%     
==========================================
  Files           6        6              
  Lines        1120     1118       -2     
==========================================
+ Hits          514      516       +2     
+ Misses        517      513       -4     
  Partials       89       89              
Flag Coverage Δ
unittests 46.15% <100.00%> (+0.26%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Adds a command to generate the README file from the `definitions.json`
schema of a component. The command generates a basic, dummy
documentation with some hardcoded parts. The main purpose of this commit
is defining the basic structure of the command.
@jvallesm jvallesm force-pushed the jvalles/ins-3584-create-component-readme-through-templates branch 4 times, most recently from e5740ba to 59233f0 Compare February 22, 2024 14:44
@jvallesm jvallesm force-pushed the jvalles/ins-3584-create-component-readme-through-templates branch 2 times, most recently from 9cd941c to b7de654 Compare February 23, 2024 11:07
@jvallesm jvallesm force-pushed the jvalles/ins-3584-create-component-readme-through-templates branch from b7de654 to 3d67a10 Compare February 23, 2024 11:43
@jvallesm jvallesm marked this pull request as ready for review February 23, 2024 11:51
@jvallesm jvallesm force-pushed the jvalles/ins-3584-create-component-readme-through-templates branch from 969789d to 383e14d Compare February 23, 2024 12:06
Avoids stuttering like in:

```
Archetype AI AI connector resources can be created in two ways:
```
@jvallesm jvallesm merged commit c814c05 into main Feb 26, 2024
10 checks passed
@jvallesm jvallesm deleted the jvalles/ins-3584-create-component-readme-through-templates branch February 26, 2024 06:25
jvallesm added a commit to instill-ai/operator that referenced this pull request Feb 26, 2024
# What is the purpose of this PR?

![CleanShot 2024-02-23 at 12 31
08](https://github.com/instill-ai/operator/assets/3977183/efc49ebc-4023-4845-9ed8-d79bb44eb82a)

Because

- Component docs get outdated easily and keeping them up-to-date is a
(mostly) mechanical task.

This PR

- Leverages `compogen readme` (see
instill-ai/component#59) to generate
automatically a `README.mdx` file for each operator.

## Excluded operators

- `start` and `end`, as their structure is special and shouldn't be
treated as operators
- `image`, as `$ref` schema isn't supported by `compogen` yet

## Next steps / improvements

The main goal of `compogen` is minimizing the maintenance cost of the
component docs at
[instill.tech](https://www.instill.tech/docs/v0.10.0-beta/vdp/operators/json).
That's why the output is an MDX document. However, having a Markdown
version of the document would be beneficial for developers using this
repository, removing the need to access an external website (or offering
the rendered Markdown in GitHub at the root of the component).
Transforming MDX to Markdown should be
[feasible](https://www.npmjs.com/package/mdx-to-md) but I didn't manage
to automate the process.

It would be interesting to add input / output code samples, either by
adding an example field at the task level in `tasks.json` or by checking
if all the required fields have an `example` field.

If this proves useful and production-ready, we can implement an GitHub
action that generates the documents automatically when changes are
introduced in `tasks.json` or `definitions.json`. We can even push these
changes to the website repo.
donch1989 pushed a commit that referenced this pull request Feb 29, 2024
🤖 I have created a release *beep* *boop*
---


##
[0.12.0-beta](v0.11.0-beta...v0.12.0-beta)
(2024-02-27)


### Features

* create README.mdx generation command
([#59](#59))
([c814c05](c814c05))
* extract task title generation
([#58](#58))
([52804d4](52804d4))

---
This PR was generated with [Release
Please](https://github.com/googleapis/release-please). See
[documentation](https://github.com/googleapis/release-please#release-please).
namwoam pushed a commit to namwoam/component that referenced this pull request Jun 24, 2024
# What is the purpose of this PR?

![CleanShot 2024-02-23 at 12 31
08](https://github.com/instill-ai/operator/assets/3977183/efc49ebc-4023-4845-9ed8-d79bb44eb82a)

Because

- Component docs get outdated easily and keeping them up-to-date is a
(mostly) mechanical task.

This PR

- Introduces `compogen`, a `README.mdx` generation tool (and potentially
more) for components.

# `compogen`

`compogen` is a generation tool for Instill AI component schemas. It
uses the
information in a component schema to automatically generate the
component
documentation.

## Installation

```shell
git clone https://github.com/instill-ai/component
cd component/tools/compogen
go install .
```

## Generate the documentation of a component

`compogen` can generate the README of a component by reading its
schemas. The
format of the documentation is MDX, so the generated files can directly
be used
in the Instill AI website.

```shell
compogen readme path/to/component/config path/to/component/README.mdx
```

### Validation & guidelines

In order to successfully build the README of a component, the
`definitions.json`
and `tasks.json` files must be present in the component configuration
directory.

The `definitions.json` file must contain an array with one object in
which the
following fields must be present and comply with the following
guidelines:

- `id`.
- `title`.
- `description` - It should contain a single sentence describing the
component.
  The template will use it next to the component title (`{{ .Title }}{{
.Description }}.`) so it must be written in third person, present tense.
- `version` - Must be valid SemVer 2.0.0.
- `type` - Connector definitions must contain this field and its value
must
match one of the (string) values defined in
[protobufs](https://github.com/instill-ai/protobufs/blob/main/vdp/pipeline/v1beta/connector_definition.proto).
- `available_tasks` - This array must have at least one value, which
should be
  one of the root-level keys in the `tasks.json` file.
- `source_url` - Must be a valid URL. It must not end with a slash, as
the
  definitions path will be appended.

Certain optional fields modify the document behaviour:

- `public`, when `true`, will set the `draft` property to `false`.
- For connector components, the content of `prerequisites` will be
displayed in
  an info block next to the resource configuration details.
- Note that this section only applies when a connector is being
documented,
    i.e. when the `--connector` flag is passed.`
- A table will be built for the `spec.resource_specification`
properties. They
must contain an `instillUIOrder` field so the row order is
deterministic.

## TODO

- Support `oneOf` schemas for resource properties, present in, e.g., the
[Airbyte](https://github.com/instill-ai/connector/blob/main/pkg/airbyte/v0/config/definitions.json#L15)
or the [REST
API](https://github.com/instill-ai/connector/blob/main/pkg/restapi/v0/config/definitions.json#L26)
connectors.
- We might leverage some Go implementation of JSON schema. Some
candidates:
-
[santhosh-tekuri/jsonschema](https://pkg.go.dev/github.com/santhosh-tekuri/jsonschema/v5#Schema)
-
[omissis/go-jsonschema](https://github.com/omissis/go-jsonschema/blob/934012d/pkg/schemas/model.go#L107)
-
[invopop/jsonschema](https://github.com/invopop/jsonschema/blob/a446707/schema.go#L14)
-
[swaggest/jsonschema-go](https://pkg.go.dev/github.com/swaggest/jsonschema-go#Schema)
  - The schema loading carried out by the `component/base` package in
`LoadConnectorDefinitions` or `LoadOperatorDefinitions` might also be
useful, although it is oriented to transforming the data to a
`structpb.Struct`
    rather than to define the object structure.
- In the "supported tasks" tables, provide better documentation for
nested
  arrays and objects (currently the type doesn't support nesting).
- If task definitions contain examples for the (required) input and
output
fields, generate param samples as in
https://github.com/instill-ai/instill.tech/blob/dedaaa3/docs/v0.12.0-beta/vdp/ai-connectors/openai.en.mdx#L148
- Implement a way to inject extra sections if a component needs further
documentation (e.g. by adding a `doc.json` file with a structured array
that
  describes the position and content of the new section.

## Next steps

- `compogen validate` might be used validate any component
configuration.
- `compogen new [--operator]` might be used to generate the skeleton of
a component.
namwoam pushed a commit to namwoam/component that referenced this pull request Jun 24, 2024
🤖 I have created a release *beep* *boop*
---


##
[0.12.0-beta](instill-ai/component@v0.11.0-beta...v0.12.0-beta)
(2024-02-27)


### Features

* create README.mdx generation command
([instill-ai#59](instill-ai#59))
([c814c05](instill-ai@c814c05))
* extract task title generation
([instill-ai#58](instill-ai#58))
([52804d4](instill-ai@52804d4))

---
This PR was generated with [Release
Please](https://github.com/googleapis/release-please). See
[documentation](https://github.com/googleapis/release-please#release-please).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
2 participants