Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

airbyte-lib: Connector documentation #33063

Merged
merged 24 commits into from
Jan 23, 2024

Conversation

flash1293
Copy link
Contributor

@flash1293 flash1293 commented Dec 4, 2023

Add capabilities to effectively document airbyte-lib-enabled connectors to our doc system:

  • Check the metadata whether the connector is published on pypi and adds an icon to the availability row in the header decoration
  • Introduce the components AirbyteLibExample and SpecSchema which can be used in documentation files to render a code sample on how to use the connector and a collapsible element describing the full config schema

The new components are picked up by a new remark plugin that's injecting the spec of the connector as a prop to the component so it can be used to render out a generated sample config / the spec schema.

This adds the following dependencies:

  • json-schema-faker to generate an example config object only based on the schema
  • @headlessui/react for the collapsible implementation
  • sanitize-html to safely render HTML from the schema in the docs

See the following connector doc pages for examples:

  • Google Drive: Complex config
  • Stripe: Medium config
  • Apify Dataset: Simple config

Copy link

vercel bot commented Dec 4, 2023

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated (UTC)
airbyte-docs ✅ Ready (Inspect) Visit Preview 💬 Add feedback Jan 23, 2024 9:43am

@octavia-squidington-iii octavia-squidington-iii added the area/documentation Improvements or additions to documentation label Dec 4, 2023
@flash1293 flash1293 changed the base branch from master to flash1293/update-docusaurus December 4, 2023 15:30
Base automatically changed from flash1293/update-docusaurus to master December 11, 2023 16:03
@flash1293 flash1293 changed the title Documentation changes for airbyte_lib airbyte_lib: Connector documentation Jan 8, 2024
@flash1293 flash1293 changed the title airbyte_lib: Connector documentation airbyte-lib: Connector documentation Jan 8, 2024
@flash1293
Copy link
Contributor Author

TODO: Reimplement json schema rendering

@nataliekwong
Copy link
Contributor

Honestly looks way better designed than the out of the box solution! Looks great.
Some minor things I don't feel we have control over, but maybe worth noting: fields are listed in order of appearance (I think), and that means that we list optional fields before required ones. We also mix the schema configuration with the connector configuration settings, so a user is context switching between the two "types" of configuration options.

@flash1293
Copy link
Contributor Author

flash1293 commented Jan 17, 2024

Thanks @nataliekwong !

fields are listed in order of appearance (I think), and that means that we list optional fields before required ones

We can actually control this, added some logic to show the required ones first (this is the advantage of using a custom-built solution here)

We also mix the schema configuration with the connector configuration settings, so a user is context switching between the two "types" of configuration options

Not sure I follow, could you elaborate on that point?

@nataliekwong
Copy link
Contributor

Oh nice, thanks for reorganizing! I think it may have resolved itself (in this example, at least) with that change:
Google-Drive-Airbyte-Documentation

I think previously the streams were listed between two other settings, but now it's at the top (Should it be at the bottom? That would then break the "list required fields at the top" rule). I think of the connector's overall configuration settings differently from the streams, because I'd expect the connector configuration to determine the streams I have available. But perhaps that's an incorrect assumption in the case of airbyte-lib. Anyway, following that assumption, I'd then expect to set the connector-level settings before defining specific streams.

@flash1293
Copy link
Contributor Author

@timroes I adopted @nora-airbyte s design and made it more table-y:
Screenshot 2024-01-18 at 15 46 30

IMHO it looks pretty cool. Could you give it a review?


## Usage with airbyte-lib

For document file type streams, make sure the `tesseract` and `pdftotext` libraries are installed. On MacOS, you can install them with `brew install poppler tesseract`.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we putting MacOS specifically here as an example to install those, but no other system?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is just an example, I will remove it from this PR before merging (only the pure implementation of the components will go on, rolling it out to connectors on a separate PR)


for record in result.cache.streams["my_stream:name"]:
print(record)`} </CodeBlock>
<p>You can find more information in the airbyte_lib quickstart guide.</p>
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this quickstart guide already exist, so we could link to it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't yet - I will work on this on a separate PR. As mentioned, this component will not be rendered yet, in this PR I just want to get the first version of the components ready.

@timroes
Copy link
Collaborator

timroes commented Jan 18, 2024

Generally I feel this looks pretty good. The only thing irking me a bit is how we display arrays. We somehow nest a items[x] key once you expand the actual array, so it looks like the following:

2024-01-18_09-06

Specifically for arrays of primitive datatypes like string[] this looks imho more confusing then helpful:

2024-01-18_09-09

Is there any chance for simple arrays to render it as array<string> as a type and don't have the collapsed items[x]. Also for complex ones I think we should not have the separate items[x], since imho it makes it more confusing and looks like the path would be streams.items[0] to access something. Could we just render them as array<object> and have the child properties directly nested when expanding the array?

@flash1293
Copy link
Contributor Author

Thanks @timroes !

Your suggestions for arrays make a lot of sense to me - I will implement that.

@flash1293
Copy link
Contributor Author

@timroes Done - it's now showing the properties of the array items directly. I did a similar thing for constant values - they are rendered as the "type" now:
Screenshot 2024-01-19 at 17 40 28

@flash1293
Copy link
Contributor Author

If you are OK with the basic component I will remove the actual usages so we can get the pre-requisite in.

@octavia-squidington-iii octavia-squidington-iii removed the area/documentation Improvements or additions to documentation label Jan 23, 2024
@flash1293 flash1293 marked this pull request as ready for review January 23, 2024 09:41
@flash1293 flash1293 merged commit 84ce3b1 into master Jan 23, 2024
24 of 26 checks passed
@flash1293 flash1293 deleted the flash1293/airbyte_lib_documentation branch January 23, 2024 13:41
jatinyadav-cc pushed a commit to ollionorg/datapipes-airbyte that referenced this pull request Feb 26, 2024
jatinyadav-cc pushed a commit to ollionorg/datapipes-airbyte that referenced this pull request Feb 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants