Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update Pipelines folder structure #31525

Merged
merged 38 commits into from
Oct 18, 2023

Conversation

erohmensing
Copy link
Contributor

@erohmensing erohmensing commented Oct 17, 2023

Overview

This is an update to the folder/file structure of our pipelines project.

The goal is through this change to improve readability and context sharing for new and existing developers.

However, This is NOT an update to any logic or code. That is out of scope for now.

The goal is to merge this fast so we dont have too much rebase hell

Changes

These are the broad high level changes and why

1. The colocation of pipelines and commands

Currently and in the future we believe the pipeline and the command concepts map 1:1.

Ideally we would be able to merge the abstractions (like in aircmd) but for now we believe a good first step is to hold them in the same location.

This allows

  1. The folder structure to show what commands are available
  2. Create a clear pairing between the pipeline code that is run and the cli arguments that feed them

2. A top level folder for CLI entry points

Even though commands are located beside their respective pipelines the entry point to the airbyte-ci should be as close to the root as possible and visible

This allows

  1. any new developer know where to start tracing code.
  2. A single area to collect CLI helpers like telemetry and custom click classes

3. The reduction of large files into smaller single purpose files

This is to improve reading and tracing code.

We've split

  1. bases.py into models
  2. utils.py into helpers
  3. environments.py into dagger/actions and dagger/containers

4. Steps get more generic as you go up the file structure

The idea is that there is a taxonomy of steps that should be separated based on their wish to be reused.

  1. Step Base Class now found at models/steps.py
  2. Generic Steps intended to be reused (e.g. PoetryRunStep, SimpleDockerStep) now found at pipeline/steps
  3. Single Purpose General Steps (e.g. GitPushChanges) now found at pipeline/steps
  4. Specialize Pipeline specific steps (e.g. BuildConnectorDistributionTar) found at pipeline/connectors/build/steps

Separating them on this taxonomy lets developers infer how wide spread their usage or if the step is intended to work in any pipeline vs just a specific set of pipelines

5. Containers separated from actions

Environments.py had a mix of two different type signatures

  • Container -> Container: These are functions that install something, analogous to dockerfile commands
  • Context|str|any -> Container: These are functions that create/initialize a container, analogous to dockerfile base image

We broke this into actions and containers respectively. This is to make this distinction clearer at a glance and potentially lean on the fact that the developer may be used to a distinction like this if theyve used docker before

File Structure

.
├── README.md
├── pipelines
│   ├── __init__.py
│   ├── cli
│   │   ├── airbyte_ci.py
│   │   ├── dagger_pipeline_command.py
│   │   ├── dagger_run.py
│   │   └── telemetry.py
│   ├── consts.py
│   ├── dagger
│   │   ├── __init__.py
│   │   ├── actions
│   │   │   ├── __init__.py
│   │   │   ├── connector
│   │   │   │   ├── hooks.py
│   │   │   │   └── normalization.py
│   │   │   ├── python
│   │   │   │   ├── __init__.py
│   │   │   │   ├── common.py
│   │   │   │   ├── pipx.py
│   │   │   │   └── poetry.py
│   │   │   ├── remote_storage.py
│   │   │   ├── secrets.py
│   │   │   └── system
│   │   │       ├── __init__.py
│   │   │       ├── common.py
│   │   │       └── docker.py
│   │   └── containers
│   │       ├── __init__.py
│   │       ├── internal_tools.py
│   │       ├── java.py
│   │       └── python.py
│   ├── hacks.py
│   ├── helpers
│   │   ├── __init__.py
│   │   ├── connectors
│   │   │   ├── __init__.py
│   │   │   ├── metadata_change_helpers.py
│   │   │   └── modifed.py
│   │   ├── gcs.py
│   │   ├── git.py
│   │   ├── github.py
│   │   ├── sentry_utils.py
│   │   ├── slack.py
│   │   ├── steps.py
│   │   └── utils.py
│   ├── internal_tools
│   │   └── internal.py
│   ├── models
│   │   ├── bases.py
│   │   ├── contexts.py
│   │   ├── reports.py
│   │   └── steps.py
│   └── pipeline
│       ├── __init__.py
│       ├── connectors
│       │   ├── __init__.py
│       │   ├── commands.py
│       │   ├── context.py
│       │   ├── pipeline.py
│       │   ├── builds
│       │   │   ├── __init__.py
│       │   │   ├── commands.py
│       │   │   ├── pipeline.py
│       │   │   └── steps
│       │   ├── bump_version
│       │   │   ├── __init__.py
│       │   │   ├── commands.py
│       │   │   └── pipeline.py
│       │   ├── format
│       │   │   ├── __init__.py
│       │   │   ├── commands.py
│       │   │   ├── pipeline.py
│       │   │   └── steps
│       │   ├── list
│       │   │   ├── __init__.py
│       │   │   ├── commands.py
│       │   │   └── pipeline.py
│       │   ├── migrate_to_base_image
│       │   │   ├── __init__.py
│       │   │   ├── commands.py
│       │   │   └── pipeline.py
│       │   ├── publish
│       │   │   ├── __init__.py
│       │   │   ├── commands.py
│       │   │   ├── context.py
│       │   │   └── pipeline.py
│       │   ├── reports.py
│       │   ├── test
│       │   │   ├── __init__.py
│       │   │   ├── commands.py
│       │   │   ├── pipeline.py
│       │   │   └── steps
│       │   └── upgrade_base_image
│       │       ├── __init__.py
│       │       ├── commands.py
│       │       └── pipeline.py
│       ├── metadata
│       │   ├── __init__.py
│       │   ├── commands.py
│       │   └── pipeline.py
│       ├── steps
│       │   ├── __init__.py
│       │   ├── docker.py
│       │   ├── git.py
│       │   ├── gradle.py
│       │   ├── no_op.py
│       │   └── poetry.py
│       └── test
│           ├── __init__.py
│           ├── commands.py
│           └── pipeline.py
├── poetry.lock
├── pyproject.toml
└── tests
    ├── __init__.py
    ├── conftest.py
    ├── test_actions
    │   └── test_environments.py
    ├── test_bases.py
    ├── test_builds
    │   ├── dummy_build_customization.py
    │   └── test_python_connectors.py
    ├── test_commands
    │   ├── __init__.py
    │   └── test_groups
    │       ├── __init__.py
    │       └── test_connectors.py
    ├── test_gradle.py
    ├── test_publish.py
    ├── test_steps
    │   ├── __init__.py
    │   └── test_simple_docker_step.py
    ├── test_tests
    │   ├── __init__.py
    │   ├── test_common.py
    │   └── test_python_connectors.py
    ├── test_utils.py
    └── utils.py

36 directories, 105 files

@vercel
Copy link

vercel bot commented Oct 17, 2023

The latest updates on your projects. Learn more about Vercel for Git ↗︎

1 Ignored Deployment
Name Status Preview Comments Updated (UTC)
airbyte-docs ⬜️ Ignored (Inspect) Visit Preview Oct 18, 2023 11:29pm

@@ -42,9 +41,12 @@ async def download(context: ConnectorContext, gcp_gsm_env_variable_name: str = "
Returns:
Directory: A directory with the downloaded secrets.
"""
# temp - fix circular import
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bnchrch this is the one circular import I just couldn't figure out how to implement in a decent way. It's also imported here. Maybe you can take a crack at it

@bnchrch bnchrch changed the title Ella ben/pipelines/refactor folder struct Update Pipelines folder structure Oct 18, 2023
Copy link
Contributor

@alafanechere alafanechere left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very cool changes! Thank you!!!!!!
Just minor suggestions.
Feel free to merge if the unit test pass.
A pre-release + a java and python connector test run would also be reassuring (feel free to ignore if you tested this locally).


@connectors.command(cls=DaggerPipelineCommand, help="List all selected connectors.")
@click.pass_context
def list(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This command is not a pipeline but only printing stuff. Do you think a subpackage like utils_commands would make sense?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good point - for what its worth, the test command (running tests for airbyte ci) is not a pipeline either, nor will the new format be. I don't think these are utils though, maybe it is just a naming problem.

@bnchrch you talked about pipelines, not commands, being the first class citizens here, somehow, right? I can't remember how that conversation went

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤔 good call out.

I dont have a good answer.

This feels like one of those puzzle pieces that can't fit

  1. If we leave it as is, theres a command with no pipeline under pipeline
  2. If we move it to util_commands the folder structure no longer matches the airbyte-ci hierarchy

Currently Im thinking we leave it as is but perhaps find a new name for pipeline

maybe even airbyte-ci?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently Im thinking we leave it as is but perhaps find a new name for pipeline
maybe even airbyte-ci?

This was going to be my exact suggestion, i say we go for it 😄

@erohmensing erohmensing marked this pull request as ready for review October 18, 2023 19:42
@erohmensing erohmensing force-pushed the ella-ben/pipelines/refactor-folder-struct branch from c795238 to ddf6e74 Compare October 18, 2023 19:48
Copy link
Contributor Author

erohmensing commented Oct 18, 2023

Current dependencies on/for this PR:

This comment was auto-generated by Graphite.

@erohmensing erohmensing force-pushed the ella-ben/pipelines/refactor-folder-struct branch from ddf6e74 to a42f2f2 Compare October 18, 2023 20:01
@erohmensing erohmensing marked this pull request as draft October 18, 2023 20:01
@erohmensing erohmensing marked this pull request as ready for review October 18, 2023 20:13
@erohmensing
Copy link
Contributor Author

Some successful javapython connector test runs here: #31586 (after fixing something so definitely worth testing!)

Successful prerelease here: https://github.com/airbytehq/airbyte/actions/runs/6567115837

Copy link
Contributor

@pedroslopez pedroslopez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! I kept having to hop around and this feels like would make things much clearer!

Copy link
Contributor

@pedroslopez pedroslopez Oct 18, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: why is this folder plural (builds)? I think I would've expected it to line up with the command build if i understand the explanation correctly

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch! created an exception in our gitignore and renamed

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just kidding, that still messes with our pipeline exclude logic.

went for build_image instead

@bnchrch bnchrch enabled auto-merge (squash) October 18, 2023 23:35
@bnchrch bnchrch merged commit a53a020 into master Oct 18, 2023
20 checks passed
@bnchrch bnchrch deleted the ella-ben/pipelines/refactor-folder-struct branch October 18, 2023 23:46
@sentry-io
Copy link

sentry-io bot commented Oct 19, 2023

Suspect Issues

This pull request was deployed and Sentry observed the following issues:

  • ‼️ **ExecuteTimeoutError: Request timed out. Try setting a higher value in 'execute_timeout' config for this dagger.Connec...** pipelines.airbyte_ci.connectors.pipeline in run...` View Issue
  • ‼️ DaggerError: Dagger Command test failed. pipelines.cli.dagger_pipeline_command in invoke View Issue
  • ‼️ QueryError: sync /runner/_work/airbyte/airbyte/airbyte-integrations/connectors/source-paypal-transaction: fai... pipelines.airbyte_ci.connectors.context in get_... View Issue
  • ‼️ QueryError: file size 260503070 exceeds limit 134217728 pipelines.helpers.utils in get_container_output View Issue
  • ‼️ ExecError: process "git diff --diff-filter=MADRT --name-only origin/master...748211b... pipelines.helpers.git in get_modified_files_in_... View Issue

Did you find this useful? React with a 👍 or 👎

ariesgun pushed a commit to ariesgun/airbyte that referenced this pull request Oct 20, 2023
Co-authored-by: Ben Church <ben@airbyte.io>
Co-authored-by: erohmensing <erohmensing@users.noreply.github.com>
Co-authored-by: bnchrch <bnchrch@users.noreply.github.com>
ariesgun pushed a commit to ariesgun/airbyte that referenced this pull request Oct 23, 2023
Co-authored-by: Ben Church <ben@airbyte.io>
Co-authored-by: erohmensing <erohmensing@users.noreply.github.com>
Co-authored-by: bnchrch <bnchrch@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants