diff --git a/docs/7-cicd/gitlab/gitlab-pat-guide.md b/docs/2-getting-started/gitlab-pat-guide.md similarity index 94% rename from docs/7-cicd/gitlab/gitlab-pat-guide.md rename to docs/2-getting-started/gitlab-pat-guide.md index f69d18f..5f61255 100644 --- a/docs/7-cicd/gitlab/gitlab-pat-guide.md +++ b/docs/2-getting-started/gitlab-pat-guide.md @@ -80,5 +80,5 @@ Once you have your Personal Access Token: ## Related Documentation -- [Cloud-based Git Platform Integration](../2-getting-started/start-free-with-cloud.md#git-integration) -- [CI/CD Automation](../2-getting-started/start-free-with-cloud.md#cicd-automation) +- [Cloud-based Git Platform Integration](./start-free-with-cloud.md#git-integration) +- [CI/CD Getting Started](../7-cicd/ci-cd-getting-started.md) diff --git a/docs/2-getting-started/start-free-with-cloud.md b/docs/2-getting-started/start-free-with-cloud.md index dc2152e..e637914 100644 --- a/docs/2-getting-started/start-free-with-cloud.md +++ b/docs/2-getting-started/start-free-with-cloud.md @@ -49,7 +49,7 @@ Connect your repository to track pull requests/merge requests and validate chang | GitHub | GitLab | |--------------------------------------------------------------------------------------------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| -| 1. Navigate to settings
2. Connect GitHub repository
3. Authorize Recce access
4. Select repository | 1. Navigate to settings
2. Connect GitLab by providing a Personal Access Token ([see our directions here](../7-cicd/gitlab-pat-guide.md))
3. Connect a project by adding a GitLab Project or URL to a Recce Project | +| 1. Navigate to settings
2. Connect GitHub repository
3. Authorize Recce access
4. Select repository | 1. Navigate to settings
2. Connect GitLab by providing a Personal Access Token ([see our directions here](./gitlab-pat-guide.md))
3. Connect a project by adding a GitLab Project or URL to a Recce Project | ### How to Use PR/MR Tracking @@ -109,8 +109,8 @@ See the CI/CD sections for complete setup guides: - [Setup CI for GitHub](../7-cicd/github/setup-ci.md) - [Setup CD for GitHub](../7-cicd/github/setup-cd.md) - GitLab CI/CD - - [Setup CI for Gitlab](../7-cicd/gitlab/setup-ci.md) - - [Setup CD for Gitlab](../7-cicd/gitlab/setup-cd.md) + - [Setup CI for GitLab](../7-cicd/gitlab/setup-ci.md) + - [Setup CD for GitLab](../7-cicd/gitlab/setup-cd.md) ### Automation Benefits diff --git a/docs/7-cicd/ci-cd-getting-started.md b/docs/7-cicd/ci-cd-getting-started.md index 1798cad..bec9faa 100644 --- a/docs/7-cicd/ci-cd-getting-started.md +++ b/docs/7-cicd/ci-cd-getting-started.md @@ -49,33 +49,43 @@ Both CI and CD workflows follow the same pattern: ## Getting Started with your CI/CD -Recce currently integrates with both GitHub Actions and GitLab CI/CD. If you use another CI/CD product and interested in Recce, [let us know](https://cal.com/team/recce/chat). +Recce integrates with both GitHub Actions and GitLab CI/CD using the lightweight `recce-cloud` CLI. If you use another CI/CD platform and are interested in Recce, [let us know](https://cal.com/team/recce/chat). ## Prerequisites Before setting up, ensure you have: -- **Recce Cloud account** You can signup and start your free trial [here](https://cloud.reccehq.com/) -- **Repository connected** to Recce Cloud ([setup guide](../2-getting-started/start-free-with-cloud.md#git-integration)) -- **dbt artifacts generated** (`manifest.json` and `catalog.json`) from your project +- ✅ **Recce Cloud account** - [Start free trial](https://cloud.reccehq.com/) +- ✅ **Repository connected** to Recce Cloud - [Git integration guide](../2-getting-started/start-free-with-cloud.md#git-integration) + - For GitLab: [Create a Personal Access Token](../2-getting-started/gitlab-pat-guide.md) if not already done +- ✅ **dbt artifacts** - Know how to generate `manifest.json` and `catalog.json` from your project -### GitHub -If your dbt project uses GitHub: +## Setup Steps -1. [Setup CD](./github/setup-cd.md) - Auto-update baseline on merge to main -2. [Setup CI](./github/setup-ci.md) - Auto-validate changes in every PR +Both GitHub and GitLab follow the same simple pattern: -### GitLab -If your dbt project uses GitLab: +### 1. Setup CD - Auto-update baseline +[**Setup CD Guide**](./setup-cd.md) - Configure automatic baseline updates when you merge to main -1. [Setup CD](./gitlab/setup-cd.md) - Auto-update baseline on merge to main -2. [Setup CI](./gitlab/setup-ci.md) - Auto-validate changes in every MR -3. [GitLab Personal Access Token Guide](./gitlab/gitlab-pat-guide.md) - Required for GitLab integration +- Updates your production baseline artifacts automatically +- Runs on merge to main + optional scheduled updates +- Works with both GitHub Actions and GitLab CI/CD -## Next steps +### 2. Setup CI - Auto-validate PRs/MRs +[**Setup CI Guide**](./setup-ci.md) - Enable automatic validation for every PR/MR -1. Start with relevant CD setup ([Gitlab](./gitlab/setup-cd.md) or [Github](./github/setup-cd.md)) to establish automatic baseline (production artifacts) updates. -2. Configure CI setup ([Gitlab](./gitlab/setup-ci.md) or [Github](./github/setup-ci.md)) to enable PR/MR validation +- Validates data changes in every pull request or merge request +- Catches issues before they reach production +- Works with both GitHub Actions and GitLab CI/CD + +## Why This Order? + +Start with **CD first** to establish your baseline (production artifacts), then add **CI** for PR/MR validation. CI validation compares your PR/MR changes against the baseline created by CD. + +## Next Steps + +1. **[Setup CD](./setup-cd.md)** - Establish automatic baseline updates +2. **[Setup CI](./setup-ci.md)** - Enable PR/MR validation 3. Review [best practices](./best-practices-prep-env.md) for environment preparation ## Related workflows diff --git a/docs/7-cicd/github/scenario-ci.md b/docs/7-cicd/github/scenario-ci.md deleted file mode 100644 index a0cbeba..0000000 --- a/docs/7-cicd/github/scenario-ci.md +++ /dev/null @@ -1,211 +0,0 @@ ---- -title: Setup CI in Open Source ---- - -# Recce CI integration with GitHub Action - -Recce provides the `recce run` command for CI/CD pipeline. You can integrate Recce with GitHub Actions (or other CI tools) to compare the data models between two environments when a new pull-request is created. The below image describes the basic architecture. - -![ci/cd architecture](/assets/images/7-cicd/ci-cd.png){: .shadow} - -The following guide demonstrates how to configure Recce in GitHub Actions. - -## Prerequisites - -Before integrating Recce with GitHub Actions, you will need to configure the following items: - -- Set up **two environments** in your data warehouse. For example, one for base and another for pull request. - -- Provide the **credentials profile** for both environments in your `profiles.yml` so that Recce can access your data warehouse. You can put the credentials in a `profiles.yml` file, or use environment variables. - -- Set up the **data warehouse credentials** in your [GitHub repository secrets](https://docs.github.com/en/actions/reference/encrypted-secrets). - -## Set up Recce with GitHub Actions - -We suggest setting up two GitHub Actions workflows in your GitHub repository. One for the base environment and another for the PR environment. - -- **Base environment workflow**: Triggered on every merge to the `main branch`. This ensures that base artifacts are readily available for use when a PR is opened. - -- **PR environment workflow**: Triggered on every push to the `pull-request branch`. This workflow will compare base models with the current PR environment. - -### Base Workflow (Main Branch) - -This workflow will perform the following actions: - -1. Run dbt on the base environment -2. Upload the generated DBT artifacts to [GitHub workflow artifacts](https://docs.github.com/en/actions/using-workflows/storing-workflow-data-as-artifacts) for later use - -```yaml -name: Recce CI Base Branch - -on: - workflow_dispatch: - push: - branches: - - main - -concurrency: - group: recce-ci-base - cancel-in-progress: true - -jobs: - build: - runs-on: ubuntu-latest - - steps: - - uses: actions/checkout@v3 - - - name: Set up Python - uses: actions/setup-python@v2 - with: - python-version: "3.10.x" - - - name: Install dependencies - run: | - pip install -r requirements.txt - - - name: Run DBT - run: | - dbt deps - dbt seed --target ${{ env.DBT_BASE_TARGET }} - dbt run --target ${{ env.DBT_BASE_TARGET }} - dbt docs generate --target ${{ env.DBT_BASE_TARGET }} - env: - DBT_BASE_TARGET: "prod" - - - name: Upload DBT Artifacts - uses: actions/upload-artifact@v4 - with: - name: target - path: target/ -``` - -!!! note - - Please place the above file in `.github/workflows/dbt_base.yml`. This workflow path will also be used in the next PR workflow. If you place it in a different location, please remember to make the corresponding changes in the next step. - -### PR Workflow (Pull Request Branch) - -This workflow will perform the following actions: - -1. Run dbt on the PR environment. -2. Download previously generated base artifacts from base workflow. -3. Use Recce to compare the PR environment with the downloaded base artifacts. - - -````yaml -name: Recce CI PR Branch - -on: - pull_request: - branches: [main] - -jobs: - check-pull-request: - name: Check pull request by Recce CI - runs-on: ubuntu-latest - steps: - - name: Checkout repository - uses: actions/checkout@v3 - with: - fetch-depth: 0 - - name: Merge Base Branch into PR - uses: DataRecce/PR-Update@v1 - with: - baseBranch: ${{ github.event.pull_request.base.ref }} - autoMerge: false - - name: Set up Python - uses: actions/setup-python@v4 - with: - python-version: "3.10.x" - - name: Install dependencies - run: | - pip install -r requirements.txt - pip install recce - - name: Prepare dbt Base environment - run: | - gh repo set-default ${{ github.repository }} - base_branch=${{ github.base_ref }} - run_id=$(gh run list --workflow ${WORKFLOW_BASE} --branch ${base_branch} --status success --limit 1 --json databaseId --jq '.[0].databaseId') - echo "Download artifacts from run $run_id" - gh run download ${run_id} -n target -D target-base - env: - GH_TOKEN: ${{ secrets.GITHUB_TOKEN }} - WORKFLOW_BASE: ".github/workflows/dbt_base.yml" - - name: Prepare dbt Current environment - run: | - git checkout ${{ github.event.pull_request.head.sha }} - dbt deps - dbt seed --target ${{ env.DBT_CURRENT_TARGET}} - dbt run --target ${{ env.DBT_CURRENT_TARGET}} - dbt docs generate --target ${{ env.DBT_CURRENT_TARGET}} - env: - DBT_CURRENT_TARGET: "dev" - - - name: Run Recce CI - run: | - recce run --github-pull-request-url ${{ github.event.pull_request.html_url }} - - - name: Upload DBT Artifacts - uses: actions/upload-artifact@v4 - with: - name: target - path: target/ - - - name: Upload Recce State File - uses: actions/upload-artifact@v4 - id: recce-artifact-uploader - with: - name: recce-state-file - path: recce_state.json -```` - - - -## Review the Recce State File - -Review the downloaded Recce [state file](../../8-technical-concepts/state-file.md) with the following command: - -```bash -recce server --review recce_state.json -``` - -In the Recce server `--review` mode, you can review the comparison results of the data models between the base and current environments. It will contain the row counts of modified data models. - diff --git a/docs/7-cicd/github/setup-cd.md b/docs/7-cicd/github/setup-cd.md deleted file mode 100644 index 810f628..0000000 --- a/docs/7-cicd/github/setup-cd.md +++ /dev/null @@ -1,109 +0,0 @@ ---- -title: Setup CD for GitHub ---- - -# Setup CD - -Set up automatic updates for your Recce Cloud base sessions. Keep your data comparison baseline current every time you merge to main, with no manual work required. - -## Purpose - -**Automated Base Session Management** eliminates manual baseline maintenance. - -- **Triggers**: PR merge to main + scheduled updates -- **Action**: Auto-update base Recce session -- **Benefit**: Current comparison baseline for future PRs - -## Prerequisites - -You need `manifest.json` and `catalog.json` files (dbt artifacts) for Recce Cloud. See [Start Free with Cloud](../../2-getting-started/start-free-with-cloud.md) for instructions on preparing these files. - -## Implementation - -### 1. Core Workflow - -Create `.github/workflows/cd-workflow.yml`: - -```yaml -name: Update Base Recce Session - -on: - push: - branches: ["main"] - schedule: - - cron: "0 2 * * *" # Daily at 2 AM UTC - workflow_dispatch: - -concurrency: - group: ${{ github.workflow }} - cancel-in-progress: true - -jobs: - update-base-session: - runs-on: ubuntu-latest - timeout-minutes: 30 - - steps: - - name: Checkout code - uses: actions/checkout@v4 - - - name: Setup Python - uses: actions/setup-python@v5 - with: - python-version: "3.11" - cache: "pip" - - - name: Install dependencies - run: | - pip install -r requirements.txt - - - name: Prepare dbt artifacts - run: | - # Install dbt packages - dbt deps - - # Optional: Build tables to ensure they're materialized and updated - # dbt build --target prod - - # Required: Generate artifacts (provides all we need) - dbt docs generate --target prod - env: - DBT_ENV_SECRET_KEY: ${{ secrets.DBT_ENV_SECRET_KEY }} - - - name: Update Recce Cloud Base Session - uses: DataRecce/recce-cloud-cicd-action@v0.1 - # This action automatically uploads artifacts to Recce Cloud -``` - -### 2. Artifact Preparation Options - -**Default: Fresh Build** (shown in example above) - -- `dbt docs generate` is required and provides the needed `manifest.json` and `catalog.json` artifacts. -- `dbt build` is optional but ensures tables are materialized and updated. - -**Alternative Methods:** - -- **External Download**: Download from dbt Cloud, Paradime, or other platforms -- **Pipeline Integration**: Use existing dbt build workflows - - -### 3. Verification - -#### Manual Trigger Test - -1. Go to **Actions** tab in your repository -2. Select "Update Base Recce Session" workflow -3. Click **Run workflow** button -4. Monitor the run for successful completion - -#### Verify Success - -- ✅ **Workflow completes** without errors in Actions tab -- ✅ **Base session updated** in Recce Cloud - -![Recce Cloud showing updated base sessions](/assets/images/7-cicd/verify-setup-cd.png){: .shadow} - -## Next Steps - -**[Setup CI](./setup-ci.md)** to automatically validate PR changes against your updated base session. This completes your CI/CD pipeline by adding automated data validation for every pull request. diff --git a/docs/7-cicd/github/setup-ci.md b/docs/7-cicd/github/setup-ci.md deleted file mode 100644 index 730386c..0000000 --- a/docs/7-cicd/github/setup-ci.md +++ /dev/null @@ -1,111 +0,0 @@ ---- -title: Setup CI for GitHub ---- - -# Setup CI - -Automatically validate your data changes in every pull request using Recce Cloud. Catch data issues before they reach production, with validation results right in your PR. - -## Purpose - -**Automated PR Validation** prevents data regressions before merge. - -- **Triggers**: PR opened/updated against main -- **Action**: Auto-update Recce session for PR validation -- **Benefit**: Automated data validation and comparison - -## Prerequisites - -You need `manifest.json` and `catalog.json` files (dbt artifacts) for Recce Cloud. See [Start Free with Cloud](../../2-getting-started/start-free-with-cloud.md) for instructions on preparing these files. - -## Implementation - -### 1. Core Workflow - -Create `.github/workflows/ci-workflow.yml`: - -```yaml -name: Validate PR Changes - -on: - pull_request: - branches: ["main"] - -concurrency: - group: ${{ github.workflow }}-${{ github.ref }} - cancel-in-progress: true - -jobs: - validate-changes: - runs-on: ubuntu-latest - timeout-minutes: 45 - - steps: - - name: Checkout PR branch - uses: actions/checkout@v4 - with: - fetch-depth: 2 - - - name: Setup Python - uses: actions/setup-python@v5 - with: - python-version: "3.11" - cache: "pip" - - - name: Install dependencies - run: | - pip install -r requirements.txt - - # Step 1: Prepare current branch artifacts - - name: Build current branch artifacts - run: | - # Install dbt packages - dbt deps - - # Optional: Build tables to ensure they're materialized - # dbt build --target ci - - # Required: Generate artifacts for comparison - dbt docs generate --target ci - env: - DBT_ENV_SECRET_KEY: ${{ secrets.DBT_ENV_SECRET_KEY }} - - - name: Update Recce PR Session - uses: DataRecce/recce-cloud-cicd-action@v0.1 - # This action automatically creates a PR session in Recce Cloud -``` - -### 2. Artifact Preparation Options - -**Default: Fresh Build** (shown in example above) - -- `dbt docs generate` is required and provides all needed artifacts. -- `dbt build` is optional but ensures tables are materialized and updated. - -**Alternative Methods:** - -- **External Download**: Download from dbt Cloud, Paradime, or other platforms -- **Pipeline Integration**: Use existing dbt build workflows - -### 3. Verification - -#### Test with a PR - -1. Create a test PR with small data changes -2. Check **Actions** tab for CI workflow execution -3. Verify validation runs successfully - -#### Verify Success - -- ✅ **Workflow completes** without errors in Actions tab -- ✅ **PR session updated** in Recce Cloud - -![Recce Cloud showing PR validation session](/assets/images/7-cicd/verify-setup-ci.png){: .shadow} - -#### Review PR Session - -To analyze the PR changes in detail: - -- Go to your [Recce Cloud](https://cloud.reccehq.com) -- Find the PR session that was created -- Launch Recce instance to explore data differences diff --git a/docs/7-cicd/gitlab-pat-guide.md b/docs/7-cicd/gitlab-pat-guide.md deleted file mode 100644 index f69d18f..0000000 --- a/docs/7-cicd/gitlab-pat-guide.md +++ /dev/null @@ -1,84 +0,0 @@ ---- -title: GitLab Personal Access Token ---- - -# GitLab Personal Access Token Setup - -To integrate Recce with your GitLab project, you'll need to create a Personal Access Token (PAT) with appropriate -permissions. - -## Token Scope Requirements - -Recce supports two different permission levels depending on your needs: - -| Scope | Permissions | Features Available | -|------------|----------------------------------|-------------------------------------------------------------------------------------------------------------------------------------| -| `api` | Full API access (read and write) | • View and track merge requests
• **Receive generated summaries and notes on MRs from Recce**
• Full integration capabilities | -| `read_api` | Read-only API access | • View and track merge requests
• **Cannot receive generated summaries and notes on MRs from Recce** | - -!!!warning "Important: Choose the Right Scope" - If you want Recce to automatically post validation summaries and notes directly to your merge requests, you **must** use - the `api` scope. The `read_api` scope only allows Recce to read merge request information but cannot write comments or - summaries back to GitLab. - -## How to Create a Personal Access Token - -Follow these steps to create a Personal Access Token in GitLab: - -1. **Navigate to [Personal Access Token Settings in GitLab](https://gitlab.com/-/user_settings/personal_access_tokens)** - -2. **Create New Token** - - Click **Add new token** button - - Enter a descriptive **Token name** (e.g., "Recce Integration") - - Set an **Expiration date** - -3. **Select Scopes** - - Choose one of the following based on your needs: - - **Option A: Full Integration (Recommended)**
- - ✅ `api` scope
- - This enables Recce to post validation summaries and notes to your merge requests - - **Option B: Read-Only Access**
- - ✅ `read_api` scope
- - ⚠️ You will **not** receive generated PR summaries and notes on your MRs from Recce - -4. **Generate Token** - - Click **Create personal access token** - - **Important**: Copy the token immediately - you won't be able to see it again! - -5. **Save Token Securely** - - Store the token in a secure location - - -## Using Your Token with Recce - -Once you have your Personal Access Token: - -1. Navigate to Recce settings -2. Select GitLab integration -3. Paste your Personal Access Token -4. Complete the connection setup - -## Troubleshooting - -**Token not working?** - -- Verify you've selected the correct scope (`api` or `read_api`) -- Check that the token hasn't expired -- Ensure you have appropriate project permissions (Maintainer or Owner role) - -**Not receiving summaries on merge requests?** - -- Verify your token uses the `api` scope (not just `read_api`) -- Check that Recce has write permissions to your project - -**Still having issues?** - -- Please reach out to us on [our Discord](https://discord.gg/HUUx9fyphJ) or via email at `help@reccehq.com` - -## Related Documentation - -- [Cloud-based Git Platform Integration](../2-getting-started/start-free-with-cloud.md#git-integration) -- [CI/CD Automation](../2-getting-started/start-free-with-cloud.md#cicd-automation) diff --git a/docs/7-cicd/gitlab/setup-cd.md b/docs/7-cicd/gitlab/setup-cd.md deleted file mode 100644 index b9f3ecb..0000000 --- a/docs/7-cicd/gitlab/setup-cd.md +++ /dev/null @@ -1,260 +0,0 @@ ---- -title: Setup CD for GitLab ---- - -# Setup CD - -Set up automatic updates for your Recce Cloud base sessions. Keep your data comparison baseline current every time you merge to main, with no manual work required. - -## Purpose - -**Automated Base Session Management** eliminates manual baseline maintenance. - -- **Triggers**: MR merge to main + scheduled updates -- **Action**: Auto-update base Recce session -- **Benefit**: Current comparison baseline for future MRs - -## Prerequisites - -You need `manifest.json` and `catalog.json` files (dbt artifacts) for Recce Cloud. See [Start Free with Cloud](../../2-getting-started/start-free-with-cloud.md) for instructions on preparing these files. - -## Implementation - -### 1. Core Workflow - -GitLab's CD setup uses the same Recce Cloud component as CI, but with different trigger rules. Add to your `.gitlab-ci.yml`: -```yaml -include: - - component: gitlab.com/recce/recce-cloud-cicd-component/recce-cloud@1.2.0 - inputs: - stage: upload - -stages: - - build - - upload - -variables: - DBT_TARGET_PROD: "prod" - -# Disable the default component job -recce-cloud-upload: - rules: - - when: never - -# Production build - runs on schedule or manual trigger -prod-build: - stage: build - image: python:3.11-slim - script: - - pip install -r requirements.txt - - dbt deps - - # Optional: Build tables to ensure they're materialized and updated - # - dbt build --target $DBT_TARGET_PROD - - # Required: Generate artifacts - - dbt docs generate --target $DBT_TARGET_PROD - artifacts: - paths: - - target/ - expire_in: 7 days - rules: - - if: $CI_PIPELINE_SOURCE == "schedule" - - if: $CI_PIPELINE_SOURCE == "push" && $CI_COMMIT_BRANCH == $CI_DEFAULT_BRANCH - - if: $CI_PIPELINE_SOURCE == "web" - when: manual - -# Production Recce upload -recce-cloud-upload-prod: - extends: recce-cloud-upload - needs: - - job: prod-build - artifacts: true - rules: - - if: $CI_PIPELINE_SOURCE == "schedule" - - if: $CI_PIPELINE_SOURCE == "push" && $CI_COMMIT_BRANCH == $CI_DEFAULT_BRANCH - - if: $CI_PIPELINE_SOURCE == "web" - when: manual -``` - -This configuration: - -- **Scheduled updates**: Runs automatically on schedule (configure in CI/CD → Schedules) -- **Post-merge updates**: Runs when commits are pushed to main branch -- **Manual triggers**: Available via web UI for on-demand updates - -### 2. Unified CI/CD Configuration - -The Recce Cloud component can handle both CI (MR validation) and CD (base session updates) in a single configuration. Here's the combined approach: -```yaml -include: - - component: gitlab.com/recce/recce-cloud-cicd-component/recce-cloud@1.2.0 - inputs: - stage: upload - -stages: - - build - - upload - -# Disable the default component job -recce-cloud-upload: - rules: - - when: never - -# MR build - runs on merge requests -mr-build: - stage: build - image: python:3.11-slim - script: - - pip install -r requirements.txt - - dbt deps - - dbt build --target dev - - dbt docs generate --target dev - artifacts: - paths: - - target/ - expire_in: 7 days - rules: - - if: $CI_PIPELINE_SOURCE == "merge_request_event" - -# Production build - runs on schedule or main branch -prod-build: - stage: build - image: python:3.11-slim - script: - - pip install -r requirements.txt - - dbt deps - - dbt build --target prod - - dbt docs generate --target prod - artifacts: - paths: - - target/ - expire_in: 7 days - rules: - - if: $CI_PIPELINE_SOURCE == "schedule" - - if: $CI_PIPELINE_SOURCE == "push" && $CI_COMMIT_BRANCH == $CI_DEFAULT_BRANCH - -# MR Recce upload -recce-cloud-upload-mr: - extends: recce-cloud-upload - needs: - - job: mr-build - artifacts: true - rules: - - if: $CI_PIPELINE_SOURCE == "merge_request_event" - -# Production Recce upload -recce-cloud-upload-prod: - extends: recce-cloud-upload - needs: - - job: prod-build - artifacts: true - rules: - - if: $CI_PIPELINE_SOURCE == "schedule" - - if: $CI_PIPELINE_SOURCE == "push" && $CI_COMMIT_BRANCH == $CI_DEFAULT_BRANCH -``` - -This unified approach: - -- Uses the same component for both CI and CD -- Separates MR validation from production updates via `rules` -- Maintains different dbt targets for each environment -- Reduces configuration duplication - -### 3. Schedule Configuration - -To enable automatic baseline updates: - -1. Go to **CI/CD → Schedules** in your GitLab project -2. Click **New schedule** -3. Configure schedule: - - **Description**: "Daily Recce Base Session Update" - - **Interval Pattern**: `0 2 * * *` (Daily at 2 AM UTC) - - **Target Branch**: `main` -4. Save schedule - -### 4. Artifact Preparation Options - -**Default: Fresh Build** (shown in examples above) - -- `dbt docs generate` is required and provides needed `manifest.json` and `catalog.json` artifacts -- `dbt build` is optional but ensures tables are materialized and updated - -**Alternative Methods:** - -- **External Download**: Download from dbt Cloud, Paradime, or other platforms -- **Pipeline Integration**: Use existing dbt build workflows - -### 5. Verification - -#### Manual Trigger Test - -1. Go to **CI/CD → Pipelines** in your project -2. Click **Run pipeline** -3. Select **main** branch -4. Click **Run pipeline** button -5. Monitor the pipeline for successful completion - -#### Verify Success - -- ✅ **Pipeline completes** without errors in CI/CD → Pipelines -- ✅ **Base session updated** in Recce Cloud - -![Recce Cloud showing updated base sessions](../../assets/images/7-cicd/verify-setup-cd.png){: .shadow} - -#### Verify Scheduled Runs - -1. Go to **CI/CD → Schedules** -2. Check **Last pipeline** status for your schedule -3. Verify regular updates appear in pipeline history - -## Troubleshooting - -### Schedule not triggering - -**Issue**: Scheduled pipeline doesn't run - -**Solutions**: - -1. Verify schedule is **Active** in CI/CD → Schedules -2. Check schedule timezone settings (UTC by default) -3. Ensure target branch exists and is protected if required -4. Review project's CI/CD minutes quota - -### Branch protection issues - -**Error**: Pipeline fails on protected branch - -**Solutions**: - -1. Configure protected branch settings to allow scheduled pipelines -2. Ensure CI/CD variables are available to protected branches -3. Verify schedule owner has push permissions - -### Artifact conflicts - -**Issue**: Wrong artifacts uploaded for production - -**Solutions**: - -1. Ensure `needs` dependencies are correct in upload jobs -2. Verify artifact paths match between build and upload jobs: -```yaml -prod-build: - artifacts: - paths: - - target/ # Must match component's dbt_target_path - -recce-cloud-upload-prod: - needs: - - job: prod-build - artifacts: true # Required -``` - -## Complete Example - -See the [complete working example](https://gitlab.com/recce/jaffle-shop-snowflake/-/blob/main/.gitlab-ci.yml) showing unified CI/CD configuration with the Recce Cloud component. - -## Next Steps - -If you haven't already, **[Setup CI](./setup-ci.md)** to automatically validate MR changes. The unified configuration above handles both CI and CD together, giving you a complete automated validation pipeline. diff --git a/docs/7-cicd/gitlab/setup-ci.md b/docs/7-cicd/gitlab/setup-ci.md deleted file mode 100644 index a6a13ff..0000000 --- a/docs/7-cicd/gitlab/setup-ci.md +++ /dev/null @@ -1,219 +0,0 @@ ---- -title: Setup CI for GitLab ---- - -# Setup CI - -Automatically validate your data changes in every merge request using Recce Cloud. Catch data issues before they reach production, with validation results right in your MR. - -## Purpose - -**Automated MR Validation** prevents data regressions before merge. - -- **Triggers**: MR opened/updated against main -- **Action**: Auto-update Recce session for MR validation -- **Benefit**: Automated data validation and comparison - -## Prerequisites - -You need `manifest.json` and `catalog.json` files (dbt artifacts) for Recce Cloud. See [Start Free with Cloud](../../2-getting-started/start-free-with-cloud.md) for instructions on preparing these files. - -## Implementation - -### 1. Core Workflow - -Add to your `.gitlab-ci.yml`: -```yaml -include: - - component: gitlab.com/recce/recce-cloud-cicd-component/recce-cloud@1.2.0 - inputs: - stage: test - -stages: - - build - - test - -variables: - DBT_TARGET: "ci" - -dbt-build: - stage: build - image: python:3.11-slim - script: - - pip install -r requirements.txt - - # Install dbt packages - - dbt deps - - # Optional: Build tables to ensure they're materialized - # - dbt build --target $DBT_TARGET - - # Required: Generate artifacts for comparison - - dbt docs generate --target $DBT_TARGET - artifacts: - paths: - - target/ - expire_in: 1 week - rules: - - if: $CI_PIPELINE_SOURCE == "merge_request_event" -``` - -The included Recce Cloud component automatically: - -- Creates a session in Recce Cloud for the merge request -- Uploads your dbt artifacts (`manifest.json` and `catalog.json`) -- Provides session URL for validation review - -### 2. Component Configuration Options - -The component accepts optional inputs for customization: -```yaml -include: - - component: gitlab.com/recce/recce-cloud-cicd-component/recce-cloud@1.2.0 - inputs: - stage: test # Pipeline stage (default: test) - dbt_target_path: target # Path to dbt artifacts (default: target) - base_branch: main # Base branch for comparison (default: main) - gitlab_token: $CUSTOM_GITLAB_TOKEN # Custom GitLab token (default: $CI_JOB_TOKEN) -``` - -**Default Configuration** (shown in example above): - -- Component uses `$CI_JOB_TOKEN` automatically (no manual token setup required) -- Uploads from `target/` directory by default -- Compares against `main` branch - -**Custom Token** (optional): - -If you need to use a custom GitLab token instead of the default `$CI_JOB_TOKEN`: - -1. Create a [Project Access Token](../gitlab-pat-guide.md) with `api` scope -2. Add it as a [CI/CD variable](https://docs.gitlab.com/ee/ci/variables/) in your project -3. Reference it in the component inputs: -```yaml -include: - - component: gitlab.com/recce/recce-cloud-cicd-component/recce-cloud@1.2.0 - inputs: - gitlab_token: $CUSTOM_GITLAB_TOKEN -``` - -### 3. Artifact Preparation Options - -**Default: Fresh Build** (shown in example above) - -- `dbt docs generate` is required and provides all needed artifacts -- `dbt build` is optional but ensures tables are materialized and updated - -**Alternative Methods:** - -- **External Download**: Download from dbt Cloud, Paradime, or other platforms -- **Pipeline Integration**: Use existing dbt build workflows - -### 4. Verification - -#### Test with an MR - -1. Create a test MR with small data changes -2. Check **CI/CD → Pipelines** for workflow execution -3. Verify validation runs successfully - -#### Verify Success - -- ✅ **Pipeline completes** without errors in CI/CD → Pipelines -- ✅ **MR session updated** in Recce Cloud -- ✅ **Session URL** appears in pipeline job output - -![Recce Cloud showing MR validation session](../../assets/images/7-cicd/verify-setup-ci.png){: .shadow} - -#### Review MR Session - -To analyze the MR changes in detail: - -- Go to your [Recce Cloud](https://cloud.reccehq.com) -- Find the MR session that was created -- Launch Recce instance to explore data differences - -Or use the session launch URL from the pipeline output: -```bash -# Pipeline output example -RECCE_SESSION_LAUNCH_URL: https://cloud.reccehq.com/launch/abc123 -``` - -## Troubleshooting - -### Missing dbt files - -**Error**: `Missing manifest.json` or `Missing catalog.json` - -**Solution**: Ensure `dbt docs generate` runs successfully before the Recce component: -```yaml -dbt-build: - script: - - dbt build - - dbt docs generate # Required - artifacts: - paths: - - target/ -``` - -### Authentication issues - -**Error**: `Failed to create session: 401 Unauthorized` - -**Solutions**: - -1. Verify Recce Cloud GitLab integration is set up for your project -2. Check that your project is connected in [Recce Cloud settings](https://cloud.reccehq.com/settings) -3. For custom tokens, ensure the token has `api` scope ([setup guide](../gitlab-pat-guide.md)) - -### Upload failures - -**Error**: `Failed to upload manifest/catalog` - -**Solutions**: - -1. Check network connectivity to Recce Cloud -2. Verify artifact files exist in `target/` directory -3. Review pipeline job logs for detailed error messages -4. Ensure artifacts are passed between jobs: -```yaml -dbt-build: - artifacts: - paths: - - target/ # Must include dbt artifacts -``` - -## Complete Example - -Here's a full working example combining dbt build and Recce validation: -```yaml -include: - - component: gitlab.com/recce/recce-cloud-cicd-component/recce-cloud@1.2.0 - inputs: - stage: test - -stages: - - build - - test - -variables: - DBT_TARGET: "ci" - -dbt-build: - stage: build - image: python:3.11-slim - before_script: - - pip install -r requirements.txt - script: - - dbt deps - - dbt build --target $DBT_TARGET - - dbt docs generate --target $DBT_TARGET - artifacts: - paths: - - target/ - expire_in: 1 week - rules: - - if: $CI_PIPELINE_SOURCE == "merge_request_event" -``` - -See the [complete example project](https://gitlab.com/recce/jaffle-shop-snowflake/-/blob/main/.gitlab-ci.yml) for a full working configuration. diff --git a/docs/7-cicd/setup-cd.md b/docs/7-cicd/setup-cd.md index 1a7f866..5a799c3 100644 --- a/docs/7-cicd/setup-cd.md +++ b/docs/7-cicd/setup-cd.md @@ -2,29 +2,33 @@ title: Setup CD --- -# Setup CD +# Setup CD - Auto-Update Baseline Set up automatic updates for your Recce Cloud base sessions. Keep your data comparison baseline current every time you merge to main, with no manual work required. -## Purpose +## What This Does -**Automated Base Session Management** eliminates manual baseline maintenance. +**Automated Base Session Management** eliminates manual baseline maintenance: -- **Triggers**: PR merge to main + scheduled updates -- **Action**: Auto-update base Recce session -- **Benefit**: Current comparison baseline for future PRs +- **Triggers**: Merge to main + scheduled updates + manual runs +- **Action**: Auto-update base Recce session with latest production artifacts +- **Benefit**: Current comparison baseline for all future PRs/MRs ## Prerequisites -You need `manifest.json` and `catalog.json` files (dbt artifacts) for Recce Cloud. See [Start Free with Cloud](../2-getting-started/start-free-with-cloud.md) for instructions on preparing these files. +Before setting up CD, ensure you have: -## Implementation +- ✅ **Recce Cloud account** - [Start free trial](https://cloud.reccehq.com/) +- ✅ **Repository connected** to Recce Cloud - [Git integration guide](../2-getting-started/start-free-with-cloud.md#git-integration) +- ✅ **dbt artifacts** - Know how to generate `manifest.json` and `catalog.json` from your dbt project -### 1. Core Workflow +## Setup + +### GitHub Actions Create `.github/workflows/cd-workflow.yml`: -```yaml +```yaml linenums="1" hl_lines="42-43" name: Update Base Recce Session on: @@ -54,56 +58,282 @@ jobs: cache: "pip" - name: Install dependencies - run: | - pip install -r requirements.txt + run: pip install -r requirements.txt - name: Prepare dbt artifacts run: | - # Install dbt packages dbt deps - - # Optional: Build tables to ensure they're materialized and updated - # dbt build --target prod - - # Required: Generate artifacts (provides all we need) + # Optional: dbt build --target prod dbt docs generate --target prod env: DBT_ENV_SECRET_KEY: ${{ secrets.DBT_ENV_SECRET_KEY }} - - name: Update Recce Cloud Base Session - uses: DataRecce/recce-cloud-cicd-action@v0.1 - # This action automatically uploads artifacts to Recce Cloud + - name: Upload to Recce Cloud + run: | + pip install recce-cloud + recce-cloud upload --type prod +``` + +**Key points:** + +- Authentication is automatic via `GITHUB_TOKEN` +- `recce-cloud upload --type prod` tells Recce this is a baseline session +- `dbt docs generate` creates the required `manifest.json` and `catalog.json` + +### GitLab CI/CD + +Add to your `.gitlab-ci.yml`: + +```yaml linenums="1" hl_lines="30-31" +stages: + - build + - upload + +variables: + DBT_TARGET_PROD: "prod" + +# Production build - runs on schedule or main branch push +prod-build: + stage: build + image: python:3.11-slim + script: + - pip install -r requirements.txt + - dbt deps + # Optional: dbt build --target $DBT_TARGET_PROD + - dbt docs generate --target $DBT_TARGET_PROD + artifacts: + paths: + - target/ + expire_in: 7 days + rules: + - if: $CI_PIPELINE_SOURCE == "schedule" + - if: $CI_PIPELINE_SOURCE == "push" && $CI_COMMIT_BRANCH == $CI_DEFAULT_BRANCH + +# Upload to Recce Cloud +recce-upload-prod: + stage: upload + image: python:3.11-slim + script: + - pip install recce-cloud + - recce-cloud upload --type prod + dependencies: + - prod-build + rules: + - if: $CI_PIPELINE_SOURCE == "schedule" + - if: $CI_PIPELINE_SOURCE == "push" && $CI_COMMIT_BRANCH == $CI_DEFAULT_BRANCH +``` + +**Key points:** + +- Authentication is automatic via `CI_JOB_TOKEN` +- Configure schedule in **CI/CD → Schedules** (e.g., `0 2 * * *` for daily at 2 AM UTC) +- `recce-cloud upload --type prod` tells Recce this is a baseline session + +### Platform Comparison + +| Aspect | GitHub Actions | GitLab CI/CD | +| -------------------- | ----------------------------------- | ------------------------------------------------------------------------------ | +| **Config file** | `.github/workflows/cd-workflow.yml` | `.gitlab-ci.yml` | +| **Trigger on merge** | `on: push: branches: ["main"]` | `if: $CI_PIPELINE_SOURCE == "push" && $CI_COMMIT_BRANCH == $CI_DEFAULT_BRANCH` | +| **Schedule setup** | In workflow YAML (`schedule:`) | In UI: **CI/CD → Schedules** | +| **Authentication** | Automatic (`GITHUB_TOKEN`) | Automatic (`CI_JOB_TOKEN`) | +| **Manual trigger** | `workflow_dispatch:` | Pipeline run from UI | + +## Verification + +### Test the Workflow + +**GitHub:** + +1. Go to **Actions** tab → Select "Update Base Recce Session" +2. Click **Run workflow** → Monitor for completion + +**GitLab:** + +1. Go to **CI/CD → Pipelines** → Click **Run pipeline** +2. Select **main** branch → Monitor for completion + +### Verify Success + +Look for these indicators: + +- ✅ **Workflow/Pipeline completes** without errors +- ✅ **Base session updated** in [Recce Cloud](https://cloud.reccehq.com) + +**GitHub:** + +![Recce Cloud showing updated base sessions](../assets/images/7-cicd/verify-setup-github-cd.png){: .shadow} + +**GitLab:** + +![Recce Cloud showing updated base sessions](../assets/images/7-cicd/verify-setup-gitlab-cd.png){: .shadow} + +### Expected Output + +When the upload succeeds, you'll see output like this in your workflow logs: + +**GitHub:** + +```hl_lines="2 3 13" +─────────────────────────── CI Environment Detection ─────────────────────────── +Platform: github-actions +Session Type: prod +Commit SHA: def456ab... +Source Branch: main +Repository: your-org/your-repo +Info: Using GITHUB_TOKEN for platform-specific authentication +────────────────────────── Creating/touching session ─────────────────────────── +Session ID: abc123-def456-ghi789 +Uploading manifest from path "target/manifest.json" +Uploading catalog from path "target/catalog.json" +Notifying upload completion... +──────────────────────────── Uploaded Successfully ───────────────────────────── +Uploaded dbt artifacts to Recce Cloud for session ID "abc123-def456-ghi789" +Artifacts from: "/home/runner/work/your-repo/your-repo/target" +``` + +**GitLab:** + +```hl_lines="2 3 13" +─────────────────────────── CI Environment Detection ─────────────────────────── +Platform: gitlab-ci +Session Type: prod +Commit SHA: a1b2c3d4... +Source Branch: main +Repository: your-org/your-project +Info: Using CI_JOB_TOKEN for platform-specific authentication +────────────────────────── Creating/touching session ─────────────────────────── +Session ID: abc123-def456-ghi789 +Uploading manifest from path "target/manifest.json" +Uploading catalog from path "target/catalog.json" +Notifying upload completion... +──────────────────────────── Uploaded Successfully ───────────────────────────── +Uploaded dbt artifacts to Recce Cloud for session ID "abc123-def456-ghi789" +Artifacts from: "/builds/your-org/your-project/target" ``` -### 2. Artifact Preparation Options +## Advanced Options + +### Custom Artifact Path + +If your dbt artifacts are in a non-standard location: + +```bash +recce-cloud upload --type prod --target-path custom-target +``` + +### External Artifact Sources + +You can download artifacts from external sources before uploading: + +```yaml +# GitHub example +- name: Download from dbt Cloud + run: | + # Your download logic here + # Artifacts should end up in target/ directory + +- name: Upload to Recce Cloud + run: | + pip install recce-cloud + recce-cloud upload --type prod +``` + +### Dry Run Testing + +Test your configuration without actually uploading: + +```bash +recce-cloud upload --type prod --dry-run +``` + +## Troubleshooting + +### Missing dbt artifacts + +**Error**: `Missing manifest.json` or `Missing catalog.json` + +**Solution**: Ensure `dbt docs generate` runs successfully before upload: + +**GitHub:** + +```yaml +- name: Prepare dbt artifacts + run: | + dbt deps + dbt docs generate --target prod # Required +``` + +**GitLab:** + +```yaml +prod-build: + script: + - dbt deps + - dbt docs generate --target $DBT_TARGET_PROD # Required + artifacts: + paths: + - target/ +``` + +### Authentication issues + +**Error**: `Failed to create session: 401 Unauthorized` + +**Solutions**: + +1. Verify your repository is connected in [Recce Cloud settings](https://cloud.reccehq.com/settings) +2. **For GitHub**: Ensure workflow has default `GITHUB_TOKEN` permissions +3. **For GitLab**: Verify project has GitLab integration configured + - Check that you've created a [Personal Access Token](../2-getting-started/gitlab-pat-guide.md) + - Ensure the token has appropriate scope (`api` or `read_api`) + - Verify the project is connected in Recce Cloud settings + +### Upload failures + +**Error**: `Failed to upload manifest/catalog` + +**Solutions**: -**Default: Fresh Build** (shown in example above) +1. Check network connectivity to Recce Cloud +2. Verify artifact files exist in `target/` directory +3. Review workflow/pipeline logs for detailed error messages +4. **For GitLab**: Ensure artifacts are passed between jobs: -- `dbt docs generate` is required and provides the needed `manifest.json` and `catalog.json` artifacts. -- `dbt build` is optional but ensures tables are materialized and updated. + ```yaml + prod-build: + artifacts: + paths: + - target/ # Must include dbt artifacts -**Alternative Methods:** + recce-upload-prod: + dependencies: + - prod-build # Required to access artifacts + ``` -- **External Download**: Download from dbt Cloud, Paradime, or other platforms -- **Pipeline Integration**: Use existing dbt build workflows +### Session not appearing +**Issue**: Upload succeeds but session doesn't appear in Recce Cloud -### 3. Verification +**Solutions**: -#### Manual Trigger Test +1. Check you're viewing the correct repository in Recce Cloud +2. Verify you're looking at the production/base sessions (not PR/MR sessions) +3. Check session filters in Recce Cloud (may be hidden by filters) +4. Refresh the Recce Cloud page -1. Go to **Actions** tab in your repository -2. Select "Update Base Recce Session" workflow -3. Click **Run workflow** button -4. Monitor the run for successful completion +### Schedule not triggering (GitLab only) -#### Verify Success +**Issue**: Scheduled pipeline doesn't run -- ✅ **Workflow completes** without errors in Actions tab -- ✅ **Base session updated** in Recce Cloud +**Solutions**: -![Recce Cloud showing updated base sessions](../assets/images/7-cicd/verify-setup-cd.png){: .shadow} +1. Verify schedule is **Active** in CI/CD → Schedules +2. Check schedule timezone settings (UTC by default) +3. Ensure target branch (`main`) exists +4. Review project's CI/CD minutes quota +5. Verify schedule owner has appropriate permissions ## Next Steps -**[Setup CI](./setup-ci.md)** to automatically validate PR changes against your updated base session. This completes your CI/CD pipeline by adding automated data validation for every pull request. +**[Setup CI](./setup-ci.md)** to automatically validate PR/MR changes against your updated base session. This completes your CI/CD pipeline by adding automated data validation for every pull request or merge request. diff --git a/docs/7-cicd/setup-ci.md b/docs/7-cicd/setup-ci.md index 4a9e9b0..38b4039 100644 --- a/docs/7-cicd/setup-ci.md +++ b/docs/7-cicd/setup-ci.md @@ -2,29 +2,34 @@ title: Setup CI --- -# Setup CI +# Setup CI - Auto-Validate PRs/MRs -Automatically validate your data changes in every pull request using Recce Cloud. Catch data issues before they reach production, with validation results right in your PR. +Automatically validate your data changes in every pull request or merge request using Recce Cloud. Catch data issues before they reach production, with validation results right in your PR/MR. -## Purpose +## What This Does -**Automated PR Validation** prevents data regressions before merge. +**Automated PR/MR Validation** prevents data regressions before merge: -- **Triggers**: PR opened/updated against main -- **Action**: Auto-update Recce session for PR validation -- **Benefit**: Automated data validation and comparison +- **Triggers**: PR/MR opened or updated against main +- **Action**: Auto-update Recce session for validation +- **Benefit**: Automated data validation and comparison visible in your PR/MR ## Prerequisites -You need `manifest.json` and `catalog.json` files (dbt artifacts) for Recce Cloud. See [Start Free with Cloud](../2-getting-started/start-free-with-cloud.md) for instructions on preparing these files. +Before setting up CI, ensure you have: -## Implementation +- ✅ **Recce Cloud account** - [Start free trial](https://cloud.reccehq.com/) +- ✅ **Repository connected** to Recce Cloud - [Git integration guide](../2-getting-started/start-free-with-cloud.md#git-integration) +- ✅ **dbt artifacts** - Know how to generate `manifest.json` and `catalog.json` from your dbt project +- ✅ **CD configured** - [Setup CD](./setup-cd.md) to establish baseline for comparisons -### 1. Core Workflow +## Setup + +### GitHub Actions Create `.github/workflows/ci-workflow.yml`: -```yaml +```yaml linenums="1" hl_lines="41-42" name: Validate PR Changes on: @@ -53,59 +58,218 @@ jobs: cache: "pip" - name: Install dependencies - run: | - pip install -r requirements.txt + run: pip install -r requirements.txt - # Step 1: Prepare current branch artifacts - name: Build current branch artifacts run: | - # Install dbt packages dbt deps - - # Optional: Build tables to ensure they're materialized - # dbt build --target ci - - # Required: Generate artifacts for comparison + # Optional: dbt build --target ci dbt docs generate --target ci env: DBT_ENV_SECRET_KEY: ${{ secrets.DBT_ENV_SECRET_KEY }} - - name: Update Recce PR Session - uses: DataRecce/recce-cloud-cicd-action@v0.1 - # This action automatically creates a PR session in Recce Cloud + - name: Upload to Recce Cloud + run: | + pip install recce-cloud + recce-cloud upload +``` + +**Key points:** + +- Authentication is automatic via `GITHUB_TOKEN` +- `recce-cloud upload` (without `--type`) auto-detects this is a PR session +- `dbt docs generate` creates the required `manifest.json` and `catalog.json` + +### GitLab CI/CD + +Add to your `.gitlab-ci.yml`: + +```yaml linenums="1" hl_lines="29-30" +stages: + - build + - upload + +variables: + DBT_TARGET: "ci" + +# MR build - runs on merge requests +dbt-build: + stage: build + image: python:3.11-slim + script: + - pip install -r requirements.txt + - dbt deps + # Optional: dbt build --target $DBT_TARGET + - dbt docs generate --target $DBT_TARGET + artifacts: + paths: + - target/ + expire_in: 1 week + rules: + - if: $CI_PIPELINE_SOURCE == "merge_request_event" + +# Upload to Recce Cloud +recce-upload: + stage: upload + image: python:3.11-slim + script: + - pip install recce-cloud + - recce-cloud upload + dependencies: + - dbt-build + rules: + - if: $CI_PIPELINE_SOURCE == "merge_request_event" ``` -### 2. Artifact Preparation Options +**Key points:** -**Default: Fresh Build** (shown in example above) +- Authentication is automatic via `CI_JOB_TOKEN` +- `recce-cloud upload` (without `--type`) auto-detects this is an MR session +- `dbt docs generate` creates the required `manifest.json` and `catalog.json` -- `dbt docs generate` is required and provides all needed artifacts. -- `dbt build` is optional but ensures tables are materialized and updated. +### Platform Comparison -**Alternative Methods:** +| Aspect | GitHub Actions | GitLab CI/CD | +| -------------------- | ----------------------------------- | -------------------------------------------------- | +| **Config file** | `.github/workflows/ci-workflow.yml` | `.gitlab-ci.yml` | +| **Trigger** | `on: pull_request:` | `if: $CI_PIPELINE_SOURCE == "merge_request_event"` | +| **Authentication** | Automatic (`GITHUB_TOKEN`) | Automatic (`CI_JOB_TOKEN`) | +| **Session type** | Auto-detected from PR context | Auto-detected from MR context | +| **Artifact passing** | Not needed (single job) | Use `artifacts:` + `dependencies:` | -- **External Download**: Download from dbt Cloud, Paradime, or other platforms -- **Pipeline Integration**: Use existing dbt build workflows +## Verification -### 3. Verification +### Test with a PR/MR -#### Test with a PR +**GitHub:** 1. Create a test PR with small data changes 2. Check **Actions** tab for CI workflow execution 3. Verify validation runs successfully -#### Verify Success +**GitLab:** + +1. Create a test MR with small data changes +2. Check **CI/CD → Pipelines** for workflow execution +3. Verify validation runs successfully + +### Verify Success + +Look for these indicators: + +- ✅ **Workflow/Pipeline completes** without errors +- ✅ **PR/MR session created** in [Recce Cloud](https://cloud.reccehq.com) +- ✅ **Session URL** appears in workflow/pipeline output + +**GitHub:** + +![Recce Cloud showing PR validation session](../assets/images/7-cicd/verify-setup-github-ci.png){: .shadow} + +**GitLab:** + +![Recce Cloud showing MR validation session](../assets/images/7-cicd/verify-setup-gitlab-ci.png){: .shadow} + +### Expected Output + +When the upload succeeds, you'll see output like this in your workflow logs: + +**GitHub:** + +```hl_lines="2 5 16" +─────────────────────────── CI Environment Detection ─────────────────────────── +Platform: github-actions +PR Number: 42 +PR URL: https://github.com/your-org/your-repo/pull/42 +Session Type: cr +Commit SHA: abc123de... +Base Branch: main +Source Branch: feature/your-feature +Repository: your-org/your-repo +Info: Using GITHUB_TOKEN for platform-specific authentication +────────────────────────── Creating/touching session ─────────────────────────── +Session ID: f8b0f7ca-ea59-411d-abd8-88b80b9f87ad +Uploading manifest from path "target/manifest.json" +Uploading catalog from path "target/catalog.json" +Notifying upload completion... +──────────────────────────── Uploaded Successfully ───────────────────────────── +Uploaded dbt artifacts to Recce Cloud for session ID "f8b0f7ca-ea59-411d-abd8-88b80b9f87ad" +Artifacts from: "/home/runner/work/your-repo/your-repo/target" +Change request: https://github.com/your-org/your-repo/pull/42 +``` + +**GitLab:** + +```hl_lines="2 5 16" +─────────────────────────── CI Environment Detection ─────────────────────────── +Platform: gitlab-ci +MR Number: 4 +MR URL: https://gitlab.com/your-org/your-project/-/merge_requests/4 +Session Type: cr +Commit SHA: c928e3d5... +Base Branch: main +Source Branch: feature/your-feature +Repository: your-org/your-project +Info: Using CI_JOB_TOKEN for platform-specific authentication +────────────────────────── Creating/touching session ─────────────────────────── +Session ID: f8b0f7ca-ea59-411d-abd8-88b80b9f87ad +Uploading manifest from path "target/manifest.json" +Uploading catalog from path "target/catalog.json" +Notifying upload completion... +──────────────────────────── Uploaded Successfully ───────────────────────────── +Uploaded dbt artifacts to Recce Cloud for session ID "f8b0f7ca-ea59-411d-abd8-88b80b9f87ad" +Artifacts from: "/builds/your-org/your-project/target" +Change request: https://gitlab.com/your-org/your-project/-/merge_requests/4 +``` + +### Review PR/MR Session + +To analyze the changes in detail: + +1. Go to your [Recce Cloud](https://cloud.reccehq.com) +2. Find the PR/MR session that was created +3. Launch Recce instance to explore data differences + +## Advanced Options + +### Custom Artifact Path + +If your dbt artifacts are in a non-standard location: + +```bash +recce-cloud upload --target-path custom-target +``` + +### Dry Run Testing + +Test your configuration without actually uploading: + +```bash +recce-cloud upload --dry-run +``` + +## Troubleshooting + +If CI is not working, the issue is likely in your CD setup. Most problems are shared between CI and CD: + +**Common issues:** + +- Missing dbt artifacts +- Authentication failures +- Upload errors +- Sessions not appearing + +**→ See the [Setup CD Troubleshooting section](./setup-cd.md#troubleshooting)** for detailed solutions. -- ✅ **Workflow completes** without errors in Actions tab -- ✅ **PR session updated** in Recce Cloud +**CI-specific tip:** If CD works but CI doesn't, verify: -![Recce Cloud showing PR validation session](../assets/images/7-cicd/verify-setup-ci.png){: .shadow} +1. PR/MR trigger conditions in your workflow configuration +2. The PR/MR is targeting the correct base branch (usually `main`) +3. You're looking at PR/MR sessions in Recce Cloud (not production sessions) -#### Review PR Session +## Next Steps -To analyze the PR changes in detail: +After setting up CI, explore these workflow guides: -- Go to your [Recce Cloud](https://cloud.reccehq.com) -- Find the PR session that was created -- Launch Recce instance to explore data differences +- [PR/MR review workflow](./scenario-pr-review.md) - Collaborate with teammates using Recce +- [Preset checks](./preset-checks.md) - Configure automatic validation checks +- [Best practices](./best-practices-prep-env.md) - Environment preparation tips diff --git a/docs/assets/images/7-cicd/verify-setup-cd.png b/docs/assets/images/7-cicd/verify-setup-cd.png deleted file mode 100644 index 487d8ab..0000000 Binary files a/docs/assets/images/7-cicd/verify-setup-cd.png and /dev/null differ diff --git a/docs/assets/images/7-cicd/verify-setup-ci.png b/docs/assets/images/7-cicd/verify-setup-ci.png deleted file mode 100644 index 1c9c4b6..0000000 Binary files a/docs/assets/images/7-cicd/verify-setup-ci.png and /dev/null differ diff --git a/docs/assets/images/7-cicd/verify-setup-github-cd.png b/docs/assets/images/7-cicd/verify-setup-github-cd.png new file mode 100644 index 0000000..3192eb3 Binary files /dev/null and b/docs/assets/images/7-cicd/verify-setup-github-cd.png differ diff --git a/docs/assets/images/7-cicd/verify-setup-github-ci.png b/docs/assets/images/7-cicd/verify-setup-github-ci.png new file mode 100644 index 0000000..2f61afe Binary files /dev/null and b/docs/assets/images/7-cicd/verify-setup-github-ci.png differ diff --git a/docs/assets/images/7-cicd/verify-setup-gitlab-cd.png b/docs/assets/images/7-cicd/verify-setup-gitlab-cd.png new file mode 100644 index 0000000..f7cd363 Binary files /dev/null and b/docs/assets/images/7-cicd/verify-setup-gitlab-cd.png differ diff --git a/docs/assets/images/7-cicd/verify-setup-gitlab-ci.png b/docs/assets/images/7-cicd/verify-setup-gitlab-ci.png new file mode 100644 index 0000000..0321f62 Binary files /dev/null and b/docs/assets/images/7-cicd/verify-setup-gitlab-ci.png differ diff --git a/mkdocs.yml b/mkdocs.yml index 478e23b..3854676 100644 --- a/mkdocs.yml +++ b/mkdocs.yml @@ -76,29 +76,15 @@ nav: - 6-collaboration/checklist.md - 6-collaboration/share.md - CI/CD: - # Brief intro explaining GitHub vs GitLab paths - 7-cicd/ci-cd-getting-started.md - - # GitHub CI/CD - - Using GitHub: - - 7-cicd/github/setup-ci.md - #- 7-cicd/cloud-recce-summary.md - #- 7-cicd/cloud-preset-checks.md -# - 7-cicd/github/scenario-ci.md - - 7-cicd/github/setup-cd.md - - # GitLab CI/CD - - Using GitLab: - - 7-cicd/gitlab/setup-ci.md - - 7-cicd/gitlab/setup-cd.md - - 7-cicd/gitlab/gitlab-pat-guide.md + - 7-cicd/setup-cd.md + - 7-cicd/setup-ci.md - 7-cicd/pr_mr_summary.md #- 7-cicd/recce-debug.md # content outdated - 7-cicd/scenario-dev.md - 7-cicd/scenario-pr-review.md - - 7-cicd/best-practices-prep-env.md - #- 7-cicd/recce-summary.md content outdated - 7-cicd/preset-checks.md + - 7-cicd/best-practices-prep-env.md - Technical Concepts: - 8-technical-concepts/state-file.md