Skip to content

[ci](fix) license check#62534

Merged
hello-stephen merged 3 commits intoapache:masterfrom
hello-stephen:0415-license-check
Apr 16, 2026
Merged

[ci](fix) license check#62534
hello-stephen merged 3 commits intoapache:masterfrom
hello-stephen:0415-license-check

Conversation

@hello-stephen
Copy link
Copy Markdown
Contributor

What problem does this PR solve?

Issue Number: close #xxx

Related PR: #xxx

Problem Summary:

Release note

None

Check List (For Author)

  • Test

    • Regression test
    • Unit Test
    • Manual test (add detailed scripts or steps below)
    • No need to test or manual test. Explain why:
      • This is a refactor/code format and no logic has been changed.
      • Previous test can cover this change.
      • No code files have been changed.
      • Other reason
  • Behavior changed:

    • No.
    • Yes.
  • Does this need documentation?

    • No.
    • Yes.

Check List (For Reviewer who merge this PR)

  • Confirm the release note
  • Confirm test cases
  • Confirm document
  • Add branch pick label

@hello-stephen
Copy link
Copy Markdown
Contributor Author

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@hello-stephen
Copy link
Copy Markdown
Contributor Author

/review

@hello-stephen
Copy link
Copy Markdown
Contributor Author

skip buildall

zclllyybb
zclllyybb previously approved these changes Apr 15, 2026
@zclllyybb
Copy link
Copy Markdown
Contributor

/review

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Apr 15, 2026
@github-actions
Copy link
Copy Markdown
Contributor

PR approved by at least one committer and no changes requested.

@github-actions
Copy link
Copy Markdown
Contributor

PR approved by anyone and no changes requested.

Copy link
Copy Markdown
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Findings

  1. : changed-file discovery now relies on . GitHub documents that endpoint as returning a maximum of 3000 files total, so and are truncated on very large PRs. In that case the incremental license config silently skips files after the cutoff, and the workflow can falsely pass unlicensed additions. The previous local path did not have this cap.
  2. : the previous implementation intentionally fell back to a full scan when changed-file discovery failed. Under the default Actions shell (), a transient failure now aborts the step before any fallback is written, turning a recoverable discovery failure into a hard CI failure.

Critical Checkpoint Conclusions

  • Goal of the change: fix license-check failures by adding missing headers to Cloud , exempting non-source artifacts, and tightening PR changed-file detection. The header and ignore-list updates look correct for the touched files, but the workflow rewrite does not fully achieve the goal because it can miss changed files on large PRs and is less robust on discovery failures.
  • Small, clear, focused: mostly yes, but the workflow change introduces a repo-wide CI behavior regression beyond the touched Cloud files.
  • Concurrency: not applicable for these YAML, CMake, and config-file changes.
  • Lifecycle and static initialization: not applicable.
  • Configuration items: no runtime configuration added.
  • Compatibility and incompatible changes: no FE/BE/storage compatibility impact.
  • Parallel code paths: the push and manual paths still run the full scan; the PR path now becomes less complete than those paths.
  • Special conditional checks: the changed-file filtering now depends on a REST API contract with a hard file-count limit and needs an uncapped or fallback approach.
  • Test coverage: I did not see automated coverage or validation for the new changed-file enumeration path.
  • Test result changes: none.
  • Observability: sufficient for this scope.
  • Transaction, persistence, data writes, and FE-BE variable passing: not applicable.
  • Performance: the new API approach is lighter than fetching the base ref, but correctness regresses; correctness should take priority here.
  • Other issues: I did not find problems in the added Cloud license headers or in the new ignore entries themselves.

Copy link
Copy Markdown
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Blocking findings:

  1. .github/workflows/license-eyes.yml: changed-file discovery now relies on GET /pulls/{number}/files, which GitHub documents as returning at most 3000 files total even with pagination. Large PRs will silently skip license checks on files past that cap, and the previous fallback to a full .licenserc.yaml scan on enumeration failure is also gone.
  2. .licenserc.yaml: conf/ubsan_ignorelist.txt and extension/dbt-doris/dev-requirements.txt are Doris-owned text configs that support # comments. Ignoring them weakens header enforcement instead of fixing the missing headers in-place.

Critical checkpoints:

  • Goal of current task: Partially achieved. The PR adds missing ASF headers to Cloud CMakeLists.txt files and updates the workflow/config, but the new workflow regresses correctness for large PRs and the config broadens ignore coverage for first-party files.
  • Minimality/focus: Mostly focused on license-check remediation, but the two new ignores above are broader than necessary.
  • Concurrency: N/A for these CMake/CI/config changes; no concurrency-sensitive runtime code is touched.
  • Lifecycle/static initialization: N/A.
  • Configuration changes: Yes. The workflow/config changes are not fully safe because changed-file enumeration can now be incomplete or fail hard.
  • Compatibility/incompatible change: No runtime, storage, or FE/BE protocol compatibility impact.
  • Parallel code paths: The same pull_request_target license-check path is used for future PRs, so the changed-file enumeration regression affects all large PRs on that path.
  • Special conditional checks: The .licenserc.yaml shortcut itself is fine, but the underlying file list source is now bounded by the API limit.
  • Test coverage: No tests were added. For CI-only changes that can be acceptable, but there is no validation here for large-PR or API-failure behavior.
  • Test result changes: None.
  • Observability: No additional observability is needed; the main issue is correctness/reliability.
  • Transaction/persistence/data writes/FE-BE variable passing: N/A.
  • Performance: Acceptable; the concern is correctness and robustness, not runtime cost.
  • Other issues: None beyond the two findings above.

Requesting changes.

Comment thread .github/workflows/license-eyes.yml Outdated
echo "config_file=.licenserc.yaml" >> "$GITHUB_OUTPUT"
exit 0
fi
all=$(gh api repos/${{ github.repository }}/pulls/${{ github.event.pull_request.number }}/files \
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using GET /pulls/{number}/files here introduces a correctness regression. GitHub documents that this endpoint returns at most 3000 files total even with pagination, so on any larger PR all/acmr becomes incomplete and skywalking-eyes will silently skip license checking for files past that cap. It also removes the previous fallback to a full .licenserc.yaml scan when enumeration fails, because a transient gh api error now aborts the step under bash -e. Could we keep the local git diff "${base_sha}...HEAD" path (or another uncapped source) and only use the API as a fallback?

Comment thread .licenserc.yaml
- "docker/thirdparties/docker-compose/iceberg/spark-defaults.conf.tpl"
- "docker/thirdparties/docker-compose/postgresql/certs/**"
- "conf/mysql_ssl_default_certificate/*"
- "conf/mysql_ssl_default_certificate/client_certificate/ca.pem"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not think these first-party text configs should be excluded from header checking. Both conf/ubsan_ignorelist.txt and extension/dbt-doris/dev-requirements.txt accept # comments, and comparable files in-tree already carry ASF headers (pytest/requirements.txt, tools/cost_model_evaluate/requirements.txt, extension/dbt-doris/tox.ini, etc.). Adding them to paths-ignore weakens license coverage instead of fixing the missing headers in-place.

yiguolei
yiguolei previously approved these changes Apr 16, 2026
@hello-stephen hello-stephen dismissed stale reviews from yiguolei and zclllyybb via 9199eca April 16, 2026 03:27
@github-actions github-actions bot removed the approved Indicates a PR has been approved by one committer. label Apr 16, 2026
@hello-stephen
Copy link
Copy Markdown
Contributor Author

/review

Copy link
Copy Markdown
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Blocking finding:

  1. .github/workflows/license-eyes.yml: the PR path now makes a second gh api pulls/{number}/files call to compute acmr, but unlike the first call it has no fallback. GitHub Actions runs run: steps under bash -e, so a transient API failure here (for example a 5xx or secondary rate limit) aborts the job even though we already had enough information to fall back to the full .licenserc.yaml scan. The old local git diff --diff-filter=ACMR path did not add this extra remote-failure point.

Critical checkpoint conclusions:

  • Goal of current task: Mostly achieved. The PR adds the missing ASF headers and the new ignore entries for the touched data/license files look reasonable, but the workflow rewrite still leaves the PR license-check path less robust than intended.
  • Minimality/focus: Yes. The changes are focused on license-check remediation.
  • Concurrency: N/A for these workflow/config/CMake/text-file changes.
  • Lifecycle/static initialization: N/A.
  • Configuration changes: Yes. The workflow/config behavior changed, and the remaining issue is in that CI configuration path.
  • Compatibility/incompatible change: No runtime, storage, or FE/BE compatibility impact.
  • Parallel code paths: The pull_request_target incremental-check path is affected; push/manual paths still use the full scan.
  • Special conditional checks: The new .licenserc.yaml and >= 3000 files fallbacks are good. The remaining concern is the unguarded second API call.
  • Test coverage: No dedicated validation was added for the new workflow branch or its failure handling.
  • Test result changes: None.
  • Observability: Existing workflow logs are sufficient for this scope.
  • Transaction/persistence/data writes/FE-BE variable passing: N/A.
  • Performance: Acceptable overall, but the extra API call adds avoidable failure surface.
  • Other issues: I did not find problems in the added ASF headers themselves.

Requesting changes.

Comment thread .github/workflows/license-eyes.yml Outdated
@hello-stephen
Copy link
Copy Markdown
Contributor Author

/review

@github-actions
Copy link
Copy Markdown
Contributor

PR approved by at least one committer and no changes requested.

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Apr 16, 2026
Copy link
Copy Markdown
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No blocking findings.

Critical checkpoint conclusions:

  • Goal of the current task: Achieved. The PR adds the missing ASF headers to the reported files and updates the pull_request_target license workflow to enumerate changed files through the GitHub API, with full-scan fallbacks for API failure, .licenserc.yaml changes, and the documented 3000-file API cap. The current License Check run passed on this PR.
  • Minimality/focus: Yes. The change stays narrowly scoped to license-check remediation.
  • Concurrency: Not applicable for these workflow, config, CMake, and text-file changes.
  • Lifecycle/static initialization: Not applicable.
  • Configuration changes: Yes, CI configuration changed. The new behavior is dynamically effective without restart and retains a safe full-scan fallback when incremental enumeration is incomplete or unavailable.
  • Compatibility/incompatible changes: No runtime, FE/BE protocol, storage-format, or rolling-upgrade impact.
  • Parallel code paths: The pull_request_target incremental path remains aligned with the push/manual full-scan behavior through the fallback path; I did not find another parallel path that also needs updating.
  • Special conditional checks: The .licenserc.yaml and >= 3000 files guards are appropriate because they prevent running an incomplete incremental scan.
  • Test coverage: No dedicated tests were added, which is acceptable for this CI-only fix. Residual risk is limited to fallback branches that are not directly exercised by this PR.
  • Test result changes: None.
  • Observability: Existing workflow logs are sufficient for this scope.
  • Transaction/persistence/data writes/FE-BE variable passing: Not applicable.
  • Performance: Acceptable. The API-based file enumeration avoids the extra base-branch fetch and is not on a product hot path.
  • Other issues: I did not find blocking problems in the new .licenserc.yaml ignore entries or the added ASF headers.

From an automated review perspective, this PR is approvable.

@hello-stephen
Copy link
Copy Markdown
Contributor Author

skip buildall

@hello-stephen hello-stephen merged commit 1e0af3f into apache:master Apr 16, 2026
32 checks passed
@hello-stephen hello-stephen deleted the 0415-license-check branch April 16, 2026 06:44
hakanuzum added a commit to hakanuzum/doris that referenced this pull request Apr 16, 2026
### What problem does this PR solve?

Issue Number:

Related PR: apache#62536

Problem Summary: Empty commit to retrigger GitHub Actions so License Check uses the updated workflow merged in apache#62534.

### Release note

None

### Check List (For Author)

- Test: No need to test (CI)
- Behavior changed: No
- Does this need documentation: No
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by one committer. reviewed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants