Skip to content

Conversation

anlowee
Copy link
Member

@anlowee anlowee commented Jun 19, 2025

Description

As mentioned in #8 :
For clang-15 build, there is an error cause the CI failure. By gcc --help=warnings | grep conversion there is no -Wimplicit-const-int-float-conversion, but in clang, there is such a flag (added in commit). So the root cause is the way that velox configures the flags is first using -Wall then manually disabling some specific flags, so this -Wimplicit-const-int-float-conversion is missed for clang due to the difference of sets of warning flags. This is why it can be built with gcc but not with clang. Since we don't modify the code, this issue should also exist in main repo's 0.293, will try to submit a fix PR to the OSS as well.

A few more things about the error mentioned above. After applying the fix similar to this commit, there is another place triggers the error (in this CI failure).

To fix this, to minimize the modification to the source code, we simply exempt this warning flag.

Checklist

  • The PR satisfies the contribution guidelines.
  • This is a breaking change and that has been indicated in the PR title, OR this isn't a
    breaking change.
  • Necessary docs have been updated, OR no docs need to be updated.

Validation performed

Passed the CI: https://github.com/y-scope/velox/actions/runs/15760719147/job/44426525776?pr=12

Summary by CodeRabbit

Summary by CodeRabbit

  • New Features

    • Introduced pre-commit hook configuration for automated code quality checks.
    • Added configuration files for code style, linting, and static analysis.
    • Added internal Spark SQL function to enforce string length limits during writes.
  • Documentation

    • Updated coding style guide to describe pre-commit workflow.
    • Made extensive whitespace and formatting corrections across documentation.
    • Added or corrected license headers in several files.
  • Chores

    • Added or updated configuration files for CI workflows, formatting, and linting.
    • Improved environment and workflow YAML formatting.
    • Updated CI to run pre-commit checks and enhanced ccache handling.
  • Refactor

    • Removed Makefile targets and legacy scripts for formatting and header checks, consolidating to pre-commit.
    • Modernized type hints in Python interface files.
  • Style

    • Standardized formatting in CMake, Docker, and Python files.
    • Cleaned up whitespace and comments in various scripts and documentation.
  • Bug Fixes

    • Fixed minor typos in documentation and comments.
  • Tests

    • Updated test output formatting and variable naming for clarity.
  • Revert

    • None.

assignUser and others added 20 commits June 18, 2025 07:04
Summary:
This PR adds a [`.pre-commit-config.yaml`](https://pre-commit.com) with a number of hooks that mirror our previous `check.py` and some new ones to further improve code quality.

Hooks to replace check.py:
- cmake-format
- clang-format
- license header check
- ruff (python formatting and linting)
- clang-tidy (has to be called manually)

New Hooks:
- general quality checks
  - trailing white space removal
  - end of file fixer
  - large file check (is in the default config, we can probably drop it?)
  - check that files with shebangs are actually executable
  - and that executable scripts have shebangs
- Several checks to improve our GitHub Actions quality
  - zizmor, a security focused linter
  - action-validator, make sure actions are valid! (we have had issues with that before) also checks if paths globs are correct
  - yamlfmt/yamllint make sure we use a unified style in our yaml/workflows

I have added all fixes in this PR ~~but we could also merge them as a separate PR so it can be easier ignored when blaming.~~ Actually I think it is better to just handle everything in this PR to avoid causing a bunch of merge conflicts.

Compared to `check.py` this is also ***much*** faster. Running all hooks on all files in the repo takes under ten seconds!

Pull Request resolved: facebookincubator#13361

Reviewed By: kKPulla

Differential Revision: D76274374

Pulled By: kgpai

fbshipit-source-id: ba23f75fd1ff6e05d0e2a41d791028ac0220da56
Copy link

coderabbitai bot commented Jun 19, 2025

Walkthrough

This update introduces a comprehensive migration to pre-commit for code formatting, linting, and license header checks, replacing previous Makefile and Python script-based workflows. It adds and configures pre-commit, clang-tidy, yamllint, and related tools, updates documentation accordingly, and removes redundant scripts and Makefile targets. Numerous files receive minor formatting, whitespace, and licensing header corrections.

Changes

File(s) / Group Change Summary
.pre-commit-config.yaml, .clang-tidy, .yamllint.yml, .yamlfmt.yml Added new configuration files for pre-commit, clang-tidy, YAML linting, and YAML formatting.
Makefile, scripts/check.py, scripts/git-clang-format Removed Makefile targets and Python scripts for formatting, header, and Docker container checks; replaced by pre-commit workflow.
CODING_STYLE.md Updated documentation to describe pre-commit workflow, replacing previous Makefile/script instructions.
.github/workflows/preliminary_checks.yml Added new GitHub Actions workflow to run pre-commit on PRs and pushes.
.github/workflows/linux-build.yml, .github/workflows/linux-build-base.yml, .github/workflows/docs.yml, .github/workflows/scheduled.yml Reformatted YAML, updated triggers, added/adjusted ccache steps, switched to Clang for builds, removed workflow_dispatch.
.github/ISSUE_TEMPLATE/*.yml, .github/dependabot.yml, .github/CODEOWNERS Reformatted YAML: removed quotes, adjusted lists, fixed whitespace.
CMake/Findlz4.cmake, CMake/Findlzo2.cmake, CMake/Findzstd.cmake, velox/exec/tests/utils/TableScanTestBase.cpp Added Apache License 2.0 headers or updated license comments.
CMake/Findglog.cmake, CMake/FindSodium.cmake Reformatted CMake commands and/or added license headers.
CMakeLists.txt Added Clang warning suppression flag for -Wno-implicit-const-int-float-conversion.
CONTRIBUTING.md, NOTICE.txt, various docs and scripts Fixed trailing spaces, blank lines, or minor typos.
python/pyvelox/__init__.py, python/pyvelox/utils/__init__.py, python/pyvelox/utils/data_generator.py, python/test/test_plan_builder.py, python/test/test_vector.py Minor linter suppressions, import aliasing, whitespace, and test expectation formatting.
scripts/checks/license-header.py Changed default license header path.
scripts/checks/run-clang-tidy.py, scripts/checks/util.py, scripts/ci/benchmark-runner.py, scripts/ci/bm-report/build-metrics.py Minor import, type check, or exception handling improvements.
velox/python/plan_builder/plan_builder.pyi Switched type hints from List/Dict to built-in list/dict.
velox/docs/functions/spark/string.rst Added documentation for new internal function varchar_type_write_side_check.
Various velox/docs/, velox/python/*.pyi, docker-compose.yml, environment YAMLs, and other scripts Whitespace, formatting, or shebang line removals; minor doc updates.

Sequence Diagram(s)

sequenceDiagram
    participant Developer
    participant Pre-commit
    participant Git
    participant CI

    Developer->>Git: git commit / git push
    activate Pre-commit
    Pre-commit->>Developer: Run hooks (formatting, lint, license, etc.)
    Pre-commit->>Developer: Stash unstaged files, check/fix staged files
    Pre-commit-->>Developer: Block commit if changes made or issues found
    deactivate Pre-commit
    Git->>CI: Push triggers CI
    CI->>Pre-commit: Run all hooks on changed files
    Pre-commit->>CI: Report status
Loading

Suggested reviewers

  • kirkrodrigues

📜 Recent review details

Configuration used: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 65fc6b2 and ad575dc.

📒 Files selected for processing (1)
  • .github/workflows/linux-build.yml (1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (2)
  • GitHub Check: Build
  • GitHub Check: Build with GCC / Ubuntu release with resolve_dependency
🔇 Additional comments (3)
.github/workflows/linux-build.yml (3)

20-20: Standardize branch name quoting in triggers.

The branch pattern is now unquoted (- presto-0.293-clp-connector), aligning with other workflows. This is syntactically valid in YAML and improves consistency.


22-30: Confirm removal of the third_party/** path filter.

The push trigger’s path list has been reformatted (quotes removed) and notably no longer includes third_party/**. Please verify that omitting third_party changes from triggering CI is intentional and doesn’t disable necessary builds for that directory.


35-44: Align pull_request path filters with push configuration.

The pull_request paths were likewise de-quoted and now exclude third_party/**. Confirm that this matches the intended trigger strategy and preserves coverage for critical code changes.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Explain this complex logic.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai explain this code block.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and explain its main purpose.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 27

🔭 Outside diff range comments (2)
.pre-commit-config.yaml (2)

1-117: Mismatch between PR objectives and the provided changes
The PR description indicates a fix for the clang build failure due to -Wimplicit-const-int-float-conversion in the CMake build setup, but this new file only adds pre-commit hooks for formatting and linting. No build-flag adjustments or CMake modifications are present here. Please ensure the intended warning-suppression changes are included in the appropriate build configuration.


113-117: Action-validator hook needs file filtering
By default it may run on all YAML files. You could scope it to .github/workflows/.*\.(yml|yaml)$ to limit its coverage to GitHub Actions workflows.

📜 Review details

Configuration used: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 82aa6bd and 65fc6b2.

⛔ Files ignored due to path filters (8)
  • static/icon.svg is excluded by !**/*.svg
  • static/logo.svg is excluded by !**/*.svg
  • website/static/img/banner-pattern.svg is excluded by !**/*.svg
  • website/static/img/icon-commits.svg is excluded by !**/*.svg
  • website/static/img/icon-contributors.svg is excluded by !**/*.svg
  • website/static/img/icon-github-star.svg is excluded by !**/*.svg
  • website/static/img/logo.svg is excluded by !**/*.svg
  • website/static/img/velox-logo.svg is excluded by !**/*.svg
📒 Files selected for processing (107)
  • .clang-tidy (1 hunks)
  • .cmake-format.yaml (2 hunks)
  • .github/CODEOWNERS (3 hunks)
  • .github/ISSUE_TEMPLATE/bug.yml (2 hunks)
  • .github/ISSUE_TEMPLATE/build.yml (2 hunks)
  • .github/ISSUE_TEMPLATE/enhancement.yml (1 hunks)
  • .github/ISSUE_TEMPLATE/fuzzer.yml (2 hunks)
  • .github/dependabot.yml (1 hunks)
  • .github/workflows/docs.yml (3 hunks)
  • .github/workflows/linux-build-base.yml (1 hunks)
  • .github/workflows/linux-build.yml (2 hunks)
  • .github/workflows/preliminary_checks.yml (1 hunks)
  • .github/workflows/scheduled.yml (1 hunks)
  • .github/zizmor.yml (1 hunks)
  • .pre-commit-config.yaml (1 hunks)
  • .yamlfmt.yml (1 hunks)
  • .yamllint.yml (1 hunks)
  • CMake/FindSodium.cmake (2 hunks)
  • CMake/Findglog.cmake (2 hunks)
  • CMake/Findlz4.cmake (1 hunks)
  • CMake/Findlzo2.cmake (1 hunks)
  • CMake/Findzstd.cmake (1 hunks)
  • CMakeLists.txt (1 hunks)
  • CODING_STYLE.md (1 hunks)
  • CONTRIBUTING.md (6 hunks)
  • Makefile (0 hunks)
  • NOTICE.txt (1 hunks)
  • docker-compose.yml (6 hunks)
  • pyproject.toml (0 hunks)
  • python/pyvelox/__init__.py (1 hunks)
  • python/pyvelox/utils/__init__.py (1 hunks)
  • python/pyvelox/utils/data_generator.py (1 hunks)
  • python/test/test_plan_builder.py (1 hunks)
  • python/test/test_vector.py (3 hunks)
  • scripts/check.py (0 hunks)
  • scripts/checks/license-header.py (1 hunks)
  • scripts/checks/run-clang-tidy.py (0 hunks)
  • scripts/checks/util.py (1 hunks)
  • scripts/ci/benchmark-runner.py (2 hunks)
  • scripts/ci/bm-report/build-metrics.py (1 hunks)
  • scripts/ci/bm-report/report.qmd (3 hunks)
  • scripts/ci/hdfs-client.xml (1 hunks)
  • scripts/ci/presto/etc/hive.properties (1 hunks)
  • scripts/ci/presto/start-prestojava.sh (0 hunks)
  • scripts/ci/signature.py (1 hunks)
  • scripts/docker/pyvelox.dockerfile (0 hunks)
  • scripts/git-clang-format (0 hunks)
  • scripts/setup-centos9.sh (0 hunks)
  • scripts/setup-classpath.sh (0 hunks)
  • scripts/setup-helper-functions.sh (0 hunks)
  • scripts/velox_env_linux.yml (2 hunks)
  • scripts/velox_env_mac.yml (2 hunks)
  • velox/docs/bindings/python/arrow.rst (1 hunks)
  • velox/docs/bindings/python/index.rst (0 hunks)
  • velox/docs/bindings/python/legacy.rst (1 hunks)
  • velox/docs/bindings/python/runners.rst (1 hunks)
  • velox/docs/bindings/python/vector.rst (1 hunks)
  • velox/docs/conf.py (1 hunks)
  • velox/docs/develop/aggregations.rst (3 hunks)
  • velox/docs/develop/debugging/print-plan-with-stats.rst (0 hunks)
  • velox/docs/develop/dynamic-loading.rst (1 hunks)
  • velox/docs/develop/operators.rst (4 hunks)
  • velox/docs/ext/issue.py (1 hunks)
  • velox/docs/ext/pr.py (1 hunks)
  • velox/docs/ext/spark.py (1 hunks)
  • velox/docs/functions/presto/binary.rst (2 hunks)
  • velox/docs/functions/presto/window.rst (1 hunks)
  • velox/docs/functions/spark/aggregate.rst (3 hunks)
  • velox/docs/functions/spark/array.rst (1 hunks)
  • velox/docs/functions/spark/binary.rst (1 hunks)
  • velox/docs/functions/spark/bitwise.rst (2 hunks)
  • velox/docs/functions/spark/comparison.rst (2 hunks)
  • velox/docs/functions/spark/coverage.rst (1 hunks)
  • velox/docs/functions/spark/datetime.rst (0 hunks)
  • velox/docs/functions/spark/decimal.rst (1 hunks)
  • velox/docs/functions/spark/json.rst (2 hunks)
  • velox/docs/functions/spark/misc.rst (1 hunks)
  • velox/docs/functions/spark/string.rst (10 hunks)
  • velox/docs/monitoring.rst (1 hunks)
  • velox/docs/monthly-updates/2021/december-2021.rst (1 hunks)
  • velox/docs/monthly-updates/2021/november-2021.rst (1 hunks)
  • velox/docs/monthly-updates/2022/april-2022.rst (1 hunks)
  • velox/docs/monthly-updates/2022/august-2022.rst (1 hunks)
  • velox/docs/monthly-updates/2022/july-2022.rst (1 hunks)
  • velox/docs/monthly-updates/2022/june-2022.rst (1 hunks)
  • velox/docs/monthly-updates/2022/may-2022.rst (1 hunks)
  • velox/docs/monthly-updates/2022/october-2022.rst (1 hunks)
  • velox/docs/monthly-updates/2023/august-2023.rst (1 hunks)
  • velox/docs/monthly-updates/2023/december-2023.rst (1 hunks)
  • velox/docs/monthly-updates/2023/november-2023.rst (1 hunks)
  • velox/docs/monthly-updates/january-2024.rst (0 hunks)
  • velox/docs/monthly-updates/july-2024.rst (1 hunks)
  • velox/dwio/common/tests/Lemire/FastPFor/LICENSE (1 hunks)
  • velox/dwio/dwrf/test/CommonTests.cpp (2 hunks)
  • velox/dwio/parquet/thrift/ParquetThriftTypes.cpp (28 hunks)
  • velox/exec/tests/utils/TableScanTestBase.cpp (1 hunks)
  • velox/experimental/breeze/test/generator_common.py (1 hunks)
  • velox/experimental/breeze/test/kernel_generator.py (2 hunks)
  • velox/experimental/breeze/test/test_fixture_generator.py (5 hunks)
  • velox/experimental/cudf/.clang-format (1 hunks)
  • velox/experimental/wave/README.md (1 hunks)
  • velox/functions/prestosql/coverage/README.md (1 hunks)
  • velox/python/arrow/arrow.pyi (0 hunks)
  • velox/python/file/file.pyi (0 hunks)
  • velox/python/plan_builder/plan_builder.pyi (2 hunks)
  • velox/python/runner/runner.pyi (0 hunks)
  • velox/python/type/type.pyi (0 hunks)
⛔ Files not processed due to max files limit (17)
  • velox/python/vector/vector.pyi
  • velox/substrait/tests/data/q1_first_stage.json
  • velox/substrait/tests/data/q6_first_stage.json
  • website/README.md
  • website/blog/2023-03-15-build-experience.mdx
  • website/blog/2024-05-31-optimize-try-more.mdx
  • website/blog/2024-08-23-ci-migration.mdx
  • website/blog/2025-03-25-velox-primer-part-2.mdx
  • website/blog/2025-05-12-velox-primer-part-3.mdx
  • website/blog/authors.yml
  • website/docs/community/01-design-philosophy.md
  • website/docs/community/02-technical-governance.md
  • website/docs/community/03-components-and-maintainers.md
  • website/docs/community/index.md
  • website/src/components/HomepageFeatures/index.js
  • website/src/components/VeloxConBanner/index.js
  • website/src/pages/index.js
💤 Files with no reviewable changes (18)
  • scripts/docker/pyvelox.dockerfile
  • velox/python/runner/runner.pyi
  • velox/docs/monthly-updates/january-2024.rst
  • scripts/checks/run-clang-tidy.py
  • velox/python/arrow/arrow.pyi
  • velox/docs/develop/debugging/print-plan-with-stats.rst
  • scripts/setup-centos9.sh
  • scripts/ci/presto/start-prestojava.sh
  • velox/python/file/file.pyi
  • pyproject.toml
  • velox/python/type/type.pyi
  • scripts/setup-classpath.sh
  • velox/docs/functions/spark/datetime.rst
  • scripts/setup-helper-functions.sh
  • velox/docs/bindings/python/index.rst
  • Makefile
  • scripts/git-clang-format
  • scripts/check.py
🧰 Additional context used
🧠 Learnings (1)
CONTRIBUTING.md (1)
Learnt from: anlowee
PR: y-scope/velox#1
File: CMakeLists.txt:458-483
Timestamp: 2025-03-10T05:55:18.722Z
Learning: When contributing to the Velox project, prefer adding new code on top of existing scripts rather than restructuring the original code organization, especially for dependency management.
🪛 Pylint (3.3.7)
velox/docs/conf.py

[warning] 33-34: No exception type(s) specified

(W0702)

python/pyvelox/utils/__init__.py

[convention] 15-15: Import alias does not rename original package

(C0414)

python/pyvelox/utils/data_generator.py

[convention] 160-160: Line too long (122/100)

(C0301)


[error] 160-160: Instance of 'dict' has no 'row_count' member

(E1101)


[error] 160-160: Instance of 'dict' has no 'file_count' member

(E1101)


[error] 160-160: Instance of 'dict' has no 'output_path' member

(E1101)

scripts/checks/util.py

[convention] 40-40: Use isinstance() rather than type() for a typecheck.

(C0123)

velox/experimental/breeze/test/generator_common.py

[error] 74-74: Unable to import 'clang.cindex'

(E0401)


[convention] 74-74: Import "import clang.cindex" should be placed at the top of the module

(C0413)


[error] 75-75: Unable to import 'clang.cindex'

(E0401)


[convention] 75-75: Import "from clang.cindex import CursorKind" should be placed at the top of the module

(C0413)

python/pyvelox/__init__.py

[error] 15-15: Unable to import 'pyvelox.legacy'

(E0401)


[error] 16-16: Unable to import 'pyvelox.legacy'

(E0401)


[convention] 16-16: Import alias does not rename original package

(C0414)

scripts/ci/benchmark-runner.py

[warning] 84-84: Catching too general exception Exception

(W0718)

scripts/ci/signature.py

[convention] 149-149: Line too long (143/100)

(C0301)

python/test/test_vector.py

[error] 191-191: Module 'pyvelox' has no 'from_list' member

(E1101)


[error] 194-194: Module 'pyvelox' has no 'from_list' member

(E1101)


[error] 197-197: Module 'pyvelox' has no 'from_list' member

(E1101)


[error] 274-274: Module 'pyvelox' has no 'from_list' member

(E1101)


[error] 275-275: Module 'pyvelox' has no 'from_list' member

(E1101)

🪛 Ruff (0.11.9)
scripts/checks/license-header.py

32-32: Trailing comma missing

Add trailing comma

(COM812)

python/pyvelox/utils/data_generator.py

160-160: Logging statement uses f-string

(G004)


160-160: Trailing comma missing

Add trailing comma

(COM812)

scripts/ci/benchmark-runner.py

84-84: Do not catch blind exception: Exception

(BLE001)

scripts/ci/bm-report/build-metrics.py

127-127: Missing return type annotation for private function ms2sec

(ANN202)


127-127: Missing type annotation for function argument x

(ANN001)

python/test/test_vector.py

193-193: Use pytest.raises instead of unittest-style assertRaises

Replace assertRaises with pytest.raises

(PT027)


196-196: Use pytest.raises instead of unittest-style assertRaises

Replace assertRaises with pytest.raises

(PT027)


340-340: Use a regular assert instead of unittest-style assertEqual

Replace assertEqual(...) with assert ...

(PT009)


342-342: Use a regular assert instead of unittest-style assertEqual

Replace assertEqual(...) with assert ...

(PT009)


344-344: Use pytest.raises instead of unittest-style assertRaises

Replace assertRaises with pytest.raises

(PT027)

🪛 LanguageTool
CONTRIBUTING.md

[uncategorized] ~210-~210: A comma might be missing here.
Context: ...h information at a time; beyond that threshold the ability to find bugs and other flaw...

(AI_EN_LECTOR_MISSING_PUNCTUATION_COMMA)


[style] ~235-~235: As an alternative to the over-used intensifier ‘very’, consider replacing this phrase.
Context: ...s possible, try to avoid functions with very large bodies. In the (rare) cases where la...

(EN_WEAK_ADJECTIVE)


[uncategorized] ~241-~241: Possible missing article found.
Context: ...upport your claims. * As needed, add micro-benchmark to objectively evaluate perfo...

(AI_HYDRA_LEO_MISSING_A)

velox/experimental/wave/README.md

[typographical] ~28-~28: Use a comma after an introductory phrase.
Context: ...al CUDA driver installed (like our CI). For this reason we need to use the base functions to cr...

(COMMA_INTRODUCTORY_WORDS_PHRASES)

CODING_STYLE.md

[uncategorized] ~45-~45: Possible missing comma found.
Context: ...The clang-tidy hook will not be run automatically as it takes a long time and requires CM...

(AI_HYDRA_LEO_MISSING_COMMA)

🪛 actionlint (1.7.7)
.github/workflows/linux-build-base.yml

29-29: label "yscope-gh-runner" is unknown. available labels are "windows-latest", "windows-latest-8-cores", "windows-2025", "windows-2022", "windows-2019", "ubuntu-latest", "ubuntu-latest-4-cores", "ubuntu-latest-8-cores", "ubuntu-latest-16-cores", "ubuntu-24.04", "ubuntu-24.04-arm", "ubuntu-22.04", "ubuntu-22.04-arm", "ubuntu-20.04", "macos-latest", "macos-latest-xl", "macos-latest-xlarge", "macos-latest-large", "macos-15-xlarge", "macos-15-large", "macos-15", "macos-14-xl", "macos-14-xlarge", "macos-14-large", "macos-14", "macos-13-xl", "macos-13-xlarge", "macos-13-large", "macos-13", "self-hosted", "x64", "arm", "arm64", "linux", "macos", "windows". if it is a custom label for self-hosted runner, set list of labels in actionlint.yaml config file

(runner-label)

🪛 markdownlint-cli2 (0.17.2)
CODING_STYLE.md

27-27: Multiple consecutive blank lines
Expected: 1; Actual: 2

(MD012, no-multiple-blanks)

⏰ Context from checks skipped due to timeout of 90000ms (2)
  • GitHub Check: Build with Clang / Ubuntu release with resolve_dependency
  • GitHub Check: pre-commit
🔇 Additional comments (131)
scripts/ci/hdfs-client.xml (1)

7-7: Add missing newline at end-of-file for POSIX compliance
Ensures tools expecting a trailing newline don’t raise warnings and avoids unnecessary diff noise.

velox/docs/ext/issue.py (1)

16-16: Remove unused utils import from docutils.
The module doesn’t reference utils, so dropping it keeps dependencies lean and code clearer.

velox/docs/ext/pr.py (1)

16-16: Remove unused utils import from docutils.
As utils isn’t used here, its removal declutters the import list without side effects.

velox/docs/monitoring.rst (2)

1-10: Inconsistent AI summary The AI-generated summary details a broad migration to pre-commit and tooling removal, but this file only had a formatting update (newline/whitespace adjustment).

Likely an incorrect or invalid review comment.


9-9: Formatting-only change Added a trailing newline for POSIX compliance.

scripts/ci/presto/etc/hive.properties (1)

4-4: Verify Hive connector property The line adds hive.allow-drop-table=true. Confirm that this key is supported by your Hive connector version and update documentation if needed.

velox/docs/monthly-updates/2022/may-2022.rst (1)

67-67: Trailing newline added Inserted a newline at end-of-file for consistent formatting.

velox/docs/monthly-updates/2022/october-2022.rst (1)

64-64: Trailing newline added Inserted a newline at end-of-file for consistent formatting.

velox/docs/monthly-updates/2022/july-2022.rst (1)

58-58: Trailing newline normalization Removed extraneous newline at end-of-file to match repository conventions.

velox/dwio/common/tests/Lemire/FastPFor/LICENSE (1)

191-192: Add missing newline at EOF.

A trailing newline ensures POSIX compliance and maintains consistency across files.

velox/docs/monthly-updates/2023/november-2023.rst (1)

105-105: Add missing newline at EOF.

Ensures consistent file termination in documentation.

velox/docs/monthly-updates/2022/august-2022.rst (1)

88-88: Add missing newline at EOF.

Uniform trailing newline for formatting consistency.

velox/docs/monthly-updates/2023/august-2023.rst (1)

85-85: Add missing newline at EOF.

Standardize file endings across monthly update docs.

velox/docs/monthly-updates/july-2024.rst (1)

128-128: Add missing newline at EOF.

Uniform trailing newline for compliance with repository style.

velox/docs/functions/presto/window.rst (1)

171-171: Add trailing newline for POSIX compliance
This ensures the file ends with a newline character, matching the repository’s formatting conventions.

velox/docs/monthly-updates/2021/december-2021.rst (1)

38-38: Remove trailing whitespace and add newline
Formatting only – cleans up a stray space and ensures the file ends with a newline for consistency.

velox/docs/bindings/python/vector.rst (1)

8-8: Ensure file ends with a newline
Purely a formatting change to maintain consistent end-of-file newlines.

velox/docs/bindings/python/runners.rst (1)

8-8: Add missing newline at EOF
Matches the repository’s style guidelines for documentation files.

velox/docs/monthly-updates/2023/december-2023.rst (1)

99-99: Insert newline to terminate file properly
Minor formatting update to ensure consistent file endings across docs.

velox/experimental/cudf/.clang-format (1)

27-27: Add missing trailing newline.
This insertion ensures POSIX-compliant EOF handling and avoids warnings in some tools.

velox/docs/monthly-updates/2022/april-2022.rst (1)

59-59: Add missing trailing newline.
Ensures consistent file termination and clean diffs.

NOTICE.txt (1)

13-13: Remove duplicated article “the.”
Corrects the typo in the Arrow project attribution.

velox/docs/bindings/python/arrow.rst (1)

8-8: Ensure trailing newline at EOF.
Adds a final newline for consistent formatting across documentation files.

velox/docs/monthly-updates/2022/june-2022.rst (1)

67-67: Add missing trailing newline.
Maintains uniform file endings and avoids tooling hiccups.

velox/docs/monthly-updates/2021/november-2021.rst (1)

38-38: Add newline at EOF.
Ensures proper file termination and consistency across docs.

velox/docs/functions/spark/bitwise.rst (2)

7-7: Remove trailing whitespace.
No semantic change; aligns with pre-commit formatting rules.


50-50: Add newline at EOF.
Ensures consistent file termination.

velox/docs/develop/dynamic-loading.rst (1)

15-15: Remove trailing whitespace.
No substantive change; maintains consistent formatting under pre-commit.

velox/docs/bindings/python/legacy.rst (1)

12-12: Add newline at EOF.
Aligns with repository-wide formatting conventions.

velox/docs/functions/spark/binary.rst (1)

35-37: Minor whitespace cleanup.
Removed trailing spaces on the might_contain description. No semantic changes.

CMake/Findlzo2.cmake (1)

2-14: Add Apache 2.0 license header.
Standard Apache License, Version 2.0 header inserted correctly. No functional changes.

CMake/Findlz4.cmake (1)

2-14: Add Apache 2.0 license header.
Standard Apache License, Version 2.0 header applied consistently across CMake modules.

velox/docs/functions/spark/misc.rst (1)

7-7: Minor whitespace cleanup.
Removed trailing space in the at_least_n_non_nulls function description. No content change.

CMake/Findzstd.cmake (1)

2-14: Add Apache 2.0 license header.
Standard header matches other CMake find modules and introduces no logical changes.

velox/experimental/breeze/test/generator_common.py (1)

74-75: LGTM! Proper use of linter suppression directives.

The # noqa E402 comments correctly suppress import placement warnings since these imports must occur after the dynamic path setup code above. This is a legitimate use case for the noqa directive.

scripts/velox_env_mac.yml (1)

25-25: LGTM! Consistent comment formatting.

The addition of spaces before comment markers improves consistency with YAML formatting conventions.

Also applies to: 39-39

velox/docs/functions/spark/array.rst (1)

265-265: LGTM! Documentation whitespace cleanup.

Removal of trailing whitespace improves the document's consistency and aligns with code quality standards.

velox/docs/develop/operators.rst (1)

319-319: LGTM! Documentation whitespace cleanup.

Removal of trailing spaces from text and code blocks improves consistency and aligns with documentation quality standards.

Also applies to: 350-350, 392-393, 400-400, 412-412

.github/dependabot.yml (1)

17-18: LGTM! Improved YAML formatting.

Removing unnecessary double quotes around string values aligns with YAML best practices and improves consistency with other GitHub configuration files in the repository.

Also applies to: 20-20, 22-22

scripts/velox_env_linux.yml (1)

25-25: LGTM! Good formatting consistency.

The addition of spaces after # in comments improves readability and aligns with YAML style conventions.

Also applies to: 40-40

.github/CODEOWNERS (1)

19-19: LGTM! Good whitespace cleanup.

Removing trailing whitespace improves code cleanliness and consistency.

Also applies to: 32-32

velox/exec/tests/utils/TableScanTestBase.cpp (1)

1-15: LGTM! Proper license header update.

The Apache 2.0 license header provides clear licensing terms and aligns with the repository's licensing standardization effort.

velox/experimental/breeze/test/kernel_generator.py (1)

109-109: LGTM! Good Python style improvement.

Using is not None is more idiomatic than != None and aligns with PEP 8 recommendations for None comparisons.

velox/docs/functions/spark/comparison.rst (1)

30-30: Whitespace cleanup approved
This change simply trims trailing space; no content alteration.

velox/experimental/wave/README.md (1)

20-20: Formatting fix approved
Removed trailing whitespace; documentation content remains unchanged.

CONTRIBUTING.md (1)

7-7: Whitespace cleanup approved
All trailing spaces removed; instructions and formatting remain intact.

Also applies to: 37-37, 44-44, 51-52, 57-58, 64-64, 107-107, 109-109, 115-115, 197-197, 203-203, 209-209, 226-226, 268-268, 270-270, 274-274, 278-278

scripts/ci/bm-report/report.qmd (1)

225-225: Whitespace cleanup approved
Removed extraneous blank lines and trailing spaces; no logic changes.

Also applies to: 323-323, 334-334

scripts/ci/signature.py (1)

149-149: Syntax fix for f-string quoting
The switch to double quotes around new_repeat resolves the string literal conflict.

velox/docs/functions/spark/json.rst (1)

24-33: LGTM! Documentation formatting improvements.

The whitespace cleanup and formatting improvements enhance consistency without affecting the functional content of the documentation.

scripts/ci/benchmark-runner.py (2)

84-84: Good improvement to exception handling specificity.

Changing from a bare except: clause to except Exception: is a best practice as it prevents catching system-exiting exceptions like KeyboardInterrupt and SystemExit, while still handling standard exceptions appropriately.


458-458: Minor formatting improvement to help text.

The single-line help string formatting is more concise while maintaining the same meaning.

velox/docs/ext/spark.py (1)

656-656: String formatting consistency improvement.

The quote usage adjustment aligns with the broader code style standardization effort.

velox/dwio/dwrf/test/CommonTests.cpp (1)

98-98: Whitespace cleanup in test vectors.

Removing trailing whitespace after comma-separated pairs improves code cleanliness without affecting test functionality.

Also applies to: 148-148

.github/zizmor.yml (1)

14-17: Configuration modernization with security analysis rules.

The new YAML structure introduces zizmor security analysis configuration with appropriate ignore rules for trusted publishing. This aligns well with the broader shift towards declarative configuration and modern CI practices.

velox/docs/functions/spark/coverage.rst (1)

84-152: LGTM! Trailing whitespace cleanup improves consistency.

However, there's an inconsistency: the PR objectives mention fixing a clang build failure due to -Wimplicit-const-int-float-conversion, but these changes are purely formatting improvements for trailing whitespace removal.

Likely an incorrect or invalid review comment.

velox/docs/functions/spark/decimal.rst (1)

163-168: LGTM! Documentation formatting improvements enhance readability.

The trailing space removal and code block whitespace cleanup aligns with the broader formatting standardization effort mentioned in the AI summary.

CMake/FindSodium.cmake (3)

270-272: LGTM! Improved formatting enhances readability.

Moving the properties to separate lines with consistent indentation makes the set_target_properties calls much easier to read and maintain.


277-280: LGTM! Consistent formatting improvement.

The multi-line property formatting matches the style used elsewhere and improves maintainability.


297-300: LGTM! Windows-specific properties formatted consistently.

The formatting aligns with the other set_target_properties calls in the file.

velox/experimental/breeze/test/test_fixture_generator.py (5)

37-41: LGTM! Removed unnecessary f-string formatting.

Since there are no variables to interpolate in this string, using a regular multi-line string is more appropriate than an f-string.


266-266: LGTM! Consistent quote style improvement.

Standardizing the quote style enhances code consistency.


417-417: LGTM! Maintains quote style consistency.

This change aligns with the quote style standardization seen elsewhere in the file.


433-438: LGTM! Simplified includes method return type.

Changing from an f-string to a plain triple-quoted string is appropriate since no variable interpolation is needed, making the code cleaner.


544-549: LGTM! Consistent includes method simplification.

This matches the same improvement made to the OpenCL test fixture includes method.

velox/dwio/parquet/thrift/ParquetThriftTypes.cpp (1)

848-863: Skip non-functional formatting changes in Statistics::printTo.
These out << ", " << "field=" consolidations are purely stylistic and don’t alter behavior.

.yamllint.yml (5)

1-13: License header consistency.

The Apache License header matches other project files; no changes required here.


14-24: YAML spacing rules look good.

The braces and brackets settings enforce consistent space usage inside delimiters, aligning with the repository's style.


33-37: Strict float notation may be too restrictive.

Forbidding scientific notation and requiring a numeral before the decimal (e.g. .5) could block valid YAML values. Confirm that these float constraints align with your team’s style guidelines.


41-43: Validate the extra-allowed regex.

The pattern '.*\$\{\{.*\}\}.*' permits unquoted GitHub Actions variables but may be overly permissive. Ensure it doesn’t inadvertently allow other unquoted expressions.


44-46: Truthy value warning.

Limiting truthy values to true, false, and on will flag common alternatives like yes/no. Verify this aligns with existing CI YAML or adjust allowed-values accordingly.

docker-compose.yml (2)

31-31: Consistent unquoted CCACHE_DIR.

Switching CCACHE_DIR to an unquoted path is correct and matches other YAML configs.

Also applies to: 54-54, 81-81, 113-113, 134-134, 155-155


56-58: Refactor to block-style EXTRA_CMAKE_FLAGS.

Using a block scalar (>) for multi-line CMake flags improves readability. Ensure downstream tools correctly consume the flags without unexpected line breaks.

Also applies to: 82-84

.clang-tidy (2)

55-89: Naming conventions align with style guide.

Identifier naming rules correctly reflect the CODING_STYLE.md definitions.


90-96: Modernize options look good.

Settings for modernize-use-nullptr and modernize-use-using match project conventions.

velox/docs/functions/spark/string.rst (1)

444-459: Review varchar_type_write_side_check documentation.

Verify that the examples and error messages accurately reflect the implementation’s behaviour (e.g. trimming logic vs exception wording) and that reST formatting is correct.

velox/docs/conf.py (1)

33-33: LGTM - Appropriate linting suppression.

The noqa comment properly addresses the pylint warning about the bare except clause while maintaining the defensive programming approach for setting the system attribute.

velox/docs/develop/aggregations.rst (1)

121-123: LGTM - Clean whitespace formatting.

Good cleanup of trailing whitespace that improves consistency with formatting standards.

Also applies to: 137-139, 297-297

velox/functions/prestosql/coverage/README.md (1)

18-18: LGTM - Clean formatting.

Good removal of trailing whitespace for consistent formatting.

CMakeLists.txt (2)

1-1: AI summary mismatch
The AI-generated summary describes a pre-commit migration, but this PR only adds a clang warning suppression flag to fix the -Wimplicit-const-int-float-conversion build error.

Likely an incorrect or invalid review comment.


367-372: Add clang warning suppression for implicit conversions
Introducing -Wno-implicit-const-int-float-conversion under the Clang-specific warnings resolves the clang-15 build failure as intended.

.github/ISSUE_TEMPLATE/enhancement.yml (1)

17-17: Standardize YAML label array syntax
Switching from quoted strings to unquoted array elements improves readability and aligns with other issue templates.

velox/docs/functions/presto/binary.rst (1)

80-85: Remove extraneous trailing spaces
Whitespace cleanup around the lpad and rpad documentation ensures consistent formatting across the docs.

.github/ISSUE_TEMPLATE/fuzzer.yml (1)

17-29: Clean up YAML quoting styles
Updating labels and the placeholder to unquoted/single-quoted values standardizes formatting across templates without altering functionality.

velox/docs/functions/spark/aggregate.rst (4)

29-31: Remove unnecessary trailing whitespace
Cleaning up whitespace in the bloom_filter_agg example section improves doc consistency.


51-53: Align function signature block
Fixed stray spaces around the bloom_filter_agg(hash) -> varbinary signature to enhance readability.


165-167: Consolidate description lines
Merged lines in the mode function docs to remove an awkward blank line and trailing spaces.


169-170: Adjust blank line before example
Removed extra blank line preceding the example in the mode docs for uniform formatting.

.github/ISSUE_TEMPLATE/bug.yml (2)

17-17: YAML formatting improvement looks good.

Removing unnecessary quotes from the labels array is a valid YAML simplification that maintains functionality.


29-29: Inconsistent PR content flagged.

While the quote style change is valid, the actual file changes appear to be formatting improvements rather than fixing clang build failures as described in the PR objectives.

Likely an incorrect or invalid review comment.

python/pyvelox/__init__.py (2)

15-15: Appropriate linter warning suppression.

The # noqa: F403 comment correctly suppresses the wildcard import warning, which is reasonable for this type of re-export module.


16-16: Import aliasing style change looks good.

The explicit aliasing syntax clarifies the import intention, even though the Pylint C0414 warning suggests it's unnecessary. This may be intentional for code clarity.

.cmake-format.yaml (3)

27-28: YAML formatting improvements look good.

Removing unnecessary quotes from simple string values (canonical, unchanged) is a valid YAML simplification.


42-45: Consistent indentation improvement.

The change from 2-space to 4-space indentation for flag lists improves consistency and readability.

Also applies to: 52-55


35-35: Minor comment formatting fix.

Good catch on removing the trailing space in the comment.

python/test/test_plan_builder.py (1)

58-58: Test formatting improvement looks good.

Consolidating the multiline string into a single line improves readability while maintaining the same assertion logic.

python/test/test_vector.py (1)

191-191: Appropriate unused variable suppression.

Adding underscore prefixes to variables within assertRaises contexts correctly indicates these are intentionally unused variables, suppressing linter warnings while maintaining test functionality.

Also applies to: 194-194, 197-197, 274-275

.github/ISSUE_TEMPLATE/build.yml (1)

17-17: Styling change approved
Unquoting the label array aligns with other issue templates and maintains consistency.

.github/workflows/scheduled.yml (1)

20-29: Ensure YAML patterns parse correctly
Removing quotes from patterns with leading ! (e.g. !velox/docs/**) may cause YAML to interpret them as tags rather than exclusion globs. Confirm that GitHub Actions still honours these as intended or wrap them in quotes to be safe.

Also applies to: 35-44

.yamlfmt.yml (1)

14-22: YAML formatter configuration looks good
The doublestar match type, exclusions, and indent settings align with pre-commit hook requirements.

CMake/Findglog.cmake (6)

2-13: Added Apache License header
Standard library header now mirrors other CMake modules and ensures consistent licensing.


25-27: Reformatted find_library calls
Splitting arguments across lines improves readability without changing behaviour.


28-28: Reformatted find_path invocation
Consistent with surrounding code style.


32-33: Aligned find_package_handle_standard_args arguments
Better alignment aids maintainability; behaviour is unchanged.


35-35: Marked variables as advanced
Mirrors other CMake find modules.


40-46: Refined imported target setup formatting
Indentation and property ordering now match project conventions.

.github/workflows/docs.yml (2)

40-40: Environment variable declaration approved
Using an unquoted string for CCACHE_DIR is valid and consistent with other workflows.


42-46: Verify ccache caching logic
The restore step is explicitly disabled (if: false) while the save step remains active. Confirm this is intentional—caching won’t be effective if restoration never runs.

Also applies to: 68-73

.github/workflows/linux-build.yml (4)

20-20: Confirm branch filter alignment

The branch pattern presto-0.293-clp-connector may be too specific. Verify that all relevant branches needing Clang builds match this pattern or consider a more general wildcard.


22-31: Approve path filters for push events

The set of included Velox paths and the exclusion of velox/docs/** look correct for GCC/Clang build triggers.


35-44: Approve path filters for pull_request events

Consistency with push triggers ensures the same files gate PR builds.


59-60: Correctly set Clang input

The with: use-clang: true parameter correctly delegates to the base workflow for Clang builds.

.github/workflows/preliminary_checks.yml (7)

1-13: The added standard Apache license header is correct and unchanged.


14-14: Approve workflow naming

The workflow is clearly named Run Checks to indicate pre-commit enforcement.


16-23: Verify push trigger intent

Using push: {} will run this workflow on all pushes (including tags). Confirm that this broad scope is desired, or restrict it to specific branches/tags to reduce CI noise.


25-27: Approve permissions scope

Limiting contents: read is a good security practice for preliminary checks.


28-31: Review concurrency cancel condition

The cancellation condition targets refs/heads/presto-0.293-clp-connector, which might not apply broadly. Ensure this is the correct branch reference or generalize if needed.


33-37: Approve pre-commit installation step

Installing pre-commit via pip before checkout is valid since it does not depend on repo content.


38-42: Approve checkout configuration

Using fetch-depth: 2 and disabling credential persistence balances speed and security.

.github/workflows/linux-build-base.yml (4)

21-21: Approve input description

The description: Use Clang to compile the project. clearly explains the use-clang input.


56-56: Approve unquoted MAKEFLAGS

Removing quotes around MAX_HIGH_MEM_JOBS=4 MAX_LINK_JOBS=3 is consistent with other unquoted assignments.


61-64: Approve post-build cache stats

Running ccache -vs to display verbose statistics after the build is helpful for cache performance insights.


31-33: [web_search]

What is the correct environment variable name for setting the maximum cache size in ccache, and does it accept suffixes such as "5G"?
velox/python/plan_builder/plan_builder.pyi (2)

18-18: Confirm PEP 585 support

Swapping List/Dict for list[...]/dict[...] requires Python 3.9+. Verify that your stub consumers target a compatible interpreter version.


53-53: Approve trailing comma

Adding a trailing comma after connector_id: str = "tpch", is a harmless formatting choice in stub files.

.pre-commit-config.yaml (7)

1-16: License header is correctly applied
The Apache 2.0 header is present and well-formatted at the top of the file.


21-25: Meta hooks configuration looks good
Including check-hooks-apply and check-useless-excludes will help maintain hygiene of this config.


26-35: Standard pre-commit-hooks set is appropriate
Trailing-whitespace, EOF fixer, large-file checks, and shebang validation cover common pitfalls.


59-77: Custom license-header hook is comprehensive
The exclusion list covers third-party and generated code appropriately.


86-93: Ruff hooks for Python linting/formatting
Auto-fixing with --fix is useful. Ensure your pyproject.toml or equivalent config is aligned with ruff’s settings.


94-101: YAML linting for GitHub Actions
Using --strict and parsable format will catch errors early. Excluding clang-related files is sensible.


102-107: YAML formatting with yamlfmt
Configuration is straightforward. No major issues spotted.


logging.info(
f"Written {result.row_count} records to {result.file_count} output files at '{result.output_path}'" # pyre-ignore
f"Written {result.row_count} records to {result.file_count} output files at '{result.output_path}'" # pyre-ignore
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Fix incorrect dict attribute access and optimise logging call.
Accessing result.row_count on a dict will raise an error. Also prefer lazy formatting to avoid unnecessary string interpolation when logging is disabled.

Apply this diff:

-    logging.info(
-        f"Written {result.row_count} records to {result.file_count} output files at '{result.output_path}'"  # pyre-ignore
-    )
+    logging.info(
+        "Written %d records to %d output files at '%s'",  # pyre-ignore
+        result["row_count"],
+        result["file_count"],
+        result["output_path"],
+    )
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
f"Written {result.row_count} records to {result.file_count} output files at '{result.output_path}'" # pyre-ignore
logging.info(
"Written %d records to %d output files at '%s'", # pyre-ignore
result["row_count"],
result["file_count"],
result["output_path"],
)
🧰 Tools
🪛 Pylint (3.3.7)

[convention] 160-160: Line too long (122/100)

(C0301)


[error] 160-160: Instance of 'dict' has no 'row_count' member

(E1101)


[error] 160-160: Instance of 'dict' has no 'file_count' member

(E1101)


[error] 160-160: Instance of 'dict' has no 'output_path' member

(E1101)

🪛 Ruff (0.11.9)

160-160: Logging statement uses f-string

(G004)


160-160: Trailing comma missing

Add trailing comma

(COM812)

🤖 Prompt for AI Agents
In python/pyvelox/utils/data_generator.py at line 160, the code incorrectly
accesses dictionary keys using attribute syntax (result.row_count) which will
raise an error. Change these to use bracket notation (result['row_count']) to
correctly access dictionary values. Additionally, modify the logging call to use
lazy formatting by passing the format string and arguments separately to avoid
unnecessary string interpolation when logging is disabled.

Comment on lines 127 to 129
def ms2sec(x):
return x / 1000

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick (assertive)

Good refactoring, but add type annotations.

The conversion from lambda to a regular function improves readability. However, the static analysis correctly identifies missing type annotations.

Apply this diff to add type annotations:

-def ms2sec(x):
-    return x / 1000
+def ms2sec(x: int) -> float:
+    return x / 1000
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
def ms2sec(x):
return x / 1000
def ms2sec(x: int) -> float:
return x / 1000
🧰 Tools
🪛 Ruff (0.11.9)

127-127: Missing return type annotation for private function ms2sec

(ANN202)


127-127: Missing type annotation for function argument x

(ANN001)

🤖 Prompt for AI Agents
In scripts/ci/bm-report/build-metrics.py around lines 127 to 129, the function
ms2sec lacks type annotations. Add appropriate type annotations to the function
signature, specifying the input parameter type (likely float or int) and the
return type (float), to improve code clarity and support static analysis.

Comment on lines 27 to 28
Any library that statically links against the stubs **can not** run on a
machine without an actual CUDA driver installed (like our CI).
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick (assertive)

Typographical nitpick: missing comma and spelling
Consider changing “can not” to “cannot” and adding a comma after the parenthetical phrase:

- Any library that statically links against the stubs **can not** run on a
- machine without an actual CUDA driver installed (like our CI).
+ Any library that statically links against the stubs **cannot** run on a
+ machine without an actual CUDA driver installed, like our CI.
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
Any library that statically links against the stubs **can not** run on a
machine without an actual CUDA driver installed (like our CI).
Any library that statically links against the stubs **cannot** run on a
machine without an actual CUDA driver installed, like our CI.
🧰 Tools
🪛 LanguageTool

[typographical] ~28-~28: Use a comma after an introductory phrase.
Context: ...al CUDA driver installed (like our CI). For this reason we need to use the base functions to cr...

(COMMA_INTRODUCTORY_WORDS_PHRASES)

🤖 Prompt for AI Agents
In velox/experimental/wave/README.md around lines 27 to 28, replace "can not"
with the correct spelling "cannot" and add a comma after the parenthetical
phrase "(like our CI)" to improve readability and correctness.

error_message = ""
for rep_change in delta["repetition_change"]:
error_message += f"""'{rep_change.get_root_key()}{rep_change.t1}' is repeated {rep_change.repetition['new_repeat']} times.\n"""
error_message += f"""'{rep_change.get_root_key()}{rep_change.t1}' is repeated {rep_change.repetition["new_repeat"]} times.\n"""
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick (assertive)

Optional: improve readability by splitting long f-string
To respect a 100-character limit, consider breaking the line:

-    error_message += f"""'{rep_change.get_root_key()}{rep_change.t1}' is repeated {rep_change.repetition["new_repeat"]} times.\n"""
+    error_message += (
+        f"'{rep_change.get_root_key()}{rep_change.t1}' "
+        f"is repeated {rep_change.repetition['new_repeat']} times.\n"
+    )
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
error_message += f"""'{rep_change.get_root_key()}{rep_change.t1}' is repeated {rep_change.repetition["new_repeat"]} times.\n"""
error_message += (
f"'{rep_change.get_root_key()}{rep_change.t1}' "
f"is repeated {rep_change.repetition['new_repeat']} times.\n"
)
🧰 Tools
🪛 Pylint (3.3.7)

[convention] 149-149: Line too long (143/100)

(C0301)

🤖 Prompt for AI Agents
In scripts/ci/signature.py at line 149, the f-string is too long and exceeds the
100-character limit. Split the f-string into multiple concatenated parts or use
implicit string concatenation across multiple lines to improve readability and
comply with the line length guideline.

Comment on lines 25 to 29
comments: disable
comments-indentation: disable
document-end: disable
document-start: disable
empty-lines: disable
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick (assertive)

Consider re-enabling indentation rules.

Rules like comments-indentation, document-start, and empty-lines are disabled. Since you use yamlfmt for formatting, you might re-enable these to catch accidental manual edits that bypass formatter.

🤖 Prompt for AI Agents
In the .yamllint.yml file around lines 25 to 29, several indentation-related
rules such as comments-indentation, document-start, and empty-lines are
disabled. To improve YAML linting and catch accidental manual formatting errors,
re-enable these rules by setting them to enable or removing the disable flags.
This will ensure yamllint enforces consistent formatting alongside yamlfmt.

Comment on lines 48 to 58
- id: clang-tidy
name: clang-tidy
description: Run clang-tidy on C/C++ files
stages:
- manual # Needs compile_commands.json
entry: clang-tidy
language: python
types_or: [c++, c]
additional_dependencies: [clang-tidy==18.1.8]
require_serial: true

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick (assertive)

Clang-tidy hook staged as manual
Good to prevent failures when compile_commands.json is missing. You might want to add a note in the CONTRIBUTING or CODING_STYLE guide about how to generate that file for local runs.

🤖 Prompt for AI Agents
In the .pre-commit-config.yaml file around lines 48 to 58, the clang-tidy hook
is set to manual to avoid failures when compile_commands.json is missing. To
improve developer experience, add a note in the CONTRIBUTING or CODING_STYLE
documentation explaining how to generate compile_commands.json locally, such as
running CMake with the appropriate flags or using build tools that produce this
file.

Comment on lines 17 to 20
# General excludes, files can also be excluded on a hook level
exclude: .*\.patch|scripts/tests/.*|velox/external/.*|CMake/third-party/.*
default_install_hook_types: [pre-commit, pre-push]
repos:
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick (assertive)

Validate global exclude pattern anchoring
The exclude regex may unintentionally match substrings in deeper paths. Consider anchoring patterns (e.g. (^|/)scripts/tests/.*) to avoid over-excluding.

🤖 Prompt for AI Agents
In the .pre-commit-config.yaml file around lines 17 to 20, the global exclude
regex patterns are not anchored, which can cause unintended matches in deeper
directory paths. Update each pattern in the exclude line by anchoring them with
(^|/) at the start to ensure they only match the intended directory names or
file paths, preventing over-exclusion of files.

Comment on lines 108 to 112
- repo: https://github.com/zizmorcore/zizmor-pre-commit
rev: v1.7.0
hooks:
- id: zizmor

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick (assertive)

Zizmor hook inclusion
Integration of zizmor is fine. Consider specifying files: \.editorconfig$ or other patterns if you want to target specific file types.

🤖 Prompt for AI Agents
In .pre-commit-config.yaml around lines 108 to 112, the zizmor hook is included
without specifying target file patterns. To limit the hook to specific files,
add a 'files' key with a regex pattern like '\.editorconfig$' under the zizmor
hook configuration. This ensures the hook only runs on intended file types.

Comment on lines 36 to 47
- repo: local
hooks:
- id: cmake-format
name: cmake-format
description: Format CMake files.
entry: cmake-format
language: python
files: (CMakeLists.*|.*\.cmake|.*\.cmake.in)$
args: [--in-place]
require_serial: false
additional_dependencies: [cmake-format==0.6.13, pyyaml]

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick (assertive)

CMake formatting hook is well-pinned
Pinning cmake-format==0.6.13 ensures reproducible formatting. Consider documenting any project-specific style options in a config file.

🤖 Prompt for AI Agents
In .pre-commit-config.yaml around lines 36 to 47, the cmake-format hook is
pinned to version 0.6.13 for reproducibility, but there is no project-specific
style configuration documented. Create a cmake-format configuration file (e.g.,
.cmake-format.yaml) in the project root to specify any custom style options, and
mention this config file in the project documentation or README to guide
contributors on the formatting standards.

Comment on lines 79 to 85
- repo: https://github.com/pre-commit/mirrors-clang-format
rev: v18.1.3
hooks:
- id: clang-format
# types_or: [c++, c, cuda, metal, objective-c]
files: \.(cpp|cc|c|h|hpp|inc|cu|cuh|clcpp|mm|metal)$

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick (assertive)

clang-format pattern is broad enough
Optionally, you could uncomment and use types_or for explicit clarity, but the file glob is sufficient.

🤖 Prompt for AI Agents
In .pre-commit-config.yaml around lines 79 to 85, the clang-format hook uses a
file glob pattern that is broad and sufficient for matching relevant files. The
comment suggests optionally uncommenting and using the types_or field for
explicit clarity, but it is not necessary. You can leave the current file glob
pattern as is or uncomment and add the types_or line if you want to be more
explicit about the file types clang-format should target.

@anlowee anlowee requested a review from kirkrodrigues June 19, 2025 17:13
Copy link
Member

@kirkrodrigues kirkrodrigues left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the PR title, how about:

build: Ignore clang's `implicit-const-int-float-conversion` warning since the existing Velox code contains violations.

@anlowee anlowee changed the title ci: Fix clang build failure due to -Wimplicit-const-int-float-conversion. build: Ignore clang's implicit-const-int-float-conversion warning since the existing Velox code contains violations. Jun 20, 2025
@anlowee anlowee merged commit ec15456 into y-scope:presto-0.293-clp-connector Jun 20, 2025
6 checks passed
@anlowee anlowee deleted the xwei/fix-clang branch June 20, 2025 03:49
wraymo pushed a commit that referenced this pull request Jun 26, 2025
…ger-overflow (facebookincubator#13831)

Summary:
Pull Request resolved: facebookincubator#13831

This avoids the following errors:

```
fbcode/third-party-buck/platform010/build/libgcc/include/c++/trunk/bits/std_abs.h:56:41: runtime error: negation of -9223372036854775808 cannot be represented in type 'long'; cast to an unsigned type to negate this value to itself
    #0 0x000000346ce5 in std::abs(long) fbcode/third-party-buck/platform010/build/libgcc/include/c++/trunk/bits/std_abs.h:56
    #1 0x000000345879 in std::shared_ptr<facebook::velox::BiasVector<facebook::velox::test::EvalTypeHelper<long>::Type>> facebook::velox::test::VectorMaker::biasVector<long>(std::vector<std::optional<long>, std::allocator<std::optional<long>>> const&) fbcode/velox/vector/tests/utils/VectorMaker-inl.h:58
    #2 0x000000344d34 in facebook::velox::test::BiasVectorErrorTest::errorTest(std::vector<std::optional<long>, std::allocator<std::optional<long>>>) fbcode/velox/vector/tests/BiasVectorTest.cpp:39
    #3 0x00000033ec99 in facebook::velox::test::BiasVectorErrorTest_checkRangeTooLargeError_Test::TestBody() fbcode/velox/vector/tests/BiasVectorTest.cpp:44
    #4 0x7fe0a2342c46 in void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) fbsource/src/gtest.cc:2727
    #5 0x7fe0a234275d in testing::Test::Run() fbsource/src/gtest.cc:2744
    #6 0x7fe0a2345fb3 in testing::TestInfo::Run() fbsource/src/gtest.cc:2890
    #7 0x7fe0a234c8eb in testing::TestSuite::Run() fbsource/src/gtest.cc:3068
    #8 0x7fe0a237b52b in testing::internal::UnitTestImpl::RunAllTests() fbsource/src/gtest.cc:6059
    #9 0x7fe0a237a0a2 in bool testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) fbsource/src/gtest.cc:2727
    #10 0x7fe0a23797f5 in testing::UnitTest::Run() fbsource/src/gtest.cc:5599
    #11 0x7fe0a2239800 in RUN_ALL_TESTS() fbsource/gtest/gtest.h:2334
    #12 0x7fe0a223952c in main fbcode/common/gtest/LightMain.cpp:20
    #13 0x7fe09ec2c656 in __libc_start_call_main /home/engshare/third-party2/glibc/2.34/src/glibc-2.34/csu/../sysdeps/nptl/libc_start_call_main.h:58:16
    #14 0x7fe09ec2c717 in __libc_start_main@GLIBC_2.2.5 /home/engshare/third-party2/glibc/2.34/src/glibc-2.34/csu/../csu/libc-start.c:409:3
    #15 0x00000033d8b0 in _start /home/engshare/third-party2/glibc/2.34/src/glibc-2.34/csu/../sysdeps/x86_64/start.S:116

UndefinedBehaviorSanitizer: signed-integer-overflow fbcode/third-party-buck/platform010/build/libgcc/include/c++/trunk/bits/std_abs.h:56:41
```
Avoid overflow by using the expression (static_cast<uint64_t>(1) + ~static_cast<uint64_t>(min)) to calculate the absolute value of min without using std::abs

Reviewed By: dmm-fb, peterenescu

Differential Revision: D76901449

fbshipit-source-id: 7eb3bd0f83e42f44cdf34ea1759f3aa9e1042dae
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants