Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: Dummy PR to check maint-17.0.0 status #43113

Open
wants to merge 16 commits into
base: main
Choose a base branch
from
Open

WIP: Dummy PR to check maint-17.0.0 status #43113

wants to merge 16 commits into from

Conversation

raulcd
Copy link
Member

@raulcd raulcd commented Jul 2, 2024

DO NOT MERGE.

This PR is to track the some crossbow jobs to validate status of maintenance branch before creating the first RC.

…43093)

### Rationale for this change

The newer version of LLVM on AlmaLinux 8 fails on the pyarrow.gandiva tests

### What changes are included in this PR?

Temporarily remove Gandiva on Python checks for AlmaLinux 8.

### Are these changes tested?

Via CI
### Are there any user-facing changes?

No
* GitHub Issue: #43059

Authored-by: Raúl Cumplido <raulcumplido@gmail.com>
Signed-off-by: Sutou Kouhei <kou@clear-code.com>
Copy link

github-actions bot commented Jul 2, 2024

Thanks for opening a pull request!

If this is not a minor PR. Could you open an issue for this pull request on GitHub? https://github.com/apache/arrow/issues/new/choose

Opening GitHub issues ahead of time contributes to the Openness of the Apache Arrow project.

Then could you also rename the pull request title in the following format?

GH-${GITHUB_ISSUE_ID}: [${COMPONENT}] ${SUMMARY}

or

MINOR: [${COMPONENT}] ${SUMMARY}

In the case of PARQUET issues on JIRA the title also supports:

PARQUET-${JIRA_ISSUE_ID}: [${COMPONENT}] ${SUMMARY}

See also:

@github-actions github-actions bot added the awaiting committer review Awaiting committer review label Jul 2, 2024
@raulcd

This comment was marked as outdated.

@raulcd

This comment was marked as outdated.

zeroshade and others added 3 commits July 3, 2024 16:19
…2199)

<!--
Thanks for opening a pull request!
If this is your first pull request you can find detailed information on
how
to contribute here:
* [New Contributor's
Guide](https://arrow.apache.org/docs/dev/developers/guide/step_by_step/pr_lifecycle.html#reviews-and-merge-of-the-pull-request)
* [Contributing
Overview](https://arrow.apache.org/docs/dev/developers/overview.html)


If this is not a [minor
PR](https://github.com/apache/arrow/blob/main/CONTRIBUTING.md#Minor-Fixes).
Could you open an issue for this pull request on GitHub?
https://github.com/apache/arrow/issues/new/choose

Opening GitHub issues ahead of time contributes to the
[Openness](http://theapacheway.com/open/#:~:text=Openness%20allows%20new%20users%20the,must%20happen%20in%20the%20open.)
of the Apache Arrow project.

Then could you also rename the pull request title in the following
format?

    GH-${GITHUB_ISSUE_ID}: [${COMPONENT}] ${SUMMARY}

or

    MINOR: [${COMPONENT}] ${SUMMARY}

In the case of PARQUET issues on JIRA the title also supports:

    PARQUET-${JIRA_ISSUE_ID}: [${COMPONENT}] ${SUMMARY}

-->

### Rationale for this change
Ensuring that creating IPC payloads works correctly for non-CPU data by
utilizing `CopyBufferSliceToCPU`.

<!--
Why are you proposing this change? If this is already explained clearly
in the issue then this section is not needed.
Explaining clearly why changes are proposed helps reviewers understand
your changes and offer better suggestions for fixes.
-->

### What changes are included in this PR?
Adding calls to `CopyBufferSliceToCPU` to the Ipc Writer for base binary
types and for list types, to avoid calls to `value_offset` in those
cases.

<!--
There is no need to duplicate the description in the issue here but it
is sometimes worth providing a summary of the individual changes in this
PR.
-->

### Are these changes tested?
Yes. Tests are added to cuda_test.cc

<!--
We typically require tests for all PRs in order to:
1. Prevent the code from being accidentally broken by subsequent changes
2. Serve as another way to document the expected behavior of the code

If tests are not included in your PR, please explain why (for example,
are they covered by existing tests)?
-->

### Are there any user-facing changes?
No.

<!--
If there are user-facing changes then we may require documentation to be
updated before approving the PR.
-->

<!--
If there are any breaking changes to public APIs, please uncomment the
line below and explain which changes are breaking.
-->
<!-- **This PR includes breaking changes to public APIs.** -->

<!--
Please uncomment the line below (and provide explanation) if the changes
fix either (a) a security vulnerability, (b) a bug that caused incorrect
or invalid data to be produced, or (c) a bug that causes a crash (even
when the API contract is upheld). We use this to highlight fixes to
issues that may affect users without their knowledge. For this reason,
fixing bugs that cause errors don't count, since those are usually
obvious.
-->
<!-- **This PR contains a "Critical Fix".** -->
* GitHub Issue: #42198
…43127)

### Rationale for this change

We can't use http://mirrorlist.centos.org because CentOS 7 reached EOL.

### What changes are included in this PR?

Use https://vault.centos.org/ instead.

### Are these changes tested?

Yes.

### Are there any user-facing changes?

No.
* GitHub Issue: #43122

Authored-by: Sutou Kouhei <kou@clear-code.com>
Signed-off-by: Raúl Cumplido <raulcumplido@gmail.com>
…e been deprecated (#43121)

### Rationale for this change

Jobs are failing to find mirrorlist.centos.org

### What changes are included in this PR?

Updating repos based on solution from: #43119 (comment)

### Are these changes tested?

Via archery

### Are there any user-facing changes?
No
* GitHub Issue: #43119

Lead-authored-by: Raúl Cumplido <raulcumplido@gmail.com>
Co-authored-by: Sutou Kouhei <kou@clear-code.com>
Co-authored-by: Sutou Kouhei <kou@cozmixng.org>
Signed-off-by: Raúl Cumplido <raulcumplido@gmail.com>
@raulcd

This comment was marked as outdated.

… segfault (#43071)

### Rationale for this change

See #43070

### What changes are included in this PR?

Checks that the ciphertext length is at least enough to hold the length (if written), nonce and GCM tag for the GCM cipher type.

Also enforces that the input ciphertext length parameter is provided (is > 0) and verifies that the ciphertext size read from the file isn't going to cause reads beyond the end of the ciphertext buffer.

### Are these changes tested?

Yes I've added new unit tests for this.

### Are there any user-facing changes?

No
* GitHub Issue: #43070

Authored-by: Adam Reeve <adreeve@gmail.com>
Signed-off-by: mwish <maplewish117@gmail.com>
@raulcd

This comment was marked as outdated.

kou and others added 5 commits July 5, 2024 11:01
### Rationale for this change

`google_cloud_cpp_mocks` depends on `GTest::gmock_main` but it's built without `BUILD_TESTING`. google-cloud-cpp finds GoogleTest only with `BUILD_TESTING`.

### What changes are included in this PR?

The recent google-cloud-cpp doesn't build `google_cloud_cpp_mocks` without `BUILD_TESTING`.

Note that we can't use 2.23.0 or later because they can't be built with MinGW-w64. See also:
* mingw-w64/mingw-w64#49
* googleapis/google-cloud-cpp#14436

### Are these changes tested?

Yes.

### Are there any user-facing changes?

Yes.
* GitHub Issue: #43134

Authored-by: Sutou Kouhei <kou@clear-code.com>
Signed-off-by: Sutou Kouhei <kou@clear-code.com>
… large memory test (#43128)

### Rationale for this change

This test consumes more than 4GB memory and causes oom-kill when running with TSAN as reported in #43116 .

### What changes are included in this PR?

Limit its running by marking it as large memory test.

### Are these changes tested?

Change is test.

### Are there any user-facing changes?

None.

* GitHub Issue: #43116

Authored-by: Ruoxi Sun <zanmato1984@gmail.com>
Signed-off-by: Raúl Cumplido <raulcumplido@gmail.com>
pyarrow knows about ARROW_ENABLE_THREADING and doesn't use threads if they are not enabled in libarrow.

Split from #37696 

* GitHub Issue: #41910

Lead-authored-by: Joe Marshall <joe.marshall@nottingham.ac.uk>
Co-authored-by: Joris Van den Bossche <jorisvandenbossche@gmail.com>
Co-authored-by: Raúl Cumplido <raulcumplido@gmail.com>
Co-authored-by: Sutou Kouhei <kou@cozmixng.org>
Signed-off-by: Sutou Kouhei <kou@clear-code.com>
… Stream 8 (#43159)

### Rationale for this change

Because json-devel on them don't provide nlohmann/json_fwd.h that is required by google-cloud-cpp.

The upstream issue:
googleapis/google-cloud-cpp#14438

### What changes are included in this PR?

Use bundled nlohmann/json instead.

### Are these changes tested?

Yes.

### Are there any user-facing changes?

No.
* GitHub Issue: #43158

Authored-by: Sutou Kouhei <kou@clear-code.com>
Signed-off-by: Sutou Kouhei <kou@clear-code.com>
### Rationale for this change

This also has a workaround for https://issues.apache.org/jira/browse/ORC-1732 .

### What changes are included in this PR?

ORC 2.0.1 has a dependency detection problem. We can't override the detection with ExternalProject but can override the detection with FetchContent.

### Are these changes tested?

Yes.

### Are there any user-facing changes?

Yes.
* GitHub Issue: #42149

Authored-by: Sutou Kouhei <kou@clear-code.com>
Signed-off-by: Raúl Cumplido <raulcumplido@gmail.com>
@raulcd
Copy link
Member Author

raulcd commented Jul 8, 2024

Revision: 12be569

Submitted crossbow builds: ursacomputing/crossbow @ maint-17.0.0-nightly-tests-2

Task Status
example-cpp-minimal-build-static GitHub Actions
example-cpp-minimal-build-static-system-dependency GitHub Actions
example-cpp-tutorial GitHub Actions
example-python-minimal-build-fedora-conda GitHub Actions
example-python-minimal-build-ubuntu-venv GitHub Actions
test-alpine-linux-cpp GitHub Actions
test-build-cpp-fuzz GitHub Actions
test-build-vcpkg-win GitHub Actions
test-conda-cpp GitHub Actions
test-conda-cpp-valgrind GitHub Actions
test-conda-python-3.10 GitHub Actions
test-conda-python-3.10-cython2 GitHub Actions
test-conda-python-3.10-hdfs-2.9.2 GitHub Actions
test-conda-python-3.10-hdfs-3.2.1 GitHub Actions
test-conda-python-3.10-pandas-latest-numpy-1.26 GitHub Actions
test-conda-python-3.10-pandas-latest-numpy-latest GitHub Actions
test-conda-python-3.10-pandas-nightly-numpy-nightly GitHub Actions
test-conda-python-3.10-spark-v3.5.0 GitHub Actions
test-conda-python-3.10-substrait GitHub Actions
test-conda-python-3.11 GitHub Actions
test-conda-python-3.11-dask-latest GitHub Actions
test-conda-python-3.11-dask-upstream_devel GitHub Actions
test-conda-python-3.11-hypothesis GitHub Actions
test-conda-python-3.11-pandas-upstream_devel-numpy-nightly GitHub Actions
test-conda-python-3.11-spark-master GitHub Actions
test-conda-python-3.12 GitHub Actions
test-conda-python-3.8 GitHub Actions
test-conda-python-3.8-pandas-1.0-numpy-1.19 GitHub Actions
test-conda-python-3.8-spark-v3.5.0 GitHub Actions
test-conda-python-3.9 GitHub Actions
test-conda-python-3.9-pandas-latest-numpy-latest GitHub Actions
test-conda-python-emscripten GitHub Actions
test-cuda-cpp GitHub Actions
test-cuda-python GitHub Actions
test-debian-12-cpp-amd64 GitHub Actions
test-debian-12-cpp-i386 GitHub Actions
test-debian-12-docs GitHub Actions
test-debian-12-go-1.21 GitHub Actions
test-debian-12-go-1.22 GitHub Actions
test-debian-12-python-3-amd64 GitHub Actions
test-debian-12-python-3-i386 GitHub Actions
test-debian-c-glib GitHub Actions
test-debian-ruby GitHub Actions
test-fedora-39-cpp GitHub Actions
test-fedora-39-python-3 GitHub Actions
test-r-arrow-backwards-compatibility GitHub Actions
test-r-clang-sanitizer GitHub Actions
test-r-depsource-bundled Azure
test-r-depsource-system GitHub Actions
test-r-dev-duckdb GitHub Actions
test-r-devdocs GitHub Actions
test-r-gcc-11 GitHub Actions
test-r-gcc-12 GitHub Actions
test-r-install-local GitHub Actions
test-r-install-local-minsizerel GitHub Actions
test-r-linux-as-cran GitHub Actions
test-r-linux-rchk GitHub Actions
test-r-linux-valgrind GitHub Actions
test-r-minimal-build Azure
test-r-offline-maximal GitHub Actions
test-r-offline-minimal Azure
test-r-rhub-debian-gcc-devel-lto-latest Azure
test-r-rhub-debian-gcc-release-custom-ccache Azure
test-r-rhub-ubuntu-release-latest Azure
test-r-rocker-r-ver-latest Azure
test-r-rstudio-r-base-4.1-opensuse155 Azure
test-r-rstudio-r-base-4.2-focal Azure
test-r-ubuntu-22.04 GitHub Actions
test-r-versions GitHub Actions
test-skyhook-integration GitHub Actions
test-ubuntu-20.04-cpp GitHub Actions
test-ubuntu-20.04-cpp-bundled GitHub Actions
test-ubuntu-20.04-cpp-minimal-with-formats GitHub Actions
test-ubuntu-20.04-cpp-thread-sanitizer GitHub Actions
test-ubuntu-20.04-python-3 GitHub Actions
test-ubuntu-22.04-cpp GitHub Actions
test-ubuntu-22.04-cpp-20 GitHub Actions
test-ubuntu-22.04-cpp-emscripten GitHub Actions
test-ubuntu-22.04-cpp-no-threading GitHub Actions
test-ubuntu-22.04-python-3 GitHub Actions
test-ubuntu-24.04-cpp GitHub Actions
test-ubuntu-24.04-cpp-gcc-14 GitHub Actions
test-ubuntu-c-glib GitHub Actions
test-ubuntu-r-sanitizer GitHub Actions
test-ubuntu-ruby GitHub Actions

@raulcd
Copy link
Member Author

raulcd commented Jul 8, 2024

Revision: 12be569

Submitted crossbow builds: ursacomputing/crossbow @ maint-17.0.0-nightly-packaging-2

Task Status
almalinux-8-amd64 GitHub Actions
almalinux-8-arm64 GitHub Actions
almalinux-9-amd64 GitHub Actions
almalinux-9-arm64 GitHub Actions
amazon-linux-2023-amd64 GitHub Actions
amazon-linux-2023-arm64 GitHub Actions
centos-7-amd64 GitHub Actions
centos-8-stream-amd64 GitHub Actions
centos-8-stream-arm64 GitHub Actions
centos-9-stream-amd64 GitHub Actions
centos-9-stream-arm64 GitHub Actions
conan-maximum GitHub Actions
conan-minimum GitHub Actions
conda-clean Azure
conda-linux-aarch64-cpu-py3 Azure
conda-linux-aarch64-cuda-py3 Azure
conda-linux-ppc64le-cpu-py3 Azure
conda-linux-ppc64le-cuda-py3 Azure
conda-linux-x64-cpu-py3 Azure
conda-linux-x64-cuda-py3 Azure
conda-osx-arm64-cpu-py3 Azure
conda-osx-x64-cpu-py3 Azure
conda-win-x64-cpu-py3 Azure
conda-win-x64-cuda-py3 Azure
debian-bookworm-amd64 GitHub Actions
debian-bookworm-arm64 GitHub Actions
debian-trixie-amd64 GitHub Actions
debian-trixie-arm64 GitHub Actions
homebrew-cpp GitHub Actions
java-jars GitHub Actions
nuget GitHub Actions
python-sdist GitHub Actions
r-binary-packages GitHub Actions
ubuntu-focal-amd64 GitHub Actions
ubuntu-focal-arm64 GitHub Actions
ubuntu-jammy-amd64 GitHub Actions
ubuntu-jammy-arm64 GitHub Actions
ubuntu-noble-amd64 GitHub Actions
ubuntu-noble-arm64 GitHub Actions
wheel-macos-big-sur-cp310-arm64 GitHub Actions
wheel-macos-big-sur-cp311-arm64 GitHub Actions
wheel-macos-big-sur-cp312-arm64 GitHub Actions
wheel-macos-big-sur-cp38-arm64 GitHub Actions
wheel-macos-big-sur-cp39-arm64 GitHub Actions
wheel-macos-catalina-cp310-amd64 GitHub Actions
wheel-macos-catalina-cp311-amd64 GitHub Actions
wheel-macos-catalina-cp312-amd64 GitHub Actions
wheel-macos-catalina-cp38-amd64 GitHub Actions
wheel-macos-catalina-cp39-amd64 GitHub Actions
wheel-manylinux-2-28-cp310-amd64 GitHub Actions
wheel-manylinux-2-28-cp310-arm64 GitHub Actions
wheel-manylinux-2-28-cp311-amd64 GitHub Actions
wheel-manylinux-2-28-cp311-arm64 GitHub Actions
wheel-manylinux-2-28-cp312-amd64 GitHub Actions
wheel-manylinux-2-28-cp312-arm64 GitHub Actions
wheel-manylinux-2-28-cp38-amd64 GitHub Actions
wheel-manylinux-2-28-cp38-arm64 GitHub Actions
wheel-manylinux-2-28-cp39-amd64 GitHub Actions
wheel-manylinux-2-28-cp39-arm64 GitHub Actions
wheel-manylinux-2014-cp310-amd64 GitHub Actions
wheel-manylinux-2014-cp310-arm64 GitHub Actions
wheel-manylinux-2014-cp311-amd64 GitHub Actions
wheel-manylinux-2014-cp311-arm64 GitHub Actions
wheel-manylinux-2014-cp312-amd64 GitHub Actions
wheel-manylinux-2014-cp312-arm64 GitHub Actions
wheel-manylinux-2014-cp38-amd64 GitHub Actions
wheel-manylinux-2014-cp38-arm64 GitHub Actions
wheel-manylinux-2014-cp39-amd64 GitHub Actions
wheel-manylinux-2014-cp39-arm64 GitHub Actions
wheel-windows-cp310-amd64 GitHub Actions
wheel-windows-cp311-amd64 GitHub Actions
wheel-windows-cp312-amd64 GitHub Actions
wheel-windows-cp38-amd64 GitHub Actions
wheel-windows-cp39-amd64 GitHub Actions

@raulcd
Copy link
Member Author

raulcd commented Jul 8, 2024

Revision: 12be569

Submitted crossbow builds: ursacomputing/crossbow @ maint-17.0.0-nightly-release-1

Task Status
verify-rc-source-cpp-linux-almalinux-8-amd64 GitHub Actions
verify-rc-source-cpp-linux-conda-latest-amd64 GitHub Actions
verify-rc-source-cpp-linux-ubuntu-20.04-amd64 GitHub Actions
verify-rc-source-cpp-linux-ubuntu-22.04-amd64 GitHub Actions
verify-rc-source-cpp-macos-amd64 GitHub Actions
verify-rc-source-cpp-macos-arm64 GitHub Actions
verify-rc-source-cpp-macos-conda-amd64 GitHub Actions
verify-rc-source-csharp-linux-almalinux-8-amd64 GitHub Actions
verify-rc-source-csharp-linux-conda-latest-amd64 GitHub Actions
verify-rc-source-csharp-linux-ubuntu-20.04-amd64 GitHub Actions
verify-rc-source-csharp-linux-ubuntu-22.04-amd64 GitHub Actions
verify-rc-source-csharp-macos-amd64 GitHub Actions
verify-rc-source-csharp-macos-arm64 GitHub Actions
verify-rc-source-go-linux-almalinux-8-amd64 GitHub Actions
verify-rc-source-go-linux-conda-latest-amd64 GitHub Actions
verify-rc-source-go-linux-ubuntu-20.04-amd64 GitHub Actions
verify-rc-source-go-linux-ubuntu-22.04-amd64 GitHub Actions
verify-rc-source-go-macos-amd64 GitHub Actions
verify-rc-source-go-macos-arm64 GitHub Actions
verify-rc-source-integration-linux-almalinux-8-amd64 GitHub Actions
verify-rc-source-integration-linux-conda-latest-amd64 GitHub Actions
verify-rc-source-integration-linux-ubuntu-20.04-amd64 GitHub Actions
verify-rc-source-integration-linux-ubuntu-22.04-amd64 GitHub Actions
verify-rc-source-integration-macos-amd64 GitHub Actions
verify-rc-source-integration-macos-arm64 GitHub Actions
verify-rc-source-integration-macos-conda-amd64 GitHub Actions
verify-rc-source-java-linux-almalinux-8-amd64 GitHub Actions
verify-rc-source-java-linux-conda-latest-amd64 GitHub Actions
verify-rc-source-java-linux-ubuntu-20.04-amd64 GitHub Actions
verify-rc-source-java-linux-ubuntu-22.04-amd64 GitHub Actions
verify-rc-source-java-macos-amd64 GitHub Actions
verify-rc-source-js-linux-almalinux-8-amd64 GitHub Actions
verify-rc-source-js-linux-conda-latest-amd64 GitHub Actions
verify-rc-source-js-linux-ubuntu-20.04-amd64 GitHub Actions
verify-rc-source-js-linux-ubuntu-22.04-amd64 GitHub Actions
verify-rc-source-js-macos-amd64 GitHub Actions
verify-rc-source-js-macos-arm64 GitHub Actions
verify-rc-source-python-linux-almalinux-8-amd64 GitHub Actions
verify-rc-source-python-linux-conda-latest-amd64 GitHub Actions
verify-rc-source-python-linux-ubuntu-20.04-amd64 GitHub Actions
verify-rc-source-python-linux-ubuntu-22.04-amd64 GitHub Actions
verify-rc-source-python-macos-amd64 GitHub Actions
verify-rc-source-python-macos-arm64 GitHub Actions
verify-rc-source-python-macos-conda-amd64 GitHub Actions
verify-rc-source-ruby-linux-almalinux-8-amd64 GitHub Actions
verify-rc-source-ruby-linux-conda-latest-amd64 GitHub Actions
verify-rc-source-ruby-linux-ubuntu-20.04-amd64 GitHub Actions
verify-rc-source-ruby-linux-ubuntu-22.04-amd64 GitHub Actions
verify-rc-source-ruby-macos-amd64 GitHub Actions
verify-rc-source-ruby-macos-arm64 GitHub Actions
verify-rc-source-windows GitHub Actions

sgilmore10 and others added 3 commits July 9, 2024 17:25
… should not include the release candidate number in the name of the tarball's top-level directory. (#43200)

### Rationale for this change

`dev/release/util-create-release-tarball.sh` should not include the release candidate number in the name of the tarball's top-level directory. If the release candidate number is included, the binaries and the release verification tasks fail because the tarball entries have an unexpected folder hierarchy. See #43188 (comment). 

### What changes are included in this PR?

1. Modified `dev/release/util-create-release-tarball.sh` to not include the release candidate number in the name of the source directory from which the release tarball is created.

### Are these changes tested?

Manually verified this change fixes the bug:

```bash
$ dev/release/utils-create-release-tarball.sh 17.0.0 1
$ tar zxvf apache-arrow-17.0.0.tar.gz
...
$ ls 
apache-arrow-17.0.0/       apache-arrow-17.0.0.tar.gz
```

### Are there any user-facing changes?

No

* GitHub Issue: #43199

Authored-by: Sarah Gilmore <sgilmore@mathworks.com>
Signed-off-by: Raúl Cumplido <raulcumplido@gmail.com>
…42003)

### Rationale for this change

<!--
Why are you proposing this change? If this is already explained clearly
in the issue then this section is not needed.
Explaining clearly why changes are proposed helps reviewers understand
your changes and offer better suggestions for fixes.
-->

This PR is complementary to #41638 .

The prior PR reduces reallocations in `PooledBufferWriter`. However the
problematic formula it addressed is still used in other functions.

In addition to this, `(*PooledBufferWriter).Reserve()` simply doubles
the capacity of buffers regardless of its argument `nbytes`. This may
result in excessive allocations in some cases.

### What changes are included in this PR?

<!--
There is no need to duplicate the description in the issue here but it
is sometimes worth providing a summary of the individual changes in this
PR.
-->

- Applied the fixed formula to `(*BufferWriter).Reserve()`.
- Updated the new capacity passed to `(*memory.Buffer).Reserve()`.
- Now using `bitutil.NextPowerOf2(b.pos + nbytes)` to avoid
reallocations when adding `nbytes`.
- Replaced `math.Max` with `utils.Max` in
`(*bufferWriteSeeker).Reserve()` to avoid unnecessary type conversions.

### Are these changes tested?

<!--
We typically require tests for all PRs in order to:
1. Prevent the code from being accidentally broken by subsequent changes
2. Serve as another way to document the expected behavior of the code

If tests are not included in your PR, please explain why (for example,
are they covered by existing tests)?
-->

Yes. The following commands pass.

```
$ export PARQUET_TEST_DATA=$PWD/cpp/submodules/parquet-testing/data
$ (cd go && go test ./...)
```

### Are there any user-facing changes?

<!--
If there are user-facing changes then we may require documentation to be
updated before approving the PR.
-->

No, but it may reduce the number of allocations and improve the
throughput.

Before:

```
$ go test -test.run='^$' -test.bench='^BenchmarkWriteColumn$' -benchmem ./parquet/pqarrow/...
goos: linux
goarch: arm64
pkg: github.com/apache/arrow/go/v17/parquet/pqarrow
BenchmarkWriteColumn/int32_not_nullable-10                  1190           1016705 ns/op        4125.39 MB/s     5443579 B/op        240 allocs/op
BenchmarkWriteColumn/int32_nullable-10                        52          24780561 ns/op         169.26 MB/s    12048944 B/op        249 allocs/op
BenchmarkWriteColumn/int64_not_nullable-10                   632           1717090 ns/op        4885.36 MB/s     5445954 B/op        265 allocs/op
BenchmarkWriteColumn/int64_nullable-10                        51          22949770 ns/op         365.52 MB/s    12209860 B/op        262 allocs/op
BenchmarkWriteColumn/float32_not_nullable-10                 519           2234718 ns/op        1876.88 MB/s     5452627 B/op       1263 allocs/op
BenchmarkWriteColumn/float32_nullable-10                      56          23423793 ns/op         179.06 MB/s    12057540 B/op       1272 allocs/op
BenchmarkWriteColumn/float64_not_nullable-10                 416           2761247 ns/op        3037.98 MB/s     5507068 B/op       1292 allocs/op
BenchmarkWriteColumn/float64_nullable-10                      51          25767881 ns/op         325.55 MB/s    12059614 B/op       1285 allocs/op
PASS
ok      github.com/apache/arrow/go/v17/parquet/pqarrow  10.592s
```

After:

```
$ go test -test.run='^$' -test.bench='^BenchmarkWriteColumn$' -benchmem ./parquet/pqarrow/...
goos: linux
goarch: arm64
pkg: github.com/apache/arrow/go/v17/parquet/pqarrow
BenchmarkWriteColumn/int32_not_nullable-10                  1196            959528 ns/op        4371.22 MB/s     5420349 B/op        238 allocs/op
BenchmarkWriteColumn/int32_nullable-10                        51          23017598 ns/op         182.22 MB/s    14138480 B/op        248 allocs/op
BenchmarkWriteColumn/int64_not_nullable-10                   690           1671710 ns/op        5017.98 MB/s     5419878 B/op        263 allocs/op
BenchmarkWriteColumn/int64_nullable-10                        50          23196051 ns/op         361.64 MB/s    13728465 B/op        261 allocs/op
BenchmarkWriteColumn/float32_not_nullable-10                 540           2185075 ns/op        1919.52 MB/s     5459392 B/op       1261 allocs/op
BenchmarkWriteColumn/float32_nullable-10                      54          21796783 ns/op         192.43 MB/s    14150622 B/op       1271 allocs/op
BenchmarkWriteColumn/float64_not_nullable-10                 418           2708292 ns/op        3097.38 MB/s     5455095 B/op       1290 allocs/op
BenchmarkWriteColumn/float64_nullable-10                      51          22174952 ns/op         378.29 MB/s    14142791 B/op       1283 allocs/op
PASS
ok      github.com/apache/arrow/go/v17/parquet/pqarrow  10.210s
```

<!--
If there are any breaking changes to public APIs, please uncomment the
line below and explain which changes are breaking.
-->
<!-- **This PR includes breaking changes to public APIs.** -->

<!--
Please uncomment the line below (and provide explanation) if the changes
fix either (a) a security vulnerability, (b) a bug that caused incorrect
or invalid data to be produced, or (c) a bug that causes a crash (even
when the API contract is upheld). We use this to highlight fixes to
issues that may affect users without their knowledge. For this reason,
fixing bugs that cause errors don't count, since those are usually
obvious.
-->
<!-- **This PR contains a "Critical Fix".** -->
* GitHub Issue: #41541
…3208)

### Rationale for this change

Currently our java-jars and some wheels jobs are failing due to downloading a wrong version of Apache Thrift based on the 0.20.0 branch instead of the tag. That branch contains a new commit that makes the sha validation to fail.

### What changes are included in this PR?

Apply the Thrift patch that was applied on vcpkg here: microsoft/vcpkg#39787

### Are these changes tested?
Via archery

### Are there any user-facing changes?
No
* GitHub Issue: #43204

Authored-by: Raúl Cumplido <raulcumplido@gmail.com>
Signed-off-by: Sutou Kouhei <kou@clear-code.com>
@raulcd raulcd requested a review from zeroshade as a code owner July 11, 2024 08:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants