DO NOT MERGE: Testing create Java-jars against 15.0.0 commit release to test Linux dataset errors#40003
Closed
davisusanibar wants to merge 22 commits intoapache:mainfrom
Closed
DO NOT MERGE: Testing create Java-jars against 15.0.0 commit release to test Linux dataset errors#40003davisusanibar wants to merge 22 commits intoapache:mainfrom
davisusanibar wants to merge 22 commits intoapache:mainfrom
Conversation
### Rationale for this change We sometimes need to use a more modern cmake, before this change although we downloaded a functioning cmake on macos, we didn't have the correct path for it. ### What changes are included in this PR? Resolves apache#38811 so that cmake is useable when downloaded on macos. This also restores the local source build jobs to be testing that source builds work (which is what the Ci jobs say they are doing). I believe these jobs started using binaries when we overhauled the build system last release. ### Are these changes tested? Yes, in CI with the local (source) install jobs in crossbow) ### Are there any user-facing changes? * Closes: apache#38811 Authored-by: Jonathan Keane <jkeane@gmail.com> Signed-off-by: Jacob Wujciak-Jens <jacob@wujciak.de>
…ghtly CI build (apache#39498) Update version checks and assertions of pyarrow array equality for pandas failing tests on the CI: [test-conda-python-3.10-pandas-nightly](https://github.com/ursacomputing/crossbow/actions/runs/7391976015/job/20109720695) * Closes: apache#39437 Lead-authored-by: AlenkaF <frim.alenka@gmail.com> Co-authored-by: Alenka Frim <AlenkaF@users.noreply.github.com> Co-authored-by: Joris Van den Bossche <jorisvandenbossche@gmail.com> Signed-off-by: Joris Van den Bossche <jorisvandenbossche@gmail.com>
…ct (apache#39522) ### Rationale for this change With CMake > 3.28 the generated Makefile fails on the jemalloc_ep due to 'bad file descriptor'. ### What changes are included in this PR? For a sequential build for jemalloc by setting -j1. ### Are these changes tested? CI ### Are there any user-facing changes? No. * Closes: apache#39517 Authored-by: Jacob Wujciak-Jens <jacob@wujciak.de> Signed-off-by: Jacob Wujciak-Jens <jacob@wujciak.de>
### Rationale for this change The CRAN check on `fedora clang devel` builds with clang against libc++ and has a system re2 installed that was build with C++11 ABI which causes linking to fail due to the [abi:cxx11]-symbol annotation on the system version. A user could manually use the bundled build or path hint a clang version of the library. To avoid extra work for the CRAN maintainers we can just default to the bundled build. The re2 build is small enough that users building from source will not really feel the difference and can still opt to use the system re2 via `EXTRA_CMAKE_FLAGS`. ### What changes are included in this PR? Default to use our bundled build to prevent the problems. ### Are these changes tested? On a local dev container replicating the cran env. ### Are there any user-facing changes? Source build now default to use the bundled re2 version, this can be overridden. Authored-by: Jacob Wujciak-Jens <jacob@wujciak.de> Signed-off-by: Jacob Wujciak-Jens <jacob@wujciak.de>
…mpty and test_view (apache#39534) Skipping dask tests `test_dataframe.py::test_describe_empty` and `test_dataframe.py::test_view` on our CI to stop the nightly dask test jobs to fail. * Closes: apache#39531 Authored-by: AlenkaF <frim.alenka@gmail.com> Signed-off-by: AlenkaF <frim.alenka@gmail.com>
…requirements for the 15.x release branch (apache#39538) ### Rationale for this change PyArrow wheels for the 15.0.0 release will not be compatible with future numpy 2.0 packages, therefore it is recommended to add this upper pin now for _releases_. We will keep the more flexible pin on the development branch (by reverting this commit on main, but so it can be cherry-picked in the release branch) * Closes: apache#39537 Authored-by: Joris Van den Bossche <jorisvandenbossche@gmail.com> Signed-off-by: Joris Van den Bossche <jorisvandenbossche@gmail.com>
…pache#39535) ### Rationale for this change Removing usage of `np.core`, as that is deprecated and will be removed in numpy 2.0. For this specific case, we can just hardcode the list of data types instead of using a numpy api (this list doesn't typically change). * Closes: apache#39533 Authored-by: Joris Van den Bossche <jorisvandenbossche@gmail.com> Signed-off-by: Joris Van den Bossche <jorisvandenbossche@gmail.com>
…e#39516) ### Rationale for this change For Iceberg we want to add metadata type the type (the field-id), therefore we need to pass in the type analog to what we do for `ListArray.from_arrays(self, offsets, values, DataType type=None, MemoryPool pool=None, mask=None)`. ### What changes are included in this PR? Updated a keyword argument for the `type`, and make sure that the the static method to create the MapType is exposed from the cpp side. ### Are these changes tested? I've added a simple test. ### Are there any user-facing changes? * Closes: apache#39515 Authored-by: Fokko Driesprong <fokko@tabular.io> Signed-off-by: AlenkaF <frim.alenka@gmail.com>
…to run integration tests (apache#39502) Integration verification tasks are currently failing on CI. Install jpype and build JNI c-data to run integration tests Yes via archery No * Closes: apache#38470 Lead-authored-by: Raúl Cumplido <raulcumplido@gmail.com> Co-authored-by: Sutou Kouhei <kou@clear-code.com> Signed-off-by: Sutou Kouhei <kou@clear-code.com>
) ### Rationale for this change The version set currently on the maintenance branch is incorrect for Java BOM. ### What changes are included in this PR? Suggested changes to set specifically version for BOM and maven. ### Are these changes tested? I will trigger java-jars via archery but I think this is currently only reproducible on the maintenance branch. So we will have to merge and validate there. ### Are there any user-facing changes? No * Closes: apache#39564 Authored-by: Raúl Cumplido <raulcumplido@gmail.com> Signed-off-by: Raúl Cumplido <raulcumplido@gmail.com>
… to fix macOS build with conda (apache#39589) ### Rationale for this change CI job has been failing since we added integration tests. ### What changes are included in this PR? Add `CGO_ENABLED=1` to go build cdata_integration on the verification script. ### Are these changes tested? Yes via archery. ### Are there any user-facing changes? No * Closes: apache#39588 Authored-by: Raúl Cumplido <raulcumplido@gmail.com> Signed-off-by: Raúl Cumplido <raulcumplido@gmail.com>
…apache#39602) See apache#39601 ### Are these changes tested? Existing CI should pass. This should also pass on macbuilder without downloading cmake, and if hardcoding `download_ok <- FALSE`, it should exit cleanly and informatively. ### Are there any user-facing changes? Define "user". * Closes: apache#39601 Authored-by: Neal Richardson <neal.p.richardson@gmail.com> Signed-off-by: Jacob Wujciak-Jens <jacob@wujciak.de>
### Rationale for this change Resolves apache#39584 ### What changes are included in this PR? We now only check the checksum after the download succeeded, and try to be quieter about it when we do. We also use bundled boost and lz4 source on macos by default (to avoid system versions of each on cran that seem to have issues) ### Are these changes tested? I submitted a download-malignant (and verbose) build to [CRAN's macbuilder](https://mac.r-project.org/macbuilder/results/1705088784-991a5beacf4ec26e/) and it succeeds. ### Are there any user-facing changes? In principle the macos source build is slightly altered + we have a cleaner path when file downloads fail. But both of these should be relatively non-impactful since most macos users are getting binaries from CRAN. Most importantly it helps us stay on CRAN. **This PR contains a "Critical Fix".** * Closes: apache#39584 Lead-authored-by: Jonathan Keane <jkeane@gmail.com> Co-authored-by: Jacob Wujciak-Jens <jacob@wujciak.de> Signed-off-by: Jacob Wujciak-Jens <jacob@wujciak.de>
…pache#39625) ### Rationale for this change CMake is now a sysreq and we don't want to default to using nightly builds in CI ### Are these changes tested? Crossbos * Closes: apache#39624 Authored-by: Jacob Wujciak-Jens <jacob@wujciak.de> Signed-off-by: Jacob Wujciak-Jens <jacob@wujciak.de>
### What changes are included in this PR? The verification script is modified to look for the versions of .NET now supported by the package. ### Are these changes tested? Manually tested the verification command. * Closes: apache#39598 Authored-by: Curt Hagenlocher <curt@hagenlocher.org> Signed-off-by: Curt Hagenlocher <curt@hagenlocher.org>
…_filtering (apache#39632) ### Rationale for this change `ParquetFileFragment` stores a `SchemaManifest` that has a raw pointer to a `SchemaDescriptor`. The `SchemaDescriptor` is originally provided by a `FileMetadata` instance but, in some cases, the `FileMetadata` instance can be destroyed while the `ParquetFileFragment` is still in use. This can typically lead to bugs or crashes. ### What changes are included in this PR? Ensure that `ParquetFileFragment` keeps an owning pointer to the `FileMetadata` instance that provides its `SchemaManifest`'s schema descriptor. ### Are these changes tested? An assertion is added that would fail deterministically in the Python test suite. ### Are there any user-facing changes? No. * Closes: apache#39562 Authored-by: Antoine Pitrou <antoine@python.org> Signed-off-by: Antoine Pitrou <antoine@python.org>
|
Thanks for opening a pull request! If this is not a minor PR. Could you open an issue for this pull request on GitHub? https://github.com/apache/arrow/issues/new/choose Opening GitHub issues ahead of time contributes to the Openness of the Apache Arrow project. Then could you also rename the pull request title in the following format? or In the case of PARQUET issues on JIRA the title also supports: See also: |
Contributor
Author
|
@assignUser is it possible to run java-jar task again to rebuild jar using v15 release? |
Contributor
Author
|
@github-actions crossbow submit java-jars |
|
Contributor
|
@github-actions crossbow submit java-jars |
|
Revision: 0853744 Submitted crossbow builds: ursacomputing/crossbow @ actions-33a76f212b
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Rationale for this change
We are seeing error to use Apache Arrow Java Dataset module on Linux environment. This PR aims to create Java-jars against 15.0.0 commit release to test Linux dataset errors.
What changes are included in this PR?
The same as V15 release https://github.com/apache/arrow/tree/a61f4af724cd06c3a9b4abd20491345997e532c0
Are these changes tested?
Yes
Are there any user-facing changes?
No