-
Notifications
You must be signed in to change notification settings - Fork 3.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GH-41134: [GLib] Support building arrow-glib with MSVC #41599
Conversation
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
.github/workflows/ruby.yml
Outdated
env: | ||
# We can invalidate the current cache by updating this. | ||
CACHE_VERSION: "2024-05-09" | ||
- name: Build |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- name: Build | |
- name: Build C++ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we also use vcpkg to install dependencies of C++?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I looked into this and got it working on another branch (https://github.com/adamreeve/arrow/tree/msvc_glib_vcpkg), but it's quite slow to build all the vcpkg dependencies (48 minutes on my GitHub actions run) and this isn't integrated with ccache so will be slow every time. If you think it's worth doing I can look at caching the vcpkg builds too? We do this in ParquetSharp so it shouldn't be too hard to add: https://github.com/G-Research/ParquetSharp/blob/ef8e7c447ecd5d84205eb73293162f5689e977b8/.github/workflows/ci.yml#L99-L118
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK. Let's work on it as a separated task.
FYI: We have a job that uses VCPKG_BINARY_SOURCES
:
VCPKG_BINARY_SOURCES: 'clear;nuget,GitHub,readwrite' |
It may also work.
c_glib/arrow-glib/array-builder.h
Outdated
GArrowArray * | ||
garrow_array_builder_finish(GArrowArrayBuilder *builder, GError **error); | ||
|
||
GARROW_AVAILABLE_IN_2_0 | ||
GARROW_EXPORT |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's add this macro to GARROW_AVAILABLE_IN_XXX
. GLib does it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is what I was originally thinking of doing, but then I was worried about how this works with the different garrow libraries. We'd probably need separate available/deprecated macros per library like:
GARROW_AVAILABLE_IN_XXX
GARROW_DEPRECATED_IN_XXX
GARROW_DATASET_AVAILABLE_IN_XXX
GARROW_DATASET_DEPRECATED_IN_XXX
GARROW_FLIGHT_AVAILABLE_IN_XXX
GARROW_FLIGHT_DEPRECATED_IN_XXX
GARROW_FLIGHT_SQL_AVAILABLE_IN_XXX
GARROW_FLIGHT_SQL_DEPRECATD_IN_XXX
...
That way when you build eg. arrow-dataset-glib and include headers from arrow-glib, they can correctly get the dllimport
attribute while code in arrow-dataset-glib gets dllexport
.
Maybe I should look at integrating that Python script from GLib to reduce the boilerplate required. What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right. We need separated version macros per library.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought I could have this PR only change the arrow-glib library and do the others in a follow up to keep the changes manageable, but the MinGW build fails with link errors like "redeclared without dllimport attribute after being referenced with dll linkage", so I might need to add the separated version macros as part of this PR too, or make the visibility attributes MSVC only for now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've changed the macro definitions to only add dllimport/dllexport for MSVC for now, so this works as currently only the arrow-glib library is built for MSVC. But I can follow up to also add them under MinGW GCC later after adding version macros per-library. Does that work for you or would you rather include the per-library macros in this PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about opening a separated PR that adds version macros per-library, merging it and rebasing on main?
(I'm OK with a follow up approach.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like that idea, I've made #41681 for this and will make a PR for that before updating this PR.
b1ba360
to
beffceb
Compare
ARROW_FLIGHT: OFF | ||
ARROW_FLIGHT_SQL: OFF | ||
ARROW_HDFS: OFF | ||
ARROW_HOME: "${{ github.workspace }}/dist" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had to change ARROW_HOME
to be in the workspace directory rather than /usr
, as /usr
becomes C:\Program Files\Git\usr
as a Windows path and I was getting errors when linking like:
[47/54] Linking target arrow-glib/arrow-glib-1700.dll
FAILED: arrow-glib/arrow-glib-1700.dll arrow-glib/arrow-glib-1700.pdb
"link" @arrow-glib/arrow-glib-1700.dll.rsp
LINK : fatal error LNK1181: cannot open input file 'Files\Git\usr\bin.obj'
I didn't get to the bottom of where this was going wrong but it seems like something in meson is failing to quote the linker command inputs.
@github-actions crossbow submit -g release |
|
@github-actions crossbow submit -g nightly-release |
Revision: 4dc31f8 Submitted crossbow builds: ursacomputing/crossbow @ actions-012c96d84c |
I've rebased on top of the changes from #41721 to add per-library version headers, and added per-library visibility headers too. I've had to keep the Gandiva build disabled due to #36958 and also disabled the Flight libraries due to a missing zlib dependency, but I'm hoping that should be fixed by switching to vcpkg dependencies for the C++ library build. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
@@ -25,7 +25,7 @@ if [ "$#" -lt 1 ]; then | |||
fi | |||
|
|||
arrow_dir=$(cd -- "$(dirname -- "$0")/../.." && pwd -P) | |||
default_vcpkg_version=$(cat "${arrow_dir}/.env" | grep "VCPKG" | cut -d "=" -f2 | tr -d '"') | |||
default_vcpkg_version=$(source "${arrow_dir}/.env" && echo "$VCPKG" || echo "") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we remove || echo ""
?
default_vcpkg_version=$(source "${arrow_dir}/.env" && echo "$VCPKG" || echo "") | |
default_vcpkg_version=$(source "${arrow_dir}/.env" && echo "$VCPKG") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was needed because there are some tasks that run in Docker where the .env
file isn't included (it's not explicitly included in .dockerignore
), eg:
RUN arrow/ci/scripts/install_vcpkg.sh ${VCPKG_ROOT} ${vcpkg} |
From what I can see these explicitly set the vcpkg version so default_vcpkg_version
isn't used. Maybe it would make more sense to only try to read the vcpkg_version from .env
if $2
isn't set?
This wasn't a problem previously because the pipefail option isn't set, so the failure calling cat
was ignored.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see.
Maybe it would make more sense to only try to read the vcpkg_version from
.env
if$2
isn't set?
+1
…ified (#41805) ### Rationale for this change See #41599 (comment) This is a small tidy up of `install_vcpkg.sh` based on code review in #41599 after it was merged. Some uses of `install_vcpkg.sh` are in Docker containers where the `.env` file hasn't been copied. Rather than try to read it and ignore any errors, only read the `.env` file if the vcpkg version wasn't specified as an argument to the script. This way if there is an error reading the `.env` file and we do need the default version, the error should be more helpful. ### What changes are included in this PR? Update `install_vcpkg.sh` to only try to read the vcpkg version from `.env` if it isn't specified as an argument and don't ignore any errors. ### Are these changes tested? Yes, this script already runs as part of CI. ### Are there any user-facing changes? No Authored-by: Adam Reeve <adreeve@gmail.com> Signed-off-by: Sutou Kouhei <kou@clear-code.com>
After merging your PR, Conbench analyzed the 7 benchmarking runs that have been run so far on merge-commit 799021a. There were 5 benchmark results indicating a performance regression:
The full Conbench report has more details. It also includes information about 33 possible false positives for unstable benchmarks that are known to sometimes produce them. |
… library (apache#41721) ### Rationale for this change This is to support later using the `*_AVAILABLE_IN_*` macros to add `dllexport/dllimport` attributes required for building these libraries with MSVC (apache#41134) ### What changes are included in this PR? * Add a Python script that generates `DEPRECATED_IN` and `AVAILABLE_IN` macros for each GLib library * Add missing `AVAILABLE_IN` annotations to some methods in the GLib libraries (except the main arrow-glib library as this is being done in apache#41599) ### Are these changes tested? This doesn't include any behaviour change that can be unit tested. ### Are there any user-facing changes? No * GitHub Issue: apache#41681 Lead-authored-by: Adam Reeve <adreeve@gmail.com> Co-authored-by: Sutou Kouhei <kou@cozmixng.org> Signed-off-by: Sutou Kouhei <kou@clear-code.com>
…41599) ### Rationale for this change Allow Windows users to more easily build the GLib libraries. ### What changes are included in this PR? * Minor fixes to allow building with MSVC: * Changes some uses of variable length arrays to `std::vector`, because MSVC doesn't support variable length arrays * Moves some function definitions that use C++ types outside of `G_BEGIN_DECLS`/`G_END_DECLS` blocks (which expand to `extern C { }`), because this caused MSVC to error * Fix libraries not having any exported symbols with MSVC, which defaults to hiding symbols * Add `visibility.h` which defines a new `GARROW_EXTERN` macro that adds `dllimport` or `dllexport` attributes when using MSVC. * Include the `GARROW_EXTERN` macro in the definitions of the `GARROW_AVAILABLE_IN_*` macros. * Add a new CI job that builds the GLib libraries with MSVC on Windows, using vcpkg to install pkgconfig and glib. * For now only `arrow-glib` is built, I can follow up with the other libraries after this PR. That will require introducing new per-library version macros. ### Are these changes tested? The build will be tested in CI but I've only done some quick manual tests that the built library works correctly, I haven't got the ruby tests running against the build yet. ### Are there any user-facing changes? No? Eventually some documentation should be updated when all the GLib libraries can be built with MSVC though * GitHub Issue: apache#41134 Lead-authored-by: Adam Reeve <adreeve@gmail.com> Co-authored-by: Sutou Kouhei <kou@cozmixng.org> Signed-off-by: Sutou Kouhei <kou@clear-code.com>
…ified (apache#41805) ### Rationale for this change See apache#41599 (comment) This is a small tidy up of `install_vcpkg.sh` based on code review in apache#41599 after it was merged. Some uses of `install_vcpkg.sh` are in Docker containers where the `.env` file hasn't been copied. Rather than try to read it and ignore any errors, only read the `.env` file if the vcpkg version wasn't specified as an argument to the script. This way if there is an error reading the `.env` file and we do need the default version, the error should be more helpful. ### What changes are included in this PR? Update `install_vcpkg.sh` to only try to read the vcpkg version from `.env` if it isn't specified as an argument and don't ignore any errors. ### Are these changes tested? Yes, this script already runs as part of CI. ### Are there any user-facing changes? No Authored-by: Adam Reeve <adreeve@gmail.com> Signed-off-by: Sutou Kouhei <kou@clear-code.com>
Rationale for this change
Allow Windows users to more easily build the GLib libraries.
What changes are included in this PR?
std::vector
, because MSVC doesn't support variable length arraysG_BEGIN_DECLS
/G_END_DECLS
blocks (which expand toextern C { }
), because this caused MSVC to errorvisibility.h
which defines a newGARROW_EXTERN
macro that addsdllimport
ordllexport
attributes when using MSVC.GARROW_EXTERN
macro in the definitions of theGARROW_AVAILABLE_IN_*
macros.arrow-glib
is built, I can follow up with the other libraries after this PR. That will require introducing new per-library version macros.Are these changes tested?
The build will be tested in CI but I've only done some quick manual tests that the built library works correctly, I haven't got the ruby tests running against the build yet.
Are there any user-facing changes?
No? Eventually some documentation should be updated when all the GLib libraries can be built with MSVC though