Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Start with cmake build #629

Merged
merged 30 commits into from
Sep 22, 2022
Merged

Start with cmake build #629

merged 30 commits into from
Sep 22, 2022

Conversation

zklaus
Copy link
Contributor

@zklaus zklaus commented May 17, 2022

Checklist

  • Used a personal fork of the feedstock to propose changes
  • Bumped the build number (if the version is unchanged)
  • Reset the build number to 0 (if the version changed)
  • Re-rendered with the latest conda-smithy (Use the phrase @conda-forge-admin, please rerender in a comment in this PR for automated rerendering)
  • Ensured the license file is being packaged.

Closes #625

@conda-forge-linter
Copy link

Hi! This is the friendly automated conda-forge-linting service.

I just wanted to let you know that I linted all conda-recipes in your PR (recipe) and found it was in an excellent condition.

@zklaus
Copy link
Contributor Author

zklaus commented May 17, 2022

@conda-forge-admin, please rerender

2 similar comments
@zklaus
Copy link
Contributor Author

zklaus commented May 17, 2022

@conda-forge-admin, please rerender

@zklaus
Copy link
Contributor Author

zklaus commented May 18, 2022

@conda-forge-admin, please rerender

@zklaus
Copy link
Contributor Author

zklaus commented May 19, 2022

@conda-forge/gdal, I think this is about as far as I can go.

The main open question for me now is how to address the Python bindings. Currently, they are built the old way with explicit calls to setup.py. In principle, I would like to build them via CMake as well, but I don't know how to achieve this with the splitting.

Then it would be good to double-check that the feature set is not changing and if there is a need to add more parameters to the CMake call.

Finally, I wonder what people here think about bringing the https://github.com/conda-forge/gdal-csharp-feedstock into this multi build.

All thoughts, comments, and commits are welcome.

@ocefpaf
Copy link
Member

ocefpaf commented May 19, 2022

I would like to build them via CMake as well, but I don't know how to achieve this with the splitting.

It is a bit annoying but we can list the files that goes into each package. Something like this:

https://docs.conda.io/projects/conda-build/en/latest/resources/define-metadata.html#specifying-files-to-include-in-output

So you would build once and then split the resulting files into the outputs.

Thanks for all your effort BTW! I'm a nit tied up at the day job but I'll try to see if I can implement that here.

@gillins
Copy link
Contributor

gillins commented May 19, 2022

Thanks for pushing on with this @zklaus. So all the same drivers are built as the old build?

Do you know why the test failure on Windows?

Happy to consider other language bindings, but gdal-csharp appears to be a separate project (as opposed to the Python bindings which are part of the gdal tree). Maybe it would make more sense to have this as a separate feedstock?

@zklaus
Copy link
Contributor Author

zklaus commented May 20, 2022

I would like to build them via CMake as well, but I don't know how to achieve this with the splitting.

It is a bit annoying but we can list the files that goes into each package. Something like this:
https://docs.conda.io/projects/conda-build/en/latest/resources/define-metadata.html#specifying-files-to-include-in-output
So you would build once and then split the resulting files into the outputs.
Thanks for all your effort BTW! I'm a nit tied up at the day job but I'll try to see if I can implement that here.

Good idea. I activated build artifact storage so that we can compare the files with and without Python bindings. Let's see where that gets us; perhaps it's easy.

Thanks for pushing on with this @zklaus. So all the same drivers are built as the old build?

Good question. I am not sure. From what I see, the cmake build system is a lot smarter about finding things on its own, largely due to the one CMAKE_PREFIX_PATH compared to all the individual --with-...=${PREFIX} in the old system. That makes the build script more compact, but less informative. The output of the configure stage of the cmake build, however, is excellent, listing all the found and configured options. So we should compare line by line the options between the two build system to make sure. What is not completely clear to me here, is whether the list of --with-... is the complete list of activated drivers, or wether some automatically found ones were added in addition. Do you know?

Do you know why the test failure on Windows?

This is a problem in the Python bindings. The linker doesn't find gdal_i.lib in numerous places. That may have something to do with

if %vc% LEQ 9 set MSVC_VER=1500
if %vc% GTR 9 set MSVC_VER=1900
, which I removed along with the rest of the BLD_OPTS.

Happy to consider other language bindings, but gdal-csharp appears to be a separate project (as opposed to the Python bindings which are part of the gdal tree). Maybe it would make more sense to have this as a separate feedstock?

I don't think that's true. At least, it uses the same source (see https://github.com/conda-forge/gdal-csharp-feedstock/blob/2d1c8ab84fea6685619bbb697947b4ff795fb32e/recipe/meta.yaml#L9) and the whole csharp binding is in the tree we use. On the other hand, the other feedstock exists and I wouldn't want to infringe on their turf. They do struggle with the cmake based build system as well right now in one of their PRs, but the build script is also rather more involved than what we have here.

@zklaus
Copy link
Contributor Author

zklaus commented May 20, 2022

@conda-forge-admin, please rerender

@gillins
Copy link
Contributor

gillins commented May 21, 2022

Do the tests dump the result of gdalinfo --formats and gdalinfo --formats (if they don't, they should)? If you could compare the results from the current build that'd be great (sorry, travelling atm and only have a very small screen). Happy to take your word on the result. Would be bad if we dropped a few important drivers accidentally...

Regarding Windows, is this a problem that needs to be reported upstream? Sounds like it should just work with cmake.

Happy to defer to @ocefpaf on the csharp question - probably need to resolve that at a higher level.

@ocefpaf
Copy link
Member

ocefpaf commented May 23, 2022

I don't think that's true. At least, it uses the same source (see https://github.com/conda-forge/gdal-csharp-feedstock/blob/2d1c8ab84fea6685619bbb697947b4ff795fb32e/recipe/meta.yaml#L9) and the whole csharp binding is in the tree we use. On the other hand, the other feedstock exists and I wouldn't want to infringe on their turf. They do struggle with the cmake based build system as well right now in one of their PRs, but the build script is also rather more involved than what we have here.

If you think building the csharp bindings here is the best option, b/c it is in the source tree and it is easier to maintain, we should:

  1. Notify the other feedstock team we are doing it and invite them to this one if they wish to continue maintaining it.
  2. Fix the outputs feedstock to point the new source of that package.
  3. Archive the other feedstock.

With that said. I don't use gdal-csharp and I don't have any skin in the game, so I'll leave it to you and the other feedstock team to decided on what is best for all. I can take care of steps 2 and 3 once you make a decision.

@zklaus
Copy link
Contributor Author

zklaus commented May 23, 2022

Alright, let's focus on getting the existing things in order first. I'll start a dialog with the other team in parallel. I am still puzzled by the problems in win and the osx-arm64 cross build. Those may be due to the fact that the original osx-arm64 migration was done with the autoconf build.

@zklaus
Copy link
Contributor Author

zklaus commented May 31, 2022

@conda-forge-admin, please rerender

@gillins
Copy link
Contributor

gillins commented Jun 3, 2022

copying build\lib.win-amd64-cpython-38\osgeo\_gdal.cp38-win_amd64.pyd -> D:bld\gdal-split_1654181135955\_h_env\Library\Lib\site-packages\osgeo

Seems to be copying into the wrong spot? Library shouldn't be there (for Python bindings, ok for the libs etc).

@zklaus
Copy link
Contributor Author

zklaus commented Jun 3, 2022

@conda-forge-admin, please rerender

@zklaus
Copy link
Contributor Author

zklaus commented Jun 3, 2022

Just rebasing to solve the conflicts.

@zklaus
Copy link
Contributor Author

zklaus commented Jun 3, 2022

copying build\lib.win-amd64-cpython-38\osgeo\_gdal.cp38-win_amd64.pyd -> D:bld\gdal-split_1654181135955\_h_env\Library\Lib\site-packages\osgeo

Seems to be copying into the wrong spot? Library shouldn't be there (for Python bindings, ok for the libs etc).

Very possible! As you may see from the patches, the last problem I addressed comes from the "Windows native" paths, i.e. the backslashes that are in cmake occasionally interpreted as escape characters, leading to problems like here:

CMake Warning (dev) at build_ext.cmake:1 (execute_process):
  Syntax error in cmake code at

    D:/bld/gdal-split_1654095001842/work/build/swig/python/build_ext.cmake:1

  when parsing string

    %PREFIX%\python.exe

  Invalid escape sequence \b

  Policy CMP0010 is not set: Bad variable reference syntax is an error.  Run
  "cmake --help-policy CMP0010" for policy details.  Use the cmake_policy
  command to set the policy and suppress this warning.
This warning is for project developers.  Use -Wno-dev to suppress it.

I mention this because one problem with the path you showed is that there is no separator between the drive letter and the rest of the path. If I recall correctly from my Windows experience some 15 years ago, that might be interpreted as a path relative to the CWD on D:, which I don't know what that is.

@zklaus
Copy link
Contributor Author

zklaus commented Jun 6, 2022

@conda-forge-admin, please rerender

@gillins
Copy link
Contributor

gillins commented Jul 7, 2022

@zklaus where did this get to? Is this more likely to work now that 3.5.1 is out?

@zklaus
Copy link
Contributor Author

zklaus commented Jul 8, 2022

@zklaus where did this get to? Is this more likely to work now that 3.5.1 is out?

Actually, I think it was fully working. But I did not finish checking that it reproduces the exact same drivers and plugins.
I'll rebase and we can take the check from there, ok?

@zklaus
Copy link
Contributor Author

zklaus commented Jul 8, 2022

@conda-forge-admin, please rerender

@zklaus zklaus marked this pull request as ready for review July 8, 2022 12:14
@zklaus
Copy link
Contributor Author

zklaus commented Sep 22, 2022

Alright, let's get this over with. The last open question seems to be from @gillins

Do we need to have a migrator just to be cautious, or are we happy the new libs will just work against existing builds?

I am not familiar enough with migrators to have an opinion on this. What do others think about it? Is that a reason to wait with the merge or can we do that regardless and set up the migrator after that?

@xylar
Copy link
Contributor

xylar commented Sep 22, 2022

As far as I understand it, this will just produce a new build of an existing version of gdal. No migration is possible or necessary. Downstream packages that already use this version will now pull in this build. This is why it's very important that the autotools and cmake versions produce compatible libraries (e.g. the same so).

I think we can just merge.

@akrherz
Copy link
Contributor

akrherz commented Sep 22, 2022

I smashing the big green button here and can take all the blame if this blows up :)

@djhoese
Copy link

djhoese commented Sep 22, 2022

Hey so this seems to have broken rasterio on Windows and OSX, but not Linux (as far as I can tell). I've been able to reproduce it on CI and am asking coworkers with macs to test it. You can see my ramblings in #648 which I thought was the original problem.

Edit: Can work on an issue when I get confirmation of the reproducer.

conda create -n test_davidh_gdal -c conda-forge --strict-channel-priority python=3.10 gdal rasterio
conda activate test_davidh_gdal
python -c "import osgeo.gdal; import rasterio"

Edit 2: Having trouble getting this reproducer to fail on real metal intel mac or M1 mac. Only failed so far on github actions osx-latest.

@jorisvandenbossche
Copy link
Member

jorisvandenbossche commented Sep 22, 2022

FYI this also broke source installs for pyogrio on Windows because of the library name change (gdal_i.lib -> gdal.lib), see geopandas/pyogrio#158 (comment) (but don't yet know if conda-forge binaries are failing as well)

@jorisvandenbossche
Copy link
Member

I can confirm that this has also broken Fiona on Windows and MacOS. In both cases, I see the following error on geopandas' CI:

     import fiona._loading
>   with fiona._loading.add_gdal_dll_directories():
E   AttributeError: partially initialized module 'fiona' has no attribute '_loading' (most likely due to a circular import)

@jorisvandenbossche
Copy link
Member

Pyogrio is also failing. This gives a more useful error message compared to the one from Fiona above, at least on macOS (https://github.com/geopandas/geopandas/actions/runs/3108750437/jobs/5038295919):

dlopen(/Users/runner/miniconda3/envs/test/lib/python3.8/site-packages/pyogrio/_ogr.cpython-38-darwin.so, 2): Library not loaded: @rpath/libgdal.31.dylib
Referenced from: /Users/runner/miniconda3/envs/test/lib/python3.8/site-packages/pyogrio/_ogr.cpython-38-darwin.so
Reason: Incompatible library version: _ogr.cpython-38-darwin.so requires version 32.0.0 or later, but libgdal.31.dylib provides version 31.0.0

So it seems that on MacOS the library name actually changed in the end.

And on Windows (https://github.com/geopandas/geopandas/actions/runs/3108750437/jobs/5038296097):

ImportError: DLL load failed while importing _ogr: The specified module could not be found.

@isuruf
Copy link
Member

isuruf commented Sep 22, 2022

@rouault
Copy link
Contributor

rouault commented Sep 22, 2022

Reason: Incompatible library version: _ogr.cpython-38-darwin.so requires version 32.0.0 or later, but libgdal.31.dylib provides version 31.0.0

_ogr.cpython-38-darwin.so was not rebuilt against the updated GDAL conda-forge package, right ? It is well possible that there are differences in shared object versionning between cmake and libtool. We tried to emulate libtool SONAME behaviour with cmake for Linux, but libtool behaviour is platform dependent, so probably there's a difference for Mac. So it is likely that a rebuild of GDAL reverse dependencies is needed on conda-forge

@akrherz
Copy link
Contributor

akrherz commented Sep 22, 2022

It sounds like the path forward here is to back out this PR from main branch and then wait to take it once GDAL 3.6 is released so that we can take advantage of the pinning against that new release for dependent packages?

@gillins
Copy link
Contributor

gillins commented Sep 22, 2022

It sounds like the path forward here is to back out this PR from main branch and then wait to take it once GDAL 3.6 is released so that we can take advantage of the pinning against that new release for dependent packages?

I agree

akrherz added a commit to akrherz/gdal-feedstock that referenced this pull request Sep 23, 2022
@hobu
Copy link
Contributor

hobu commented Sep 23, 2022

I was rebuilding PDAL with build 2 (which I know was declared broken and pulled out), and I was getting an error on windows about msodbcsql17.dll

I think GDAL's CMake is maybe greedily pulling in ODBC, but we don't have the windows system dependency needed available to run, just to build. This is another thing to run down as part of the CMake migration.

image

github-actions bot added a commit that referenced this pull request Sep 23, 2022
automerged PR by conda-forge/automerge-action
@gillins
Copy link
Contributor

gillins commented Sep 25, 2022

So how do we re-open this one again?

@isuruf
Copy link
Member

isuruf commented Sep 25, 2022

You'll need to set MACHO_COMPATIBILITY_VERSION and MACHO_CURRENT_VERSION.
See https://cmake.org/cmake/help/latest/prop_tgt/MACHO_COMPATIBILITY_VERSION.html

@isuruf
Copy link
Member

isuruf commented Sep 25, 2022

See also https://gitlab.kitware.com/cmake/cmake/-/issues/17652 for an explanation

@akrherz akrherz mentioned this pull request Sep 25, 2022
4 tasks
@zklaus
Copy link
Contributor Author

zklaus commented Sep 26, 2022

Ouch, sorry for all the trouble, and thanks for the swift rollback.

I am not completely sure that we should wait for 3.6.0, but let's take that as the target for now. I think there are a few points to be addressed:

  1. The Osx and Windows builds need to be tested as thoroughly as the Linux one. I will try to add as much of the guidance from the documentation on moving to Cmake to the test section of the recipe in a reference PR, but if that doesn't get us all the way, it would be good to find someone with an Osx and Windows machine, respectively. Any takers?
  2. On Windows, the library name changed from gdal_i.lib to gdal.lib. I think that change is deliberate, but it does seem to brake some downstream builds. Can we confirm that we want this change? @rouault, was this a deliberate upstream decision?
  3. Rasterio, Fiona, Pyogrio all seem to suffer from the same change of the Osx libraries. I'll try to check that extra carefully with 1
  4. The Odbc issue with Pdal is actually straightforward. On Windows it is autodetected, so it must be explicitly disabled.

@rouault
Copy link
Contributor

rouault commented Sep 26, 2022

2. On Windows, the library name changed from gdal_i.lib to gdal.lib. I think that change is deliberate, but it does seem to brake some downstream builds. Can we confirm that we want this change? @rouault, was this a deliberate upstream decision?

"Deliberate" in the sense that we just use CMake default settings regarding this and not try to emulate exactly previous behaviour.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Use cmake for build