Skip to content

feat(c/driver_manager): add new function to allow loading by manifest #2918

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 63 commits into from
Jun 23, 2025

Conversation

zeroshade
Copy link
Member

Following the discussions on the mailing list and on https://docs.google.com/document/d/1ANeoDI7MCi9xFt_q8z726xKweWS2vgqO8Ut6BSTt5zM/edit?tab=t.0#heading=h.rd6unnsl7uy this PR implements the proposed behavior by adding a new function to the driver manager rather than changing the existing AdbcLoadDriver (for now).

The new function, AdbcFindLoadDriver adds a parameter for LoadOptions along with behavior to search directories at multiple levels (environment, user, system) to search for either a manifest or a driver library. This should improve the developer experience and make it easier for users and applications to leverage the driver manager.

Copy link
Member Author

@zeroshade zeroshade left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've tagged everyone who commented on the google doc or on the mailing list as reviewers so that this can be reviewed and discussed here.

Copy link
Member Author

@zeroshade zeroshade left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

duplicate comment, oops

@zeroshade
Copy link
Member Author

@paleolimbot I think i'm gonna need a little help getting the R build to recognize the vendored toml++ header, as far as I can tell looking at bootstrap-c.R it should be copying it over correctly from the vendor dir, so I'm not sure what's wrong. Any ideas?

@zeroshade zeroshade requested a review from kou as a code owner June 3, 2025 23:18
@zeroshade
Copy link
Member Author

@paleolimbot looks like i figured it out! 😄

@zeroshade zeroshade requested a review from CurtHagenlocher June 4, 2025 00:14
#include <dlfcn.h>
#endif // defined(_WIN32)
ADBC_EXPORT
std::vector<std::filesystem::path> AdbcParsePath(const std::string_view& path);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we use InternalAdbc instead?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It has to start with Adbc because of the symbols.map, so we can't use InternalAdbc unless we change the symbols.map

We could change it to AdbcInternal if you like, or just not have the separate unit test for it and not export it, but I prefer to test it

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Er, the symbols.map accepts InternalAdbc.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, must have missed that. Why doesn't the other internal function use that too then?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They don't necessarily need to be exposed. This is only for internal functions which we want to expose for unit testing.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, I'm talking about AdbcDriverManagerDefaultEntrypoint which is the other function we only expose for unit testing. It's not using InternalAdbc that's why I followed the same pattern. It feels like either both should be InternalAdbc or neither should

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, we can rename it. That probably predates the symbols.map.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I'll rename both then

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like I needed to add InternalAdbc* to the extern "C++" portion of the symbols.map for this to work, otherwise it didn't export the symbol

std::filesystem::path UserConfigDir() {
std::filesystem::path config_dir;
#if defined(_WIN32)
auto dir = std::getenv("AppData");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this is dealing with filesystem paths, I would recommend that you use the Unicode API functions (either _wgetenv or GetEnvironmentVariableW) and then encode to utf8.

Generally I think you should use the convention that std::string and std::string_view paths on Windows are utf8-encoded. If you use the bytewise ("ASCII") APIs provided by Windows, they use the current codepage which is brittle and limited (it cannot represent everything).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A realistic test for this would be to create a Windows user with a é (e.g.) in the name and ensure that this works (at least that's what happened the last time I had to debug this!)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, does this need to use the Windows API? (The only other place I've seen this does:)

https://github.com/r-lib/rappdirs/blob/955f2e7997ed6d65cc47a3d77d0cf324c2685897/src/win-path.c#L13

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

...create a Windows user with a é (e.g.)

Everyone goes through this ritual with Windows programming, huh? :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@paleolimbot I don't think we need to use the Windows API that you're pointing at there, as the only available options for that are %LOCALAPPDATA%\Desktop, %LOCALAPPDATA%\Documents, %LOCALAPPDATA%\Favorites and %LOCALAPPDATA%\ProgramData so it looks like you can't just get the %AppData% folder with that API

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

@paleolimbot paleolimbot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for handling the R builds! This is going to be great and I'm excited for the opportunities it will create. My comments here are mostly from trying to wrap various spatial libraries that have various types of search paths (e.g., PROJ looking for its database or data files, trying to get GDAL built against PROJ to look in the right place, etc.)...writing bindings to those libraries from R required some flexibility and I was glad the libraries exposed this.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just pointing out the change in scope introduced here (the current driver manager is ~2000 lines, and this is ~18,000 lines of vendored parser). Opting out of the driver manager is pretty easy (e.g., how we were able to get GDAL to support ADBC in its default build) and so I think this is OK, but there's a chance it could result in more opting out of the driver manager.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the release build of the driver manager with these changes and updates is only 320K, so I don't think this will necessarily result in people opting out of the driver manager. Unless I'm missing something?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose GDAL didn't want to vendor the driver manager even when it was 2000 lines, so perhaps there's no change here. As another data point, QGIS is using vcpkg where this wouldn't be a problem (as long as it can use the vcpkg version of the toml parser).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know much about vcpkg configs but I can probably set up some option so that it can use the vcpkg version instead of the vendored one? Right?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I think you can (no need to do it in this PR).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@paleolimbot So, I've also seen smaller toml libraries in C or C++ if that's a concern (but they tend not to be header only) such as https://github.com/cktan/tomlc17

If you think it's worthwhile I can rework this to use a smaller toml library instead.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It sounds like there are a few options if this comes up as an issue (it may not)!

std::filesystem::path UserConfigDir() {
std::filesystem::path config_dir;
#if defined(_WIN32)
auto dir = std::getenv("AppData");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A realistic test for this would be to create a Windows user with a é (e.g.) in the name and ensure that this works (at least that's what happened the last time I had to debug this!)

std::filesystem::path UserConfigDir() {
std::filesystem::path config_dir;
#if defined(_WIN32)
auto dir = std::getenv("AppData");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, does this need to use the Windows API? (The only other place I've seen this does:)

https://github.com/r-lib/rappdirs/blob/955f2e7997ed6d65cc47a3d77d0cf324c2685897/src/win-path.c#L13

std::filesystem::path UserConfigDir() {
std::filesystem::path config_dir;
#if defined(_WIN32)
auto dir = std::getenv("AppData");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

...create a Windows user with a é (e.g.)

Everyone goes through this ritual with Windows programming, huh? :)

}
#endif

const std::string& current_arch() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It may be worth mirroring DuckDB's implementation (since that code has seen some action). (Can also be sorted in a follow-up to improve this later on):

https://github.com/duckdb/duckdb/blob/985150a688889981dd039822174569e262b79b72/src/include/duckdb/common/platform.hpp#L80

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll take a look at theirs and improve this :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks like what they were doing is mostly the same as what I've got here, there were a few extra macros they check so I added those here so that we're basically the same as them

@zeroshade
Copy link
Member Author

Hey @kou any ideas on how to fix the std::filesystem issue on almalinux-8? I've been toying around locally trying to add -lstdc++fs but it doesn't seem to fix the linking issue which is odd. Hoping you might have some ideas on how to get it building properly as I really don't want to drop std::filesystem::path and have to do all the file path nonsense myself.

@kou
Copy link
Member

kou commented Jun 10, 2025

I think I can fix it. :-)
I'll take a look at it later.

@kou
Copy link
Member

kou commented Jun 11, 2025

Fixed:

@zeroshade
Copy link
Member Author

@pitrou @paleolimbot @lidavidm Any further comments here?

@zeroshade zeroshade force-pushed the driver-manager-configs branch from 1579bb4 to 12feba7 Compare June 20, 2025 14:51
@zeroshade
Copy link
Member Author

Thanks everyone for all the comments and work here! Glad to get this in. I'll follow this up with changes for the docs and updates to the bindings

@zeroshade zeroshade merged commit 8e1b028 into apache:main Jun 23, 2025
86 checks passed
@zeroshade zeroshade deleted the driver-manager-configs branch June 23, 2025 15:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants