Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add the ability to access file_path in FileOpener #8484

Merged
merged 12 commits into from
Sep 4, 2023

Conversation

stephaniewang526
Copy link
Contributor

This allows us to support multiple secrets and do path matching, etc.

@github-actions github-actions bot marked this pull request as draft August 4, 2023 18:15
@Mytherin
Copy link
Collaborator

Mytherin commented Aug 7, 2023

Thanks for the PR!

This will not work correctly as the FileOpener is a shared structure (that lives in the ClientData). If multiple files are opened in different threads because of e.g. UNION pipelines that query different files this will result in a race condition. This should be solved in a different manner, e.g. by adding the file path as a parameter to TryGetCurrentSetting or so.

@stephaniewang526 stephaniewang526 changed the title add file_path to FileOpener add the ability to access file_path in FileOpener Aug 7, 2023
@stephaniewang526 stephaniewang526 changed the title add the ability to access file_path in FileOpener Add the ability to access file_path in FileOpener Aug 7, 2023
@stephaniewang526 stephaniewang526 marked this pull request as ready for review August 7, 2023 19:45
@stephaniewang526
Copy link
Contributor Author

Thanks for the review. I'm not so sure where/how we could get and then set the file_path (in this case, glob_pattern) in TryGetCurrentSetting and how it helps with the concurrency problem? I tried a different way (where a new ScopedOpener object is constructed and lives for the scope/duration of just the Glob method). Please let me know if further changes are needed. Thanks!

@Mytherin
Copy link
Collaborator

Mytherin commented Aug 8, 2023

I'm not a big fan of this solution as I can see this becoming tricky w.r.t. memory management of the multiple openers as you are returning a raw pointer and not an owning pointer - I think a cleaner solution would be to expand TryGetCurrentSetting in the FileOpener with a new parameter (FileOpenerInfo or so) that contains a file path and potentially other info (e.g. whether this is a Glob or an OpenFile or so).

@github-actions github-actions bot marked this pull request as draft August 8, 2023 19:50
@stephaniewang526 stephaniewang526 marked this pull request as ready for review August 8, 2023 19:56
virtual ~FileOpener() {};

virtual bool TryGetCurrentSetting(const string &key, Value &result) = 0;
virtual bool TryGetCurrentSetting(const string &key, Value &result, FileOpenerInfo &info) = 0;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can make new argument an optional_ptr and default it to a nullptr for most subclasses.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at usage again. It might be cleaner to have two virtual TryGetCurrentSetting methods one with the FileOpenerInfo and one without. This way users can only provide FileOpenerInfo when it makes sense, and avoids potentially confusing nullptrs

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay makes sense!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can default this method to disregard FileOpenerInfo to make life easier on library writer (though I am personally inclined towards forcing libraries to be explicit).

src/common/file_system.cpp Outdated Show resolved Hide resolved
@github-actions github-actions bot marked this pull request as draft August 9, 2023 00:16
@stephaniewang526 stephaniewang526 marked this pull request as ready for review August 9, 2023 00:18
@github-actions github-actions bot marked this pull request as draft August 9, 2023 00:23
}

bool FileOpener::TryGetCurrentSetting(const string &key, Value &result, FileOpenerInfo &info) {
return false;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this default to calling the none FileOpenerInfo method i.e.

bool FileOpener::TryGetCurrentSetting(const string &key, Value &result, FileOpenerInfo &info) {
	return this->TryGetCurrentSetting(key, value);
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. Corrected.

@stephaniewang526 stephaniewang526 marked this pull request as ready for review August 9, 2023 00:29
@stephaniewang526
Copy link
Contributor Author

Thanks for the pointer. I made another attempt. Please let me know if this makes sense.

Copy link
Collaborator

@Mytherin Mytherin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the fixes! LGTM. Could you just have a look at the failing CI?

@stephaniewang526 stephaniewang526 marked this pull request as draft August 28, 2023 15:13
@stephaniewang526 stephaniewang526 marked this pull request as ready for review August 28, 2023 15:13
@github-actions github-actions bot marked this pull request as draft August 28, 2023 15:17
@stephaniewang526 stephaniewang526 marked this pull request as ready for review August 28, 2023 15:24
@github-actions github-actions bot marked this pull request as draft August 29, 2023 15:51
@stephaniewang526 stephaniewang526 marked this pull request as ready for review August 29, 2023 15:51
@Mytherin Mytherin merged commit 4efd3e7 into duckdb:main Sep 4, 2023
53 of 54 checks passed
@Mytherin
Copy link
Collaborator

Mytherin commented Sep 4, 2023

Thanks! LGTM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants