-
Notifications
You must be signed in to change notification settings - Fork 984
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DRILL-8358: Storage plugin for querying other Apache Drill clusters #2709
Conversation
super("Query timed out in "+ timeoutValueInSeconds + " seconds"); | ||
} | ||
} | ||
package org.apache.drill.exec.store.drill.plugin; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So Git decided that this was the renaming of a file 😏
try { | ||
String urlSuffix = connection.substring(CONNECTION_STRING_PREFIX.length()); | ||
Properties props = ConnectStringParser.parse(urlSuffix, properties); | ||
props.putAll(credentialsProvider.getUserCredentials(userName)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This getUserCredentials(String username) method is meant to fetch per-query-user credentials for plugins that are in user translation auth mode while the nullary method getUserCredentials() is meant for shared credentials. Only the plain and Vault providers currently support per-user credentials. You can see some logic for deciding which to call (via UsernamePasswordCredentials objects) in JdbcStorageConfig on line 142.
Those APIs wound up being a little ugly :/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, fixed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @vvysotskyi for this. I had a usage question as well.
Let's say that I have 2 drills, drill1 and drill2. Let's say that drill2 is connected to a file system called dfs2 and I want to query that from drill1. What would the query look like?
Would it be something like:
SELECT *
FROM drill2.dfs.ws.`file`
@JsonCreator | ||
public DrillSubScan( | ||
@JsonProperty("userName") String userName, | ||
@JsonProperty("mongoPluginConfig") StoragePluginConfig pluginConfig, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this supposed to be mongoPluginConfig
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, it isn't, thanks, fixed it.
@cgivre, yes, you can create a plugin in drill1 with the name drill2, and query all plugins that drill2 has configured from drill1, so if drill2 has file system plugin called dfs2, query for drill1 will be the following: SELECT *
FROM drill2.dfs2.ws.`file` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for making changes.
.recordCount(); | ||
|
||
assertEquals(50L, recordCount); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we have a test of a schema path that descends through directories in a filesystem plugin on the remote Drill cluster? E.g.
select * from drill.`dfs.tmp`.`/path/to/foo.parquet`
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I'll add a unit test for it in one of the future pull requests.
DRILL-8358: Storage plugin for querying other Apache Drill clusters
Description
Using native client to query other drill clusters. Added logic to do various pushdowns when possible.
Fixed adding extra project for the case of star columns.
Fixed ignoring column with empty name column for excel format.
Documentation
See README.md
Testing
Tested manually, added UT.