Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DRILL-7573: Support htpasswd based authentication #1977

Closed
wants to merge 1 commit into from

Conversation

dobesv
Copy link
Contributor

@dobesv dobesv commented Feb 11, 2020

DRILL-7573: Support htpasswd based authentication

Description

This allows you to specify htpasswd as your authentication implementation. In this case, users will be authenticated using usernames and password taken from a text file in htpasswd format.

This gives some more flexibility compared to the PAM authenticator. For example, in docker / kubernetes you can mount a folder with an htpasswd file and update that file when you want to add/remove users, without any concern about interfering with the contents of /etc/passwd and /etc/shadow.

Documentation

Using a password file for authentication

Apache Drill allows you to store valid usernames and passwords in a text file in the popular "htpasswd" format.

This can be more convenient than using PAM in containerized environments, because you do not have to modify any system files like passwd, shadow, or files in pam.d. Instead you can mount a volume with the htpasswd file in it and tell drill to use that file for authentication.

To configure this feature:

  1. Create an htpasswd file and copy/mount it to/on the drillbit machines/containers:

     $ htpasswd /path/to/htpasswd $USER
    
  2. Add the following configuration to the drill.exec block in the <DRILL_HOME>/conf/drill-override.conf file:

           drill.exec: {
             security.auth.mechanisms : ["PLAIN"],
             security.user.auth: {
               enabled: true,
               packages += "org.apache.drill.exec.rpc.user.security",
               impl: "htpasswd",
               htpasswd: { path: "/path/to/htpasswd" }
             }
           }
    
  3. Restart the drillbit(s)

  4. Now you must use a username/password from the htpasswd file when logging into Drill

Note: Currently the crypt and bcrypt algorithms are not supported, you should probably use the MD5 hashing algorithm used by default by the htpasswd command.

Testing

I created an htpasswd file using htpasswd, configured the auth mechanism as shown above, and testing logging in with both valid and invalid passwords with MD5, SHA-1, and plantext password hashes in the htpasswd files.

No automated tests so far, but I'm open to advice on how/where to add them. Still very new to the code base.

@dobesv
Copy link
Contributor Author

dobesv commented Feb 11, 2020

I'm not sure if there's anything I can do about that build failure:

https://github.com/apache/drill/pull/1977/checks#step:6:1748

Copy link
Contributor

@paul-rogers paul-rogers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @dobesv, thank you for the contribution. Very useful!

There are a few minor comments. To address them, make your changes and push a new commit to your branch. No need to create a new PR. Later, as we get ready to commit, we'll "squash" the commits into one.

@cgivre cgivre added documentation PRs relating to documentation enhancement PRs that add a new functionality to Drill labels Feb 11, 2020
@paul-rogers
Copy link
Contributor

Thanks for the detailed usage description. In the config file shown in the description consistent with the implementation?

Description:

           impl: "htpasswd",
           htpasswd: { file: "/path/to/htpasswd" }

Implementation:

  public static final String HTPASSWD_AUTHENTICATOR_PATH = "drill.exec.security.user.auth.htpasswd.path";

Notice the file vs path difference.

@dobesv
Copy link
Contributor Author

dobesv commented Feb 11, 2020

Good catch on the file/path issue in the doc

@paul-rogers
Copy link
Contributor

@dobesv, the key failures to check for are any checkstyle errors. If anything else fails, it is probably due to problems with the (new) builds or master; we'll have to check.

This PR includes no tests. It would be great to add some. Some example test cases:

  • htpasswd enabled, but no file path or default file found.
  • Enabled, and valid file in default location.
  • Enabled, and valid file in custom location.
  • File contents change to become invalid.
  • File contents change and are valid.

To test you'll need to first set up the server to enable the web server, which is turned off by default in tests. For testing, we use something called the ClusterFixture. Most tests are based off of ClusterTest which set up the server once per test file. For your tests, you may want to start the server in each test case, each with a different configuration. StatusResourcesTest is a good example.

The config you need is probably something like:

    ClusterFixtureBuilder builder = ClusterFixture.builder(dirTestWatcher).
      configProperty(ExecConstants.HTTP_ENABLE, true).
      configProperty(ExecConstants.HTTP_PORT_HUNT, true).
      configProperty(ExecConstants.HTPASSWD_AUTHENTICATOR_PATH, "whatever");

Next you'll need a way to talk to the REST server. There is a RestClientFixture class which looks promising, though I've not used it myself.

You will need a place to create and update the file. The BaseDirTestWatcher class does this, often provided in tests as dirTestWatcher. See TestDotDrillUtil for an example of getting a test-specific temp directory and writing a file to that location. You can then use the temp directory as your custom auth file location when configuring the Drillbit.

With this, you can test the "happy path":

  • Enabled, and valid file in custom location.
  • File contents change and are valid.

A bit more work is needed for the other cases. To test this one:

  • Enabled, and valid file in default location.

You can put the htpasswd in src/test/resources which will be on the classpath when running tests. The challenge is that the file is static; if we have the file, there is no way to test errors if the file does not exist. Sigh.

Finally, checking the error cases is a nice-to-have, but is rather tricky. The errors don't kill the server, they only leave a trace in the log file. So, you can use two other tricks. One is LogFixture which lets you temporarily turn on logging. You won't need that because your code logs errors at the error level which is always enabled.

To capture log messages, you can try something like TestOperatorDump: create an in-memory log appender. Yes, it would be nice if this capability were part of LogFixture; a nice little project when someone has the time.

My recommendation is to at least test the "happy path"; add the others only if you feel so inclined.

@dobesv
Copy link
Contributor Author

dobesv commented Feb 11, 2020

@paul-rogers

I added some tests and cleaned up the logic slightly.

I didn't add tests for the REST API, since the password logic is shared between both systems I don't think this requires separate testing.

I couldn't figure out a sane way to test the stuff related to the default path and I didn't do anything related to loading the file from the classpath. I think that might be more than I am capable of with the time budget I have for this task right now. I don't think the "default path" is particularly import nor is it particularly prone to regressions.

I didn't attempt tests related to logging, it just doesn't seem worth it. I'm not so familiar with the project that I can make a test like that in a reasonable amount of time. Perhaps someone core to the team can add that if it seems important.

@paul-rogers
Copy link
Contributor

@dobesv, thanks for the updates. I agree with limiting the scope of the tests.

Looks like the failure above is due to something in this PR. If you can take a look at the issue, we'll be good to go from my perspective.

[ERROR] Failures: 
[ERROR]   TestHtpasswdFileUserAuthenticator.detectsChanges:114->tryCredentials:145 Expected connect to fail because of incorrect username / password combination, but it succeeded

@dobesv
Copy link
Contributor Author

dobesv commented Feb 13, 2020

If I'm interpreting the checks correctly, the code works in Java 11 & 12, but not 1.8? Very strange, I wonder why that would be.

@paul-rogers
Copy link
Contributor

Excellent! There is also one minor race condition to fix.

@paul-rogers
Copy link
Contributor

@vvysotskyi, in this case the sleep is necessary. As I read it, this code uses the file system for synchronization. We write a file, then make a request that will check the file timestamp. If that request is too fast, the code will not detect the change.

Maybe 1 second is too long; maybe 100ms is plenty.

But, without the delay, we need some other mechanism for synchronization, which then defeats the purpose of the code and test.

You propose changing the file modification timestamp. This can be done, but will not accurately reproduce the use case that we want to test.

Are you OK with a shorter delay?

@vvysotskyi
Copy link
Member

@paul-rogers, thanks for explanation. I think we cannot specify a shorter delay due to JDK-8177809. Let's leave it as it is (but with passing unit tests).

@dobesv
Copy link
Contributor Author

dobesv commented Feb 13, 2020

FWIW a 1 second delay is required because that version of the JVM only offers 1-second resolution on file timestamps.

I've changed it to detect changes based on file size and mod time. That should work when changes are made rapidly in systems where the modification time is rounded to the nearest second.

In containerized environments, PAM based authentication is not convenient.

This provides a simple mechanism for setting up users' passwords
that can be managed using docker volume mounts.
@dobesv
Copy link
Contributor Author

dobesv commented Feb 13, 2020

Seems like the failure is in another component now:

[ERROR] Tests run: 30, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 77.4 s <<< FAILURE! - in org.apache.drill.exec.udf.dynamic.TestDynamicUDFSupport
[ERROR] org.apache.drill.exec.udf.dynamic.TestDynamicUDFSupport.testDropFunction  Time elapsed: 1.878 s  <<< ERROR!
org.apache.drill.common.exceptions.UserRemoteException: 
FUNCTION ERROR: Failure reading Function class.

Function Class org.apache.drill.udf.dynamic.CustomLowerFunction
Fragment 0:0

[Error Id: 9834cc92-5ec4-4a1a-83e0-deee12f02447 on fv-az89:31064]
Caused by: java.io.IOException: Failure trying to locate source code for class org.apache.drill.udf.dynamic.CustomLowerFunction, tried to read on classpath location /org/apache/drill/udf/dynamic/CustomLowerFunction.java

Was something related to this merged into master recently?

@cgivre
Copy link
Contributor

cgivre commented Feb 14, 2020

I'm working on a PR and encountered the same problem. @vvysotskyi could you take a look?

@vvysotskyi
Copy link
Member

@cgivre, sorry, currently I'm running out of time with other tasks, so I can't resolve this issue now.

@cgivre
Copy link
Contributor

cgivre commented Feb 14, 2020

@vvysotskyi No worries mate.. When you have time, please take a look as all the PRs are running into the same issue.

Copy link
Contributor

@paul-rogers paul-rogers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. Let's get it merged so others can try it.
+1

@asfgit asfgit closed this in 39457bf Feb 21, 2020
@dobesv dobesv deleted the htpasswd-file-auth branch February 21, 2020 17:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation PRs relating to documentation enhancement PRs that add a new functionality to Drill
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants