Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Provide new Crypto interface & impl #560

Merged
merged 22 commits into from Aug 13, 2018

Conversation

@milleruntime
Copy link
Contributor

commented Jul 13, 2018

  • Replaced old CryptoModule properties and code
  • Modified Initialize to use site config
  • Changed BCFile and DFSLogger to use getParameters
  • Added setProperty to Mini for testing
  • Made crypto properties instance level
  • Implement AES/GCM crypto
  • Added a custom exception and crypto info prints
  • Made Crypto Params a byte array
  • Implement parameters parser for crypto service (#9)

Co-authored-by: Nick Felts 31989480+PircDef@users.noreply.github.com

TODO: Add new way to print crypto params from file in rfile-info command before reading the file.

Provide new Crypto interface & impl
* Replaced old CryptoModule properties and code
* Modified Initialize to use site config
* Changed BCFile and DFSLogger to use getParameters
* Added setProperty to Mini for testing
* Implement AES/GCM crypto
* Added a custom exception and crypto info prints
* Updated service to make decisions
* Implement parameters parser for crypto service (#9)

Co-authored-by: Nick Felts <31989480+PircDef@users.noreply.github.com>
@milleruntime

This comment has been minimized.

Copy link
Contributor Author

commented Jul 13, 2018

FYI mvn verify -Psunny is passing. But not all ITs are passing. https://jenkins.revelc.net/job/Accumulo-Miller/lastBuild/testReport/

@mikewalch
Copy link
Member

left a comment

Nice PR! I may take another pass at reviewing this.

"org.apache.accumulo.core.security.crypto.impl.NoCryptoService", PropertyType.CLASSNAME,
"The class which executes on-disk file encryption. The default does nothing. To enable "
+ "encryption, replace this classname with an implementation of the"
+ "org.apache.accumulo.core.security.crypto.CryptoService interface."),

This comment has been minimized.

Copy link
@mikewalch

mikewalch Jul 13, 2018

Member

could use {% jlink -f org.apache.accumulo.core.security.crypto.CryptoService %}

/**
* Example implementation of AES encryption for Accumulo
*/
public class AESCryptoService implements CryptoService {

This comment has been minimized.

Copy link
@mikewalch

mikewalch Jul 13, 2018

Member

It would be nice if there was a On-disk Encryption page in the Accumulo 2.0 documentation that gave an overview of the Encryption service and described how to use this class.

This comment has been minimized.

Copy link
@milleruntime

milleruntime Jul 31, 2018

Author Contributor

Where does this exist? I can at least open a ticket to have this done once this stuff gets merged.

This comment has been minimized.

Copy link
@milleruntime

This comment has been minimized.

Copy link
@milleruntime

milleruntime Jul 31, 2018

Author Contributor

Created a follow on issue for this: apache/accumulo-website#101

cfg.setProperty(Property.TABLE_CRYPTO_SERVICE,
"org.apache.accumulo.core.security.crypto.impl.AESCryptoService");
cfg.setProperty(TABLE_CRYPTO_PREFIX.getKey() + "kekId", keyPath);
cfg.setProperty(TABLE_CRYPTO_PREFIX.getKey() + "keyManager", "uri");

This comment has been minimized.

Copy link
@mikewalch

mikewalch Jul 13, 2018

Member

It looks like keyManager property is just set to uri. Does it not need to be set?

this.encryptingKeyManager);
return (cm.getDecrypter(fek));

// TODO I suspect we want to leave "U+1F47B" and create an internal NoFileDecrypter

This comment has been minimized.

Copy link
@mikewalch

mikewalch Jul 13, 2018

Member

I am OK with the comment bu there should be a corresponding issue if this is left in.

This comment has been minimized.

Copy link
@milleruntime

milleruntime Jul 31, 2018

Author Contributor

Removed this. I don't think we want the NoCryptoService.VERSION in here anyway.

ByteArrayOutputStream out = new ByteArrayOutputStream();
DataOutputStream dataOut = new DataOutputStream(out);
OutputStream encrypted = encrypter.encryptStream(out);
// System.out.println("after enc out Bytes written " + out.size());

This comment has been minimized.

Copy link
@mikewalch

mikewalch Jul 13, 2018

Member

Should remove these or convert to log.trace or log.debug

milleruntime added some commits Jul 13, 2018

@milleruntime

This comment has been minimized.

Copy link
Contributor Author

commented Jul 17, 2018

@mikewalch Thanks for the feedback. I am still working through some kinks but I will get to your suggestions as soon I have some more fixes up.

PircDef and others added some commits Jul 25, 2018

@keith-turner
Copy link
Contributor

left a comment

I have only looked at the interfaces so far. Plan to continue looking at impl and test.

byte[] bytes;
try (ByteArrayOutputStream baos = new ByteArrayOutputStream();
DataOutputStream params = new DataOutputStream(baos)) {
params.writeUTF("org.apache.accumulo.core.security.crypto.impl.AESCryptoService");

This comment has been minimized.

Copy link
@keith-turner

keith-turner Jul 30, 2018

Contributor

Could do AESCryptoService.class.getName()

*
* @since 2.0
*/
void init(Map<String,String> conf) throws CryptoException;

This comment has been minimized.

Copy link
@keith-turner

keith-turner Jul 30, 2018

Contributor

I noticed that CryptoEnvironment has a getConf method. Is the expectation that the conf passed here and that in CryptoEnvironment are always the same? Does the user need to handle the conf changing between the call to init and getFileEncrypter? Would it make sense to drop this init method since CryptoEnvironment has the conf?

If a user is implementing this, under what circumstances should they throw CryptoException?

This comment has been minimized.

Copy link
@milleruntime

milleruntime Jul 31, 2018

Author Contributor

I modified CryptoEnvironment to be an interface with only two methods to get scope and decryptionParams (the new name to params). Does this make more sense with init? Or do you think I should mention something about configuration in the javadoc.

*
* @since 2.0
*/
FileEncrypter getFileEncrypter(CryptoEnvironment environment);

This comment has been minimized.

Copy link
@keith-turner

keith-turner Jul 30, 2018

Contributor

IT would be nice to provide some info about lifecycle in the javadocs. I assume this method is called once per file? Also could mention that impl need to be threadsafe.

*
* @since 2.0
*/
InputStream decryptStream(InputStream inputStream) throws CryptoService.CryptoException;

This comment has been minimized.

Copy link
@keith-turner

keith-turner Jul 30, 2018

Contributor

Could provide lifecycle info in javadocs, like how often this is called for RFiles and WAL files. Not sure if this the best place for that info.

/**
* Encrypt the OutputStream.
*
* @since 2.0

This comment has been minimized.

Copy link
@keith-turner

keith-turner Jul 30, 2018

Contributor

You can put @since 2.0 at class level javadoc and all method will inherit it unless overridden.

This comment has been minimized.

Copy link
@milleruntime

milleruntime Jul 31, 2018

Author Contributor

Ok cool. Will do.

/**
* Initialize the FileEncrypter for the environment and return
*
* @since 2.0

This comment has been minimized.

Copy link
@keith-turner

keith-turner Jul 30, 2018

Contributor

Do not need since tag here because class has it.

*
* @since 2.0
*/
FileDecrypter getFileDecrypter(CryptoEnvironment environment);

This comment has been minimized.

Copy link
@keith-turner

keith-turner Jul 30, 2018

Contributor

Would it make more sense to drop parameter info from env and change this method to :

FileDecrypter getFileDecrypter(CryptoEnvironment environment, byte[] parameters);
@@ -238,6 +168,20 @@
PropertyType.STRING,
"One-line configuration property controlling the network locations "
+ "(hostnames) that are allowed to impersonate other users"),
// Crypto-related properties

This comment has been minimized.

Copy link
@keith-turner

keith-turner Jul 30, 2018

Contributor

Really nice simplification!!!!

import java.util.Map;
import java.util.Objects;

public class CryptoEnvironment {

This comment has been minimized.

Copy link
@keith-turner

keith-turner Jul 30, 2018

Contributor

For API could make this an interface with only getter methods. Should add since tags

* License for the specific language governing permissions and limitations under
* the License.
*/
package org.apache.accumulo.core.security.crypto;

This comment has been minimized.

Copy link
@keith-turner

keith-turner Jul 30, 2018

Contributor

Could move all of the crypto APIs to org.apache.accumulo.core.spi.crypto

*
* @since 2.0
*/
void init(Map<String,String> conf) throws CryptoException;

This comment has been minimized.

Copy link
@keith-turner

keith-turner Jul 30, 2018

Contributor

Could possibly pass an interface like CryptoEnv to init inorder to support easier API evolution. Not sure what to name it.

Also could possibly replace init with a factory like :

interface CyrptoServiceFactory {
  CryptoService newCryptoService(Map<String, String> conf);
}

I am not sure if this better than init, its just something to consider.

/**
* Load the singleton class configured in {@link Property#INSTANCE_CRYPTO_SERVICE}
*/
public static CryptoService getConfigured(AccumuloConfiguration conf) {

This comment has been minimized.

Copy link
@keith-turner

keith-turner Aug 2, 2018

Contributor

This method has multiple concurrency issues. There is a race condition where one thread may use singleton while another thread is calling init. Also, because the variable is not volatile processors with different caches will not see each others updates. It would be nice to avoid synchronization, the following is one possible way to avoid these issue w/o sync.

private static AtomicReference<CryptoService> singleton = new AtomicReference<>(null);

  /**
   * Load the singleton class configured in {@link Property#INSTANCE_CRYPTO_SERVICE}
   */
  public static CryptoService getConfigured(AccumuloConfiguration conf) {
    CryptoService cyptoService = singleton.get();
    String configuredClass = conf.get(Property.INSTANCE_CRYPTO_SERVICE.getKey());
    if (cyptoService == null || !cyptoService.getClass().getName().equals(configuredClass)) {
      CryptoService newCryptoService = loadCryptoService(configuredClass);
      newCryptoService.init(conf.getAllPropertiesWithPrefix(Property.INSTANCE_CRYPTO_PREFIX));
      singleton.compareAndSet(cyptoService, newCryptoService);
      return singleton.get();
    }
    return cyptoService;
  }

It would probably be better to not have the singleton in the first place. Tservers could create a single cryptoservice.

This comment has been minimized.

Copy link
@milleruntime

milleruntime Aug 2, 2018

Author Contributor

Does it need to be thread safe? Tservers shouldn't have different configuration and should fail the instance start up so the class shouldn't be different.

This comment has been minimized.

Copy link
@milleruntime

milleruntime Aug 2, 2018

Author Contributor

I guess the issue with thread safety is with calling CryptoService.init(), since you only want that to ever be called once.

This comment has been minimized.

Copy link
@keith-turner

keith-turner Aug 2, 2018

Contributor

I saw the code being called in BCFile. BCFile is called by multiple threads.

This comment has been minimized.

Copy link
@milleruntime

milleruntime Aug 2, 2018

Author Contributor

Where in TabletServer would cryptoService get created just once?

* Load the singleton class configured in {@link Property#INSTANCE_CRYPTO_SERVICE}
*/
public static CryptoService getConfigured(AccumuloConfiguration conf) {
String configuredClass = conf.get(Property.INSTANCE_CRYPTO_SERVICE.getKey());

This comment has been minimized.

Copy link
@keith-turner

keith-turner Aug 2, 2018

Contributor

For performance reasons, properties that are frequently accessed like this one should be added to the hot path properties enum set in the Property class.

This comment has been minimized.

Copy link
@keith-turner

keith-turner Aug 2, 2018

Contributor

The reason I was thinking this is frequently accessed is I noticed this code is called from RFile and/or BCFile. Which means for each scan it may be called for each of a tablets rfiles.

ParsedCryptoParameters parsed = new ParsedCryptoParameters();
try (ByteArrayInputStream bais = new ByteArrayInputStream(parameters);
DataInputStream params = new DataInputStream(bais)) {
parsed.setCryptoServiceName(params.readUTF());

This comment has been minimized.

Copy link
@keith-turner

keith-turner Aug 2, 2018

Contributor

It seems like this method is expected to handle a file created by the NoCryptoService, however looking at the code I think it would fail when given that input. Seems like the multiple calls to readUTF will fail. Is there a test that ensures this code can read an unencrypted file? Would be nice to have a test that compacts unencrypted files to encrypted files to run through the use case or turning on encryption on an existing system.

This comment has been minimized.

Copy link
@milleruntime

milleruntime Aug 2, 2018

Author Contributor

I think you are right. I will add a Test for this, good idea.

This comment has been minimized.

Copy link
@milleruntime

milleruntime Aug 2, 2018

Author Contributor

Good eye! I created a test to read an unencrypted RFile with an AESCryptoService and got an error on that line. I will fix this case and push the Test with the fix.

This comment has been minimized.

Copy link
@milleruntime

milleruntime Aug 2, 2018

Author Contributor

Do you think we also need to test with old versions of RFile like RFileTest.testOldVersions() ?

This comment has been minimized.

Copy link
@keith-turner

keith-turner Aug 2, 2018

Contributor

Do you think we also need to test with old versions of RFile like RFileTest.testOldVersions() ?

More test are good, but not sure if its needed. I think it depends on what happens with the encryption params when reading the old versions of the files.

This comment has been minimized.

Copy link
@milleruntime

milleruntime Aug 2, 2018

Author Contributor

It wasn't too bad, I was able to make it work with the old data. Was able to fix a few NPEs when reading the old files and having the new crypto configured.

CryptoService newCryptoService = loadCryptoService(configuredClass);
newCryptoService.init(conf.getAllPropertiesWithPrefix(Property.INSTANCE_CRYPTO_PREFIX));
singleton.compareAndSet(cyptoService, newCryptoService);
return singleton.get();

This comment has been minimized.

Copy link
@keith-turner

keith-turner Aug 2, 2018

Contributor

May be better to return newCryptoService here, because in the case that compare and set fails the crypto service from singleton.get() may not match config. For example if there are different configs in the JVM (for example in the case of someone using the RFile API with multiple configs).

This comment has been minimized.

Copy link
@milleruntime

milleruntime Aug 6, 2018

Author Contributor

Hmm OK I hadn't thought of that.

This comment has been minimized.

Copy link
@milleruntime

milleruntime Aug 6, 2018

Author Contributor

Wondering if I should check the compareAndSet result. And if so what should I do if it fails? Is it worth logging a warning?

This comment has been minimized.

Copy link
@keith-turner

keith-turner Aug 6, 2018

Contributor

Wondering if I should check the compareAndSet result. And if so what should I do if it fails? Is it worth logging a warning?

I don't think its worth logging or we care if it fails, its kinda of a cache w/o any strong guarantees. Don't even have to do compare and set, could just set. W/o sync there is no way to prevent multiple service objects for the same config. Even w/ sync, if there are multiple active configs in the JVM there is no way to prevent multiple services objects for the same config w/o sync+map (where map key is something unique derived from config).

This comment has been minimized.

Copy link
@keith-turner

keith-turner Aug 6, 2018

Contributor

For a single config in the JVM, could avoid multiple service object for same config with compareAndSet + loop. This would not help with multiple config situation though.

This comment has been minimized.

Copy link
@milleruntime

milleruntime Aug 6, 2018

Author Contributor

This still doesn't prevent concurrency issues though does it? There is still the possibility of a race condition where CryptoService.init() gets called twice.

@milleruntime

This comment has been minimized.

Copy link
Contributor Author

commented Aug 6, 2018

@keith-turner Take a look at my latest changes to CryptoSeErviceFactory in c8310f9. I changed it to only load the CryptoService once when the factory class is loaded and made getConfigured() just check against the latest config and throw an error if the class changed. If you think this would work, we also may not need AtomicReference... I am not sure though.

@keith-turner

This comment has been minimized.

Copy link
Contributor

commented Aug 7, 2018

@milleruntime Yeah I don't think the atomic ref is needed after that change. That change fixes the concurrency issue and it handles multiple config in a predictable way (it only compares the class name and not the opts though... however comparing the opts for each call would inefficient). More static stuff in Accumulo is unfortunate. It would be nice to open up a follow on issue to look into removing the static stuff.

@milleruntime

This comment has been minimized.

Copy link
Contributor Author

commented Aug 7, 2018

More static stuff in Accumulo is unfortunate. It would be nice to open up a follow on issue to look into removing the static stuff.

I will either rebase or create a follow on to incorporate the creation of CryptoService into the Instance work @mikewalch is currently working on.

@milleruntime

This comment has been minimized.

Copy link
Contributor Author

commented Aug 8, 2018

Opened #584 relating to the static objects in CryptoServiceFactory.

@milleruntime

This comment has been minimized.

Copy link
Contributor Author

commented Aug 8, 2018

@PircDef Are there any more changes you would like to make before I merge this PR? I feel like it is close but I am going to continue testing today.

@PircDef

This comment has been minimized.

Copy link
Contributor

commented Aug 8, 2018

@milleruntime I have no immediate objections.

@milleruntime milleruntime merged commit 5e14af6 into apache:master Aug 13, 2018

2 checks passed

Jenkins This pull request looks good
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details

@milleruntime milleruntime deleted the milleruntime:crypto-rebased branch Sep 7, 2018

@ctubbsii ctubbsii added the v2.0.0 label Sep 15, 2018

@ctubbsii ctubbsii added this to Done in 2.0.0 Jun 14, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
5 participants
You can’t perform that action at this time.