Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Re-factor IdentityServer init. to fix race conditions + tests #55

Merged

Conversation

Kortanul
Copy link
Member

@Kortanul Kortanul commented Jun 3, 2018

Several issues being fixed:

  • FR was statically initializing the IDENTITY_SERVER AtomicReference to a placeholder instance that provided different information to other IDM components during initialization than the final instance did after initialization.
  • FR was using an AtomicBoolean to guard updates to the the IDENTITY_SERVER AtomicReference, without explicitly using a synchronized critical section. So, any calls into the IdentityService instance between the time the server was marked "initialized" and the time the instance was set would actually be interacting with the default (placeholder) instance without realizing it.
  • Because of the behaviors above, code like SelfService came to rely on being able to get an IdentityServer instance during its static class initializer. This was bad -- depending upon whether the SelfService class was loaded before, during, or after IdentityServer was fully initialized, it would be getting informaton from either the default (placeholder) instance of IdentityServer or the actual instance.
  • IdentityServer.getInstance() was checking for initialization with a null check that would never fail, because the value was always initialized to a non-null value.
  • IdentitySever.initInstance(IdentityServer) would not throw a IllegalStateException if it was passed a null value.

Now:

  • Both the the IDENTITY_SERVER AtomicReference and the INITIALISED AtomicBoolean are gone.
  • There's just one volatile static instance variable that's initialized in a synchronized critical section.
  • All components of IDM that need an instance of IdentityServer can only interact with it once it's been initialized; this is now enforced properly by the getInstance() method.
  • SelfService uses a shared key alias that is lazily-initialized by a double-checked locking pattern, so that it doesn't need the IdentityServer to be initialized at class loading time.
  • There are tests that actually cover both overloads of initInstance().
  • A utility class now exists for tests to manipulate the state of IdentityServer.
  • Fixed broken tests by making sure all tests that depend on IdentityServer initialize it before or during the test.
  • Fixed test failures in openidm-security that were dependent on file order. Here's what was happening:
    • If EntryResourceProviderTest was loaded by the JVM before PrivateKeyResourceProviderTest, then all tests would pass.
    • If EntryResourceProviderTest was loaded by the JVM after PrivateKeyResourceProviderTest, then tests for EntryResourceProviderTest, KeystoreResourceProviderTest, and PrivateKeyResourceProviderTest would fail.

Several issues being fixed:
  - FR was statically initializing the `IDENTITY_SERVER` `AtomicReference` to a placeholder instance that provided different information to other IDM components _during_ initialization than the final instance did _after_ initialization.
  - FR was using an AtomicBoolean to guard updates to the the `IDENTITY_SERVER` `AtomicReference`, without explicitly using a `synchronized` critical section. So, any calls into the `IdentityService` instance between the time the server was marked "initialized" and the time the instance was set would actually be interacting with the default (placeholder) instance without realizing it.
  - Because of the behaviors above, code like `SelfService` came to rely on being able to get an `IdentityServer` instance during its static class initializer. This was bad -- depending upon whether the `SelfService` class was loaded before, during, or after `IdentityServer` was fully initialized, it would be getting informaton from either the default (placeholder) instance of `IdentityServer` or the actual instance.
  - `IdentityServer.getInstance()` was checking for initialization with a `null` check that would never fail, because the value was always initialized to a non-null value.
  - `IdentitySever.initInstance(IdentityServer)` would not throw a `IllegalStateException` if it was passed a `null` value.

Now:
  - Both the the `IDENTITY_SERVER` `AtomicReference` and the `INITIALISED` `AtomicBoolean` are gone.
  - There's just one volatile static instance variable that's initialized in a `synchronized` critical section.
  - All components of IDM that need an instance of `IdentityServer` can only interact with it once it's been initialized; this is now enforced properly by the `getInstance()` method.
  - `SelfService` uses a shared key alias that is lazily-initialized by a double-checked locking pattern, so that it doesn't need the `IdentityServer` to be initialized at class loading time.
  - There are tests that actually cover both overloads of `initInstance()`.
  - A utility class now exists for tests to manipulate the state of `IdentityServer`.
- Fix broken tests by making sure all tests that depend on `IdentityServer` initialize it before or during the test.
- Fix test failures in `openidm-security` that were dependent on file order. Here's what was happening:
  - If `EntryResourceProviderTest` was loaded by the JVM before `PrivateKeyResourceProviderTest`, then all tests would pass.
  - If `EntryResourceProviderTest` was loaded by the JVM after `PrivateKeyResourceProviderTest`, then tests for `EntryResourceProviderTest`, `KeystoreResourceProviderTest`, and `PrivateKeyResourceProviderTest` would fail.
The code that was committed in e824e61 to fix OPENIDM-1004 was mostly backed out by 0f4b937, since the root cause of the issue was that the server was getting initialized twice. With the new approach, the CLI now needs to initialize the server before interacting with it.
siepkes
siepkes previously approved these changes Jun 4, 2018
Copy link
Member

@siepkes siepkes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Haven't build it (deferring the thorough review to @pavelhoral) but the concurrency part looks good.

Just as an FYI and not specifically applicable to this situation but a good pattern to know for lazy initialization is the Initialization-on-demand holder pattern. Allows for safe and lock-free (!) lazy initialization.

@pavelhoral
Copy link
Member

Just a quick question before I dive into the changes: What was the observed issue being solved here? Is this about the "strangely failing" unit tests? Because in normal operation openidm-system is being activated on the first level, long before any other openidm bundle.

@siepkes
Copy link
Member

siepkes commented Jun 4, 2018

@pavelhoral I don't know the exact semantics but I would be cautious to rely on the OSGi start order (ie. OSGi run level). The bundle run level only guards the starting of the bundle which in turn guarantees the bundle's context activator has been called and has returned. However I suspect (haven't checked it) that most stuff in IDM gets activated by SCR. And OSGi will be starting bundles with other run levels even though the bundles with a lower run level might still be starting their SCR stuff.

@pavelhoral
Copy link
Member

pavelhoral commented Jun 4, 2018

AFAIK that is not the case of openidm-system and IdentityServer component - https://github.com/WrenSecurity/wrenidm/blob/master/openidm-system/src/main/java/org/forgerock/openidm/core/internal/Activator.java#L57

@Kortanul
Copy link
Member Author

Kortanul commented Jun 4, 2018

As requested, here's all the context about the issue (from Gitter chat):

@karelmaxa or @pavelhoral have you guys seen build failures like these before, for IDM 5.5?

Tests run: 18, Failures: 18, Errors: 0, Skipped: 0, Time elapsed: 0.825 sec <<< FAILURE! - in TestSuite
testReadEntry(org.forgerock.openidm.security.impl.PrivateKeyResourceProviderTest)  Time elapsed: 0.019 sec  <<< FAILURE!
java.lang.NullPointerException: null
        at org.forgerock.openidm.security.impl.SecurityTestUtils.createKeyStore(SecurityTestUtils.java:49)
        at org.forgerock.openidm.security.impl.SecurityTestUtils.createKeyStores(SecurityTestUtils.java:43)
        at org.forgerock.openidm.security.impl.PrivateKeyResourceProviderTest.testReadEntry(PrivateKeyResourceProviderTest.java:57)

overall I'm seeing this:

Failed tests:
  EntryResourceProviderTest.attemptToCreateAliasThatAlreadyExists:233->createEntryResourceProvider:296 ▒ NullPointer
  EntryResourceProviderTest.attemptToCreateWithNoResourceId:256->createEntryResourceProvider:296 ▒ NullPointer
  EntryResourceProviderTest.attemptToDeleteEntryThatDoesNotExist:284->createEntryResourceProvider:296 ▒ NullPointer
  EntryResourceProviderTest.attemptToReadEntryThatDoesNotExist:270->createEntryResourceProvider:296 ▒ NullPointer
  EntryResourceProviderTest.testActionCollection:184->createEntryResourceProvider:296 ▒ NullPointer
  EntryResourceProviderTest.testActionInstance:198 ▒ NullPointer
  EntryResourceProviderTest.testCreatingEntry:90->createEntryResourceProvider:296 ▒ NullPointer
  EntryResourceProviderTest.testDeleteEntry:128->createEntryResourceProvider:296 ▒ NullPointer
  EntryResourceProviderTest.testPatchEntry:170->createEntryResourceProvider:296 ▒ NullPointer
  EntryResourceProviderTest.testQueryCollection:217->createEntryResourceProvider:296 ▒ NullPointer
  EntryResourceProviderTest.testReadEntry:107->createEntryResourceProvider:296 ▒ NullPointer
  EntryResourceProviderTest.testUpdateEntry:149->createEntryResourceProvider:296 ▒ NullPointer
  KeystoreResourceProviderTest.testActionGenerateCSR:105 ▒ NullPointer
  KeystoreResourceProviderTest.testActionGenerateCert:82 ▒ NullPointer
  KeystoreResourceProviderTest.testActionGenerateCertWithAliasAlreadyInUse:153 ▒ NullPointer
  KeystoreResourceProviderTest.testActionGenerateCertWithoutAlias:132 ▒ NullPointer
  PrivateKeyResourceProviderTest.testReadEntry:57 ▒ NullPointer
  PrivateKeyResourceProviderTest.testReadPrivateKey:41 ▒ NullPointer

this is on Java 1.8.0_161

I'm continuing to work through understanding the test failures I'm getting. If my project is in one location on disk (~/host/wrenidm), they do not occur. In another location (~/wrenidm) they do occur. I've also tried /opt/src/openidm and they occur there too. When they occur, they happen with both Maven 3.5.2 and 3.3.9.
the really strange part is that when the failures are present, IDM thinks it was launched without a project.
here's a log (print statements added to spit out the property look-ups):
image
I literally do not understand how the project location can be null. I also can't figure out how it ends up with a value in the first place. I see logic that processes this info in the FR launcher ZIP, but cannot quite see how values from that OSGi service make their way into properties that are looked-up in the IdentityServer class.

@Kortanul
Copy link
Member Author

Kortanul commented Jun 4, 2018

also, @pavelhoral, there's a bit more detail at the end of my PR description:

Fixed test failures in openidm-security that were dependent on file order. Here's what...

pavelhoral
pavelhoral previously approved these changes Jun 5, 2018
Copy link
Member

@pavelhoral pavelhoral left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just have a few comments which I don't consider to be "blocking the PR":

  • I am not a super fan of powermock dependency. I would rather add some additional methods to IdentityServer.
  • I have zero experience with test-jar dependencies, so I am a bit cautious about those.

Yeah, and then there are my formatting "sigh" comments.. you can ignore those as I just wanted to express my discomfort reading that code.

* what has been provided.
*/
public static void initInstanceForTest(final String projectLocation,
final String installLocation)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This formatting makes me sad.

Whitebox.invokeConstructor(
IdentityServer.class,
new Class[] { PropertyAccessor.class },
new Object[] { null });
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This formatting makes me sad.

* what has been provided.
*/
public static void verifyServerInitPaths(final String projectLocation,
final String installLocation)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This formatting makes me sad.

String rev = capturedAfter.get(ResourceResponse.FIELD_CONTENT_REVISION).asString();
assertThat(rev).isEqualTo("2");
}
finally {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This formatting makes me sad.

@Kortanul
Copy link
Member Author

Kortanul commented Jun 7, 2018

I'll address a few of the formatting concerns. My goal is still to keep lines < 100 chars when we can (generics tend to be my only exception), since alas > 120 chars makes PR review on GH painful and doesn't fit well on my screens.

cleans up some issues pointed out during review.
Changes approach to using a package-private method for clearing state, and a new method for checking initialization status.
Removing "\n" on Windows leaves "\r", which causes each line of the certificate to overwrite the previous one. This fix ensures that the test works properly on both environments.
@Kortanul Kortanul dismissed stale reviews from pavelhoral and siepkes via 01205ef June 14, 2018 04:31
@Kortanul
Copy link
Member Author

Kortanul commented Jun 14, 2018

@pavelhoral back to you. my changes:

  • Cleaned-up some of the formatting.
  • Eliminated the use of PowerMock (this isn't always possible, especially with singletons and legacy code from FR, but it was possible here).
  • Corrected a remaining test failure for Keystore on Windows.

Test JARs are nothing to worry about -- I'm using it here just so we can re-use the test utility class across projects. Test classpaths are not transitive in Maven, so you have to export any test classes like this in order to use them accross modules.

@pavelhoral pavelhoral merged commit 834a17c into WrenSecurity:master Jun 26, 2018
pavelhoral pushed a commit that referenced this pull request Jun 26, 2018
cleans up some issues pointed out during review.
pavelhoral pushed a commit that referenced this pull request Jun 26, 2018
Changes approach to using a package-private method for clearing state, and a new method for checking initialization status.
@Kortanul Kortanul deleted the feature/fix-identityserver-init branch July 6, 2018 17:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants