Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature: Repository Cache Keep-Alive #29

Merged
merged 12 commits into from
Jul 8, 2020
Merged

Conversation

AFaust
Copy link
Collaborator

@AFaust AFaust commented May 28, 2020

This PR adds a new sub-module for a standalone Java application that can extend Repository-tier caches and keep their data alive even across complete restarts of all Repository-tier server nodes.

Adding this feature required some refactoring to consolidate more shared code / components within our sub-module for common code, as well as to introduce additional grid node attributes to determine a node's tier / role, in order to allow a security processor to validate the node is connecting to a grid of the correct tier and to allow registration handling to be processed by other nodes if a new node does not or can not handle it on its own.

With regards to member registration within the Alfresco database, this PR includes a change that breaks compatibility with existing registration data in existing installations due to restructuring the data stored via the AttributeService. As this module has not yet been released, such a change is considered unproblematic. The only effect of this change is that any previously stored registration data is effectively ignored / not used, and new registration data is stored in a different attribute.

@AFaust AFaust requested a review from andreaskring May 28, 2020 23:02
@AFaust
Copy link
Collaborator Author

AFaust commented May 28, 2020

@andreaskring PR is not yet ready to be merged. Documentation of the new component is pending, and there is also some work to be done to correctly document / handle licenses of 3rd-party libraries for the keep-alive Uber JAR.

One thing I would like to request of you is to try and run the Bash test script. As I mostly work on Windows, I have used the companion bat file for my local test. If you find any issues and need to correct the script, simply push those commits to the branch.

@andreaskring
Copy link
Collaborator

I have tested the script and are in the process of correcting a few minor syntatical bugs (\ -> / and removing a few back tics). I get this error though from the keepAlive01 container:

keepAlive01_1   | Exception in thread "main" org.springframework.beans.factory.CannotLoadBeanClassException: Error loading class [org.aldica.common.ignite.lifecycle.SpringIgniteLifecycleBean] for bean with name 'org.aldica.common.ignite.lifecycle.SpringIgniteLifecycleBean#0' defined in class path resource [application-context.xml]: problem with class file or dependent class; nested exception is java.lang.UnsupportedClassVersionError: org/aldica/common/ignite/lifecycle/SpringIgniteLifecycleBean has been compiled by a more recent version of the Java Runtime (class file version 55.0), this version of the Java Runtime only recognizes class file versions up to 52.0
keepAlive01_1   | 	at org.springframework.beans.factory.support.AbstractBeanFactory.resolveBeanClass(AbstractBeanFactory.java:1395)
keepAlive01_1   | 	at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.determineTargetType(AbstractAutowireCapableBeanFactory.java:663)
keepAlive01_1   | 	at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.predictBeanType(AbstractAutowireCapableBeanFactory.java:630)

@AFaust
Copy link
Collaborator Author

AFaust commented May 29, 2020

As discussed in our meeting, you'll have to ensure to build the other sub-modules for Java 8 if you have previously built them for Java 11 by using the Alfresco 6.2.0 parent POM. That is one of the reasons why I don't want to upgrade the parent POM in general as it would remove backwards compatibility with other Alfresco versions on the count of the Java version.

@AFaust AFaust force-pushed the feature/repo-cache-extender branch from c27a21c to c7f9ada Compare May 31, 2020 21:15
@andreaskring
Copy link
Collaborator

I have run mvn clean install (using Java 8) from the aldica root folder, and this got me a step further when running the startStopTests.sh script.

Another problem occurs though:

Creating network "docker_testNetwork" with the default driver
Creating volume "docker_alf_data" with default driver
Creating docker_postgres_1 ... 
Creating docker_postgres_1 ... done
Creating docker_repository01_1 ... 
Creating docker_repository01_1 ... done
Creating docker_keepAlive01_1 ... 
Creating docker_keepAlive01_1 ... done
Ignite instance repositoryGrid currently has 2 active nodes
Stopping docker_repository01_1 ... done
KeepAlive01 did not handle disconnect

and inspecting the container logs gives:

$ docker logs -f docker_keepAlive01_1
...
[08:21:14] Ignite node started OK (id=fed32d2c, instance name=repositoryGrid)
[08:21:14] Topology snapshot [ver=2, locNode=fed32d2c, servers=2, clients=0, state=ACTIVE, CPUs=16, offheap=17.0GB, heap=2.5GB]
08:21:14.564 [main] INFO  o.a.c.i.l.SpringIgniteLifecycleBean - Ignite instance repositoryGrid currently has 2 active nodes on addresses [172.19.0.3, 172.19.0.4]
08:21:14.586 [main] INFO  o.a.c.i.l.SpringIgniteLifecycleBean - Started Ignite instance repositoryGrid
[08:21:24] Topology snapshot [ver=3, locNode=fed32d2c, servers=1, clients=0, state=ACTIVE, CPUs=8, offheap=16.0GB, heap=0.5GB]
[08:21:24] Coordinator changed [prev=TcpDiscoveryNode [id=c5bec0ac-1f6a-4d76-b554-39218238f020, addrs=[172.19.0.3], sockAddrs=[repository01/172.19.0.3:47110], discPort=47110, order=1, intOrder=1, lastExchangeTime=1590999673967, loc=false, ver=2.7.6#20190911-sha1:21f7ca41, isClient=false], cur=TcpDiscoveryNode [id=fed32d2c-7ae8-4a8a-9d2d-4c1c17f89427, addrs=[172.19.0.4], sockAddrs=[keepAlive01/172.19.0.4:47110], discPort=47110, order=2, intOrder=2, lastExchangeTime=1590999673866, loc=true, ver=2.7.6#20190911-sha1:21f7ca41, isClient=false]]
08:21:24.607 [exchange-worker-#38%repositoryGrid%] WARN  o.a.i.i.p.a.GridAffinityAssignmentCache - Logging at INFO level without checking if INFO level is enabled: Local node affinity assignment distribution is not ideal [cache=authenticationSharedCache, expectedPrimary=32.00, actualPrimary=32, expectedBackups=32.00, actualBackups=0, warningThreshold=50.00%]
08:21:24.607 [exchange-worker-#38%repositoryGrid%] WARN  o.a.i.i.p.a.GridAffinityAssignmentCache - Logging at INFO level without checking if INFO level is enabled: Local node affinity assignment distribution is not ideal [cache=nodeOwnerSharedCache, expectedPrimary=32.00, actualPrimary=32, expectedBackups=32.00, actualBackups=0, warningThreshold=50.00%]
08:21:24.607 [exchange-worker-#38%repositoryGrid%] WARN  o.a.i.i.p.a.GridAffinityAssignmentCache - Logging at INFO level without checking if INFO level is enabled: Local node affinity assignment distribution is not ideal [cache=readersSharedCache, expectedPrimary=32.00, actualPrimary=32, expectedBackups=32.00, actualBackups=0, warningThreshold=50.00%]
08:21:24.608 [exchange-worker-#38%repositoryGrid%] WARN  o.a.i.i.p.a.GridAffinityAssignmentCache - Logging at INFO level without checking if INFO level is enabled: Local node affinity assignment distribution is not ideal [cache=solrFacetNodeRefSharedCache, expectedPrimary=32.00, actualPrimary=32, expectedBackups=32.00, actualBackups=0, warningThreshold=50.00%]
08:21:24.608 [exchange-worker-#38%repositoryGrid%] WARN  o.a.i.i.p.a.GridAffinityAssignmentCache - Logging at INFO level without checking if INFO level is enabled: Local node affinity assignment distribution is not ideal [cache=aclSharedCache, expectedPrimary=32.00, actualPrimary=32, expectedBackups=32.00, actualBackups=0, warningThreshold=50.00%]
08:21:24.609 [exchange-worker-#38%repositoryGrid%] WARN  o.a.i.i.p.a.GridAffinityAssignmentCache - Logging at INFO level without checking if INFO level is enabled: Local node affinity assignment distribution is not ideal [cache=siteNodeRefSharedCache, expectedPrimary=32.00, actualPrimary=32, expectedBackups=32.00, actualBackups=0, warningThreshold=50.00%]
08:21:24.609 [exchange-worker-#38%repositoryGrid%] WARN  o.a.i.i.p.a.GridAffinityAssignmentCache - Logging at INFO level without checking if INFO level is enabled: Local node affinity assignment distribution is not ideal [cache=tenantEntitySharedCache, expectedPrimary=32.00, actualPrimary=32, expectedBackups=32.00, actualBackups=0, warningThreshold=50.00%]
08:21:24.609 [exchange-worker-#38%repositoryGrid%] WARN  o.a.i.i.p.a.GridAffinityAssignmentCache - Logging at INFO level without checking if INFO level is enabled: Local node affinity assignment distribution is not ideal [cache=contentUrlEncryptingMasterKeySharedCache, expectedPrimary=32.00, actualPrimary=32, expectedBackups=32.00, actualBackups=0, warningThreshold=50.00%]
08:21:24.610 [exchange-worker-#38%repositoryGrid%] WARN  o.a.i.i.p.a.GridAffinityAssignmentCache - Logging at INFO level without checking if INFO level is enabled: Local node affinity assignment distribution is not ideal [cache=tagscopeSummarySharedCache, expectedPrimary=32.00, actualPrimary=32, expectedBackups=32.00, actualBackups=0, warningThreshold=50.00%]
08:21:24.610 [exchange-worker-#38%repositoryGrid%] WARN  o.a.i.i.p.a.GridAffinityAssignmentCache - Logging at INFO level without checking if INFO level is enabled: Local node affinity assignment distribution is not ideal [cache=personSharedCache, expectedPrimary=32.00, actualPrimary=32, expectedBackups=32.00, actualBackups=0, warningThreshold=50.00%]
08:21:24.610 [exchange-worker-#38%repositoryGrid%] WARN  o.a.i.i.p.a.GridAffinityAssignmentCache - Logging at INFO level without checking if INFO level is enabled: Local node affinity assignment distribution is not ideal [cache=aclEntitySharedCache, expectedPrimary=32.00, actualPrimary=32, expectedBackups=32.00, actualBackups=0, warningThreshold=50.00%]
08:21:24.610 [disco-event-worker-#37%repositoryGrid%] INFO  o.a.c.i.l.SpringIgniteLifecycleBean - Node c5bec0ac-1f6a-4d76-b554-39218238f020 (on [172.19.0.3]) left the Ignite instance repositoryGrid
08:21:24.610 [disco-event-worker-#37%repositoryGrid%] INFO  o.a.c.i.l.SpringIgniteLifecycleBean - Ignite instance repositoryGrid currently has 1 active nodes on addresses [172.19.0.4]
08:21:24.611 [exchange-worker-#38%repositoryGrid%] WARN  o.a.i.i.p.a.GridAffinityAssignmentCache - Logging at INFO level without checking if INFO level is enabled: Local node affinity assignment distribution is not ideal [cache=readersDeniedSharedCache, expectedPrimary=32.00, actualPrimary=32, expectedBackups=32.00, actualBackups=0, warningThreshold=50.00%]
08:21:24.611 [exchange-worker-#38%repositoryGrid%] WARN  o.a.i.i.p.a.GridAffinityAssignmentCache - Logging at INFO level without checking if INFO level is enabled: Local node affinity assignment distribution is not ideal [cache=contentDataSharedCache, expectedPrimary=32.00, actualPrimary=32, expectedBackups=32.00, actualBackups=0, warningThreshold=50.00%]
08:21:24.611 [exchange-worker-#38%repositoryGrid%] WARN  o.a.i.i.p.a.GridAffinityAssignmentCache - Logging at INFO level without checking if INFO level is enabled: Local node affinity assignment distribution is not ideal [cache=contentUrlSharedCache, expectedPrimary=32.00, actualPrimary=32, expectedBackups=32.00, actualBackups=0, warningThreshold=50.00%]
08:21:24.612 [exchange-worker-#38%repositoryGrid%] WARN  o.a.i.i.p.a.GridAffinityAssignmentCache - Logging at INFO level without checking if INFO level is enabled: Local node affinity assignment distribution is not ideal [cache=executingActionsCache, expectedPrimary=32.00, actualPrimary=32, expectedBackups=32.00, actualBackups=0, warningThreshold=50.00%]
08:21:24.612 [exchange-worker-#38%repositoryGrid%] WARN  o.a.i.i.p.a.GridAffinityAssignmentCache - Logging at INFO level without checking if INFO level is enabled: Local node affinity assignment distribution is not ideal [cache=hbClusterUsageCache, expectedPrimary=32.00, actualPrimary=32, expectedBackups=32.00, actualBackups=0, warningThreshold=50.00%]
08:21:24.612 [exchange-worker-#38%repositoryGrid%] WARN  o.a.i.i.p.a.GridAffinityAssignmentCache - Logging at INFO level without checking if INFO level is enabled: Local node affinity assignment distribution is not ideal [cache=permissionsAccessSharedCache, expectedPrimary=32.00, actualPrimary=32, expectedBackups=32.00, actualBackups=0, warningThreshold=50.00%]
08:21:24.613 [exchange-worker-#38%repositoryGrid%] WARN  o.a.i.i.p.a.GridAffinityAssignmentCache - Logging at INFO level without checking if INFO level is enabled: Local node affinity assignment distribution is not ideal [cache=contentUrlMasterKeySharedCache, expectedPrimary=32.00, actualPrimary=32, expectedBackups=32.00, actualBackups=0, warningThreshold=50.00%]
08:21:24.613 [exchange-worker-#38%repositoryGrid%] WARN  o.a.i.i.p.a.GridAffinityAssignmentCache - Logging at INFO level without checking if INFO level is enabled: Local node affinity assignment distribution is not ideal [cache=remoteAlfrescoTicketService.ticketsCache, expectedPrimary=32.00, actualPrimary=32, expectedBackups=32.00, actualBackups=0, warningThreshold=50.00%]
08:21:26.726 [tcp-disco-sock-reader-#4%repositoryGrid%] INFO  o.a.c.i.d.CredentialsAwareTcpDiscoverySpi - Finished serving remote node connection [rmtAddr=/172.19.0.3:49333, rmtPort=49333

@AFaust
Copy link
Collaborator Author

AFaust commented Jun 1, 2020

@andreaskring Yeah - so looks like in this case the number of lines that the log tailing does is not sufficient or the timing of the exchange workers is just different than in my tests. Might need to increase the tailing to 12-15 lines instead of 6. Alternatively we/I might disable the extraneous WARN messages to have an easier reproducible logging experience for valdiation.

@andreaskring
Copy link
Collaborator

Yes - right. I should have thought of that after what you explained to me at our meeting. I have made a few minor syntactical fixes to the bash script and will push these in a minute.

@AFaust AFaust force-pushed the feature/repo-cache-extender branch from 0443235 to b71f73a Compare July 7, 2020 00:23
@AFaust
Copy link
Collaborator Author

AFaust commented Jul 7, 2020

@andreaskring I consider this PR to be ready for merging. Documentation about the companion application has been added and both the basic Maven-integrated integration test as well as the separate, script-based start-stop integration test run successfully after accounting for the recent serialisation optimisations. I have created a separate issue to update our Docker / Kubernetes documentation and samples to also include the companion app, which may be done without hurry after the initial aldica release.

Copy link
Collaborator

@andreaskring andreaskring left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very cool - let's merge it

@AFaust AFaust merged commit 3b4b1e2 into master Jul 8, 2020
@AFaust AFaust deleted the feature/repo-cache-extender branch July 8, 2020 10:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants