New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update to cBioPortal cache persistence using REDIS #46
Comments
Consult with Hongxin ... onkoKB is using external caching and we can use the approach as a template. (Helm chart and linkage) see notes here: https://docs.google.com/document/d/1bu7NVOavH_ekK1O_DKHZAp1HAFwPVlUkp1c0gE5tDYo |
A new cluster node was added, in consideration of the expected high memory footprint. helm and tiller executables were copied from dashi-dev to pipelines and needed environment variables were added to the cbioportal_importer shell startup script to allow use of helm on pipelines. Also, file access mode on relevant startup scripts were tightened. Adding new node: It was noticed that only the master pod was deployed to the specified instance group. Other pods were deployed to general compute nodes on the cluster. |
We had some trouble finding the redis server ip address and port to use in the configuration. You can find these with commands like this: Output looks like: |
We were having trouble due to failing to comment out the beans for Ehcache in applicationContext-ehcache.xml Also we tried to redirect log4j away from the application log (/srv/www/schultz-tomcat/logs...) and stream it out to the console (which we hoped would appear in the catalina.out log (accessible by "kubectl logs") Helpful commands: (from knowlegesystems-k8s-configs?) kubectl get pods |
Latest failure report during application startup: 2020-07-08 20:57:29 [localhost-startStop-1] ERROR org.springframework.web.context.ContextLoader - Context initialization failed |
How to actually find stuff in the cache using redis-cli: https://scalegrid.io/blog/redis-iterating-over-keys/ E.g. |
Configuration information:
Decide on: ttl - - time to live for key\value entry in milliseconds. If 0 then time to live doesn't affect entry expiration.
|
The initial round of coding has reached a point where we could see persistence layer calls being cached in the external redis server deployment. Code has now been committed to a feature branch in the cbioportal code base here: redis-cache-dev is the branch name and for reference, here is the PR with the code changes: |
Status of "each portal needs to save to a different cache names": I wanted to do something like:
But I can't because all of our Plan for Monday:
It worked!!!
Except "public-portalStaticRepositoryCacheOneResolver" needs to be renamed "public-portalStaticRepositoryCacheOne" (doing that now). |
To get information about the helm chart deployed:
Options are defined here: https://github.com/bitnami/charts/tree/master/bitnami/redis/#installing-the-chart and it looks like major version 10 is the most recent. Rob deployed a r5.xlarge node, which has 32 GiB (probably 64 bit, Redis says use 32 bit when possible).
Rob ran:
To configure the server add:
Currently set to use unlimited memory with no eviction policy:
Current memory usage around 76mb, given by https://redis.io/commands/memory-stats:
Set Maxmemory to 50mb and see if keys are removed from cache (before and after we add eviction policy):
Lots of details about pod given with Get pods running on a specific node with
New Redis Cluster has maxmemory and the Maxmemory-policy set correctly:
Looks like peak allocated memory can be over what maxmemory is set to but I haven't seen the
I checked and /data is created and it is size 8G (really 7.9G) which is the default size.
I think not enough memory for the cache caused a new instance of the portal to not be able to start up.
Seems like setting the ttl and maxIdleTime to 0 made the portal unable to start up, even though it is supposed to be OK with a max-memory policy of allkeys-lfu or allkeys-lru. Not sure why it is a problem.
|
Tomorrow: Try to get ehcache + Redis working. Ideally one property has: ehcache-heap, ehcache-disk, ehcache-hybrid, redis, none options. Use that to figure out which beans to set up. |
Really separate ehcache and Redis: cBioPortal/cbioportal#7696 Ideally we would not create the ehcache beans when we are using redis but right now we need to because the EhCacheStatistics is a @Compontent with a constructor that expects a javax.cache.CacheManager which we do not have when running redis. Maybe we can change the Redis setup to also use javax.cache.CacheManager: Some of the code in EhCacheStatistics is specific to ehcache but some would work with any javax.cache.CacheManager. Maybe split the code out? |
@mandawilson @sheridancbio I may have misheard/understood something during the scrum today about having to add the REDIS IP into portal.properties in order for the cbioportal backend to find the REDIS host. Did I hear right? If so, why can't we use DNS?
|
Commit b61e6b8de5e2f11beed285ab1e97a1114c2c24a3 works without any statistics for Redis (poorly described exception is thrown if API endpoints are called with Redis profile). Trying to get a javax.cache.Cache (JCache API) for Redis so that we can get two of the cache statistics endpoints working with redis. We need to somehow get a Redisson CachingProvider/CacheManager in Cannot resolve reference to bean 'redisCacheManager' while setting bean property 'cacheManager'; nested exception is org.springframework.beans.factory.BeanCreationException: Error creating bean with name 'redisCacheManager' defined in class path resource [applicationContext-rediscache.xml]: Cannot resolve reference to bean 'cacheManager' while setting bean property 'cacheManager'; nested exception is org.springframework.beans.factory.BeanCreationException: Error creating bean with name 'cacheManager' defined in class path resource [applicationContext-rediscache.xml]: Invocation of init method failed; nested exception is javax.cache.CacheException: Multiple CachingProviders have been configured when only a single CachingProvider is expected JCache API/Redisson: https://github.com/redisson/redisson/wiki/14.-Integration-with-frameworks#144-jcache-api-jsr-107-implementation
|
I was able to loop through the available CachingProviders and got this: cbioportal_importer@pipelines:/data/portal-cron/git-repos/knowledgesystems-k8s-deployment/cbioportal> kubectl logs cbioportal-redis-cache-6d9ccbd885-k5s5c| grep ERROR So I added code to check for Tomorrow maybe focus on commit b61e6b8de5e2f11beed285ab1e97a1114c2c24a3 which should be tested with both Redis and Ehcache. I had it working with redis. Change the cache_type in your deployment yaml file from redis to one of the ehcache ones. |
|
|
"So the cache names are different for each portal instance in Redis (as you know) and then in both Redis and EhCache the key stored in the cache is the class name + method name + parameters (e.g. "StudyMyBatisRepository_getAllStudies_null_SUMMARY_10000000_0_null_ASC"). Does that answer your question?" |
Config Map: kubectl delete configmap cbioportal-redis-cache kubectl create -f config_map_redis_cache.yaml |
|
Resolving redission version clashes with libraries installed in the tomcat server is now delegated to another card. |
Rob and I tried to make the Kubernetes redis services available to dashi-dev (and the world) with ingress, but ingress does not support port 6379 (or anything besides http and https) by default. You can modify ingress to do so, but we didn't feel confident enough to do that. See: https://stackoverflow.com/questions/62939846/exposing-redis-with-ingress-nginx-controller We see that Ben installed a redis for testing on dashi-dev in 2018 and my plan is to use that for testing. schultz-tomcat on dashi-dev is configured to use a redis running on pipelines, database 3, but that database doesn't have any keys stored in it. To test on dashi-dev:
|
Redis database usage:
|
dashi-dev redisson conflicts:
|
Both the app and tomcat are using the same version of redisson but we are getting this error: Error creating bean with name 'cacheManager' defined in class path resource [applicationContext-rediscache.xml]: Invocation of init method failed; nested exception is java.lang.LinkageError: loader constraint violation: when resolving method "org.redisson.jcache.configuration.RedissonConfiguration.fromInstance(Lorg/redisson/api/RedissonClient;Ljavax/cache/configuration/Configuration;)Ljavax/cache/configuration/Configuration;" the class loader (instance of org/apache/catalina/loader/ParallelWebappClassLoader) of the current class, org/cbioportal/persistence/util/CustomRedisCachingProvider, and the class loader (instance of java/net/URLClassLoader) for the method's defining class, org/redisson/jcache/configuration/RedissonConfiguration, have different Class objects for the type javax/cache/configuration/Configuration used in the signature Look at this: redisson/redisson#1668 So we added the redisson jar back to the war file and now are getting the same error we got when the redisson versions did not match: 25-Sep-2020 11:40:42.889 WARNING [localhost-startStop-5] org.apache.naming.NamingContext.lookup Unexpected exception resolving reference Look at this file because it returns the RedissonClient ./persistence/persistence-api/src/main/java/org/cbioportal/persistence/util/CustomRedisCachingProvider.java |
Caused by: org.springframework.beans.factory.BeanCreationException: Error creating bean with name 'cacheManager' defined in class path resource [applicationContext-rediscache.xml]: Invocation of init method failed; nested exception is java.util.ServiceConfigurationError: javax.cache.spi.CachingProvider: Provider org.redisson.jcache.JCachingProvider not a subtype |
Check that Tomcat sessions are being stored by my app in Kubernetes: kubectl exec --stdin --tty cbioportal-redis-cache-5b9b9bdb4-6gsv5 -- /bin/bash cbioportal-redis-master:6379> keys *F6B5E639F2DF5D85A43D7E02C9E5F98E
It is there! |
I renamed the JNDI resource in Tomcat's context.xml and server.xml but still have the same error: The local resource link [redisson-tomcat] that refers to global resource [session/redisson-tomcat] was expected to return an instance of [org.redisson.api.RedissonClient] but returned an instance of [org.redisson.Redisson] |
When I exclude the jar from the war, I get this again: Invocation of init method failed; nested exception is java.util.ServiceConfigurationError: javax.cache.spi.CachingProvider: Provider org.redisson.jcache.JCachingProvider not a subtype |
Passing this jcacheConfig, instead of just using the
Causes this: text-rediscache.xml]: Invocation of init method failed; nested exception is java.lang.LinkageError: loader constraint violation: when resolving method "org.redisson.jcache.configuration.RedissonConfiguration.fromInstance(Lorg/redisson/api/RedissonClient;Ljavax/cache/configuration/Configuration;)Ljavax/cache/configuration/Configuration;" the class loader (instance of org/apache/catalina/loader/ParallelWebappClassLoader) of the current class, org/cbioportal/persistence/util/CustomRedisCachingProvider, and the class loader (instance of java/net/URLClassLoader) for the method's defining class, org/redisson/jcache/configuration/RedissonConfiguration, have different Class objects for the type javax/cache/configuration/Configuration used in the signature But leaving it and removing the jar from the war give you this again: |
In this thread it says Perhaps it works in Kubernetes because there is only the web runner jar running our app, so one shared "space" or thread or whatever the class loader needs. Do we have to take our CustomRedisCachingProvider.java class and put it in a jar, and exclude it from the war and add it to Tomcat? In portal.pom: Also exclude redisson*.jar.
|
added loader option to context.xml
|
After contemplating additional efforts towards resolving the problem loading appropriate classes (choosing the container "/lib" jar classes for redisson or choosing the warfile package dependencies) during tomcat deployment, product owner has accepted that the unresolved conflict during tomcat deployments will be allowed to stand. Our local tomcat deployments will not use redis caching for the persistence layer - and will instead continue with the ehcache persistence caching. A note will be added to the build/deploy documentation that installers should expect difficulties when deploying .war files to a tomcat which has been configured to use redis caching for user sessions. |
When switching to profiles - remember you'll have to configure this for unit tests!! or use @MockBean |
./docs/Caching.md:@Cacheable(cacheNames = "ClinicalDataCache", condition = "@cacheEnabledConfig.getEnabled()")
Re-configure so that REDIS is not built into the JVM process, but is instead run as an external process. The cBioPortal persistence layer annotations can be updated to refer to an external service.
Consult with Hongxin ... oncoKB is using external caching and we can use the approach as a template. (Helm chart and linkage) (OncoKb CacheConfiguration)
One or more help deployments of Redis are running in the kubernetes cluster and being used for various websites to cache persistence layer return values.
Use a separate pool of Redis services for the distinct cohort databases
Start with Genie database .. but plan for having a separate pool of servers for public.
The code base should allow continued use of embedded Ehcache .. but allow reconfiguration for using external redis services.
The text was updated successfully, but these errors were encountered: