-
Notifications
You must be signed in to change notification settings - Fork 699
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enhance the CachingLayer #904
Comments
IMHO, CachingLayer in its current implementation does not solve the problem of re-calculating the indicator. I mean, in one process, an indicator with a certain set of parameters is calculated once - this is good. However, in practice, in the process of optimization, it is necessary to select the parameters of indicators, which leads to their constant recalculation. In addition, during the research process, we run our strategies over and over again and each time the indicators are recalculated anew. On the other hand, when the strategy is launched as a trading robot, caching is generally redundant because for most indicators, only its last value matters. So In my opinion, caching can be improved by saving the results of the indicator calculation on disk. Of course, it is better to use memcached, redis, mongodb or hazelcast for data storage, but even writing to disk will significantly increase the caching efficiency. |
Would it make sense to just remove the cache from the project entirely? Why caching when most users never actually use the cache because:
Do I miss some other broad use cases which justifies the "caching layer" in ta4j? If someone wants to save or cache the values somewhere, then they can do that, but the project doesn't need it for any indicator calculation. I think, we should do two steps:
Step 1 can be done immediatley. |
As I see, the caching layer is used in many indicators and I could not imagine which performance impact we would get in case of removing CachedIndicator. |
Yes, we also need it for its child |
The cache is very important for a lot of use cases.
So far I have no metrics but I think the caching layer that ensures a calculation only needs to executed once has a very big impact of performance
I would prefer to have a cache in the default mode and adding the possiblity to disable the cache. BarSeries barSeries = new BaseBarSeriesBuilder()
.withName("AAPL live trading")
.withCache(BaseCacheBuilder.NO_CACHE); If implementing the cache this way I would suggest to use caffeine as a third party cache provider for the default "BaseCache" implementation. |
After considering the importance of the caching in ta4j, I have a few questions:
The missing requirements are to avoid errors in indicator calculations due to the things described in #902 (comment). We should also discuss the following steps to clarify whether much reworking of the caching layer is necessary at all:
|
The possibility of disabling the cache would be only a side effect of an "pluggable" cache.
There are fairly common issues and bug reports that are based or related to our current caching logic. And I think the problems are not solved by adding more and more special cases to this: public T getValue(int index) {
BarSeries series = getBarSeries();
if (series == null) {
// Series is null; the indicator doesn't need cache.
// (e.g. simple computation of the value)
// --> Calculating the value
T result = calculate(index);
if (log.isTraceEnabled()) {
log.trace("{}({}): {}", this, index, result);
}
return result;
}
// Series is not null
final int removedBarsCount = series.getRemovedBarsCount();
final int maximumResultCount = series.getMaximumBarCount();
T result;
if (index < removedBarsCount) {
// Result already removed from cache
if (log.isTraceEnabled()) {
log.trace("{}: result from bar {} already removed from cache, use {}-th instead",
getClass().getSimpleName(), index, removedBarsCount);
}
increaseLengthTo(removedBarsCount, maximumResultCount);
highestResultIndex = removedBarsCount;
result = results.get(0);
if (result == null) {
// It should be "result = calculate(removedBarsCount);".
// We use "result = calculate(0);" as a workaround
// to fix issue #120 (https://github.com/mdeverdelhan/ta4j/issues/120).
result = calculate(0);
results.set(0, result);
}
} else {
if (index == series.getEndIndex()) {
// Don't cache result if last bar
result = calculate(index);
} else {
increaseLengthTo(index, maximumResultCount);
if (index > highestResultIndex) {
// Result not calculated yet
highestResultIndex = index;
result = calculate(index);
results.set(results.size() - 1, result);
} else {
// Result covered by current cache
int resultInnerIndex = results.size() - 1 - (highestResultIndex - index);
result = results.get(resultInnerIndex);
if (result == null) {
result = calculate(index);
results.set(resultInnerIndex, result);
}
}
}
}
if (log.isTraceEnabled()) {
log.trace("{}({}): {}", this, index, result);
}
return result;
}
/**
* Increases the size of cached results buffer.
*
* @param index the index to increase length to
* @param maxLength the maximum length of the results buffer
*/
private void increaseLengthTo(int index, int maxLength) {
if (highestResultIndex > -1) {
int newResultsCount = Math.min(index - highestResultIndex, maxLength);
if (newResultsCount == maxLength) {
results.clear();
results.addAll(Collections.nCopies(maxLength, null));
} else if (newResultsCount > 0) {
results.addAll(Collections.nCopies(newResultsCount, null));
removeExceedingResults(maxLength);
}
} else {
// First use of cache
assert results.isEmpty() : "Cache results list should be empty";
results.addAll(Collections.nCopies(Math.min(index + 1, maxLength), null));
}
}
/**
* Removes the N first results which exceed the maximum bar count. (i.e. keeps
* only the last maximumResultCount results)
*
* @param maximumResultCount the number of results to keep
*/
private void removeExceedingResults(int maximumResultCount) {
int resultCount = results.size();
if (resultCount > maximumResultCount) {
// Removing old results
final int nbResultsToRemove = resultCount - maximumResultCount;
if (nbResultsToRemove == 1) {
results.remove(0);
} else {
results.subList(0, nbResultsToRemove).clear();
}
}
}
Yes, independently of the cache fix or enhancement we need to eliminate all usages of instance variables that store context information that depend on an index
Using |
I think before providing a cache fix or optional caching, we should do #906. |
Draft: #907 |
Hi Simon (@team172011). I was thinking to implement cache myself and came across this ticket. I've reviewed your great work and was wondering what was the reason for removing CacheProvider from BaseSeries as it completely decouples cache implementation from CacheIndicator. I have use cases where I need partial cache busting (rewind) as well as use of different cache implementations for different instances of bar series and going back to cache factory via BarSeries would be perfect. |
@sergproua I think one problem with the |
This issue is about improving/replacing/correcting the CachingLayer:
Some background information:
Ideas welcome.
The text was updated successfully, but these errors were encountered: