Refactor CCA
and CCorA
classes to reduce repeated operations
#49
Labels
enhancement
New feature or request
estimator
Related to one or more estimators
refactor
Code cleanup without changing functionality
Milestone
At present,
CCA
andCCorA
are implemented as classes that mainly use@property
to access attributes. As such, properties call other properties within the class and there are often cases when properties are used repeatedly, which forces recalculation. This is particularly expensive when evaluating methods from thenp.linalg
module such as singular value decomposition, QR decomposition, and solving for systems of linear equations. We need to ensure that attributes are only calculated once when there is no possibility of change to attribute values.This issue is closely tied to #47, but we are opening it in a separate issue for a few different reasons:
transform
method on the enclosing estimators (CCATransformer
andCCorATransformer
) is the method that is called most often and the fixes in Store repeatedly used variables as estimator attributes #47 #48 address the issue of repeated calls to properties by setting estimator attributes duringfit
. Because thefit
method on these transformers calls the classes of interest in this issue (CCA
andCCorA
), it still suffers from repeated calls to properties, butfit
is typically called only once, so delaying the refactor shouldn't be too problematic.yaImpute
and wanted to verify that "checkpoints" in the porting process were lining up with output from yaImpute. As such, there are properties that correspond to multiple steps in each ordination process, very few of which we actually use. Now that the porting seems to be behaving correctly, we can treat many of these properties as local variables and only retain the attributes/properties/methods needed for clients and enclosing classes. However, if there is value in retaining some of these "internal" attributes for diagnosing model goodness-of-fit (e.g. charting capabilities for ordinations), we should perhaps retain these as (at the very least) private properties/methods. This will require a bit of research into what attributes are retained for diagnostics, usingyaImpute
andvegan
for inspiration.CCA
, there are actually a number of other ordination techniques (redundancy analysis (RDA), distance-based RDA) that use common code already implemented in this class from which we may choose to create additional estimators. Before optimizing the two existing classes, we'll want to think through the design of any new estimators and reuse as much of the existing code as possible.Note that none of these reasons really stops us from refactoring out the duplicate calls at present, but changes to the existing code and potential expansion into a family of classes makes it prudent to delay until after the
Core Estimators
milestone.The text was updated successfully, but these errors were encountered: