Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support use of reduced number of axes in CCorATransformer, CCATransformer #33

Closed
grovduck opened this issue May 31, 2023 · 1 comment · Fixed by #40
Closed

Support use of reduced number of axes in CCorATransformer, CCATransformer #33

grovduck opened this issue May 31, 2023 · 1 comment · Fixed by #40
Assignees
Labels
enhancement New feature or request estimator Related to one or more estimators testing Related to test code or data

Comments

@grovduck
Copy link
Member

We've touched on this in a few previous issues (#20, #22) but for estimators that perform dimensionality reduction and create "axes" that are linear combinations of their input covariates (i.e. CCorATransformer, CCATransformer), we want the flexibility that a user can define the number of axes to use from these estimators. Typically, these estimators will create the number of axes up to the number of X variables provided. For example, CCorATransformer uses a test internally to reduce the number of axes down based on the statistical significance of the axes.

We need to solve the following problems with this issue:

  • Allow the user to specify the number of axes to keep. This would likely be done through the estimator's __init__ and would default to None.
  • After the model is fit, use the minimum value from either the user-specified value or from the number of axes calculated during model fit.
  • If the number of axes is reduced, possibly modify and/or normalize the axis weightings to account for this. Each estimator handles this differently.
  • Use the updated number of axes and axis weightings in transform.
  • For these estimators' "wrappers" (i.e. MSNRegression and GNNRegression), test that predict and kneighbors return the expected values. Because yaImpute doesn't support a user-defined option for fewer axes, we will need to create new test files that implement the expected behavior and produce these results.
@grovduck grovduck added enhancement New feature or request testing Related to test code or data estimator Related to one or more estimators labels May 31, 2023
@grovduck grovduck added this to the Core Estimators milestone Jun 7, 2023
@grovduck grovduck self-assigned this Jun 7, 2023
grovduck added a commit that referenced this issue Jun 28, 2023
This feature introduces a new mixin (ComponentReducerMixin) to handle
dimension reduction for transformers which create "axes" as linear
combinations of their X features (presently CCATransformer and
CCorATransformer).  For these transformers, there is an upper limit
on the number of axes that can be used, either by matrix rank or
using tests to determine significance of axes.  The user can
specify a hyperparameter (n_components) on these transformers
to use fewer than the maximum number of axes, which in turn changes
the n-dimensional space for NN finding.

A few relevant changes:
- CCA and CCorA ordination classes now have a "max_components" property
  to ensure consistency so that either can be used in conjunction
  with ComponentReducerMixin
- CCA has been refactored to combine the coefficients and axis_weights
  into a single matrix, which has been moved into the projector method
  (similar to CCorA).
- Both CCA and CCorA's projector method now accommodates a parameter
  for n_components
- Tests have been added to ensure that transformed X data has the
  correct number of components and raise errors if the user specified
  value is outside the range 0 - max_components.
grovduck added a commit that referenced this issue Jul 4, 2023
This feature introduces a new mixin (`ComponentReducerMixin`) to handle dimension reduction for transformers which create "axes" as linear combinations of their X features (presently `CCATransformer` and `CCorATransformer`).  For these transformers, there is an upper limit on the number of axes that can be used, either by matrix rank or using tests to determine significance of axes.  The user can specify a hyperparameter (`n_components`) on these transformers
to use fewer than the maximum number of axes, which in turn changes the n-dimensional space for NN finding.

A few relevant changes:
- `CCA` and `CCorA` ordination classes now have a `max_components` property to ensure consistency so that either can be used in conjunction with `ComponentReducerMixin`
- `CCA` has been refactored to combine the coefficients and axis_weights into a single matrix, which has been moved into the `projector` method (similar to `CCorA`).
- Both `CCA` and `CCorA`'s `projector` method now accommodates a parameter for `n_components`
- Tests have been added to ensure that transformed X data has the correct number of components and raise errors if the user specified value is outside the range 0 - max_components.
- Tests have been added to ensure that `GNNRegressor` and `MSNRegressor` use the correct number of components in `kneighbors` and `predict`.


---------

Co-authored-by: Aaron Zuspan <50475791+aazuspan@users.noreply.github.com>
@grovduck
Copy link
Member Author

grovduck commented Jul 4, 2023

Resolved via #40

@grovduck grovduck closed this as completed Jul 4, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request estimator Related to one or more estimators testing Related to test code or data
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant