Skip to content

Commit

Permalink
Update AAclust().filter_coverage() method documentation
Browse files Browse the repository at this point in the history
  • Loading branch information
breimanntools committed Jul 1, 2024
1 parent 2ca949c commit d2b0b9a
Show file tree
Hide file tree
Showing 2 changed files with 43 additions and 30 deletions.
15 changes: 7 additions & 8 deletions aaanalysis/feature_engineering/_aaclust.py
Original file line number Diff line number Diff line change
Expand Up @@ -621,14 +621,13 @@ def filter_coverage(self,
"""
Select a redundancy-reduced set of numerical scales with defined subcategory coverage.
This method reduces the number of numerical scales in the feature matrix `X` by clustering them.
It ensures that the selected clusters cover a minimum percentage (`min_coverage`) of unique subcategories
in `names_ref`.
The process involves clustering the scales in `X` and selecting one scale per cluster. The initial number of
clusters is determined by the number of unique subcategories in `names_ref`. The number of clusters is increased
step-wise until the overlap (coverage) between the unique elements in `names_ref` and the subcategories of
the selected scales meets or exceeds the defined threshold (`min_coverage`).
This method reduces the number of numerical scales in the feature matrix ``X``, while
ensuring that the selected scales cover a minimum percentage (``min_coverage``) of subcategories.
The process involves clustering the scales in ``X`` and selecting one scale per cluster. The initial number of
clusters is determined by the number of unique subcategories in ``names_ref``. The number of clusters is
increased step-wise until the overlap (coverage) between the unique elements in ``names_ref`` and the
subcategories of the selected scales meets a defined threshold (``min_coverage``).
Parameters
----------
Expand Down
Loading

0 comments on commit d2b0b9a

Please sign in to comment.