-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Matrix refactoring #183
Matrix refactoring #183
Conversation
This moving OrderedMatrix to the bottom of the three places them in the order they are depended on by the one above. This prepares the way for letting each take responsibility for their own dependencies rather than having that centralized in TransformedMatrix.
Let _MatrixWithHidden create its own dependencies.
For better clarity, but also to prepare for what comes next.
After prior refactorings, TransformedMatrix became a straight pass-through to _MatrixWithHidden and its methods could be replaced directly with those from _MatrixWithHidden and the latter class removed.
Since everything now has access to unordered_matrix, there's no reason to go to the start of the pipeline to get those values.
I'm on the fence about this one as the +/- is so close, but thought I'd leave it in as a worthwhile extraction.
As a precursor to reducing vector interleaving to a sort, provide interleaved vectors with an `.ordering` property. The (position, idx, vector) value returned can be sorted to produce vectors in interleaved order.
.iter_rows and ._iter_columns can now be easily subsumed into .rows and .columns respectively. Rename ._all_inserted_rows/columns to just ._inserted_rows/columns
Give _BaseMatrixInsertionVector what it needs on construction to avoid circular dependence on the matrix that constructed them.
Collapse few remaining distinct properties of _OrderedMatrix directly into _MatrixWithInsertions.
Merge remaining properties of _MatrixWithInsertions directly into TransformedMatrix. Merge subclass properties into TransformedMatrix since there's only the one subclass now. Rename ._rows_ind and ._cols_ind to the more descriptive ._visible_rows_mask and ._visible_cols_mask.
These two methods extract the interleaving boilerplate that will be reused for each _AssembledVector measure.
a5deaa6
to
d5040ce
Compare
It turns out the `fbase()` function is the same in all cases. Move that logic into `._apply_interleaved()` and simplify callers.
Hi @slobodan-ilic I think this one is ready to go. It gets the same seven test failures on exporter that master does, so I expect those are unresolved integrations for the new pval work Ernesto has done. Anyway, let me know what you think. I was able to trim out a lot of code and I think clarify things along the way, but I'll let you be the judge of that :) |
_AssembledVector(row, opposing_insertions, 0 if idx < 0 else idx) | ||
for _, idx, row in sorted( | ||
itertools.chain( | ||
(row.ordering for row in self._inserted_rows), | ||
(row.ordering for row in self._base_rows), | ||
) | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@slobodan-ilic @ernestoarbitrio It smells to me like a bug that an assembled row that is an insertion row relies on its idx value to be 0. Why would it be that the the "intersection" pval would always be the first value of .pvals
on the intersecting column? I suppose if that .pvals
was always a sequence of exactly one value that would make sense of it, but it sounds a little odd to me and at the very least is obscure without a comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is probably indicative of needing two extractions (either as classes or as mechanisms within one class). The assembled row that is not an insertion row itself needs this index. The assembled row that is an insertion row, doesn't need it (but currently has it defaulting to 0).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, the interesting thing is that at least one test fails (on expectation doesn't match) if you replace 0 if idx < 0 else idx
with idx
. idx
always has a legitimate value, it's just end-based (like -2) and indexes into the insertions-collection if the vector is an insertion.
I took a quick look, and it appears this idx
value is directly involved in the pvals
calculation for an insertion, which just seems odd to me. I can't think of what calculation would need a col_idx but work correctly with the same value (0) for all insertions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ernestoarbitrio can you weigh in on this one? ^^^
Some refactorings to trim-down and clarify transform-application matrix objects.