WIP: ENH,TST: partial fit implementation with tests by eiviani-lanl · Pull Request #47 · lanl/GFDL

eiviani-lanl · 2026-02-05T02:43:24Z

partial_fit() using normal equations

Moore Penrose Pseudoinverse:
D+ = (D.T @ D)^-1 @ D.T

Least squares solution:
D+ @ y = (D.T @ D)^-1 @ D.T @ y

We're persisting the gram (D.T @ D) and moment (D.T @ y) matrices and updating them by adding the gram and moment matrices of each consecutive batch.

Update rule:
Consider the summation representation of matrix multiplication:
(B.T @ B)_ij = sum_k (B.T_ik @ B_kj)

Now suppose the full design matrix is formed by vertically stacking
two batches, written as [B_1 | B_2]:
[B_1 | B_2].T @ [B_1 | B_2] = B_1.T @ B_1 + B_2.T @ B_2

…ial_fit

tylerjereddy · 2026-02-05T04:11:20Z

Thanks, I've added the 0.2.0 release milestone since trying to merge this right before branching for a release would probably be ill-advised.

tylerjereddy · 2026-02-12T17:29:59Z

I just resolved the merge conflicts and force pushed up. Let me see if I can at least take a quick look over.

tylerjereddy · 2026-02-12T18:13:39Z

+    ff_model.fit(X, y)
+
+    classes = np.unique(y)
+    batch = 25


I noticed that if I change the batch size to 17 locally, one of the Moore-Penrose cases ends up with a numerical issue that causes a failure.

That may be related to the requirement for well-conditioned matrices you note above. We'll need to be clear on what the limitations/weaknesses with partial_fit are, and double check that they are things we cannot improve. One useful exercise might be to check this test with i.e., MLPClassifier to see if its partial_fit is a bit more robust in this regard.

(and of course, to assess if using rtol + partial_fit actually allows us to overcome many of these issues, or not)

@eiviani-lanl this looks like it still needs a response

Hmm, I can't reproduce this locally. I tried reverting the solver to the original np.linalg.pinv, as well as reverting the n_redundant argument in make_classification to 2. Can you still reproduce this error with the most recent version of model.py and test_model.py?

tylerjereddy · 2026-02-12T18:39:19Z

+        # NOTE: for Moore-Penrose, a large singular value
+        # cutoff (rcond) is required to achieve reasonable accuracy
+        # with the Wisconsin breast cancer dataset
+        # Without rtol accuracy ~= 0.6316


Excellent, that's good to know. Does this apply only for partial_fit() on that dataset, or to full fit() as well? Based on this, I wonder if it might be beneficial to:

Perhaps add a Notes docstring section to the partial_fit() function(s) as appropriate to concisely express the potential importance of setting rtol appropriately? I can't remember if we already do that for fit() proper, which of course also has the same issue with Moore-Penrose.

Perhaps also include some of your math details described in detail in a few places to the Notes section of the docstring for partial_fit(). I believe you currently have that info as a detailed code comment that an end user wouldn't see without looking at our source. Not sure if we want quite that much detail, but maybe useful to describe at least the core algorithm used.

@eiviani-lanl I think this review comment still needs a response

tylerjereddy · 2026-03-04T00:44:31Z

I'll add a "WIP" (work in progress) to the title here since some merges aren't done correctly yet (i.e., old dependencies are added back accidentally, etc.).

…ial_fit

Did not address structural comments.. I think I'll handle those in a separate PR * Changed solver from `pinv`/`solve` to `lstsq` as it's more robust for ill determined matrices * I don't think it's possible to use `Ridge` in `partial_fit()`, at least from my experimentation. This does reduce the flexibility in solver, but I can attempt to implement other solvers by hand in a future release. * Combined ensemble/classifier tests. * Added test for partial_fit regression performance on Boston dataset.

tylerjereddy

@eiviani-lanl Ok, I did another round of detailed review. Some notes:

I added some reminders to respond to inline review comments--a fair bit more detail would be helpful in some places, some changes may make sense to make right away, others you may defer to issues, but in such cases please open the issues and refer to them in written self-review comments so it is clear to the reviewer that you're coordinating those matters.
Moving forward, if you don't mind leaving reviewer comments unresolved that can help (let the reviewer decide if resolved--in this case it is "ok" I just checked them all myself)
Overall, it looks pretty good--I'd say I'm a bit unclear on the reasoning for the adjusted solving approaches and did add some review comments about justifying/explaining those with crystal clarity to the user/reader in docstrings.

tylerjereddy · 2026-03-31T22:19:31Z

-@pytest.mark.parametrize("activation", activations)
-@pytest.mark.parametrize("weight_scheme", weights)
+@pytest.mark.parametrize("activation", ACTIVATIONS.keys())
+@pytest.mark.parametrize("weight_scheme", list(WEIGHTS.keys()))


Here and elsewhere--I can't quite work out why you need to perform the list() coercion for one dictionary and not the other? When I remove the coercions locally the same number of tests run and pass, so unless there's a compelling reason for it, it is probably best to remove the spurious coercions.

This has been fixed.

tylerjereddy · 2026-03-31T22:23:13Z

+@pytest.mark.parametrize("hidden_layer_sizes", [(10,), (5, 5)])
+@pytest.mark.parametrize("n_classes", [2, 5])
+@pytest.mark.parametrize("activation", ACTIVATIONS.keys())
+@pytest.mark.parametrize("weight_init", list(WEIGHTS.keys())[1:])


this particular coercion is "ok" since you can't slice the dict keys

tylerjereddy · 2026-03-31T22:31:25Z

+        # NOTE: for Moore-Penrose, a large singular value
+        # cutoff (rcond) is required to achieve reasonable accuracy
+        # with the Wisconsin breast cancer dataset
+        # Without rtol accuracy ~= 0.6316


@eiviani-lanl I think this review comment still needs a response

tylerjereddy · 2026-03-31T22:37:10Z

+            model.partial_fit(X_train[start:end], y_train[start:end])
+
+    actual = model.score(X_test, y_test)
+    # RandomForestRegressor() with default params scores 0.958 here


Should this be RandomForestClassifier?

tylerjereddy · 2026-03-31T22:43:23Z

+    "Classifier, attr",
+    [
+        (GFDLClassifier, "coeff_"),
+        (GFDLClassifier, "coeff_"),


This is an accidental duplication or am I missing something?

tylerjereddy · 2026-03-31T23:32:19Z

+        -----
+        The design matrix is incrementally updated by persisting the gram matrix
+        (D.T @ D) and the moment vector (D.T @ y) for each batch, then adding
+        the gram and moment contributions of each new batch.


as noted elsewhere, a bit of detail about what happens with partial fit vs. full fit for exact and non-exact solves, and why they are different, seems useful to clearly explain

I've added a few sentences to Notes acknowledging the differences in the full fit and partial fit non-exact solve, and justifying our implementation. Notes section also includes the differences in the exact solve and possible necessity for lower rtol.

tylerjereddy · 2026-03-31T23:35:40Z


+    def partial_fit(self, X, y):
+        """
+        Train the gradient-free neural network on the batched training set (X, y).


For this and other partial_fit docstrings, can you please add a .. versionadded:: directive to signal that this feature is new version 0.2.0? You can find examples of how this is done in other Python OSS projects. Example from another project:

tylerjereddy · 2026-03-31T23:38:11Z

+    ff_model.fit(X, y)
+
+    classes = np.unique(y)
+    batch = 25


@eiviani-lanl this looks like it still needs a response

tylerjereddy · 2026-03-31T23:41:35Z

+                self.coeff_ = np.linalg.pinv(self.A, rcond=self.rtol) @ self.B
+        else:
+            reg = np.identity(self.A.shape[0]) * self.reg_alpha
+            self.coeff_ = np.linalg.solve(self.A + reg, self.B)


@eiviani-lanl this one probably still needs a written response

tylerjereddy · 2026-03-31T23:43:23Z

+                    coef_ = np.linalg.pinv(self.As[i], rcond=self.rtol) @ self.Bs[i]
+            else:
+                reg = np.identity(self.As[i].shape[0]) * self.reg_alpha
+                coef_ = np.linalg.solve(self.As[i] + reg, self.Bs[i])


@eiviani-lanl would be useful to clarify here with written response

… possible

nray · 2026-04-09T16:13:04Z

+        #
+        # Now suppose the full design matrix is formed by vertically stacking
+        # two batches, written as [B_1 | B_2]:
+        # [B_1 | B_2].T @ [B_1 | B_2] = B_1.T @ B_1 + B_2.T @ B_2


I would suggest adding the following description here to clarify how our partial fit is implemented:

$D\beta=y$ , where D is the design matrix over the full dataset.
Assuming $n$ batches, $D = [D_1, ..., D_n]^T$ where $D_i$ corresponds to the design matrix for batch $i$. Similarly, $y = [y_1, y_2, ..., y_n]^T$. Then, $\sum_{i=1}^{n}D_i^TD_i\beta=\sum_{i=1}^{n}D_i^Ty_i$

Also, add a statement about how you are collecting the batch sums into the normal and gram matrix as each batch is processed.

…nsively tests rtol and alpha reg

tylerjereddy · 2026-04-13T00:19:16Z

@eiviani-lanl let me know when this is ready for another round of review

@nray I see you've approved when all review comments are not fully resolved--can you reserve approvals for full confidence that the entire PR is ready to go?

tylerjereddy · 2026-04-13T00:30:06Z

The tests are failing for reasons unrelated to the changes here. When you notice that, please proactively open an issue and help the team investigate.

I've gotten that process started with gh-82.

…d rtol differences to user-facing partial_fit() notes

…ences from full `fit()`

…ial_fit

…to eiviani_partial_fit

…egularized solve

eiviani-lanl · 2026-04-23T23:12:44Z

@tylerjereddy I think this is ready for review.

eiviani-lanl added 7 commits February 4, 2026 19:28

ENH: partial_fit() implementation

f6d46dd

TST: partial_fit() tests

a89c665

TST: marked ill conditioned test as xfail

e58325d

Merge branch 'main' of https://github.com/lanl/GFDL into eiviani_part…

078b2ea

…ial_fit

MNT: docstring and naming consistency

ce2b74b

MNT: fixing docstring

b5c4711

MNT: fixing docstring

4e61c9b

tylerjereddy added the enhancement New feature or request label Feb 5, 2026

tylerjereddy added this to the 0.2.0 milestone Feb 5, 2026

eiviani-lanl added 6 commits February 12, 2026 10:27

ENH: partial_fit() implementation

f1066fc

TST: partial_fit() tests

22ddaa2

TST: marked ill conditioned test as xfail

3d3f9ec

MNT: docstring and naming consistency

3d159cb

MNT: fixing docstring

d9c16c2

MNT: fixing docstring

d98778d

tylerjereddy force-pushed the eiviani_partial_fit branch from 4e61c9b to d98778d Compare February 12, 2026 17:29