Refactor learning module #1873

luisfpereira · 2023-06-29T15:16:28Z

This PR refactors learning in order to make an estimator consistently use space (instead of space and/or metric).

1

Additionally, some estimators or optimizers were "promoted" to their own class. In particular:

in FrechetMean
- optimizers: BaseGradientDescent, GradientDescent, BatchGradientDescent, and AdaptiveGradientDescent
- algorithms: FrechetMean, CircleMean, ElasticMean, LinearMean
in GeodesicRegression:
- optimizers: RiemannianGradientDescent
in ExponentialBarycenter
- optimizers: GradientDescent

Notice these classes are not created from scratch, but from already-existing code.

Regarding the optimizers, the main goal is to make it easier to control hyperparameters (following what has been done in geomstats.numerics.optimizers), as well as to try to find common abstractions we may be missing out (for now there's nothing very obvious, but is nice to see adaptations of gradient descent appearing "everywhere" -> maybe later we can merge them).

Regarding the estimators, it makes it easier to identity what's the algorithm we are actually using. Notice the API is not impacted, as e.g. to use CircleMean, it suffices to pass a circle to FrechetMean.

I've added the method set to set parameters of e.g. an optimizer used within an estimator. This serves two purposes:

it simplifies the __init__ method by not allowing such parameters to be defined from there
it solves the issue of having parameters that are ignored: e.g. GeodesicRegression using method="extrinsic", requires different parameters than using method="intrinsic". set verifies if the parameter we are trying to set actually exists, so it can prevent bugs.

I've made set return self such that the following snippet of code is valid (this is very common in OOP, but not seem very often in python -> let me know how natural it feels to you):

estimator = GeodesicRegression(space, method="riemannian").set(max_iter=20)

2

I've added a note with "required metric methods" to the docstrings of most of the estimators, such that we can understand the minimum we need to define in order to run them.

3

In TangentPCA and ToTangentPCA I use the notion of _geometry in order to also be able to use the space to compute the exp: if a space is equipped, geometry=space.metric, otherwise geometry=space, followed by geometry.metric.

A lot of other little things where updated, but mostly for consistency among estimators.

There's still work that needs to be done in this part, especially related with inheritance.

EDIT: tests are successful now!

closes #715

…classifier and Knn; fix AAC

review-notebook-app · 2023-07-10T15:33:21Z

Check out this pull request on

See visual diffs & provide feedback on Jupyter Notebooks.

Powered by ReviewNB

codecov · 2023-07-11T09:38:52Z

Codecov Report

Merging #1873 (f6b96ff) into master (4f828b6) will decrease coverage by 3.01%.
The diff coverage is 91.03%.

@@            Coverage Diff             @@
##           master    #1873      +/-   ##
==========================================
- Coverage   91.44%   88.43%   -3.01%     
==========================================
  Files         141      141              
  Lines       13450    13570     +120     
==========================================
- Hits        12298    11999     -299     
- Misses       1152     1571     +419

Flag	Coverage Δ
autograd	`87.66% <91.03%> (?)`
numpy	`?`
pytorch	`82.35% <69.03%> (+0.02%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files	Coverage Δ
geomstats/learning/aac.py	`45.60% <ø> (-52.80%)`	⬇️
geomstats/learning/expectation_maximization.py	`81.12% <80.15%> (+1.23%)`	⬆️
geomstats/learning/preprocessing.py	`86.54% <81.49%> (+3.78%)`	⬆️
geomstats/learning/geodesic_regression.py	`83.61% <86.96%> (+3.74%)`	⬆️
geomstats/learning/mdm.py	`97.73% <92.31%> (-2.27%)`	⬇️
geomstats/learning/riemannian_mean_shift.py	`94.03% <92.60%> (+0.28%)`	⬆️
geomstats/learning/kalman_filter.py	`95.35% <93.88%> (-0.51%)`	⬇️
geomstats/learning/wrapped_gaussian_process.py	`97.78% <94.74%> (+1.49%)`	⬆️
geomstats/learning/frechet_mean.py	`95.74% <95.08%> (-0.28%)`	⬇️
geomstats/learning/kmeans.py	`89.42% <96.43%> (+3.70%)`	⬆️
... and 10 more

... and 26 files with indirect coverage changes

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

ninamiolane

So much cleaner and interpretable, thank you so much!!

Two remarks:

the optimizers (e.g. BaseGradientDescent) are currently in the files where they're used (e.g. in frechet_mean): since the logic of optimization is now abstracted, would it make sense to have a separate folder, in the spirit of what you did for the numerics folders?
pymanopt provides optimizers: could we make our abstraction of optimizers compatible with theirs, so that we can use their optimizers on demand?

These two remarks are not necessarily for this PR, we can talk about them in another one, if they make sense.

Thanks again!!

luisfpereira · 2023-07-11T13:02:54Z

@ninamiolane, the optimizers are still particular for the cases we they're defined (that's why I keep them there). The goal is the medium term would be to see if we can get a common abstraction and then move them to e.g. numerics.

Having a seamlessly integration with pymanopt would be great. I think we can attempt this soon.

luisfpereira added 13 commits June 26, 2023 09:40

Refactor FrechetMean

2bc844e

Add set to FrechetMean and fix BatchGradientDescent

99a97d7

Refactor RiemannianKmeans

406987c

Update frechet mean tests

85aa0e0

Update RiemannianKMeans tests

74a8d7f

Refactor IncrementalFrechetMean

529d928

Update GeometricMedian to receive space

121a31a

Update IncrementalFrechetMean docstrings

4fc9567

Refactor GeodesicRegression

94347be

Refactor PCA

6f97a02

Refactor ExponentialBarycenter

aa31ec4

Refactor RiemannianKMedoids

90c2a07

Refactor RiemannianEM

3912c93

luisfpereira force-pushed the learning-refactor branch from fa62fd9 to 3912c93 Compare July 7, 2023 09:49

luisfpereira added 12 commits July 7, 2023 13:38

Refactor RiemannianMininumDistanceToMean

d0cc65b

Refactor OnlineKMeans

d148fef

Refactor RiemannianMeanShift

1a65f41

Refactor ToTangentSpace

0ec304c

Merge branch 'master' into learning-refactor

c39fef6

Refactor WrappedGaussianProcesses

b4f1d7b

Refactor kalman_filter

6db8e2e

Fix, clean, and update docstrings of learning methods

5c2e5b7

Refactor AgglomerativeHierarchicalClustering, KernelDensityEstimation…

3eb8a67

…classifier and Knn; fix AAC

Update examples

e2b08aa

Update notebooks

14b6063

Fix, clean, and update docstrings of learning methods (part 2)

9afff75

luisfpereira force-pushed the learning-refactor branch from 8316006 to e43cbad Compare July 10, 2023 15:52

luisfpereira requested a review from ninamiolane July 10, 2023 16:22

luisfpereira marked this pull request as ready for review July 10, 2023 16:22

luisfpereira added 2 commits July 10, 2023 18:27

Fix tests

b857560

Bound tensorflow version in notebook

771b41b

luisfpereira force-pushed the learning-refactor branch from e43cbad to 771b41b Compare July 10, 2023 16:28

luisfpereira added 2 commits July 11, 2023 10:00

Fix geodesic regression for pytorch

e283b4f

Fix issue with _SpecialEuclideanMatrix projection and autograd

c489454

luisfpereira mentioned this pull request Jul 11, 2023

Revision of polynomial_regression #1802

Open

6 tasks

luisfpereira added refactoring code refactoring learning labels Jul 11, 2023

Update tolerances on test_connection to avoid recurrent failures

f6b96ff

ninamiolane approved these changes Jul 11, 2023

View reviewed changes

This was referenced Jul 11, 2023

Numerically defined geometry -- obtaining the Fréchet mean #1625

Open

Add PullbackDiffeoMetric example to notebook 09 #1868

Merged

luisfpereira mentioned this pull request Jul 11, 2023

Errors in FrechetMean (methods: default, adaptive) on discrete curves #1283

Open

luisfpereira merged commit 78025da into geomstats:master Jul 11, 2023
12 of 15 checks passed

luisfpereira deleted the learning-refactor branch July 11, 2023 15:28

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor learning module #1873

Refactor learning module #1873

luisfpereira commented Jun 29, 2023 •

edited

review-notebook-app bot commented Jul 10, 2023

codecov bot commented Jul 11, 2023 •

edited

ninamiolane left a comment

luisfpereira commented Jul 11, 2023

Refactor learning module #1873

Refactor learning module #1873

Conversation

luisfpereira commented Jun 29, 2023 • edited

review-notebook-app bot commented Jul 10, 2023

codecov bot commented Jul 11, 2023 • edited

Codecov Report

ninamiolane left a comment

Choose a reason for hiding this comment

luisfpereira commented Jul 11, 2023

luisfpereira commented Jun 29, 2023 •

edited

codecov bot commented Jul 11, 2023 •

edited