Move datasets from `mong_gap` and `gromov_wasserstein` notebooks to `datasets.py` file; add conditional distribution sampling to the `gromov_wasserstein` notebook. #467

theouscidda6 · 2023-11-21T16:32:09Z

This pull-request proposes to:

Move datasets from notebooks Monge_Gap.ipynb and gromov_wasserstein.ipynb into file datasets.py. Therefore, we create two new Dataset classes: SklearnDistribution and SortedSpiral.
Update the end of gromov_wasserstein.ipynb with a sampling of the conditional distributions of the discrete Gromov-Wasserstein coupling instead of recovering a one-to-one matching associating each point with the one it is most coupled.

review-notebook-app · 2023-11-21T16:32:14Z

Check out this pull request on

See visual diffs & provide feedback on Jupyter Notebooks.

Powered by ReviewNB

codecov · 2023-11-21T17:34:49Z

Codecov Report

Merging #467 (82b8fc3) into main (b50719b) will decrease coverage by 0.60%.
Report is 1 commits behind head on main.
The diff coverage is 40.00%.

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #467      +/-   ##
==========================================
- Coverage   90.45%   89.86%   -0.60%     
==========================================
  Files          58       58              
  Lines        6349     6423      +74     
  Branches      614      903     +289     
==========================================
+ Hits         5743     5772      +29     
- Misses        467      512      +45     
  Partials      139      139

Files	Coverage Δ
src/ott/datasets.py	`58.40% <40.00%> (-36.47%)`	⬇️

marcocuturi · 2023-11-28T14:52:17Z

is this ready for review?

theouscidda6 · 2023-11-28T15:07:52Z

is this ready for review?

I think so. The previous doc errors were due to the hackathon refactoring. It should be fine now.

michalk8 · 2023-11-28T15:14:03Z

I think so. The previous doc errors were due to the hackathon refactoring. It should be fine now.

Should be fixed soon! once it's done, can you please rebase? Will ping you here.

theouscidda6 · 2023-11-28T15:16:02Z

I think so. The previous doc errors were due to the hackathon refactoring. It should be fine now.

Should be fixed soon! once it's done, can you please rebase? Will ping you here.

Of course, I'll be waiting for your feedback.

marcocuturi · 2023-11-28T15:17:12Z

docs/tutorials/Monge_Gap.ipynb

@@ -1,5 +1,15 @@
 {


remove

Reply via ReviewNB

marcocuturi · 2023-11-28T15:17:12Z

docs/tutorials/Monge_Gap.ipynb

@@ -1,5 +1,15 @@
 {


update

Reply via ReviewNB

marcocuturi · 2023-11-28T15:17:12Z

docs/tutorials/Monge_Gap.ipynb

@@ -1,5 +1,15 @@
 {


Line #10. model = models.models.MLP(
maybe import models to avoid models.models ?

Reply via ReviewNB

Now it should be from ott.neural import models; model = models.MLP

review-notebook-app · 2023-11-28T15:17:14Z

View / edit / reply to this conversation on ReviewNB

marcocuturi commented on 2023-11-28T15:17:14Z
----------------------------------------------------------------

unfortunately this notation is not introduced elsewhere? either add equation to define it or remove

review-notebook-app · 2023-11-28T15:17:15Z

View / edit / reply to this conversation on ReviewNB

marcocuturi commented on 2023-11-28T15:17:15Z
----------------------------------------------------------------

maybe a comment on the fact that we expect somewhat the optimal alignment matrix to be close to identity.

review-notebook-app · 2023-11-28T15:17:16Z

View / edit / reply to this conversation on ReviewNB

marcocuturi commented on 2023-11-28T15:17:16Z
----------------------------------------------------------------

same, here $\pi^\star_\varepsilon$ needs to be defined somewhere.

review-notebook-app · 2023-11-28T15:17:17Z

View / edit / reply to this conversation on ReviewNB

marcocuturi commented on 2023-11-28T15:17:17Z
----------------------------------------------------------------

can we reduce the alpha of gray points?

marcocuturi · 2023-11-29T13:29:57Z

I think all is fixed, so should be ready to be ready to merge once modifications are taken into account!

michalk8

Thanks @theouscidda6 , I will review the changes in the notebook later!

michalk8 · 2023-12-04T11:18:37Z

pyproject.toml

@@ -87,6 +87,7 @@ docs = [
    "sphinxcontrib-bibtex>=2.5.0",
    "sphinxcontrib-spelling>=7.7.0",
    "myst-nb>=0.17.1",
+    "scikit-learn>=1.0",


This is not necessary here, as scikit-learn is not needed when building the docs.

This should be in the normal requirements instead.

michalk8 · 2023-12-04T11:22:00Z

src/ott/datasets.py

+    """Random sample generator from a Sklearn distribution.
+
+    Returns:
+    A generator of samples from the Sklearn distribution.


Dedent please.

michalk8 · 2023-12-04T11:22:08Z

src/ott/datasets.py

+    return self._create_sample_generators()
+
+  def __post_init__(self):
+


Remove empty space.

michalk8 · 2023-12-04T11:22:23Z

src/ott/datasets.py

+    """
+    return self._create_sample_generators()
+
+  def __post_init__(self):


Missing -> None:

michalk8 · 2023-12-04T11:23:37Z

src/ott/datasets.py

+  init_rng: jax.Array
+  dim_data: int = 2
+  theta_rotation: float = 0.0
+  offset: Optional[jnp.ndarray] = None


I guess this doesn't need to be None, can be just by default 0; would use offset: Union[float, jnp.ndarray] = 0.0

michalk8 · 2023-12-04T11:40:53Z

src/ott/datasets.py

+    """
+    return self._create_sample_generators()
+
+  def __post_init__(self):


Add a check in this function whether self.name is in correct values as

assert self.name in ("moon", etc.), self.na,me

michalk8 · 2023-12-04T11:41:19Z

src/ott/datasets.py

+    Returns:
+    A generator of samples from the Sklearn distribution.
+    """
+    return self._create_sample_generators()


Also consider removing the private function and implementing everything directly in the __iter__ method.

michalk8 · 2023-12-04T11:43:58Z

src/ott/datasets.py

@@ -105,6 +110,168 @@ def _create_sample_generators(self) -> Iterator[jnp.array]:
      yield samples


+@dataclasses.dataclass
+class SklearnDistribution:


Please add this and all the newly created functions to the docs in docs/datasets.rst; I think you will need to create this file - please have a look at docs/utils.rst how it's done.

michalk8 · 2023-12-04T11:44:52Z

src/ott/datasets.py

+            random_state=seed,
+            noise=self.std_noise,
+        )
+        samples = x[:, [2, 0]] if self.dim_data == 2 else x


Any reason for swapping the axes? If yes, please add a comment.

michalk8 · 2023-12-04T11:45:42Z

src/ott/datasets.py

+          )
+      ),
+  )
+  dim_data = 2


Why is this hardcoded here? I would consider removing this and just returning the 2 datasets.

michalk8 · 2023-12-04T14:29:39Z

src/ott/datasets.py

+    valid_batch_size: int = 256,
+    rng: Optional[jax.Array] = None,
+) -> Tuple[Dataset, Dataset, int]:
+  """Sklearn samplers for :class:`~ott.solvers.nn.neuraldual.W2NeuralDual`.


This link is no longer correct.

michalk8 · 2023-12-04T14:52:20Z

docs/tutorials/Monge_Gap.ipynb

@@ -1,5 +1,15 @@
 {


Line #15. from ott.neural.solvers import losses, map_estimator
This needs to be adjusted as
from ott.neural import losses
from ott.neural.solvers import map_estimator

Reply via ReviewNB

michalk8 · 2023-12-04T14:52:20Z

docs/tutorials/Monge_Gap.ipynb

@@ -1,5 +1,15 @@
 {


Links, such as ott.solvers.nn.losses.monge_gap,MapEstimator and W2NeuralDual need to be adjusted.

Reply via ReviewNB

michalk8 · 2023-12-04T14:52:20Z

docs/tutorials/Monge_Gap.ipynb

@@ -1,5 +1,15 @@
 {


Remove this please

Reply via ReviewNB

review-notebook-app · 2023-12-04T14:52:22Z

View / edit / reply to this conversation on ReviewNB

michalk8 commented on 2023-12-04T14:52:21Z
----------------------------------------------------------------

mathcing -> matching

michalk8 · 2023-12-19T19:39:14Z

@theouscidda6 what's the status of this PR?

michalk8 · 2024-06-05T15:42:25Z

@theouscidda6 because of the heave refactoring done in #466 , I will be closing this issue.
I will create later a new issue to unify the example datasets present in ott.datasets and the dataset class/dataloaders in ott.neural.datasets.

theouscidda6 added 2 commits November 21, 2023 11:55

move_sklearn_datasets

1717f55

move_ordered_spiral_datasets

d4efcb4

add sklearn to docs in pyproject.py

82b8fc3

marcocuturi reviewed Nov 28, 2023

View reviewed changes

michalk8 assigned theouscidda6 Dec 4, 2023

michalk8 added the enhancement New feature or request label Dec 4, 2023

michalk8 self-requested a review December 4, 2023 11:17

michalk8 requested changes Dec 4, 2023

View reviewed changes

michalk8 reviewed Dec 4, 2023

View reviewed changes

michalk8 closed this Jun 5, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Move datasets from `mong_gap` and `gromov_wasserstein` notebooks to `datasets.py` file; add conditional distribution sampling to the `gromov_wasserstein` notebook. #467

Move datasets from `mong_gap` and `gromov_wasserstein` notebooks to `datasets.py` file; add conditional distribution sampling to the `gromov_wasserstein` notebook. #467

theouscidda6 commented Nov 21, 2023

review-notebook-app bot commented Nov 21, 2023

codecov bot commented Nov 21, 2023 •

edited

Loading

marcocuturi commented Nov 28, 2023

theouscidda6 commented Nov 28, 2023 •

edited

Loading

michalk8 commented Nov 28, 2023

theouscidda6 commented Nov 28, 2023

marcocuturi Nov 28, 2023 •

edited

Loading

marcocuturi Nov 28, 2023 •

edited

Loading

marcocuturi Nov 28, 2023 •

edited

Loading

michalk8 Dec 4, 2023

review-notebook-app bot commented Nov 28, 2023 •

edited

Loading

review-notebook-app bot commented Nov 28, 2023 •

edited

Loading

review-notebook-app bot commented Nov 28, 2023 •

edited

Loading

review-notebook-app bot commented Nov 28, 2023 •

edited

Loading

marcocuturi commented Nov 29, 2023

michalk8 left a comment

michalk8 Dec 4, 2023

michalk8 Dec 4, 2023

michalk8 Dec 4, 2023

michalk8 Dec 4, 2023

michalk8 Dec 4, 2023

michalk8 Dec 4, 2023

michalk8 Dec 4, 2023

michalk8 Dec 4, 2023

michalk8 Dec 4, 2023

michalk8 Dec 4, 2023

michalk8 Dec 4, 2023

michalk8 Dec 4, 2023

michalk8 Dec 4, 2023 •

edited

Loading

michalk8 Dec 4, 2023 •

edited

Loading

michalk8 Dec 4, 2023 •

edited

Loading

review-notebook-app bot commented Dec 4, 2023 •

edited

Loading

michalk8 commented Dec 19, 2023

michalk8 commented Jun 5, 2024

		return self._create_sample_generators()

		def __post_init__(self):

Move datasets from mong_gap and gromov_wasserstein notebooks to datasets.py file; add conditional distribution sampling to the gromov_wasserstein notebook. #467

Move datasets from mong_gap and gromov_wasserstein notebooks to datasets.py file; add conditional distribution sampling to the gromov_wasserstein notebook. #467

Conversation

theouscidda6 commented Nov 21, 2023

review-notebook-app bot commented Nov 21, 2023

codecov bot commented Nov 21, 2023 • edited Loading

Codecov Report

marcocuturi commented Nov 28, 2023

theouscidda6 commented Nov 28, 2023 • edited Loading

michalk8 commented Nov 28, 2023

theouscidda6 commented Nov 28, 2023

marcocuturi Nov 28, 2023 • edited Loading

Choose a reason for hiding this comment

marcocuturi Nov 28, 2023 • edited Loading

Choose a reason for hiding this comment

marcocuturi Nov 28, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

review-notebook-app bot commented Nov 28, 2023 • edited Loading

review-notebook-app bot commented Nov 28, 2023 • edited Loading

review-notebook-app bot commented Nov 28, 2023 • edited Loading

review-notebook-app bot commented Nov 28, 2023 • edited Loading

marcocuturi commented Nov 29, 2023

michalk8 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

michalk8 Dec 4, 2023 • edited Loading

Choose a reason for hiding this comment

michalk8 Dec 4, 2023 • edited Loading

Choose a reason for hiding this comment

michalk8 Dec 4, 2023 • edited Loading

Choose a reason for hiding this comment

review-notebook-app bot commented Dec 4, 2023 • edited Loading

michalk8 commented Dec 19, 2023

michalk8 commented Jun 5, 2024

Move datasets from `mong_gap` and `gromov_wasserstein` notebooks to `datasets.py` file; add conditional distribution sampling to the `gromov_wasserstein` notebook. #467

Move datasets from `mong_gap` and `gromov_wasserstein` notebooks to `datasets.py` file; add conditional distribution sampling to the `gromov_wasserstein` notebook. #467

codecov bot commented Nov 21, 2023 •

edited

Loading

theouscidda6 commented Nov 28, 2023 •

edited

Loading

marcocuturi Nov 28, 2023 •

edited

Loading

marcocuturi Nov 28, 2023 •

edited

Loading

marcocuturi Nov 28, 2023 •

edited

Loading

review-notebook-app bot commented Nov 28, 2023 •

edited

Loading

review-notebook-app bot commented Nov 28, 2023 •

edited

Loading

review-notebook-app bot commented Nov 28, 2023 •

edited

Loading

review-notebook-app bot commented Nov 28, 2023 •

edited

Loading

michalk8 Dec 4, 2023 •

edited

Loading

michalk8 Dec 4, 2023 •

edited

Loading

michalk8 Dec 4, 2023 •

edited

Loading

review-notebook-app bot commented Dec 4, 2023 •

edited

Loading