Feature/bag of words by modiase · Pull Request #22 · modiase/mo-net

modiase · 2025-07-02T08:46:14Z

Summary by CodeRabbit

New Features
- Introduced an Embedding layer for handling word embeddings, with support for gradient updates and serialization.
- Added an Average layer for averaging over specified axes in neural network models.
- Added a command-line interface for training and interacting with Continuous Bag of Words (CBOW) models.
- Added a command-line interface for MNIST dataset inference and sampling, including visualization and evaluation metrics.
Bug Fixes
- Corrected batch calculation in the training batcher to ensure correct splitting of data into batches.
Refactor
- Simplified the CLI by removing quickstart features, dataset URL inference, weight decay attachment logic, and certain visualization commands.
- Updated model and training logic to require explicit dataset URLs and parameters.
- Enhanced training framework to support optional data transformation during batching.
- Improved error logging during training step failures.
- Refined weight decay regulariser integration with a new static attach method.
- Improved run identification display in the dashboard and API responses.
- Updated training backend interfaces and models to include run naming for better tracking.
- Minor code style and formatting improvements in several modules.
Tests
- Added comprehensive tests for the new Average layer, covering forward/backward propagation and serialization.
Chores
- Minor formatting and style changes, including import formatting and file endings.
- Added new dependency on msgpack for serialization support.

This change is

coderabbitai · 2025-07-02T08:46:21Z

Important

Review skipped

Review was skipped due to path filters

⛔ Files ignored due to path filters (1)

uv.lock is excluded by !**/*.lock

CodeRabbit blocks several paths by default. You can override this behavior by explicitly including those paths in the path filters. For example, including **/dist/** will override the default block on the dist directory, by removing the pattern from both the lists.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Walkthrough

The changes introduce new model layers (Average, Embedding), a CBOW word embedding CLI sample, and an MNIST CLI sample for inference and visualization. The CLI is refactored to remove "quickstart" shortcuts, dataset URL inference, and visualization commands, focusing on explicit user inputs. Batch calculation is fixed for integer splits. Minor code formatting and logging improvements are made, and tests for the new Average layer are added.

Changes

File(s)	Change Summary
`mo_net/cli.py`, `mo_net/config.py`	Removed "quickstart" logic, dataset URL inference, visualization commands, and `no_transform` option.
`mo_net/model/layer/average.py`, `mo_net/model/layer/__init__.py`	Added new `Average` layer and exported it.
`mo_net/model/layer/embedding.py`	Added new `Embedding` layer with serialization, gradient, and parameter management.
`mo_net/samples/cbow.py`	Added CLI sample for CBOW word embeddings: training, inference, and similarity sampling.
`mo_net/samples/mnist.py`	Added CLI sample for MNIST: inference, evaluation, and data visualization.
`mo_net/tests/model/layer/test_average.py`	Added tests for `Average` layer: forward/backward propagation, serialization, and error handling.
`mo_net/model/layer/batch_norm/batch_norm.py`, `mo_net/model/layer/batch_norm/batch_norm_2d.py`, `mo_net/model/layer/layer_norm.py`	Minor syntax change: dictionary argument in `update()` calls wrapped in parentheses.
`mo_net/train/batcher.py`	Fixed batch calculation to use integer division and correct rounding for batch splitting.
`mo_net/train/trainer/trainer.py`	Refactored to accept a `transform` function, removed partial/affine_transform logic, improved error logging.
`mo_net/protos.py`	Reformatted import statements for clarity.
`mypy_plugins/__init__.py`, `mypy_plugins/d.py`	Minor formatting: added newlines, reformatted method signature in plugin.
`mo_net/regulariser/weight_decay.py`	Fixed attribute reference for zero-initialized biases gradient array in weight decay regularizer; replaced function with static method.
`mo_net/train/trainer/parallel.py`	Updated to use `WeightDecayRegulariser.attach` static method instead of function.
`pyproject.toml`	Added dependency on `msgpack>=1.1.1`.
`mo_net/server/app.py`, `mo_net/server/templates/dashboard.html`	Added run name to API JSON responses and improved run name display formatting in dashboard UI.
`mo_net/train/backends/log.py`	Updated `start_run` method signature to include `name` parameter in all backends.
`mo_net/train/backends/models.py`	Updated `DbRun.create` method to require explicit `name` parameter.
`mo_net/train/run.py`	Modified `TrainingRun.start_run` to pass run name to backend.

Sequence Diagram(s)

sequenceDiagram
    participant User
    participant CLI (CBOW/MNIST)
    participant Model
    participant DataLoader
    participant Trainer

    User->>CLI (CBOW/MNIST): Run CLI command (train/infer/sample)
    CLI (CBOW/MNIST)->>DataLoader: Load and preprocess data
    CLI (CBOW/MNIST)->>Model: Initialize or load model
    CLI (CBOW/MNIST)->>Trainer: (If training) Start training loop
    Trainer->>Model: Forward/Backward propagation
    Trainer->>Model: Update parameters
    CLI (CBOW/MNIST)->>User: Output results (metrics, visualization, REPL)

Poem

🥕
A rabbit hopped in, with layers anew,
"Average" and "Embedding"—a modeler's brew!
MNIST and CBOW now dance on the CLI,
No more quickstarts, just give it a try.
With batches now whole, and tests that are bright,
The code leaps ahead—what a wonderful sight!
🐇

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>, please review it.
- Explain this complex logic.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
- @coderabbitai explain this code block.
- @coderabbitai modularize this function.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
- @coderabbitai read src/utils.ts and explain its main purpose.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
- @coderabbitai help me debug CodeRabbit configuration file.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

@coderabbitai pause to pause the reviews on a PR.
@coderabbitai resume to resume the paused reviews.
@coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
@coderabbitai full review to do a full review from scratch and review all the files again.
@coderabbitai summary to regenerate the summary of the PR.
@coderabbitai generate docstrings to generate docstrings for this PR.
@coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
@coderabbitai resolve resolve all the CodeRabbit review comments.
@coderabbitai configuration to show the current CodeRabbit configuration for the repository.
@coderabbitai help to get help.

Other keywords and placeholders

Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (`.coderabbit.yaml`)

You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
Please see the configuration documentation for more information.
If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

coderabbitai

Actionable comments posted: 9

🔭 Outside diff range comments (1)

mo_net/train/batcher.py (1)

47-48: Critical: Inconsistent batch calculation in StopIteration handler.

The same integer division fix needs to be applied here. Currently, this code still uses the problematic float division that was fixed in the constructor.

-            self._internal_iterator = zip(
-                iter(np.array_split(self.X, self.train_set_size / self.batch_size)),
-                iter(np.array_split(self.Y, self.train_set_size / self.batch_size)),
-            )
+            num_batches = (self.train_set_size + self.batch_size - 1) // self.batch_size
+            self._internal_iterator = zip(
+                iter(np.array_split(self.X, num_batches)),
+                iter(np.array_split(self.Y, num_batches)),
+            )

🧹 Nitpick comments (8)

mo_net/model/layer/batch_norm/batch_norm.py (1)

270-277: Minor style issue: Unnecessary parentheses around dictionary argument.

The extra parentheses around the dictionary argument are unnecessary. The dictionary can be passed directly to update():

-            self._cache.update(
-                {
-                    "input_activations": input_activations,
-                    "mean": batch_mean,
-                    "var": batch_variance,
-                    "batch_size": input_activations.shape[0],
-                }
-            )
+            self._cache.update({
+                "input_activations": input_activations,
+                "mean": batch_mean,
+                "var": batch_variance,
+                "batch_size": input_activations.shape[0],
+            })

mo_net/model/layer/layer_norm.py (1)

235-249: Minor style issue: Unnecessary parentheses around dictionary argument.

The extra parentheses around the dictionary argument are unnecessary. The dictionary can be passed directly to update():

-        self._cache.update(
-            {
-                "input_activations": input_activations,
-                "mean": np.mean(
-                    input_activations,
-                    axis=tuple(range(1, input_activations.ndim)),
-                    keepdims=True,
-                ),
-                "var": np.var(
-                    input_activations,
-                    axis=tuple(range(1, input_activations.ndim)),
-                    keepdims=True,
-                ),
-            }
-        )
+        self._cache.update({
+            "input_activations": input_activations,
+            "mean": np.mean(
+                input_activations,
+                axis=tuple(range(1, input_activations.ndim)),
+                keepdims=True,
+            ),
+            "var": np.var(
+                input_activations,
+                axis=tuple(range(1, input_activations.ndim)),
+                keepdims=True,
+            ),
+        })

mo_net/model/layer/batch_norm/batch_norm_2d.py (1)

242-249: Minor style issue: Unnecessary parentheses around dictionary argument.

The extra parentheses around the dictionary argument are unnecessary. The dictionary can be passed directly to update():

-            self._cache.update(
-                {
-                    "input_activations": input_activations,
-                    "mean": batch_mean,
-                    "var": batch_variance,
-                    "batch_size": batch_size,
-                }
-            )
+            self._cache.update({
+                "input_activations": input_activations,
+                "mean": batch_mean,
+                "var": batch_variance,
+                "batch_size": batch_size,
+            })

mo_net/tests/model/layer/test_average.py (1)

83-83: Add newline at end of file.

-        Average(input_dimensions=(2, 3), axis=())
+        Average(input_dimensions=(2, 3), axis=())
+

mo_net/samples/mnist.py (1)

148-152: Remove duplicate matplotlib calls.

     plt.tight_layout()
     plt.show()
-    plt.tight_layout()
-    plt.show()

mo_net/model/layer/embedding.py (2)

105-105: Consider Python version compatibility for type alias syntax.

The type statement is Python 3.12+ syntax. For broader compatibility, consider using the older syntax:
-type ParametersType = Parameters
+ParametersType = Parameters
Or if you want to be more explicit:
-type ParametersType = Parameters
+from typing import TypeAlias
+ParametersType: TypeAlias = Parameters
113-113: Apply same type alias compatibility fix here.
-type CacheType = Cache
+CacheType = Cache

mo_net/samples/cbow.py (1)

34-38: Consider improving token cleaning robustness.

The regex pattern removes punctuation and non-printable characters, but the approach could be more explicit about what constitutes valid tokens.

-def clean_token(token: str) -> str:
-    """
-    Remove non-printable characters and punctuation
-    """
-    return re.sub(r"[^\w\s]|[^\x20-\x7E]", "", token).lower().strip()
+def clean_token(token: str) -> str:
+    """
+    Remove non-printable characters and punctuation, keeping only alphanumeric and spaces
+    """
+    # Remove non-ASCII and punctuation, keep only word characters and whitespace
+    cleaned = re.sub(r"[^\w\s]", "", token)
+    # Remove non-printable characters
+    cleaned = re.sub(r"[^\x20-\x7E]", "", cleaned) 
+    return cleaned.lower().strip()

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between fb9c92b and ed1f09c.

📒 Files selected for processing (16)

mo_net/cli.py (4 hunks)
mo_net/config.py (0 hunks)
mo_net/model/layer/__init__.py (2 hunks)
mo_net/model/layer/average.py (1 hunks)
mo_net/model/layer/batch_norm/batch_norm.py (1 hunks)
mo_net/model/layer/batch_norm/batch_norm_2d.py (1 hunks)
mo_net/model/layer/embedding.py (1 hunks)
mo_net/model/layer/layer_norm.py (1 hunks)
mo_net/protos.py (1 hunks)
mo_net/samples/cbow.py (1 hunks)
mo_net/samples/mnist.py (1 hunks)
mo_net/tests/model/layer/test_average.py (1 hunks)
mo_net/train/batcher.py (1 hunks)
mo_net/train/trainer/trainer.py (4 hunks)
mypy_plugins/__init__.py (1 hunks)
mypy_plugins/d.py (2 hunks)

💤 Files with no reviewable changes (1)

mo_net/config.py

🧰 Additional context used

🧬 Code Graph Analysis (3)

mo_net/model/layer/__init__.py (1)

mo_net/model/layer/average.py (1)

Average (12-90)

mo_net/tests/model/layer/test_average.py (1)

mo_net/model/layer/average.py (4)

Average (12-90)

axis (82-83)

serialize (85-90)

deserialize (19-28)

mo_net/model/layer/average.py (2)

mo_net/protos.py (2)

D (41-49)

d (173-174)

mo_net/model/layer/base.py (1)

layer_id (87-89)

🪛 Ruff (0.11.9)

mo_net/protos.py

13-13: typing.Optional imported but unused

Remove unused import

(F401)

14-14: typing.Type imported but unused

Remove unused import

(F401)

15-15: typing.Union imported but unused

Remove unused import

(F401)

mo_net/samples/mnist.py

97-97: Use a context manager for opening files

(SIM115)

mo_net/samples/cbow.py

309-309: Use a context manager for opening files

(SIM115)

395-395: Use a context manager for opening files

(SIM115)

🪛 Pylint (3.3.7)

mo_net/model/layer/average.py

[error] 8-8: No name 'layer' in module 'mo_net.model'

(E0611)

[error] 9-9: Cannot import 'mo_net.protos' due to 'invalid syntax (mo_net.protos, line 67)'

(E0001)

[error] 9-9: No name 'protos' in module 'mo_net'

(E0611)

mo_net/model/layer/embedding.py

[error] 105-105: Parsing failed: 'invalid syntax (mo_net.model.layer.embedding, line 105)'

(E0001)

mo_net/samples/mnist.py

[refactor] 70-70: Too many local variables (29/15)

(R0914)

[refactor] 97-97: Consider using 'with' for resource-allocating operations

(R1732)

[refactor] 70-70: Too many statements (52/50)

(R0915)

mo_net/samples/cbow.py

[error] 16-16: No name 'layer' in module 'mo_net.model'

(E0611)

[error] 17-17: Cannot import 'mo_net.model.layer.embedding' due to 'invalid syntax (mo_net.model.layer.embedding, line 105)'

(E0001)

[error] 17-17: No name 'layer' in module 'mo_net.model'

(E0611)

[error] 18-18: Cannot import 'mo_net.model.layer.linear' due to 'invalid syntax (mo_net.model.layer.linear, line 156)'

(E0001)

[error] 18-18: No name 'layer' in module 'mo_net.model'

(E0611)

[error] 19-19: No name 'layer' in module 'mo_net.model'

(E0611)

[error] 20-20: No name 'layer' in module 'mo_net.model'

(E0611)

[error] 21-21: Cannot import 'mo_net.model.model' due to 'invalid syntax (mo_net.model.model, line 384)'

(E0001)

[error] 21-21: No name 'model' in module 'mo_net.model'

(E0611)

[error] 23-23: Cannot import 'mo_net.protos' due to 'invalid syntax (mo_net.protos, line 67)'

(E0001)

[error] 23-23: No name 'protos' in module 'mo_net'

(E0611)

[error] 28-28: Cannot import 'mo_net.train.trainer.trainer' due to 'invalid syntax (mo_net.train.trainer.trainer, line 42)'

(E0001)

[error] 28-28: No name 'trainer' in module 'mo_net.train.trainer'

(E0611)

[refactor] 196-196: Too many arguments (9/5)

(R0913)

[refactor] 196-196: Too many positional arguments (9/5)

(R0917)

[refactor] 196-196: Too many local variables (29/15)

(R0914)

[refactor] 214-214: Unnecessary use of a comprehension, use list(sentence.split()) instead.

(R1721)

[refactor] 296-296: Too many local variables (16/15)

(R0914)

[refactor] 309-309: Consider using 'with' for resource-allocating operations

(R1732)

[refactor] 382-382: Too many local variables (19/15)

(R0914)

[refactor] 395-395: Consider using 'with' for resource-allocating operations

(R1732)

🪛 Flake8 (7.2.0)

mo_net/model/layer/embedding.py

[error] 105-105: SyntaxError: invalid syntax

(E999)

mo_net/samples/cbow.py

[error] 60-60: whitespace before ':'

(E203)

[error] 60-60: whitespace before ':'

(E203)

⏰ Context from checks skipped due to timeout of 90000ms (1)

GitHub Check: typecheck

🔇 Additional comments (18)

mypy_plugins/d.py (2)

29-31: LGTM: Improved method signature formatting.

The multi-line method signature improves readability without changing functionality.

61-61: LGTM: Added newline at end of file.

Following Python convention for newline at end of file.

mypy_plugins/__init__.py (1)

4-4: LGTM: Minor formatting improvement.

Adding a trailing newline is a good practice and often required by linters.

mo_net/model/layer/__init__.py (2)

2-2: LGTM: Proper integration of new Average layer.

The import follows the existing pattern and correctly integrates the new Average layer into the package.

13-13: LGTM: Correctly exposed in public API.

The Average layer is properly added to the __all__ list, making it available when importing from this package.

mo_net/train/batcher.py (1)

23-27: Excellent fix for integer division issue.

The change correctly addresses the problem of passing float values to np.array_split() by using integer division with proper ceiling behavior: (self.train_set_size + self.batch_size - 1) // self.batch_size. This ensures all samples are included in batches.

mo_net/train/trainer/trainer.py (4)

82-82: LGTM: Clear type alias for transformation functions.

The TransformFn type alias improves code readability and type safety.

94-94: LGTM: Clean parameter design.

The optional transform parameter provides better flexibility than the previous boolean flag approach.

115-115: LGTM: Simplified transform handling.

Directly passing the transform to the Batcher is cleaner than the previous conditional logic with partial application.

280-280: LGTM: Improved error logging.

Adding error logging when a check fails will help with debugging training issues.

mo_net/tests/model/layer/test_average.py (1)

1-83: LGTM! Comprehensive test coverage for the Average layer.

The test suite thoroughly covers forward propagation, backward propagation, serialization/deserialization, and error handling for the Average layer. The tests correctly verify the mathematical operations and gradient flow.

mo_net/cli.py (2)

254-290: Good refactoring to make model creation more flexible.

The changes correctly:

Add Y_train parameter to dynamically determine output dimensions

Remove hardcoded output class constants

Maintain proper error handling for missing dims when creating new models

343-344: Clear and concise error message.

mo_net/model/layer/average.py (1)

1-91: Well-implemented Average layer with proper gradient handling.

The implementation correctly:

Validates axis parameters in the constructor

Adjusts axes by +1 to account for batch dimension in forward/backward propagation

Properly broadcasts and scales gradients in backward propagation

Implements serialization/deserialization

The static analysis import errors appear to be false positives.

mo_net/model/layer/embedding.py (1)

116-273: Excellent implementation of the Embedding layer.

The implementation includes:

Comprehensive arithmetic operations for parameters

Proper gradient accumulation using np.add.at for sparse updates

Optional gradient clipping for training stability

Well-structured serialization/deserialization

Proper validation of embedding matrix dimensions
mo_net/samples/cbow.py (3)

67-123: Model architecture looks sound, but verify layer dimensions.

The CBOW model architecture follows standard practices with embedding → averaging → linear layers. The dimension calculations appear correct.

The model correctly:

Uses context_size * 2 for input dimensions (words before + after target)

Applies embedding layer followed by averaging over context words

Uses Xavier initialization for both embedding and linear layers

Includes proper tracing support

52-64: Fix potential index error in context window creation.

The context window creation logic has an off-by-one error that could cause index out of bounds issues.
def get_training_set(
    sentences: Collection[Sequence[int]], context_size: int, vocab_size: int
) -> tuple[np.ndarray, np.ndarray]:
    contexts = []
    targets = []
    for sentence in sentences:
        for i in range(context_size, len(sentence) - context_size):
            context = (
-                sentence[i - context_size : i] + sentence[i + 1 : i + context_size + 1]
+                sentence[i - context_size : i] + sentence[i + 1 : i + 1 + context_size]
            )
            targets.append(sentence[i])
            contexts.append(context)
    return np.array(contexts), np.eye(vocab_size)[targets]
The original slicing sentence[i + 1 : i + context_size + 1] is correct, but the logic could be clearer by ensuring the context has exactly 2 * context_size elements.

Likely an incorrect or invalid review comment.

1-28: Imports Verified – all referenced mo_net.model.layer modules exist and compile

All modules imported by mo_net/samples/cbow.py are present and pass syntax checks:

mo_net/model/layer/average.py

mo_net/model/layer/embedding.py

mo_net/model/layer/linear.py

mo_net/model/layer/output.py

mo_net/model/layer/reshape.py

No import errors remain; no changes required.

coderabbitai · 2025-07-02T08:52:39Z

mo_net/protos.py

+from typing import (
+    Generic,
+    NewType,
+    Protocol,
+    Self,
+    TypedDict,
+    TypeVar,
+    cast,
+    Optional,
+    Type as TypingType,
+    Union,
+)


🛠️ Refactor suggestion

Clean up unused imports.

The multi-line import formatting improves readability. However, static analysis indicates that Optional, Type, and Union are imported but unused and should be removed.

from typing import ( Generic, NewType, Protocol, Self, TypedDict, TypeVar, cast, - Optional, - Type as TypingType, - Union, )

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

from typing import (

Generic,

NewType,

Protocol,

Self,

TypedDict,

TypeVar,

cast,

Optional,

Type as TypingType,

Union,

)

from typing import (

Generic,

NewType,

Protocol,

Self,

TypedDict,

TypeVar,

cast,

)

🧰 Tools

🪛 Ruff (0.11.9)

13-13: typing.Optional imported but unused

Remove unused import

(F401)

14-14: typing.Type imported but unused

Remove unused import

(F401)

15-15: typing.Union imported but unused

Remove unused import

(F401)

🤖 Prompt for AI Agents

In mo_net/protos.py between lines 5 and 16, the imports Optional, Type, and Union are not used anywhere in the file. Remove these three imports from the multi-line import statement to clean up unused imports and improve code clarity.

coderabbitai · 2025-07-02T08:52:39Z

mo_net/samples/mnist.py

+    precision = np.sum(Y_test_pred == Y_test_true) / len(Y_test_pred)
+    recall = np.sum(Y_test_true == Y_test_pred) / len(Y_test_true)
+    f1_score = 2 * precision * recall / (precision + recall)
+    logger.info(f"F1 Score: {f1_score}")


⚠️ Potential issue

Fix incorrect F1 score calculation.

The current implementation calculates accuracy instead of precision and recall. For multi-class classification, you need per-class or weighted F1 scores.

- precision = np.sum(Y_test_pred == Y_test_true) / len(Y_test_pred) - recall = np.sum(Y_test_true == Y_test_pred) / len(Y_test_true) - f1_score = 2 * precision * recall / (precision + recall) - logger.info(f"F1 Score: {f1_score}") + from sklearn.metrics import f1_score + f1 = f1_score(Y_test_true, Y_test_pred, average='weighted') + logger.info(f"F1 Score (weighted): {f1}")

Alternatively, if you want to avoid the sklearn dependency:

# Calculate per-class F1 scores manually f1_scores = [] for label in range(N_DIGITS): true_positives = np.sum((Y_test_pred == label) & (Y_test_true == label)) false_positives = np.sum((Y_test_pred == label) & (Y_test_true != label)) false_negatives = np.sum((Y_test_pred != label) & (Y_test_true == label)) precision = true_positives / (true_positives + false_positives) if (true_positives + false_positives) > 0 else 0 recall = true_positives / (true_positives + false_negatives) if (true_positives + false_negatives) > 0 else 0 f1 = 2 * precision * recall / (precision + recall) if (precision + recall) > 0 else 0 f1_scores.append(f1) # Weighted average F1 score class_counts = [np.sum(Y_test_true == label) for label in range(N_DIGITS)] weighted_f1 = np.average(f1_scores, weights=class_counts) logger.info(f"F1 Score (weighted): {weighted_f1}")

🤖 Prompt for AI Agents

In mo_net/samples/mnist.py around lines 112 to 115, the current code incorrectly calculates precision and recall as overall accuracy, leading to an incorrect F1 score. To fix this, implement per-class precision and recall by counting true positives, false positives, and false negatives for each class label, then compute the F1 score for each class. Finally, calculate a weighted average F1 score using the class counts as weights and log this weighted F1 score instead of the current incorrect value.

coderabbitai · 2025-07-02T08:52:39Z

mo_net/samples/mnist.py

+        logger.error(f"File not found: {model_path}")
+        sys.exit(1)
+
+    model = Model.load(open(model_path, "rb"))


⚠️ Potential issue

Use a context manager for file operations.

- model = Model.load(open(model_path, "rb")) + with open(model_path, "rb") as f: + model = Model.load(f)

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

model = Model.load(open(model_path, "rb"))

with open(model_path, "rb") as f:

model = Model.load(f)

🧰 Tools

🪛 Ruff (0.11.9)

97-97: Use a context manager for opening files

(SIM115)

🪛 Pylint (3.3.7)

[refactor] 97-97: Consider using 'with' for resource-allocating operations

(R1732)

🤖 Prompt for AI Agents

In mo_net/samples/mnist.py at line 97, the file is opened without using a context manager, which can lead to resource leaks. Replace the open call with a with statement to ensure the file is properly closed after loading the model. Use "with open(model_path, 'rb') as f:" and then load the model from the file object f.

coderabbitai · 2025-07-02T08:52:39Z

mo_net/samples/cbow.py

+    vocab = vocab_data["vocab"]
+    token_to_id = vocab_data["token_to_id"]
+
+    model = Model.load(open(model_path, "rb"), training=False)


⚠️ Potential issue

Use context manager for file operations.

Same issue as above - unsafe file handling.

- model = Model.load(open(model_path, "rb"), training=False) + with open(model_path, "rb") as f: + model = Model.load(f, training=False)

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

model = Model.load(open(model_path, "rb"), training=False)

with open(model_path, "rb") as f:

model = Model.load(f, training=False)

🧰 Tools

🪛 Ruff (0.11.9)

395-395: Use a context manager for opening files

(SIM115)

🪛 Pylint (3.3.7)

[refactor] 395-395: Consider using 'with' for resource-allocating operations

(R1732)

🤖 Prompt for AI Agents

In mo_net/samples/cbow.py at line 395, the file is opened without using a context manager, which can lead to unsafe file handling and resource leaks. Replace the open call with a with statement to ensure the file is properly closed after loading the model. This means wrapping the open call inside a with block and passing the file object to Model.load within that block.

coderabbitai · 2025-07-02T08:52:39Z

mo_net/samples/cbow.py

+    vocab = vocab_data["vocab"]
+    token_to_id = vocab_data["token_to_id"]
+
+    model = Model.load(open(model_path, "rb"), training=False)


⚠️ Potential issue

Use context manager for file operations.

Static analysis correctly identifies unsafe file handling that could lead to resource leaks.

- model = Model.load(open(model_path, "rb"), training=False) + with open(model_path, "rb") as f: + model = Model.load(f, training=False)

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

model = Model.load(open(model_path, "rb"), training=False)

with open(model_path, "rb") as f:

model = Model.load(f, training=False)

🧰 Tools

🪛 Ruff (0.11.9)

309-309: Use a context manager for opening files

(SIM115)

🪛 Pylint (3.3.7)

[refactor] 309-309: Consider using 'with' for resource-allocating operations

(R1732)

🤖 Prompt for AI Agents

In mo_net/samples/cbow.py at line 309, the file is opened without using a context manager, which risks resource leaks. Replace the open call with a with statement to ensure the file is properly closed after loading the model. This involves wrapping the open call inside a with block and passing the file handle to Model.load.

coderabbitai · 2025-07-02T08:52:39Z

mo_net/samples/cbow.py

+            similarity = np.dot(word1_embedding, word2_embedding) / (
+                np.linalg.norm(word1_embedding) * np.linalg.norm(word2_embedding)
+            )


🛠️ Refactor suggestion

Add numerical stability to cosine similarity calculation.

The cosine similarity calculation could fail with zero-magnitude vectors or cause numerical instability.

- similarity = np.dot(word1_embedding, word2_embedding) / ( - np.linalg.norm(word1_embedding) * np.linalg.norm(word2_embedding) - ) + norm1 = np.linalg.norm(word1_embedding) + norm2 = np.linalg.norm(word2_embedding) + + if norm1 == 0 or norm2 == 0: + similarity = 0.0 + else: + similarity = np.dot(word1_embedding, word2_embedding) / (norm1 * norm2) + # Clamp to [-1, 1] for numerical stability + similarity = np.clip(similarity, -1.0, 1.0)

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

similarity = np.dot(word1_embedding, word2_embedding) / (

np.linalg.norm(word1_embedding) * np.linalg.norm(word2_embedding)

)

norm1 = np.linalg.norm(word1_embedding)

norm2 = np.linalg.norm(word2_embedding)

if norm1 == 0 or norm2 == 0:

similarity = 0.0

else:

similarity = np.dot(word1_embedding, word2_embedding) / (norm1 * norm2)

# Clamp to [-1, 1] for numerical stability

similarity = np.clip(similarity, -1.0, 1.0)

🤖 Prompt for AI Agents

In mo_net/samples/cbow.py around lines 349 to 351, the cosine similarity calculation can be unstable or fail if either word embedding vector has zero magnitude. To fix this, add a small epsilon value to the denominator when computing the norms to prevent division by zero and improve numerical stability. This ensures the calculation safely handles zero or near-zero magnitude vectors without errors.

coderabbitai · 2025-07-02T08:52:39Z

mo_net/samples/cbow.py

+def train(
+    embedding_dim: int,
+    context_size: int,
+    batch_size: int,
+    num_epochs: int,
+    learning_rate: float,
+    warmup_epochs: int,
+    max_sentences: int,
+    model_output_path: Path | None,
+    log_level: str,
+):
+    """Train a CBOW model on Shakespeare text"""
+    logger.remove()
+    logger.add(lambda msg: click.echo(msg, err=True), level=log_level)
+
+    shakespeare = get_resource("s3://mo-net-resources/shakespeare.txt").read_text()
+    sentences = shakespeare.split("\n")[:max_sentences]
+    token_ids = [
+        [token for token in sentence.split()] for sentence in sentences if sentence
+    ]
+    vocab = sorted(set(token for sentence in sentences for token in sentence.split()))
+    token_to_id = {token: i for i, token in enumerate(vocab)}
+    token_ids = [[token_to_id[token] for token in sentence] for sentence in token_ids]
+    vocab_size = len(vocab)
+    X_train, Y_train = get_training_set(token_ids, context_size, vocab_size)
+
+    logger.info(f"Vocabulary size: {vocab_size}")
+    logger.info(f"Embedding dimension: {embedding_dim}")
+    logger.info(f"Context size: {context_size}")
+    logger.info(f"Training samples: {len(X_train)}")
+
+    model = CBOWModel.create(
+        vocab_size=vocab_size,
+        embedding_dim=embedding_dim,
+        context_size=context_size,
+        tracing_enabled=False,
+    )
+
+    training_parameters = TrainingParameters(
+        batch_size=batch_size,
+        dropout_keep_probs=(),
+        history_max_len=100,
+        learning_rate_limits=(learning_rate, learning_rate),
+        log_level=log_level,
+        max_restarts=0,
+        monotonic=False,
+        no_monitoring=True,
+        normalisation_type=NormalisationType.NONE,
+        num_epochs=num_epochs,
+        quiet=False,
+        regulariser_lambda=0.0,
+        trace_logging=False,
+        train_set_size=len(X_train),
+        warmup_epochs=warmup_epochs,
+        workers=0,
+    )
+
+    train_size = int(0.8 * len(X_train))
+    X_train_split = X_train[:train_size]
+    Y_train_split = Y_train[:train_size]
+    X_val = X_train[train_size:]
+    Y_val = Y_train[train_size:]
+
+    run = TrainingRun(seed=42, backend=CsvBackend(path=DATA_DIR / "cbow.csv"))
+    optimizer = get_optimizer("adam", model, training_parameters)
+
+    trainer = BasicTrainer(
+        X_train=X_train_split,
+        Y_train=Y_train_split,
+        X_val=X_val,
+        Y_val=Y_val,
+        model=model,
+        optimizer=optimizer,
+        run=run,
+        training_parameters=training_parameters,
+    )
+
+    logger.info(f"Starting CBOW training with {len(X_train_split)} training samples")
+    result = trainer.train()
+
+    if hasattr(result, "model_checkpoint_path"):
+        if model_output_path is None:
+            model_output_path = DATA_DIR / "output" / "cbow_model.pkl"
+        result.model_checkpoint_path.rename(model_output_path)
+        logger.info(f"Training completed. Model saved to: {model_output_path}")
+
+        vocab_path = model_output_path.with_suffix(".vocab")
+        np.save(vocab_path, {"vocab": vocab, "token_to_id": token_to_id})
+        logger.info(f"Vocabulary saved to: {vocab_path}")
+    else:
+        logger.error(f"Training failed: {result}")
+


🛠️ Refactor suggestion

Reduce function complexity and improve resource management.

The train function has high complexity with many parameters and local variables. Consider extracting helper functions.

Key issues to address:

Vocabulary processing: Extract to separate function

Model setup: Extract configuration logic

Data splitting: Use more robust validation split

def train( embedding_dim: int, context_size: int, batch_size: int, num_epochs: int, learning_rate: float, warmup_epochs: int, max_sentences: int, model_output_path: Path | None, log_level: str, ): """Train a CBOW model on Shakespeare text""" logger.remove() logger.add(lambda msg: click.echo(msg, err=True), level=log_level) + # Extract vocabulary processing + vocab, token_to_id, token_ids = _prepare_vocabulary(max_sentences) + vocab_size = len(vocab) - shakespeare = get_resource("s3://mo-net-resources/shakespeare.txt").read_text() - sentences = shakespeare.split("\n")[:max_sentences] - token_ids = [ - [token for token in sentence.split()] for sentence in sentences if sentence - ] - vocab = sorted(set(token for sentence in sentences for token in sentence.split())) - token_to_id = {token: i for i, token in enumerate(vocab)} - token_ids = [[token_to_id[token] for token in sentence] for sentence in token_ids] - vocab_size = len(vocab) X_train, Y_train = get_training_set(token_ids, context_size, vocab_size) # ... rest of function

Committable suggestion skipped: line range outside the PR's diff.

🧰 Tools

🪛 Pylint (3.3.7)

[refactor] 196-196: Too many arguments (9/5)

(R0913)

[refactor] 196-196: Too many positional arguments (9/5)

(R0917)

[refactor] 196-196: Too many local variables (29/15)

(R0914)

[refactor] 214-214: Unnecessary use of a comprehension, use list(sentence.split()) instead.

(R1721)

🤖 Prompt for AI Agents

In mo_net/samples/cbow.py from lines 196 to 287, the train function is overly complex with many parameters and local variables. To fix this, extract the vocabulary processing logic (building vocab, token_to_id, and token_ids) into a separate helper function. Similarly, move the model creation and training parameter setup into another helper function to isolate configuration logic. For data splitting, replace the manual slicing with a more robust method such as using sklearn's train_test_split to ensure better validation splitting and reproducibility.

coderabbitai · 2025-07-02T08:52:40Z

mo_net/samples/cbow.py

+    if hasattr(result, "model_checkpoint_path"):
+        if model_output_path is None:
+            model_output_path = DATA_DIR / "output" / "cbow_model.pkl"
+        result.model_checkpoint_path.rename(model_output_path)
+        logger.info(f"Training completed. Model saved to: {model_output_path}")
+
+        vocab_path = model_output_path.with_suffix(".vocab")
+        np.save(vocab_path, {"vocab": vocab, "token_to_id": token_to_id})
+        logger.info(f"Vocabulary saved to: {vocab_path}")
+    else:
+        logger.error(f"Training failed: {result}")
+


🛠️ Refactor suggestion

Improve error handling and model checkpoint validation.

The model saving logic assumes the training result has a checkpoint path without proper validation.

- if hasattr(result, "model_checkpoint_path"): + if hasattr(result, "model_checkpoint_path") and result.model_checkpoint_path is not None: if model_output_path is None: model_output_path = DATA_DIR / "output" / "cbow_model.pkl" + + # Ensure output directory exists + model_output_path.parent.mkdir(parents=True, exist_ok=True) + result.model_checkpoint_path.rename(model_output_path) logger.info(f"Training completed. Model saved to: {model_output_path}") vocab_path = model_output_path.with_suffix(".vocab") np.save(vocab_path, {"vocab": vocab, "token_to_id": token_to_id}) logger.info(f"Vocabulary saved to: {vocab_path}") else: - logger.error(f"Training failed: {result}") + logger.error(f"Training failed or no model checkpoint available: {result}")

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

if hasattr(result, "model_checkpoint_path"):

if model_output_path is None:

model_output_path = DATA_DIR / "output" / "cbow_model.pkl"

result.model_checkpoint_path.rename(model_output_path)

logger.info(f"Training completed. Model saved to: {model_output_path}")

vocab_path = model_output_path.with_suffix(".vocab")

np.save(vocab_path, {"vocab": vocab, "token_to_id": token_to_id})

logger.info(f"Vocabulary saved to: {vocab_path}")

else:

logger.error(f"Training failed: {result}")

if hasattr(result, "model_checkpoint_path") and result.model_checkpoint_path is not None:

if model_output_path is None:

model_output_path = DATA_DIR / "output" / "cbow_model.pkl"

# Ensure output directory exists

model_output_path.parent.mkdir(parents=True, exist_ok=True)

result.model_checkpoint_path.rename(model_output_path)

logger.info(f"Training completed. Model saved to: {model_output_path}")

vocab_path = model_output_path.with_suffix(".vocab")

np.save(vocab_path, {"vocab": vocab, "token_to_id": token_to_id})

logger.info(f"Vocabulary saved to: {vocab_path}")

else:

logger.error(f"Training failed or no model checkpoint available: {result}")

🤖 Prompt for AI Agents

In mo_net/samples/cbow.py around lines 276 to 287, the code assumes result has a valid model_checkpoint_path without verifying it. Improve error handling by explicitly checking if result.model_checkpoint_path exists and is a valid path before attempting to rename it. If the checkpoint path is missing or invalid, log an appropriate error message and avoid proceeding with saving the model and vocabulary files.

coderabbitai · 2025-07-02T08:52:40Z

mo_net/samples/cbow.py

+                    similarity = np.dot(word_embedding, other_embedding) / (
+                        np.linalg.norm(word_embedding) * np.linalg.norm(other_embedding)
+                    )


🛠️ Refactor suggestion

Extract similarity calculation to avoid code duplication.

The same cosine similarity calculation is repeated in multiple places. Extract to a helper function.

Add this helper function at the top of the file:

+def cosine_similarity(vec1: np.ndarray, vec2: np.ndarray) -> float: + """Calculate cosine similarity between two vectors with numerical stability.""" + norm1 = np.linalg.norm(vec1) + norm2 = np.linalg.norm(vec2) + + if norm1 == 0 or norm2 == 0: + return 0.0 + + similarity = np.dot(vec1, vec2) / (norm1 * norm2) + return np.clip(similarity, -1.0, 1.0)

Then replace the duplicated calculations:

- similarity = np.dot(word_embedding, other_embedding) / ( - np.linalg.norm(word_embedding) * np.linalg.norm(other_embedding) - ) + similarity = cosine_similarity(word_embedding, other_embedding)

🤖 Prompt for AI Agents

In mo_net/samples/cbow.py around lines 415 to 417, the cosine similarity calculation is duplicated. Extract this calculation into a helper function placed near the top of the file that takes two embeddings as input and returns their cosine similarity. Replace all instances of the duplicated code with calls to this new helper function to improve code reuse and maintainability.

Currently, the library is overfocused on mnist. The affine_transform is a footgun when training on other data. Removing it to force specification of a transformation separately. - Move towards manually specifying a transformation. - Remove no_transform.

- Move cbow into samples. - Remove mnist code from cli. - Add mnist module sample.

- Streamline code - Make a Vocab class - Simplify/streamline code

Fix a bug in attach_weight_decay_regulariser

coderabbitai

Actionable comments posted: 2

♻️ Duplicate comments (7)

mo_net/protos.py (1)

5-16: Address the unused imports as previously identified.

The multi-line import formatting improves readability. However, the unused imports (Optional, Type, Union) identified in the previous review remain unaddressed.

mo_net/samples/mnist.py (2)

97-97: Use a context manager for file operations.

112-115: Fix incorrect F1 score calculation.

mo_net/samples/cbow.py (4)

375-375: Use context manager for file operations.

461-461: Use context manager for file operations.

415-417: Add numerical stability to cosine similarity calculation.

480-482: Extract similarity calculation to avoid code duplication.

🧹 Nitpick comments (1)

mo_net/samples/mnist.py (1)
150-151: Remove duplicate plotting calls.

The plt.tight_layout() and plt.show() calls are duplicated, which is unnecessary and could cause display issues.

Apply this diff:
     plt.tight_layout()
     plt.show()
-    plt.tight_layout()
-    plt.show()

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between ed1f09c and ccdd9c4.

⛔ Files ignored due to path filters (1)

uv.lock is excluded by !**/*.lock

📒 Files selected for processing (18)

mo_net/cli.py (4 hunks)
mo_net/config.py (0 hunks)
mo_net/model/layer/__init__.py (2 hunks)
mo_net/model/layer/average.py (1 hunks)
mo_net/model/layer/batch_norm/batch_norm.py (1 hunks)
mo_net/model/layer/batch_norm/batch_norm_2d.py (1 hunks)
mo_net/model/layer/embedding.py (1 hunks)
mo_net/model/layer/layer_norm.py (1 hunks)
mo_net/protos.py (1 hunks)
mo_net/regulariser/weight_decay.py (1 hunks)
mo_net/samples/cbow.py (1 hunks)
mo_net/samples/mnist.py (1 hunks)
mo_net/tests/model/layer/test_average.py (1 hunks)
mo_net/train/batcher.py (1 hunks)
mo_net/train/trainer/trainer.py (4 hunks)
mypy_plugins/__init__.py (1 hunks)
mypy_plugins/d.py (2 hunks)
pyproject.toml (1 hunks)

💤 Files with no reviewable changes (1)

mo_net/config.py

✅ Files skipped from review due to trivial changes (2)

pyproject.toml
mo_net/regulariser/weight_decay.py

🚧 Files skipped from review as they are similar to previous changes (9)

mypy_plugins/init.py
mo_net/model/layer/init.py
mo_net/model/layer/batch_norm/batch_norm.py
mo_net/model/layer/layer_norm.py
mo_net/model/layer/batch_norm/batch_norm_2d.py
mypy_plugins/d.py
mo_net/tests/model/layer/test_average.py
mo_net/train/trainer/trainer.py
mo_net/train/batcher.py

🧰 Additional context used

🧬 Code Graph Analysis (2)

mo_net/model/layer/average.py (2)

mo_net/protos.py (2)

D (41-49)

d (173-174)

mo_net/model/layer/base.py (1)

layer_id (87-89)

mo_net/model/layer/embedding.py (3)

mo_net/model/layer/base.py (5)

BadLayerId (48-48)

ParametrisedHidden (111-144)

layer_id (87-89)

_write_header (142-144)

get_layer_id (134-140)

mo_net/protos.py (13)

D (41-49)

GradCache (127-128)

GradLayer (135-144)

SupportsGradientOperations (90-105)

d (173-174)

input_dimensions (190-190)

output_dimensions (193-193)

parameters (137-137)

empty_gradient (144-144)

serialize (165-165)

gradient_operation (142-142)

cache (140-140)

parameter_count (116-116)

mo_net/samples/cbow.py (2)

embeddings (182-183)

serialize (50-53)

🪛 Ruff (0.11.9)

mo_net/samples/cbow.py

101-115: zip() without an explicit strict= parameter

Add explicit value for parameter strict=

(B905)

375-375: Use a context manager for opening files

(SIM115)

461-461: Use a context manager for opening files

(SIM115)

mo_net/protos.py

13-13: typing.Optional imported but unused

Remove unused import

(F401)

14-14: typing.Type imported but unused

Remove unused import

(F401)

15-15: typing.Union imported but unused

Remove unused import

(F401)

mo_net/samples/mnist.py

97-97: Use a context manager for opening files

(SIM115)

🪛 Flake8 (7.2.0)

mo_net/samples/cbow.py

[error] 106-106: whitespace before ':'

(E203)

[error] 107-107: whitespace before ':'

(E203)

mo_net/model/layer/embedding.py

[error] 105-105: SyntaxError: invalid syntax

(E999)

🪛 Pylint (3.3.7)

mo_net/samples/cbow.py

[error] 20-20: No name 'layer' in module 'mo_net.model'

(E0611)

[error] 21-21: No name 'layer' in module 'mo_net.model'

(E0611)

[error] 22-22: Cannot import 'mo_net.model.layer.embedding' due to 'invalid syntax (mo_net.model.layer.embedding, line 105)'

(E0001)

[error] 22-22: No name 'layer' in module 'mo_net.model'

(E0611)

[error] 23-23: Cannot import 'mo_net.model.layer.linear' due to 'invalid syntax (mo_net.model.layer.linear, line 156)'

(E0001)

[error] 23-23: No name 'layer' in module 'mo_net.model'

(E0611)

[error] 24-24: No name 'layer' in module 'mo_net.model'

(E0611)

[error] 25-25: No name 'layer' in module 'mo_net.model'

(E0611)

[error] 26-26: Cannot import 'mo_net.model.model' due to 'invalid syntax (mo_net.model.model, line 384)'

(E0001)

[error] 26-26: No name 'model' in module 'mo_net.model'

(E0611)

[error] 28-28: Cannot import 'mo_net.protos' due to 'invalid syntax (mo_net.protos, line 67)'

(E0001)

[error] 28-28: No name 'protos' in module 'mo_net'

(E0611)

[error] 34-34: Cannot import 'mo_net.train.trainer.trainer' due to 'invalid syntax (mo_net.train.trainer.trainer, line 42)'

(E0001)

[error] 34-39: No name 'trainer' in module 'mo_net.train.trainer'

(E0611)

[refactor] 257-257: Too many arguments (9/5)

(R0913)

[refactor] 257-257: Too many positional arguments (9/5)

(R0917)

[refactor] 257-257: Too many local variables (28/15)

(R0914)

[refactor] 375-375: Consider using 'with' for resource-allocating operations

(R1732)

[refactor] 448-448: Too many local variables (18/15)

(R0914)

[refactor] 461-461: Consider using 'with' for resource-allocating operations

(R1732)

mo_net/model/layer/average.py

[error] 8-8: No name 'layer' in module 'mo_net.model'

(E0611)

[error] 9-9: Cannot import 'mo_net.protos' due to 'invalid syntax (mo_net.protos, line 67)'

(E0001)

[error] 9-9: No name 'protos' in module 'mo_net'

(E0611)

mo_net/model/layer/embedding.py

[error] 105-105: Parsing failed: 'invalid syntax (mo_net.model.layer.embedding, line 105)'

(E0001)

mo_net/samples/mnist.py

[refactor] 70-70: Too many local variables (29/15)

(R0914)

[refactor] 97-97: Consider using 'with' for resource-allocating operations

(R1732)

[refactor] 70-70: Too many statements (52/50)

(R0915)

🔇 Additional comments (7)

mo_net/model/layer/average.py (1)

12-91: LGTM! Well-implemented averaging layer.

The Average layer implementation is solid with proper:

Input validation for axes ranges

Batch dimension handling in forward/backward propagation

Gradient reconstruction through broadcasting

Serialization support

The logic correctly accounts for the batch dimension by adding +1 to axes during computation and properly distributes gradients during backpropagation.

mo_net/model/layer/embedding.py (3)

105-105: Type alias syntax is correct for Python 3.12+.

The syntax error flagged by static analysis is a false positive. The type ParametersType = Parameters syntax is valid Python 3.12+ type alias syntax.

24-103: LGTM! Well-designed parameter class with gradient operations.

The Parameters dataclass properly implements SupportsGradientOperations with:

Complete arithmetic operations for gradient computation

Proper handling of division by zero with EPSILON

Multiple initialization methods (random, xavier, he)

Appropriate parameter validation

The implementation follows the gradient computation protocol correctly.

116-273: LGTM! Comprehensive embedding layer implementation.

The embedding layer is well-implemented with:

Proper parameter shape validation during initialization

Correct forward propagation mapping indices to embeddings

Gradient accumulation with optional clipping in backward propagation

Complete serialization/deserialization support

Appropriate cache management

The gradient clipping logic and parameter update mechanisms are correctly implemented.

mo_net/cli.py (2)

254-296: LGTM! Clean refactoring removes quickstart complexity.

The changes successfully simplify the CLI by:

Deriving output dimensions directly from Y_train.shape[1]

Removing quickstart branching logic in model creation

Requiring explicit dataset URLs instead of inference

This makes the interface more explicit and predictable.

343-344: LGTM! Explicit dataset URL requirement improves clarity.

Requiring an explicit dataset URL instead of inferring from quickstart templates makes the CLI more predictable and removes implicit behavior.
mo_net/samples/cbow.py (1)

453-456: Fix inconsistent vocabulary file extension.

The sample command expects .vocab.npy extension but the train command saves with .vocab extension. This will cause the sample command to fail.

Apply this diff to make the extensions consistent:
-    vocab_path = model_path.with_suffix(".vocab.npy")
+    vocab_path = model_path.with_suffix(".vocab")
     if not vocab_path.exists():
         raise click.ClickException(f"Vocabulary file not found: {vocab_path}")

-    vocab_data = np.load(vocab_path, allow_pickle=True).item()
-    vocab = vocab_data["vocab"]
-    token_to_id = vocab_data["token_to_id"]
+    vocab = Vocab.deserialize(vocab_path)
+    token_to_id = vocab.token_to_id
Likely an incorrect or invalid review comment.

coderabbitai · 2025-07-02T11:28:33Z

mo_net/samples/mnist.py

+    sample_indices = sample(np.where(Y_test_true != Y_test_pred)[0], 25)
+    for idx, i in enumerate(sample_indices):
+        plt.subplot(8, 5, idx + 1)
+        plt.imshow(X_test[i].reshape(MNIST_IMAGE_SIZE, MNIST_IMAGE_SIZE), cmap="gray")
+        plt.title(f"Pred: {Y_test_pred[i]}, True: {Y_test_true[i]}")
+        plt.axis("off")


💡 Verification agent

🧩 Analysis chain

Verify subplot indexing for 25 images in grid.

The subplot indexing uses (8, 5, idx + 1) but displays only 25 images. For a 5x5 grid layout, consider using (5, 5, idx + 1) instead to ensure proper arrangement.

🏁 Script executed:

#!/bin/bash # Verify if the subplot configuration properly displays 25 images # The current code uses (8, 5) which creates 40 subplot positions for only 25 images echo "Current subplot configuration: (8, 5) = 40 positions for 25 images" echo "Suggested configuration: (5, 5) = 25 positions for 25 images" echo "Verify the intended layout matches the image count"

Length of output: 384

Use a 5×5 grid for 25 sample images

The current call allocates an 8×5 grid (40 slots) but only fills 25. To arrange the misclassified digits in a square layout, switch to 5 rows × 5 columns:

• File: mo_net/samples/mnist.py
• Lines: ≈120–125

Suggested change:

- plt.subplot(8, 5, idx + 1) + plt.subplot(5, 5, idx + 1)

This ensures exactly 25 subplot slots and a clean square display.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

sample_indices = sample(np.where(Y_test_true != Y_test_pred)[0], 25)

for idx, i in enumerate(sample_indices):

plt.subplot(8, 5, idx + 1)

plt.imshow(X_test[i].reshape(MNIST_IMAGE_SIZE, MNIST_IMAGE_SIZE), cmap="gray")

plt.title(f"Pred: {Y_test_pred[i]}, True: {Y_test_true[i]}")

plt.axis("off")

sample_indices = sample(np.where(Y_test_true != Y_test_pred)[0], 25)

for idx, i in enumerate(sample_indices):

plt.subplot(5, 5, idx + 1)

plt.imshow(X_test[i].reshape(MNIST_IMAGE_SIZE, MNIST_IMAGE_SIZE), cmap="gray")

plt.title(f"Pred: {Y_test_pred[i]}, True: {Y_test_true[i]}")

plt.axis("off")

🤖 Prompt for AI Agents

In mo_net/samples/mnist.py around lines 120 to 125, the subplot grid is currently set to 8 rows by 5 columns, which creates 40 slots but only 25 images are displayed. Change the subplot layout to 5 rows by 5 columns to create a square grid that exactly fits the 25 sample images, ensuring a cleaner and more balanced visualization.

mo_net/samples/cbow.py

- Fix bug in weight decay regulariser - Apply weight decay to embeddings not linear layer in CBOWModel

It's not clear to me that dropping out projection is a good idea after all.

English sentences rather than shakespeare

coderabbitai

Actionable comments posted: 1

♻️ Duplicate comments (3)

mo_net/samples/cbow.py (3)

458-458: Use context manager for file operations.

Static analysis correctly identifies unsafe file handling that could lead to resource leaks.

-    model = CBOWModel.load(open(model_path, "rb"), training=False)
+    with open(model_path, "rb") as f:
+        model = CBOWModel.load(f, training=False)

537-537: Use context manager for file operations.

Same issue as above - unsafe file handling.

-    model = CBOWModel.load(open(model_path, "rb"), training=False)
+    with open(model_path, "rb") as f:
+        model = CBOWModel.load(f, training=False)

555-558: Extract similarity calculation to avoid code duplication.

The same cosine similarity calculation is repeated in multiple places. Extract to a helper function.

Add this helper function at the top of the file:

+def cosine_similarity(vec1: np.ndarray, vec2: np.ndarray) -> float:
+    """Calculate cosine similarity between two vectors with numerical stability."""
+    norm1 = np.linalg.norm(vec1)
+    norm2 = np.linalg.norm(vec2)
+    
+    if norm1 == 0 or norm2 == 0:
+        return 0.0
+    
+    similarity = np.dot(vec1, vec2) / (norm1 * norm2)
+    return np.clip(similarity, -1.0, 1.0)

Then replace the duplicated calculations:

-                similarity = np.dot(word_embedding, other_embedding) / (
-                    np.linalg.norm(word_embedding) * np.linalg.norm(other_embedding)
-                )
+                similarity = cosine_similarity(word_embedding, other_embedding)

🧹 Nitpick comments (1)

mo_net/samples/cbow.py (1)
491-497: Add numerical stability to cosine similarity calculation.

The cosine similarity calculation properly handles zero-magnitude vectors, which is good. However, the result could still benefit from clamping for numerical stability.
+            # Clamp to [-1, 1] for numerical stability
+            similarity = np.clip(similarity, -1.0, 1.0)

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between ccdd9c4 and 28a4c02.

📒 Files selected for processing (5)

mo_net/cli.py (4 hunks)
mo_net/model/layer/embedding.py (1 hunks)
mo_net/regulariser/weight_decay.py (2 hunks)
mo_net/samples/cbow.py (1 hunks)
mo_net/train/trainer/parallel.py (2 hunks)

✅ Files skipped from review due to trivial changes (1)

mo_net/train/trainer/parallel.py

🚧 Files skipped from review as they are similar to previous changes (1)

mo_net/regulariser/weight_decay.py

🧰 Additional context used

🧬 Code Graph Analysis (1)

mo_net/cli.py (2)

mo_net/protos.py (3)

ActivationFn (29-35)

NormalisationType (202-205)

d (173-174)

mo_net/model/model.py (5)

Model (48-381)

mlp_of (77-88)

mlp_of (92-101)

mlp_of (104-197)

module_dimensions (347-348)

🪛 Pylint (3.3.7)

mo_net/model/layer/embedding.py

[error] 105-105: Parsing failed: 'invalid syntax (mo_net.model.layer.embedding, line 105)'

(E0001)

mo_net/samples/cbow.py

[error] 21-21: No name 'layer' in module 'mo_net.model'

(E0611)

[error] 22-22: Cannot import 'mo_net.model.layer.embedding' due to 'invalid syntax (mo_net.model.layer.embedding, line 105)'

(E0001)

[error] 22-22: No name 'layer' in module 'mo_net.model'

(E0611)

[error] 23-23: Cannot import 'mo_net.model.layer.linear' due to 'invalid syntax (mo_net.model.layer.linear, line 156)'

(E0001)

[error] 23-23: No name 'layer' in module 'mo_net.model'

(E0611)

[error] 24-24: No name 'layer' in module 'mo_net.model'

(E0611)

[error] 25-25: Cannot import 'mo_net.model.model' due to 'invalid syntax (mo_net.model.model, line 384)'

(E0001)

[error] 25-25: No name 'model' in module 'mo_net.model'

(E0611)

[error] 28-28: Cannot import 'mo_net.protos' due to 'invalid syntax (mo_net.protos, line 67)'

(E0001)

[error] 28-28: No name 'protos' in module 'mo_net'

(E0611)

[error] 33-33: Cannot import 'mo_net.train.trainer.trainer' due to 'invalid syntax (mo_net.train.trainer.trainer, line 42)'

(E0001)

[error] 33-38: No name 'trainer' in module 'mo_net.train.trainer'

(E0611)

[refactor] 336-336: Too many arguments (10/5)

(R0913)

[refactor] 336-336: Too many positional arguments (10/5)

(R0917)

[refactor] 336-336: Too many local variables (28/15)

(R0914)

[refactor] 448-448: Too many local variables (16/15)

(R0914)

[refactor] 458-458: Consider using 'with' for resource-allocating operations

(R1732)

[refactor] 527-527: Too many local variables (16/15)

(R0914)

[refactor] 537-537: Consider using 'with' for resource-allocating operations

(R1732)

🪛 Flake8 (7.2.0)

mo_net/model/layer/embedding.py

[error] 105-105: SyntaxError: invalid syntax

(E999)

mo_net/samples/cbow.py

[error] 182-182: whitespace before ':'

(E203)

[error] 183-183: whitespace before ':'

(E203)

🪛 Ruff (0.11.9)

mo_net/samples/cbow.py

458-458: Use a context manager for opening files

(SIM115)

537-537: Use a context manager for opening files

(SIM115)

⏰ Context from checks skipped due to timeout of 90000ms (1)

GitHub Check: typecheck

🔇 Additional comments (7)

mo_net/model/layer/embedding.py (2)

200-214: LGTM! Solid backward propagation implementation.

The gradient accumulation using np.add.at() correctly handles multiple indices pointing to the same embedding, and the gradient clipping implementation provides good numerical stability.

171-178: Good parameter validation.

The shape validation ensures embedding matrix dimensions match the specified vocabulary size and embedding dimension, preventing runtime errors.

mo_net/cli.py (2)

253-288: Excellent refactor to remove quickstart magic.

The dynamic output dimension calculation from Y_train.shape[1] is much cleaner than hardcoded constants. Requiring explicit dims parameter when training new models improves clarity and prevents configuration mistakes.

342-343: Good explicit requirement for dataset URL.

Removing dataset URL inference and requiring explicit URLs improves predictability and prevents silent failures from incorrect assumptions.

mo_net/samples/cbow.py (3)

44-86: Well-designed custom regularizer implementation.

The EmbeddingWeightDecayRegulariser properly integrates with the training framework and follows the established pattern of attaching via static methods. The gradient accumulation and loss computation are correctly implemented.

196-257: Excellent CBOW model architecture design.

The model composition using embedding layer, averaging layer, and linear output layer correctly implements the CBOW architecture. The class methods for creation and property access to embeddings provide a clean interface.

427-438: Good pattern matching for training result handling.

The use of pattern matching with TrainingSuccessful/TrainingFailed and assert_never provides clear, exhaustive handling of training outcomes.

coderabbitai · 2025-07-02T12:40:02Z

mo_net/model/layer/embedding.py

+        return self.__class__(embeddings=embeddings)
+
+
+type ParametersType = Parameters


⚠️ Potential issue

Fix Python version compatibility issue with type aliases.

The type statement syntax is only available in Python 3.12+. This will cause syntax errors in older Python versions and explains the static analysis failures.

Replace the new type alias syntax with traditional type aliases for broader compatibility:

-type ParametersType = Parameters +ParametersType = Parameters -type CacheType = Cache +CacheType = Cache

Alternatively, if Python 3.12+ is required, ensure this is documented in your project requirements.

Also applies to: 113-113

🧰 Tools

🪛 Pylint (3.3.7)

[error] 105-105: Parsing failed: 'invalid syntax (mo_net.model.layer.embedding, line 105)'

(E0001)

🪛 Flake8 (7.2.0)

[error] 105-105: SyntaxError: invalid syntax

(E999)

🤖 Prompt for AI Agents

In mo_net/model/layer/embedding.py at lines 105 and 113, the use of the 'type' keyword for type aliasing is not compatible with Python versions below 3.12. Replace these 'type' alias declarations with the traditional assignment syntax for type aliases to ensure compatibility with older Python versions. For example, change 'type ParametersType = Parameters' to 'ParametersType = Parameters'. Make sure to update both lines accordingly.

coderabbitai

Actionable comments posted: 0

♻️ Duplicate comments (5)

mo_net/samples/cbow.py (5)

462-462: Use context manager for file operations.

This matches the existing review comment about unsafe file handling.

-    model = CBOWModel.load(open(model_path, "rb"), training=False)
+    with open(model_path, "rb") as f:
+        model = CBOWModel.load(f, training=False)

541-541: Use context manager for file operations.

Same file handling issue as in the infer command.

-    model = CBOWModel.load(open(model_path, "rb"), training=False)
+    with open(model_path, "rb") as f:
+        model = CBOWModel.load(f, training=False)

559-562: Use the extracted cosine similarity helper function.

This is the same cosine similarity calculation that should use the helper function mentioned in the previous comment.

-                similarity = np.dot(word_embedding, other_embedding) / (
-                    np.linalg.norm(word_embedding) * np.linalg.norm(other_embedding)
-                )
+                similarity = cosine_similarity(word_embedding, other_embedding)

495-501: Add numerical stability to cosine similarity calculation.

The cosine similarity calculation should handle edge cases and potential numerical instability.

Extract to a helper function for reuse:

+def cosine_similarity(vec1: np.ndarray, vec2: np.ndarray) -> float:
+    """Calculate cosine similarity with numerical stability."""
+    norm1 = np.linalg.norm(vec1)
+    norm2 = np.linalg.norm(vec2)
+    
+    if norm1 == 0 or norm2 == 0:
+        return 0.0
+    
+    similarity = np.dot(vec1, vec2) / (norm1 * norm2)
+    return np.clip(similarity, -1.0, 1.0)

-            if (n1 := np.linalg.norm(word1_embedding)) == 0 or (
-                n2 := np.linalg.norm(word2_embedding)
-            ) == 0:
-                similarity = 0.0
-            else:
-                similarity = np.dot(word1_embedding, word2_embedding) / (n1 * n2)
+            similarity = cosine_similarity(word1_embedding, word2_embedding)

339-443: Reduce function complexity by extracting helper functions.

The train function is overly complex with 10 parameters and 29 local variables, making it hard to maintain and test.

Extract vocabulary and data preparation logic:

+def _prepare_vocabulary_and_data(vocab_size: int, context_size: int) -> tuple[Vocab, np.ndarray, np.ndarray]:
+    """Prepare vocabulary and training data from sentences."""
+    sentences = (
+        get_resource("s3://mo-net-resources/english-sentences.txt")
+        .read_text()
+        .split("\n")
+    )[:100000]
+    vocab = Vocab.from_sentences(sentences, max_size=vocab_size)
+    
+    tokenized_sentences = [
+        [vocab[token] for token in sentence.split()]
+        for sentence in sentences
+        if sentence
+    ]
+    
+    return vocab, *get_training_set(tokenized_sentences, context_size, len(vocab))

 def train(
     embedding_dim: int,
     context_size: int,
     # ... other params
 ):
     setup_logging(log_level)
     
-    sentences = (
-        get_resource("s3://mo-net-resources/english-sentences.txt")
-        .read_text()
-        .split("\n")
-    )[:100000]
-    vocab = Vocab.from_sentences(sentences, max_size=vocab_size)
-    
-    tokenized_sentences = [
-        [vocab[token] for token in sentence.split()]
-        for sentence in sentences
-        if sentence
-    ]
-    
-    X_train, Y_train = get_training_set(tokenized_sentences, context_size, len(vocab))
+    vocab, X_train, Y_train = _prepare_vocabulary_and_data(vocab_size, context_size)

🧹 Nitpick comments (1)

mo_net/samples/cbow.py (1)
89-154: Consider adding validation for vocabulary size limits.

The Vocab class is well-designed with proper serialization support. However, consider adding validation to prevent extremely large vocabularies that could cause memory issues.

Add this validation to from_sentences:
 @classmethod
 def from_sentences(cls, sentences: Collection[str], max_size: int) -> Vocab:
+    if max_size <= 0:
+        raise ValueError("max_size must be positive")
+    if max_size > 1_000_000:  # reasonable upper bound
+        raise ValueError("max_size too large, consider reducing vocabulary size")
+    
     most_common_tokens = [

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 28a4c02 and bc065fe.

📒 Files selected for processing (7)

mo_net/samples/cbow.py (1 hunks)
mo_net/server/app.py (3 hunks)
mo_net/server/templates/dashboard.html (2 hunks)
mo_net/train/backends/log.py (4 hunks)
mo_net/train/backends/models.py (1 hunks)
mo_net/train/run.py (1 hunks)
pyproject.toml (2 hunks)

✅ Files skipped from review due to trivial changes (2)

mo_net/server/app.py
mo_net/server/templates/dashboard.html

🚧 Files skipped from review as they are similar to previous changes (1)

pyproject.toml

🧰 Additional context used

🧬 Code Graph Analysis (2)

mo_net/train/backends/log.py (2)

mo_net/train/run.py (2)

start_run (54-61)

seed (30-31)

mo_net/train/backends/models.py (2)

DbRun (39-102)

create (76-102)

mo_net/samples/cbow.py (14)

mo_net/model/layer/embedding.py (6)

Embedding (116-271)

cache (243-244)

Parameters (25-102)

parameters (247-248)

serialize (223-230)

deserialize (130-138)

mo_net/log.py (2)

LogLevel (12-18)

setup_logging (21-23)

mo_net/model/layer/linear.py (1)

Linear (167-364)

mo_net/model/layer/output.py (1)

SoftmaxOutputLayer (55-92)

mo_net/model/module/base.py (1)

Output (121-188)

mo_net/protos.py (3)

NormalisationType (202-205)

TrainingStepHandler (70-81)

d (173-174)

mo_net/resources.py (1)

get_resource (16-47)

mo_net/cli.py (2)

train (313-463)

cli (298-298)

mo_net/config.py (1)

TrainingParameters (8-50)

mo_net/train/backends/log.py (5)

SqliteBackend (118-219)

create (17-18)

create (67-68)

create (130-131)

create (230-231)

mo_net/train/run.py (2)

TrainingRun (6-66)

seed (30-31)

mo_net/train/trainer/trainer.py (1)

BasicTrainer (85-371)

mo_net/regulariser/weight_decay.py (2)

after_compute_update (19-31)

attach (45-61)

mo_net/optimizer/base.py (1)

register_after_compute_update_handler (99-104)

🪛 Flake8 (7.2.0)

mo_net/samples/cbow.py

[error] 185-185: whitespace before ':'

(E203)

[error] 186-186: whitespace before ':'

(E203)

🪛 Ruff (0.11.9)

mo_net/samples/cbow.py

462-462: Use a context manager for opening files

(SIM115)

541-541: Use a context manager for opening files

(SIM115)

🪛 Pylint (3.3.7)

mo_net/samples/cbow.py

[error] 22-22: No name 'layer' in module 'mo_net.model'

(E0611)

[error] 23-23: Cannot import 'mo_net.model.layer.embedding' due to 'invalid syntax (mo_net.model.layer.embedding, line 105)'

(E0001)

[error] 23-23: No name 'layer' in module 'mo_net.model'

(E0611)

[error] 24-24: Cannot import 'mo_net.model.layer.linear' due to 'invalid syntax (mo_net.model.layer.linear, line 156)'

(E0001)

[error] 24-24: No name 'layer' in module 'mo_net.model'

(E0611)

[error] 25-25: No name 'layer' in module 'mo_net.model'

(E0611)

[error] 26-26: Cannot import 'mo_net.model.model' due to 'invalid syntax (mo_net.model.model, line 384)'

(E0001)

[error] 26-26: No name 'model' in module 'mo_net.model'

(E0611)

[error] 29-29: Cannot import 'mo_net.protos' due to 'invalid syntax (mo_net.protos, line 67)'

(E0001)

[error] 29-29: No name 'protos' in module 'mo_net'

(E0611)

[error] 34-34: Cannot import 'mo_net.train.trainer.trainer' due to 'invalid syntax (mo_net.train.trainer.trainer, line 42)'

(E0001)

[error] 34-39: No name 'trainer' in module 'mo_net.train.trainer'

(E0611)

[refactor] 339-339: Too many arguments (10/5)

(R0913)

[refactor] 339-339: Too many positional arguments (10/5)

(R0917)

[refactor] 339-339: Too many local variables (29/15)

(R0914)

[refactor] 452-452: Too many local variables (16/15)

(R0914)

[refactor] 462-462: Consider using 'with' for resource-allocating operations

(R1732)

[refactor] 531-531: Too many local variables (16/15)

(R0914)

[refactor] 541-541: Consider using 'with' for resource-allocating operations

(R1732)

⏰ Context from checks skipped due to timeout of 90000ms (1)

GitHub Check: typecheck

🔇 Additional comments (8)

mo_net/samples/cbow.py (2)

45-87: LGTM! Well-designed regularizer following established patterns.

The EmbeddingWeightDecayRegulariser correctly implements L2 weight decay for embeddings, following the same pattern as WeightDecayRegulariser for linear layers. The implementation properly modifies gradients and computes regularization loss.

177-197: Good implementation with proper strict checking.

The get_training_set function correctly uses strict=True for the zip operation, addressing the static analysis concern from past reviews.

mo_net/train/run.py (1)

57-57: LGTM! Correctly implements the updated backend interface.

The addition of the name=self._name parameter properly aligns with the updated backend start_run method signatures and ensures run names are persisted in the logging backend.

mo_net/train/backends/models.py (1)

79-86: LGTM! Improved API design with explicit naming.

The change from implicitly deriving the name from the seed to requiring an explicit name parameter improves clarity and allows for more meaningful run identification. This aligns well with the broader changes to support explicit run naming throughout the system.

mo_net/train/backends/log.py (4)

20-26: LGTM! Consistent protocol update for explicit run naming.

The LoggingBackend protocol correctly adds the name parameter to the start_run method signature, establishing the contract for all backend implementations.

70-77: LGTM! Proper parameter handling in CsvBackend.

The CsvBackend correctly accepts the new name parameter and appropriately marks it as unused since CSV logging doesn't utilize run names.

133-149: LGTM! Correct integration with database model.

The SqliteBackend properly passes the name parameter to DbRun.create(), ensuring run names are persisted in the database for tracking and identification.

233-241: LGTM! Appropriate no-op implementation.

The NullBackend correctly accepts the new parameter and marks it as unused, maintaining interface consistency while providing no-op behavior as expected.

modiase

Reviewed 13 of 20 files at r1, 4 of 5 files at r2, 7 of 7 files at r5, 2 of 2 files at r6, 1 of 1 files at r7, all commit messages.
Reviewable status: complete! all files reviewed, all discussions resolved (waiting on @modiase)

coderabbitai bot reviewed Jul 2, 2025

View reviewed changes

modiase force-pushed the feature/bag-of-words branch 2 times, most recently from 8905d1a to 1d0b7e7 Compare July 2, 2025 11:17

Moye added 7 commits July 2, 2025 12:19

Add initial CBoW implementation

6b99256

Remove transformation

8823b5f

Currently, the library is overfocused on mnist. The affine_transform is a footgun when training on other data. Removing it to force specification of a transformation separately. - Move towards manually specifying a transformation. - Remove no_transform.

Log CheckFail error

adbfbf3

Create samples package

5cc18dc

- Move cbow into samples. - Remove mnist code from cli. - Add mnist module sample.

Add average layer

c2295c2

- Add average layer

42cc0e1

- Streamline code - Make a Vocab class - Simplify/streamline code

Add regularisation to CBoW

ccdd9c4

Fix a bug in attach_weight_decay_regulariser

modiase force-pushed the feature/bag-of-words branch from 1d0b7e7 to ccdd9c4 Compare July 2, 2025 11:19

coderabbitai bot reviewed Jul 2, 2025

View reviewed changes

Moye added 3 commits July 2, 2025 12:37

Handle zero norms when computing cosine similarity

85294d8

Use strict=True

fa302d0

Apply weight decay to embeddings not linear layer

b14d06c

- Fix bug in weight decay regulariser - Apply weight decay to embeddings not linear layer in CBOWModel

modiase force-pushed the feature/bag-of-words branch from 1d2710a to b14d06c Compare July 2, 2025 12:02

Moye added 5 commits July 2, 2025 13:07

Remove comment

0b40b09

Remove superfluous reshape layer

b6e634e

Remove dropout from cbow

516d4ba

It's not clear to me that dropping out projection is a good idea after all.

Use higher quality dataset

28a4c02

English sentences rather than shakespeare

Log to sqlite

3fda823

coderabbitai bot reviewed Jul 2, 2025

View reviewed changes

Moye added 4 commits July 2, 2025 13:49

Show name as well as id on training dashboard

ad5af74

Reduce default num epochs

c5e1733

Document 3.12 required for type syntax

1e7d0f9

Inline some code

bc065fe

coderabbitai bot reviewed Jul 2, 2025

View reviewed changes

Moye added 2 commits July 2, 2025 15:43

Remove superfluous comments

336c4c6

uv lock

d870d1e

modiase commented Jul 2, 2025

View reviewed changes

modiase merged commit c01c364 into main Jul 2, 2025
4 checks passed

modiase deleted the feature/bag-of-words branch July 2, 2025 14:58

coderabbitai bot mentioned this pull request Sep 16, 2025

Feature/skip gram #33

Merged

coderabbitai bot mentioned this pull request Feb 7, 2026

Feature/add rnn #37

Open

	model = Model.load(open(model_path, "rb"))
	with open(model_path, "rb") as f:
	model = Model.load(f)

	model = Model.load(open(model_path, "rb"), training=False)
	with open(model_path, "rb") as f:
	model = Model.load(f, training=False)

-            similarity = np.dot(word1_embedding, word2_embedding) / (
-                np.linalg.norm(word1_embedding) * np.linalg.norm(word2_embedding)
-            )
+            norm1 = np.linalg.norm(word1_embedding)
+            norm2 = np.linalg.norm(word2_embedding)
+            if norm1 == 0 or norm2 == 0:
+                similarity = 0.0
+            else:
+                similarity = np.dot(word1_embedding, word2_embedding) / (norm1 * norm2)
+                # Clamp to [-1, 1] for numerical stability
+                similarity = np.clip(similarity, -1.0, 1.0)

		return self.__class__(embeddings=embeddings)


		type ParametersType = Parameters

Conversation

modiase commented Jul 2, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Jul 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review skipped

Walkthrough

Changes

Sequence Diagram(s)

Poem

Chat

Support

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

CodeRabbit Configuration File (.coderabbit.yaml)

Documentation and Community

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jul 2, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jul 2, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jul 2, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jul 2, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jul 2, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jul 2, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jul 2, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jul 2, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jul 2, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jul 2, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jul 2, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

modiase left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

modiase commented Jul 2, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Jul 2, 2025 •

edited

Loading

CodeRabbit Configuration File (`.coderabbit.yaml`)