Feat: Updated docs#5
Conversation
There was a problem hiding this comment.
Pull request overview
This PR expands Neutro well beyond documentation updates by adding a Keras-style Functional API, symbolic graph execution, MIMO training/evaluation support, shared-layer handling, new merge layers, related tests/examples, and a broad documentation refresh.
Changes:
- Adds
KerasTensor/Node,Input, FunctionalModel(inputs, outputs), graph forward/backward, and MIMO fit/evaluate support. - Adds new merge layers and shape inference updates across several layers.
- Adds Functional API examples/tests and rewrites/adds many documentation pages.
Reviewed changes
Copilot reviewed 50 out of 51 changed files in this pull request and generated 11 comments.
Show a summary per file
| File | Description |
|---|---|
pyproject.toml |
Bumps package/Python/dependency versions. |
neutro/models/base_model.py |
Adds Functional API graph execution, MIMO training/evaluation, nested model support, and summary updates. |
neutro/models/vision/unet.py |
Adds a minimal build override. |
neutro/layers/base.py |
Adds symbolic/eager dispatch and inbound node tracking. |
neutro/engine/node.py |
Adds symbolic tensor and graph node primitives. |
neutro/engine/__init__.py |
Engine package placeholder. |
neutro/layers/core/input_layer.py |
Adds Input() and InputLayer. |
neutro/layers/core/merging.py |
Adds/updates merge layers and shape inference. |
neutro/layers/core/dense.py |
Adds debug comment in build path. |
neutro/layers/core/moe.py |
Adds output shape inference. |
neutro/layers/core/reparameterization.py |
Adds output shape inference. |
neutro/layers/convolutional/conv1d.py |
Adds **kwargs passthrough. |
neutro/layers/normalization/layernorm.py |
Adds **kwargs passthrough. |
neutro/layers/recurrent/simple_rnn.py |
Adds **kwargs passthrough. |
neutro/layers/embedding/time_embedding.py |
Adds output shape inference. |
neutro/layers/attention/flash_attention.py |
Adds **kwargs passthrough and output shape inference. |
neutro/layers/attention/mla.py |
Adds output shape inference. |
neutro/layers/transformer/transformer_block.py |
Adds output shape inference. |
neutro/layers/__init__.py |
Exports Input and new merge layers. |
neutro/__init__.py |
Exposes Input at package top level. |
tests/test_functional_api.py |
Adds Functional API tests. |
tests/test_mimo_fit.py |
Adds MIMO fit/evaluate tests. |
tests/test_shared_transformer_block.py |
Adds shared TransformerBlock tests. |
examples/README.md |
Documents the new Functional MNIST example. |
examples/mnist_functional_residual.py |
Adds Functional API residual MNIST example. |
docs/activations/activations.md |
Adds activation documentation. |
docs/callbacks/callbacks.md |
Adds callback documentation. |
docs/data/data.md |
Adds data loader documentation. |
docs/engine/node.md |
Documents symbolic graph internals. |
docs/initializers/initializers.md |
Adds initializer documentation. |
docs/layers/attention/kv_cache.md |
Adds KV cache documentation. |
docs/layers/attention/mla.md |
Adds MLA documentation. |
docs/layers/base.md |
Documents base layer lifecycle and symbolic calls. |
docs/layers/core/core_utility_layers.md |
Adds core utility layer docs. |
docs/layers/core/dense.md |
Adds Dense layer docs. |
docs/layers/core/input_layer.md |
Documents Input/InputLayer. |
docs/layers/core/merging.md |
Documents merge layers. |
docs/layers/embedding/embedding.md |
Adds embedding layer docs. |
docs/layers/normalization/normalization.md |
Adds normalization docs. |
docs/layers/pooling/pooling.md |
Adds pooling docs. |
docs/layers/recurrent/recurrent.md |
Adds recurrent layer docs. |
docs/layers/transformer/transformer_block.md |
Adds TransformerBlock docs. |
docs/losses/losses.md |
Adds loss documentation. |
docs/metrics/metrics.md |
Adds metric documentation. |
docs/models/language_models.md |
Adds language model docs. |
docs/models/model.md |
Expands Model/Sequential/Functional API docs. |
docs/models/vision_models.md |
Adds vision model docs. |
docs/optimizers/optimizers.md |
Adds optimizer documentation. |
docs/preprocessing/preprocessing.md |
Adds preprocessing docs. |
docs/tokenizers/tokenizers.md |
Adds tokenizer docs. |
docs/utils/utils.md |
Adds utility docs. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
|
||
| Used by `Conv2D` and `MaxPooling2D` for efficient forward/backward computation. | ||
|
|
||
| ## rope_utils — `neutro/utils/rope_utils.py$ |
| ... | ||
| ``` | ||
|
|
||
| ## Sequence Preprocessing — `neutro/preprocessing/sequence.py$ |
| - Output range: (0, 1). Used for binary classification or as gating mechanism (LSTM, GRU). | ||
| - **Vanishing gradient**: for very large or very small inputs, the gradient approaches 0. | ||
|
|
||
| ### Tanh — `neutro/activations/tanh.py}$ |
| - Output: probability distribution over classes. | ||
| - **Jacobian-Vector Product** (`gradient_fast`, line 18): computes $y * (\text{grad\_output} - \sum(y * \text{grad\_output}))$ without building the full $N \times N$ Jacobian. | ||
|
|
||
| ### SiLU — `neutro/activations/silu.py$ (Sigmoid Linear Unit) |
|
|
||
| Used before the final Dense layer in CNNs to replace Flatten (fewer parameters, no overfitting). | ||
|
|
||
| ### UpSampling2D — `neutro/layers/pooling/upsampling2d.py$ |
|
|
||
| W[i, j] = orig_val | ||
|
|
||
| num_grad = (loss_plus - loss_minus) / (2 * eps) |
| class Minimum(Layer): | ||
| """ | ||
| Layer that computes the maximum (element-wise) of a list of inputs. | ||
| """ |
| ```python | ||
| class DataLoader: | ||
| def __init__(self, dataset, batch_size=32, shuffle=True): | ||
| self.dataset = dataset | ||
| self.batch_size = batch_size | ||
| self.shuffle = shuffle | ||
|
|
||
| def __iter__(self): | ||
| indices = np.arange(len(self.dataset)) | ||
| if self.shuffle: | ||
| np.random.shuffle(indices) | ||
| for i in range(0, len(indices), self.batch_size): | ||
| batch_idx = indices[i:i + self.batch_size] | ||
| yield self.dataset[batch_idx] |
|
|
||
| Unique layers are collected from the nodes (line 60): `if node.layer not in self.layers`. | ||
|
|
||
| ### Forward Pass — line 203 |
| - `node.state` is captured **after** `forward` runs, ensuring it stores the state from this specific call (not stale data from a previous call). | ||
| - The captured state uses `_capture_layer_state` which recurses into sublayers. | ||
|
|
||
| ### Backward Pass — line 297 |
|
@copilot Fix the review comments. |
Addressed in commit |
No description provided.