diff --git a/doc/modules/ROOT/images/gds-session-algorithms/kge/delta-value.svg b/doc/modules/ROOT/images/gds-session-algorithms/kge/delta-value.svg
new file mode 100644
index 000000000..37e48d333
--- /dev/null
+++ b/doc/modules/ROOT/images/gds-session-algorithms/kge/delta-value.svg
@@ -0,0 +1,57 @@
+
+
+
\ No newline at end of file
diff --git a/doc/modules/ROOT/images/gds-session-algorithms/kge/distmult-formula.svg b/doc/modules/ROOT/images/gds-session-algorithms/kge/distmult-formula.svg
new file mode 100644
index 000000000..2d6dda3ae
--- /dev/null
+++ b/doc/modules/ROOT/images/gds-session-algorithms/kge/distmult-formula.svg
@@ -0,0 +1,56 @@
+
+
+
\ No newline at end of file
diff --git a/doc/modules/ROOT/images/gds-session-algorithms/kge/mrl.svg b/doc/modules/ROOT/images/gds-session-algorithms/kge/mrl.svg
new file mode 100644
index 000000000..74e63ba13
--- /dev/null
+++ b/doc/modules/ROOT/images/gds-session-algorithms/kge/mrl.svg
@@ -0,0 +1,35 @@
+
+
+
\ No newline at end of file
diff --git a/doc/modules/ROOT/images/gds-session-algorithms/kge/transe-formula.svg b/doc/modules/ROOT/images/gds-session-algorithms/kge/transe-formula.svg
new file mode 100644
index 000000000..6f641a048
--- /dev/null
+++ b/doc/modules/ROOT/images/gds-session-algorithms/kge/transe-formula.svg
@@ -0,0 +1,46 @@
+
+
+
\ No newline at end of file
diff --git a/doc/modules/ROOT/pages/algorithms.adoc b/doc/modules/ROOT/pages/algorithms.adoc
index 670bca25f..061ecbb85 100644
--- a/doc/modules/ROOT/pages/algorithms.adoc
+++ b/doc/modules/ROOT/pages/algorithms.adoc
@@ -72,6 +72,15 @@ assert fastrp_result["nodePropertiesWritten"] == G.node_count()
Some algorithms deviate from the standard syntactic structure.
We describe how to use them in the Python client in the sections below.
+// // Suppose to be uncommented when the algorithms are available.
+// [NOTE]
+// ====
+// There are some algorithm which are available for xref:gds-session.adoc[GDS Sessions] only.
+// See the list:
+//
+// include::ROOT:partial$gds-session-algorithms-algorithms.adoc[]
+// ====
+
[[algorithms-execution-mode]]
== Execution modes
diff --git a/doc/modules/ROOT/pages/gds-session-algorithms/knowledge-graph-embeddings.adoc b/doc/modules/ROOT/pages/gds-session-algorithms/knowledge-graph-embeddings.adoc
new file mode 100644
index 000000000..476d2f486
--- /dev/null
+++ b/doc/modules/ROOT/pages/gds-session-algorithms/knowledge-graph-embeddings.adoc
@@ -0,0 +1,289 @@
+= Knowledge graph embeddings
+
+Knowledge Graph Embeddings (KGE) refer to a family of algorithms designed to learn low-dimensional representations of entities and relations within a knowledge graph.
+These embeddings are utilized for tasks such as link prediction, entity classification, and entity clustering.
+
+This chapter provides an overview of the available shallow embedding algorithms under the GDS Session, such as `TransE` and `DistMult`.
+
+* `TransE`
+** Bordes, Antoine, et al. "Translating embeddings for modeling multi-relational data." Advances in neural information processing systems 26 (2013).
+* `DistMult`
+** Yang, Bishan, et al. "Embedding entities and relations for learning and inference in knowledge bases." arXiv preprint arXiv:1412.6575 (2014)
+
+A knowledge graph is a directed, multi-relational graph.
+It consists of triples `(head, relation, tail)` where `head` and `tail` are entities and `relation` is the relationship between them.
+
+Shallow models of KGE represented here are _transductive_ algorithms for computing node and relationship type embeddings (low-dimensional representations) for knowledge graphs.
+Shallow means that matrix lookups represent the entity and relation encoders.
+
+A KGE model can return the score for the triple which can be used to predict the missing tail entity for a given head and relationship type.
+The embeddings can be used to perform various tasks, such as link prediction, entity classification, and entity clustering.
+
+
+[[algorithms-embeddings-kge-considerations]]
+== Scoring functions
+
+For KGE algorithms, the scoring function is used to compute the score of a triple `(head, relation, tail)`.
+This score is used for computing loss during training and during prediction to rank the tail entities for a given head and relationship type.
+
+The scoring function is a function of the embeddings of the head, relation, and tail entities.
+
+Let the embeddings of the head, relation, and tail entities be denoted by `h`, `r`, and `t` respectively.
+To compute the score of the triple `(head, relation, tail)`, the scoring function takes the embeddings `h`, `r`, and `t` as input and computes a scalar score using the corresponding formula.
+
+
+=== `TransE`
+
+`TransE` is a translational distance interaction model.
+It represents the relationship between head and tail entities as a vector translation in the embedding space.
+This method is effective for modeling anti-symmetric, inversion, and composition relations.
+
+The formula to compute the score of a triple `(head, relation, tail)` using the embeddings `h`, `r`, and `t`:
+
+image::gds-session-algorithms/kge/transe-formula.svg[width=280]
+
+
+=== `DistMult`
+
+`DistMult` is a semantic matching interaction model.
+This method scores the triple by computing the dot-product multiplication of the embeddings of the head, relationship type, and tail entities.
+This method works for modelling symmetric relations.
+
+The formula to compute the score of a triple `(head, relation, tail)` using the embeddings `h`, `r`, and `t`:
+
+image::gds-session-algorithms/kge/distmult-formula.svg[width=400]
+
+
+== Considerations
+
+To effectively train KGE models, several considerations need to be taken into account, including the choice of loss function, sampling methods, and optimization strategies.
+
+
+=== Stochastic Local Closed World Assumption (sLCWA)
+
+Observed triplets in the knowledge graph are considered true.
+The unobserved triplets can be treated differently based on the assumption made.
+
+* Open World Assumption (OWA) assumes that all unobserved facts are unknown.
+
+* Closed World Assumption (CWA) assumes that all unobserved facts are false.
+
+* Local Closed World Assumption (LCWA) assumes that all observed facts are true.
+All corrupted triplets, which are generated by replacing the head or tail entity of a positive triplet, are false.
+
+* Stochastic Local Closed World Assumption (sLCWA) assumes that all observed facts are true.
+Some corrupted triplets are false and some are true.
+The number of corrupted triplets for each true triplet is set by the `negative_sampling_size` parameter.
+
+Knowledge graph embedding models are trained under the sLCWA assumption in the GDS implementation.
+
+
+=== Loss function
+
+The loss function is crucial for guiding the training process of KGE models.
+It determines how the difference between predicted and actual values is calculated and minimized.
+There are several loss functions that can be used for training KGE models, see below for more details.
+
+
+==== Margin ranking loss
+
+Margin ranking loss is a pairwise loss function that compares the scores based on the difference between the scores of a positive triple and a negative triple.
+When negative sampling size is more than one, the loss is computed for a positive triple and each of its negative triples, and the average loss is computed.
+
+image::gds-session-algorithms/kge/mrl.svg[width=300]
+image::gds-session-algorithms/kge/delta-value.svg[width=400]
+
+
+==== Negative Sampling Self-Adversarial Loss
+
+Negative Sampling Self-Adversarial Lossfootnote:[Sun, Zhiqing, et al. "Rotate: Knowledge graph embedding by relational rotation in complex space." arXiv preprint arXiv:1902.10197 (2019).] is a setwise loss function that compares the scores based on the difference between the scores of a positive triple and a set of negative triples.
+`loss_function_kwargs` can be used to set the `adversarial_temperature` and `margin` parameters.
+
+
+=== Optimizer
+
+Several optimizers are available, such as `Adam`, `SGD`, and `Adagrad`.
+Their parameters are aligned with the correspond PyTorch optimizer parameters.
+To use non-default optimizer, specify the optimizer class name as a string in the `optimizer` parameter.
+All optimizer parameters except `params` can be passed as `optimizer_kwargs`.
+
+
+=== Negative sampling
+
+The loss function requires negative samples to compute the loss.
+The number of negative samples per positive sample is controlled by the `negative_sampling_size` parameter.
+When `use_node_type_aware_sampler` is set to `True`, negative nodes are sampled with the same label as the corresponding positive node.
+With or without node type awareness, the negative samples are sampled uniformly at random from the graph.
+
+
+=== Learning rate scheduler
+
+Any PyTorch learning rate scheduler can be used for training the model.
+To use non-default learning rate scheduler, specify the scheduler class name as a string in the `lr_scheduler` parameter.
+All scheduler parameters except `optimizer` can be passed as `lr_scheduler_kwargs`.
+
+
+=== Inner normalisation
+
+In the original `TransE` paperfootnote:[Bordes, Antoine, et al. "Translating embeddings for modeling multi-relational data." Advances in neural information processing systems 26 (2013).]
+in `Algorithm 1`, line 5, the entity embeddings are normalized to have `Lp` norm of 1.
+Value of `p` is set by the `p_norm` parameter.
+For some datasets, this normalization might not be beneficial.
+To avoid this normalization, set `inner_norm` to `False`.
+
+
+=== Filtered metrics
+
+When we evaluate (compute metrics of) the model on the test set, we compute scores for all possible triples with the same head or tail and relationship type as the test triple.
+Ranking the test triple among other triples is used for computing metrics, such as Mean Rank, Mean Reciprocal Rank, and Hits@k.
+
+When `filtered_metrics` is set to `False`, the ranking is done among all possible triples.
+
+When `filtered_metrics` is set to `True`, the ranking is done among only the triples that are not present in the training set.
+
+[[algorithms-embeddings-kge-syntax]]
+== Syntax
+
+[source, python, role=no-test]
+----
+gds.kge.model.train(G,
+ num_epochs = 10,
+ embedding_dimension = 100,
+)
+----
+
+.Parameters
+[cols="1m,1m,1m,1", options="header"]
+|====
+| Parameter | Type | Default value | Description
+
+| num_epochs
+| int
+| N/A
+| Number of epochs for training (must be greater than 0)
+
+| embedding_dimension
+| int
+| N/A
+| Dimensionality of the embeddings (must be greater than 0)
+
+| epochs_per_checkpoint
+| int
+| max(num_epochs / 10, 1)
+| Number of epochs between checkpoints (must be greater than or equal to 0)
+
+| load_from_checkpoint
+| Optional[tuple[str, int]]
+| None
+| Checkpoint to load from, specified as a tuple (path, epoch)
+
+| split_ratios
+| dict[str, float]
+| {TRAIN=0.8, TEST=0.2}
+| Ratios for splitting the dataset into training and test sets.
+When the sum of the ratios is less than 1.0, the remaining examples are used for validation.
+The validation set ratio can be set explicitly with the key `VALID`.
+When all three keys are present, the sum of values must be equal to 1.0.
+
+| scoring_function
+| str
+| "transe"
+| Function used to score embeddings of triples
+
+| p_norm
+| float
+| 1.0
+| Norm to use in TransE scoring function
+
+| batch_size
+| int
+| 512
+| Size of the training batch (must be greater than 0)
+
+| test_batch_size
+| int
+| 512
+| Size of the test batch (must be greater than 0)
+
+| optimizer
+| str
+| "adam"
+| Optimizer to use for training
+
+| optimizer_kwargs
+| dict[str, Any]
+| {lr=0.01, weight_decay=0.0005}
+| Arguments for the optimizer
+
+| lr_scheduler
+| str
+| ConstantLR
+| Learning rate scheduler
+
+| lr_scheduler_kwargs
+| dict[str, Any]
+| {factor=1, total_iters=1000}
+| Additional arguments for the learning rate scheduler
+
+| loss_function
+| str
+| MarginRanking
+| Loss function to use for training
+
+| loss_function_kwargs
+| dict[str, Any]
+| {margin=1.0, adversarial_temperature=1.0, gamma=20.0}
+| Additional arguments for the loss function
+
+| negative_sampling_size
+| int
+| 1
+| Number of negative samples per positive sample
+
+| use_node_type_aware_sampler
+| bool
+| False
+| Whether to sample negative nodes with the same label as the corresponding positive node
+
+| k_value
+| int
+| 10
+| Value of k used in Hits@k evaluation metric
+
+| do_validation
+| bool
+| True
+| Whether to perform validation
+
+| do_test
+| bool
+| True
+| Whether to perform testing
+
+| filtered_metrics
+| bool
+| False
+| Whether to use filtered metrics during evaluation, see <<_filtered_metrics, filtered metrics>>
+
+| epochs_per_val
+| int
+| 50
+| Number of epochs between validations (must be greater than or equal to 0)
+
+| inner_norm
+| bool
+| True
+| Whether to apply normalization to embeddings, see <<_inner_normalisation, inner normalization>>
+
+| init_bound
+| Optional[float]
+| None
+| The value for the range [-init_bound; init_bound] of the uniform distribution used to initialize the embeddings.
+Xavier initializationfootnote:[Xavier Glorot, Yoshua Bengio "Understanding the difficulty of training deep feedforward neural networks" Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics, PMLR 9:249-256, 2010]
+ is used if None.
+|====
+
+
+[[algorithms-embeddings-kge-examples]]
+== Examples
+TODO
\ No newline at end of file
diff --git a/doc/modules/ROOT/partials/python-runtime-algorithms.adoc b/doc/modules/ROOT/partials/python-runtime-algorithms.adoc
new file mode 100644
index 000000000..48f39c094
--- /dev/null
+++ b/doc/modules/ROOT/partials/python-runtime-algorithms.adoc
@@ -0,0 +1 @@
+* xref:gds-session-algorithms/knowledge-graph-embeddings.adoc[Knowledge Graph Embeddings]
\ No newline at end of file