Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
31 changes: 30 additions & 1 deletion next.config.mjs
Original file line number Diff line number Diff line change
Expand Up @@ -3405,7 +3405,36 @@ export default withNextra({
destination: '/ai-ecosystem#graphchat',
permanent: true
},

{
source: '/getting-started/build-memgraph-from-source#obtaining-the-source-code',
destination: '/getting-started/build-memgraph-from-source#obtain-the-source-code',
permanent: true
},
{
source: '/getting-started/build-memgraph-from-source#downloading-the-dependencies',
destination: '/getting-started/build-memgraph-from-source#download-dependencies-required-for-methods-1--2',
permanent: true
},
{
source: '/getting-started/build-memgraph-from-source#compiling',
destination: '/getting-started/build-memgraph-from-source#toolchain-installation-required-for-methods-1--2',
permanent: true
},
{
source: '/getting-started/build-memgraph-from-source#toolchain-installation-procedure',
destination: '/getting-started/build-memgraph-from-source#toolchain-installation-required-for-methods-1--2',
permanent: true
},
{
source: '/getting-started/build-memgraph-from-source#installing-memgraph-dependencies',
destination: '/getting-started/build-memgraph-from-source#download-dependencies-required-for-methods-1--2',
permanent: true
},
{
source: '/getting-started/build-memgraph-from-source#running-memgraph',
destination: '/getting-started/build-memgraph-from-source#run-memgraph',
permanent: true
},


// END: NEW MEMGRAPH LAB REDIRECTS
Expand Down
1 change: 1 addition & 0 deletions pages/advanced-algorithms/available-algorithms/_meta.ts
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,7 @@ export default {
"degree_centrality": "degree_centrality",
"distance_calculator": "distance_calculator",
"elasticsearch_synchronization": "elasticsearch_synchronization",
"embeddings": "embeddings",
"export_util": "export_util",
"gnn_link_prediction": "gnn_link_prediction",
"gnn_node_classification": "gnn_node_classification",
Expand Down
135 changes: 135 additions & 0 deletions pages/advanced-algorithms/available-algorithms/embeddings.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,135 @@
---
title: embeddings
description: Calculate sentence embeddings on node strings using pytorch.
---

# embeddings

import { Cards } from 'nextra/components'
import GitHub from '/components/icons/GitHub'

The embeddings module provides tools for calculating sentence embeddings on node strings using pytorch.

<Cards>
<Cards.Card
icon={<GitHub />}
title="Source code"
href="https://github.com/memgraph/mage/blob/main/python/embeddings.py"
/>
</Cards>

| Trait | Value |
| ------------------- | ------------------- |
| **Module type** | algorithm |
| **Implementation** | Python |
| **Parallelism** | parallel |


## Procedures

### `compute()`

The procedure computes the sentence embeddings on the string properties of nodes. Embeddings are
created as a property of the nodes in the graph.

{<h4 className="custom-header"> Input: </h4>}

- `input_nodes: List[Vertex]` (**OPTIONAL**) ➡ The list of nodes to compute the embeddings for. If not provided, the embeddings are computed for all nodes in the graph.
- `embedding_property: string` ➡ The name of the property to store the embeddings in. This property is `embedding` by default.
- `excluded_properties: List[string]` ➡ The list of properties to exclude from the embeddings computation. This list is empty by default.
- `model_name: string` ➡ The name of the model to use for the embeddings computation, buy default this module uses the `all-MiniLM-L6-v2` model provided by the `sentence-transformers` library.
- `batch_size: int` ➡ The batch size to use for the embeddings computation. This is set to `2000` by default.
- `chunk_size: int` ➡ The number of batches per "chunk". This is used when computing embeddings across multiple GPUs, as this has to be done by spawning multiple processes. Each spawned process computes the embeddings for a single chunk. This is set to 48 by default.
- `device: string|int|List[string|int]` ➡ The device to use for the embeddings computation. This can be any of the following:
- `"cpu"` - Use CPU for computation.
- `"cuda"` or `"all"` - Use all available CUDA devices for computation.
- `"cuda:id"` - Use a specific CUDA device for computation.
- `id` - Use a specific device for computation.
- `[id1, id2, ...]` - Use a list of device ids for computation.
- `["cuda:id1", "cuda:id2", ...]` - Use a list of CUDA devices for computation.
by default, the first device (`0`) is used.

{<h4 className="custom-header"> Output: </h4>}

- `success: bool` ➡ Whether the embeddings computation was successful.

{<h4 className="custom-header"> Usage: </h4>}

To compute the embeddings across the entire graph with the default parameters, use the following query:

```cypher
CALL embeddings.compute()
YIELD success;
```

To compute the embeddings for a specific list of nodes, use the following query:


```cypher
MATCH (n)
WITH n ORDER BY id(n)
LIMIT 5
WITH collect(n) AS subset
CALL embeddings.compute(subset)
YIELD success;
```

To run the computation on specific device(s), use the following query:

```cypher
CALL embeddings.compute(
NULL,
"embedding",
NULL,
"all-MiniLM-L6-v2",
2000,
48,
"cuda:1"
)
YIELD success;
```


## Example

Create the following graph:

```cypher
CREATE (n:Node {id: 1, Title: "Stilton", Description: "A stinky cheese from the UK"}),
(n:Node {id: 2, Title: "Roquefort", Description: "A blue cheese from France"}),
(n:Node {id: 3, Title: "Cheddar", Description: "A yellow cheese from the UK"}),
(n:Node {id: 4, Title: "Gouda", Description: "A Dutch cheese"}),
(n:Node {id: 5, Title: "Parmesan", Description: "An Italian cheese"}),
(n:Node {id: 6, Title: "Red Leicester", Description: "The best cheese in the world"});
```

Run the following query to compute the embeddings:

```cypher
CALL embeddings.compute()
YIELD success;

MATCH (n)
WHERE n.embedding IS NOT NULL
RETURN n.Title, n.embedding;
```

Results:

```plaintext
+---------+
| success |
+---------+
| true |
+---------+
+----------------------------------------------------------------------+----------------------------------------------------------------------+
| n.Title | n.embedding |
+----------------------------------------------------------------------+----------------------------------------------------------------------+
| "Stilton" | [-0.0485366, -0.021823, 0.0159757, 0.0376443, 0.00594089, -0.0044... |
| "Roquefort" | [-0.0252884, 0.0250485, -0.0249728, 0.0571037, 0.0386177, 0.03863... |
| "Cheddar" | [-0.0129724, -0.00756301, -0.00379329, 0.0037531, -0.0134941, 0.0... |
| "Gouda" | [0.0128716, 0.025435, -0.0288951, 0.0177759, -0.0624398, 0.043577... |
| "Parmesan" | [-0.0755439, 0.00906182, -0.010977, 0.0208911, -0.0527448, 0.0085... |
| "Red Leicester" | [-0.0244318, -0.0280038, -0.0373183, 0.0284436, -0.0277753, 0.066... |
+----------------------------------------------------------------------+----------------------------------------------------------------------+
```
8 changes: 5 additions & 3 deletions pages/advanced-algorithms/available-algorithms/map.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -171,16 +171,18 @@ RETURN map.from_pairs([["b", 3], ["c", "c"]]) AS map;
### `merge()`

The procedure merges two maps into one. If the same key occurs twice, the later
value will overwrite the previous one.
value will overwrite the previous one.

If null is provided as an argument, it will resolve to an empty map.

<Callout type="info">
This function is equivalent to **apoc.map.merge**.
</Callout>

{<h4 className="custom-header"> Input: </h4>}

- `first: Map` ➡ A map containing key-value pairs that need to be merged with another map.
- `second: Map` ➡ The second map containing key-value pairs that need to be merged with the key-values from the first map.
- `first: mgp.Nullable[Map]` ➡ A map containing key-value pairs that need to be merged with another map.
- `second: mgp.Nullable[Map]` ➡ The second map containing key-value pairs that need to be merged with the key-values from the first map.

{<h4 className="custom-header"> Output: </h4>}

Expand Down
22 changes: 17 additions & 5 deletions pages/advanced-algorithms/install-mage.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,9 @@ The following tags are available on Docker Hub:
- `x.y` - production MAGE image
- `x.y-relwithdebinfo` - contains debugging symbols and `gdb`
- `x.y-malloc` - Memgraph compiled with `malloc`instead of `jemalloc` (x86_64 only)
- `x.y-relwithdebinfo-cuda` - Memgraph built with CUDA support* - available since version `3.6.1`.

*To run GPU-accelerated algorithms, you need to launch the container with the `--gpus all` flag.

For versions prior to `3.2`, MAGE image tags included both MAGE and Memgraph versions, e.g.

Expand Down Expand Up @@ -90,7 +93,7 @@ sudo apt-get update && sudo apt-get install -y \
git \
pkg-config \
uuid-dev \
libxmlsec1-dev xmlsec1 \
xmlsec1 \
--no-install-recommends
```

Expand All @@ -106,7 +109,7 @@ git clone --recurse-submodules https://github.com/memgraph/mage.git && cd mage

Download and install the [Memgraph Toolchain](https://memgraph.com/docs/getting-started/build-memgraph-from-source#toolchain-installation-procedure):
```bash
curl -L https://s3-eu-west-1.amazonaws.com/deps.memgraph.io/toolchain-v6/toolchain-v6-binaries-ubuntu-24.04-amd64.tar.gz -o toolchain.tar.gz
curl -L https://s3-eu-west-1.amazonaws.com/deps.memgraph.io/toolchain-v7/toolchain-v7-binaries-ubuntu-24.04-amd64.tar.gz -o toolchain.tar.gz
sudo tar xzvfm toolchain.tar.gz -C /opt
```

Expand All @@ -125,16 +128,25 @@ curl https://sh.rustup.rs -sSf | sh -s -- -y
export PATH="/root/.cargo/bin:${PATH}"
python3 -m pip install -r python/requirements.txt
python3 -m pip install -r cpp/memgraph/src/auth/reference_modules/requirements.txt
python3 -m pip install torch-sparse torch-cluster torch-spline-conv torch-geometric torch-scatter -f https://data.pyg.org/whl/torch-2.3.0+cpu.html
python3 -m pip install dgl -f https://data.dgl.ai/wheels/torch-2.3/repo.html
python3 -m pip install torch-sparse torch-cluster torch-spline-conv torch-geometric torch-scatter -f https://data.pyg.org/whl/torch-2.6.0+cpu.html
python3 -m pip install dgl -f https://data.dgl.ai/wheels/torch-2.6/repo.html
```

<Callout type="info">

To install the dependencies for GPU-accelerated algorithms, you need to use the GPU-specific requirements file:

```shell
python3 -m pip install -r python/requirements-gpu.txt
```
</Callout>

{<h3 className="custom-header">Run the `setup` script</h3>}

Run the following command:

```shell
source /opt/toolchain-v6/activate
source /opt/toolchain-v7/activate
python3 setup build
sudo cp -r dist/* /usr/lib/memgraph/query_modules
```
Expand Down
10 changes: 5 additions & 5 deletions pages/custom-query-modules/python.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -55,7 +55,7 @@ inside a Docker container, run the following command in the terminal:


```
docker exec -i -u root <container_id> bash -c "apt install -y python3-pip &&
docker exec -i -u memgraph <container_id> bash -c "apt install -y python3-pip &&
pip install pandas"
```

Expand All @@ -69,16 +69,16 @@ commands in the Dockerfile:
```
FROM memgraph/memgraph:latest

USER root
USER memgraph

RUN apt install -y python3-pip
RUN pip install pandas

USER memgraph
```

It is important that you install Python library as a `root` user, rather than
the default `memgraph` user.
Python libraries should now be installed as the `memgraph` user, which is the
default user inside the Memgraph container. You no longer need to switch to the
`root` user to perform installations.

## Example

Expand Down
Loading