# GreenNode Text Embeddings

>[GreenNode](https://greennode.ai/) is a global AI solutions provider and a **NVIDIA Preferred Partner**, delivering full-stack AI capabilities—from infrastructure to application—for enterprises across the US, MENA, and APAC regions. Operating on **world-class infrastructure** (LEED Gold, TIA‑942, Uptime Tier III), GreenNode empowers enterprises, startups, and researchers with a comprehensive suite of AI services

This notebook provides a guide to getting started with `GreenNodeEmbeddings`. It enables you to perform semantic document search using various built-in connectors or your own custom data sources by generating high-quality vector representations of text.

## Overview

The `GreenNodeEmbeddings` class enables integration with GreenNode Serverless AI’s suite of embedding models via [LangChain](https://www.langchain.com/). These embeddings are well-suited for tasks such as semantic search, document similarity, and other natural language processing applications that require vectorized text representations.

### Integration details

- **Provider**: [GreenNode Serverless AI](https://aiplatform.console.greennode.ai/playground)
- **Model Types**: Text embedding models
- **Primary Use Case**: Generating vector embeddings for semantic similarity and information retrieval
- **Available Models**: Includes [BAAI/bge-m3](https://huggingface.co/BAAI/bge-m3) and a variety of other state-of-the-art models
- **Dimensions**: Varies by model (typically 1024-4096 dimensions)

## Setup

### Installation

The GreenNode integration can be installed via pip:

In [None]:
%pip install --upgrade langchain-greennode

### Credentials

GreenNode requires an API key for authentication, which can be provided either as the `api_key` parameter during initialization or set as the environment variable `GREENNODE_API_KEY`. You can obtain an API key by registering for an account on [GreenNode Serverless AI](https://aiplatform.console.greennode.ai/playground).

In [None]:
import getpass
import os

# Make sure you've set your API key as an environment variable
if "GREENNODE_API_KEY" not in os.environ:
    os.environ["GREENNODE_API_KEY"] = getpass.getpass("Enter your GreenNode API key: ")

## Instantiation

The `GreenNodeEmbeddings` class can be instantiated with optional parameters for the API key and model name:

In [None]:
from langchain_greennode import GreenNodeEmbeddings

# Initialize the embeddings model
embeddings = GreenNodeEmbeddings(
    # api_key="YOUR_API_KEY",  # You can pass the API key directly
    model="BAAI/bge-m3"  # The default embedding model
)

### Available Models

The list of supported models is available at https://aiplatform.console.greennode.ai/models

## Indexing and Retrieval

Embedding models play a key role in retrieval-augmented generation (RAG) workflows by enabling both the indexing of content and its efficient retrieval. The example below illustrates how to integrate `GreenNodeEmbeddings` with a vector store to perform document retrieval.

In [5]:
from langchain_community.vectorstores import FAISS
from langchain_core.documents import Document

# Prepare documents (finance/economics domain)
docs = [
    Document(
        page_content="Inflation represents the rate at which the general level of prices for goods and services rises"
    ),
    Document(
        page_content="Central banks use interest rates to control inflation and stabilize the economy"
    ),
    Document(
        page_content="Cryptocurrencies like Bitcoin operate on decentralized blockchain networks"
    ),
    Document(
        page_content="Stock markets are influenced by corporate earnings, investor sentiment, and economic indicators"
    ),
]

# Assume embeddings is already defined (e.g., OpenAIEmbeddings or HuggingFaceEmbeddings)
vector_store = FAISS.from_documents(docs, embeddings)

# Perform similarity search
query = "How do central banks fight rising prices?"
results = vector_store.similarity_search(query, k=2)

print("Search results for query:", query)
for i, doc in enumerate(results):
    print(f"Result {i+1}: {doc.page_content}")


Search results for query: How do central banks fight rising prices?
Result 1: Central banks use interest rates to control inflation and stabilize the economy
Result 2: Inflation represents the rate at which the general level of prices for goods and services rises


### Using with InMemoryVectorStore

You can also use the `InMemoryVectorStore` for lightweight applications:

In [6]:
from langchain_core.vectorstores import InMemoryVectorStore

# Create a sample text
text = "LangChain is a framework for developing applications powered by language models"

# Create a vector store
vectorstore = InMemoryVectorStore.from_texts(
    [text],
    embedding=embeddings,
)

# Use as a retriever
retriever = vectorstore.as_retriever()

# Retrieve similar documents
docs = retriever.invoke("What is LangChain?")
print(f"Retrieved document: {docs[0].page_content}")

Retrieved document: LangChain is a framework for developing applications powered by language models


## Direct Usage

The `GreenNodeEmbeddings` class can be used independently to generate text embeddings without the need for a vector store. This is useful for tasks such as similarity scoring, clustering, or custom processing pipelines.

### Embedding a Single Text

You can use the `embed_query` method to embed a single piece of text:

In [7]:
query = "What is machine learning?"
query_embedding = embeddings.embed_query(query)

# Check the embedding dimension
print(f"Embedding dimension: {len(query_embedding)}")
print(f"First few values: {query_embedding[:5]}")

Embedding dimension: 1024
First few values: [-0.045654296875, -0.0218505859375, -0.03125, -0.0028533935546875, 0.00885009765625]


### Embedding Multiple Texts

You can embed multiple texts at once using the `embed_documents` method:

In [8]:
documents = [
    "Machine learning is a branch of artificial intelligence",
    "Deep learning is a subfield of machine learning",
    "Natural language processing deals with interactions between computers and human language",
]

document_embeddings = embeddings.embed_documents(documents)

# Check the results
print(f"Number of document embeddings: {len(document_embeddings)}")
print(f"Each embedding has {len(document_embeddings[0])} dimensions")

Number of document embeddings: 3
Each embedding has 1024 dimensions


### Async Support

GreenNodeEmbeddings supports async operations:

In [9]:
import asyncio


async def generate_embeddings_async():
    # Embed a single query
    query_result = await embeddings.aembed_query("What is the capital of France?")
    print(f"Async query embedding dimension: {len(query_result)}")

    # Embed multiple documents
    docs = [
        "Paris is the capital of France",
        "Berlin is the capital of Germany",
        "Rome is the capital of Italy",
    ]
    docs_result = await embeddings.aembed_documents(docs)
    print(f"Async document embeddings count: {len(docs_result)}")


await generate_embeddings_async()

Async query embedding dimension: 1024
Async document embeddings count: 3


### Document Similarity Example

In [11]:
import numpy as np
from scipy.spatial.distance import cosine

# Create some documents
documents = [
    "Machine learning algorithms build mathematical models based on sample data",
    "Deep learning uses neural networks with many layers",
    "Climate change is a major global environmental challenge",
    "Neural networks are inspired by the human brain's structure",
]

# Embed the documents
embeddings_list = embeddings.embed_documents(documents)


# Function to calculate similarity
def calculate_similarity(embedding1, embedding2):
    return 1 - cosine(embedding1, embedding2)


# Print similarity matrix
print("Document Similarity Matrix:")
for i, emb_i in enumerate(embeddings_list):
    similarities = []
    for j, emb_j in enumerate(embeddings_list):
        similarity = calculate_similarity(emb_i, emb_j)
        similarities.append(f"{similarity:.4f}")
    print(f"Document {i+1}: {similarities}")

Document Similarity Matrix:
Document 1: ['1.0000', '0.6005', '0.3542', '0.5788']
Document 2: ['0.6005', '1.0000', '0.4154', '0.6170']
Document 3: ['0.3542', '0.4154', '1.0000', '0.3528']
Document 4: ['0.5788', '0.6170', '0.3528', '1.0000']


## API Reference

For more details about the GreenNode Serverless AI API, visit the [GreenNode Serverless AI Documentation](https://aiplatform.console.greennode.ai/api-docs/maas).