# Clova Embeddings

Clova offers two versions of embedding services: V1 and V2.  
For more details, please check the [official documentation](https://guide.ncloud-docs.com/docs/clovastudio-explorer03) and [API guide](https://api.ncloud-docs.com/docs/clovastudio-embedding).


<div style="margin-bottom: 30px;"> <!-- 위쪽 공백 -->
    <p style="font-size: 16px; line-height: 1.6; margin-bottom: 20px;">
        This table illustrates the details of the Clova embedding models and their specifications.
        The embedding services are categorized based on their versions and capabilities.
    </p>
</div>

<table style="width:100%; border-collapse: collapse; font-size: 18px; margin-bottom: 30px;">
  <tr style="background-color: #f2f2f2;">
    <th style="border: 1px solid black; padding: 15px; text-align: center;">Tool Name</th>
    <th style="border: 1px solid black; padding: 15px; text-align: center;">Model Name</th>
    <th style="border: 1px solid black; padding: 15px; text-align: center;">Max Token Count</th>
    <th style="border: 1px solid black; padding: 15px; text-align: center;">Vector Dimension</th>
    <th style="border: 1px solid black; padding: 15px; text-align: center;">Recommended Distance Metric</th>
  </tr>
  <tr>
    <td style="border: 1px solid black; padding: 15px; text-align: center;">EmbeddingV1</td>
    <td style="border: 1px solid black; padding: 15px; text-align: center;">clir-emb-dolphin</td>
    <td style="border: 1px solid black; padding: 15px; text-align: center;">500 tokens</td>
    <td style="border: 1px solid black; padding: 15px; text-align: center;">1024</td>
    <td style="border: 1px solid black; padding: 15px; text-align: center;">IP (Inner Product/Dot Product/Scalar Product)</td>
  </tr>
  <tr>
    <td style="border: 1px solid black; padding: 15px; text-align: center;">EmbeddingV1</td>
    <td style="border: 1px solid black; padding: 15px; text-align: center;">clir-sts-dolphin</td>
    <td style="border: 1px solid black; padding: 15px; text-align: center;">500 tokens</td>
    <td style="border: 1px solid black; padding: 15px; text-align: center;">1024</td>
    <td style="border: 1px solid black; padding: 15px; text-align: center;">Cosine Similarity</td>
  </tr>
  <tr>
    <td style="border: 1px solid black; padding: 15px; text-align: center;">EmbeddingV2</td>
    <td style="border: 1px solid black; padding: 15px; text-align: center;">bge-m3</td>
    <td style="border: 1px solid black; padding: 15px; text-align: center;">8,192 tokens</td>
    <td style="border: 1px solid black; padding: 15px; text-align: center;">1024</td>
    <td style="border: 1px solid black; padding: 15px; text-align: center;">Cosine Similarity</td>
  </tr>
</table>

<div style="margin-top: 30px;"> <!-- 아래쪽 공백 -->
    <p style="font-size: 16px; line-height: 1.6; margin-top: 20px;">
        This notebook demonstrates how to interact with Clova's inference for text embeddings using LangChain.
    </p>
</div>



In [None]:
import os

os.environ["CLOVA_EMB_API_KEY"] = ""
os.environ["CLOVA_EMB_APIGW_API_KEY"] = ""
os.environ["CLOVA_EMB_APP_ID_V1"] = ""
os.environ["CLOVA_EMB_APP_ID_V2"] = ""

In [None]:
from langchain_community.embeddings import ClovaEmbeddingsV1, ClovaEmbeddingsV2

In [None]:
query_text = "This is a test query."
document_texts = ["This is a test doc1.", "This is a test doc2."]

Define embedding test function

In [None]:
def test_embeddings(embeddings_instance, query_text, document_texts):
    query_result = embeddings_instance.embed_query(query_text)
    document_result = embeddings_instance.embed_documents(document_texts)
    return query_result, document_result

Test ClovaEmbeddingsV1 with 'clir-emb-dolphin' model

In [None]:
embeddings_v1_emb = ClovaEmbeddingsV1(model="clir-emb-dolphin")
query_result_v1_emb, document_result_v1_emb = test_embeddings(embeddings_v1_emb, query_text, document_texts)

print("V1 (clir-emb-dolphin) Query Embedding Result:", query_result_v1_emb[:5])
print("V1 (clir-emb-dolphin) Document Embedding Results:", [doc[:5] for doc in document_result_v1_emb])

Test ClovaEmbeddingsV1 with 'clir-sts-dolphin' model

In [None]:
embeddings_v1_sts = ClovaEmbeddingsV1(model="clir-sts-dolphin")
query_result_v1_sts, document_result_v1_sts = test_embeddings(embeddings_v1_sts, query_text, document_texts)

print("V1 (clir-sts-dolphin) Query Embedding Result:", query_result_v1_sts[:5])
print("V1 (clir-sts-dolphin) Document Embedding Results:", [doc[:5] for doc in document_result_v1_sts])

Test ClovaEmbeddingsV2

In [None]:
embeddings_v2 = ClovaEmbeddingsV2()
query_result_v2, document_result_v2 = test_embeddings(embeddings_v2, query_text, document_texts)

print("V2 Query Embedding Result:", query_result_v2[:5])
print("V2 Document Embedding Results:", [doc[:5] for doc in document_result_v2])