# Aleph Alpha

There are two possible ways to use Aleph Alpha's semantic embeddings. If you have texts with a dissimilar structure. E.g. a Document and a Question (Query) about this document you would want to use asymmetric embeddings. On the other hand, for texts with comparable structures, symmetric embeddings are the suggested approach.

Aleph Alpha embeddings can be called synchronously, or asynchronously with a context manager. The async approach is especially useful when you have lots of documents and want to make simultanious calls to the API. 

In [14]:
from langchain.embeddings import AlephAlphaAsymmetricSemanticEmbedding, AlephAlphaSymmetricSemanticEmbedding

In [16]:
document = "This is a content of the document"
query = "What is the content of the document?"

NUM_DOCUMENTS = 50

## Synchronous call

Everything in processed in order, slow for large corpus of documents. 

In [19]:
embeddings = AlephAlphaAsymmetricSemanticEmbedding(normalize=True, compress_to_size=128)
doc_result = embeddings.embed_documents([document])
query_result = embeddings.embed_query(query)

In [22]:
embeddings = AlephAlphaSymmetricSemanticEmbedding(normalize=True, compress_to_size=128)
doc_result = embeddings.embed_documents([document])
query_result = embeddings.embed_query(document)

## Asynchronous call

We asynchronously call the Aleph Alpha API. The number of concurrent  calls is specified by the `concurrency_limit`

In [23]:
async def symmetric_call(documents: [str], query) -> ([[float]], [float]):
    async with AlephAlphaSymmetricSemanticEmbedding(concurrency_limit=10, show_progress=True) as embeddings:
        doc_result = await embeddings.aembed_documents(documents)
        query_result = await embeddings.aembed_query(query)

    return doc_result, query_result

In [24]:
async def asymmetric_call(documents: [str], query) -> ([[float]], [float]):
    async with AlephAlphaAsymmetricSemanticEmbedding(concurrency_limit=10, show_progress=True) as embeddings:
        doc_result = await embeddings.aembed_documents(documents)
        query_result = await embeddings.aembed_query(query)

    return doc_result, query_result

In [26]:
import asyncio

async def main():
    #uncomment this to remove the errors on a windows machine
    #asyncio.set_event_loop_policy(asyncio.WindowsSelectorEventLoopPolicy())
    await symmetric_call([document]*NUM_DOCUMENTS, query)
    await asymmetric_call([document]*NUM_DOCUMENTS, query)

asyncio.run(main())

([[0.12695312, 1.0390625, 0.60546875, -0.34375, -0.31640625, -0.765625, 0.44140625, -0.390625, -0.69140625, -0.54296875, -0.609375, -0.36328125, -0.79296875, 0.060302734, 0.061035156, 0.33789062, -0.71875, 0.5859375, 1.0078125, 1.8046875, 0.004180908, 0.13867188, 0.5703125, 0.75390625, -0.68359375, -0.31835938, -0.6640625, -0.21777344, -0.84765625, 0.33007812, -0.58203125, 0.6640625, -0.75, -1.0546875, -0.5078125, -0.55859375, 0.5859375, -0.58984375, -0.890625, -0.31054688, 0.83984375, -0.78125, 0.796875, 1.1640625, -0.3046875, -0.9296875, 1.1171875, -0.41601562, 0.07080078, -1.8515625, 0.08935547, -0.60546875, -0.2578125, 0.2578125, -0.026489258, 0.234375, 0.578125, 0.061767578, 0.953125, -0.875, -0.28125, 0.7578125, 0.6875, 0.51953125, 0.66015625, -0.100097656, -0.4609375, -1.1328125, -0.5546875, 2.484375, -0.12792969, 0.20507812, -0.013000488, 0.57421875, -0.8046875, 0.23242188, -0.70703125, -0.58984375, 1.0234375, 0.7109375, -0.51953125, -1.171875, -0.18261719, -0.44140625, -0.1289