# Embedding Providers

In this notebook, we will show how to use RedisVL to create embeddings using the built-in Providers. Today RedisVL supports:
1. OpenAI
2. HuggingFace

Before running this notebook, be sure to
1. Have installed ``redisvl`` and have that environment active for this notebook.
2. Have a running Redis instance with RediSearch > 2.4 running.


In [22]:
# import necessary modules
import os
from redisvl.utils.utils import array_to_buffer

## Creating Embeddings

This example will show how to create an embedding from 3 simple sentences with a number of different providers

- "That is a happy dog"
- "That is a happy person"
- "Today is a nice day"


### Huggingface

Huggingface is a popular NLP library that has a number of pre-trained models. RedisVL supports using Huggingface to create embeddings from these models. To use Huggingface, you will need to install the ``sentence-transformers`` library.

```bash
pip install sentence-transformers
```

In [32]:
os.environ["TOKENIZERS_PARALLELISM"] = "false"
from redisvl.providers import HuggingfaceProvider


# create a provider
hf = HuggingfaceProvider(model="sentence-transformers/all-mpnet-base-v2")

# embed a sentence
test = hf.embed("This is a test sentence.")
test[:10]

[0.00037813105154782534,
 -0.05080341547727585,
 -0.03514720872044563,
 -0.023251093924045563,
 -0.04415826499462128,
 0.020487893372774124,
 0.0014619074063375592,
 0.03126181662082672,
 0.056051574647426605,
 0.0188154224306345]

In [24]:
# You can also create many embeddings at once

sentences = [
    "That is a happy dog",
    "That is a happy person",
    "Today is a sunny day"
]

embeddings = hf.embed_many(sentences)


## Search with Provider Embeddings

Now that we've created our embeddings, we can use them to search for similar sentences. We will use the same 3 sentences from above and search for similar sentences.

First, we need to create the schema for our index.

Here's what the schema for the example looks like in yaml for the HuggingFace Provider

```yaml
index:
    name: providers
    prefix: rvl
    storage_type: hash

fields:
    text:
        - name: sentence
    vector:
        - name: embedding
          dims: 768
          algorithm: flat
          distance_metric: cosine
```

In [11]:
from redisvl.index import SearchIndex

# construct a search index from the schema
index = SearchIndex.from_yaml("./schema.yaml")

# connect to local redis instance
index.connect("redis://localhost:6379")

# create the index (no data yet)
index.create(overwrite=True)

In [12]:
# use the CLI to see the created index
!rvl index listall

[32m15:50:34[0m [35msam.partee-NW9MQX5Y74[0m [34mredisvl.cli.index[33382][0m [1;30mINFO[0m Indices:
[32m15:50:34[0m [35msam.partee-NW9MQX5Y74[0m [34mredisvl.cli.index[33382][0m [1;30mINFO[0m 1. providers


In [21]:
# load expects an iterable of dictionaries where
# the vector is stored as a bytes buffer

data = [{"text": t,
         "embedding": array_to_buffer(v)}
        for t, v in zip(sentences, embeddings)]

index.load(data)

In [31]:
from redisvl.query import VectorQuery

# use the HuggingFace Provider again to create a query embedding
query_embedding = hf.embed("That is a happy cat")

query = VectorQuery(
    vector=query_embedding,
    vector_field_name="embedding",
    return_fields=["text"],
    num_results=3
)

results = index.search(query.query, query_params=query.params)
for doc in results.docs:
    print(doc.text)
    print(doc.vector_distance)

That is a happy dog
0.160862445831
That is a happy person
0.273598074913
Today is a sunny day
0.744559526443
