# Wordlift Vector Store

In this notebook, we'll illustrate how to leverage Wordlift as a Vector store for seamless integration with LlamaIndex. To access a Wordlift key and unlock our AI-powered SEO tools, visit [Wordlift](https://wordlift.io/).

### Setting up environments

Install Llamaindex and Wordlift vector store using pip 

In [None]:
%pip install llama-index
%pip install llama-index-vector-stores-wordlift
%pip install nest_asyncio

In [None]:
from llama_index.core import SimpleDirectoryReader, StorageContext
from llama_index.core import VectorStoreIndex
from llama_index.vector_stores.wordlift import WordliftVectorStore
from llama_index.core.embeddings.utils import get_cache_dir
from llama_index.embeddings.huggingface import HuggingFaceEmbedding
from llama_index.core import Settings
from llama_index.core import Document
import nest_asyncio

Since we made use of async loops for the implementation of the Wordlift Vector Store, nest_asyncio is needed to use it in a Jupyter Notebook

In [None]:
nest_asyncio.apply()

Setup OpenAI API

In [None]:
import os
import openai

openai.api_key = os.environ["your_openAI_key"]

Download and prepare the sample dataset

In [None]:
!mkdir 'data\paul_graham\'
!curl 'https://raw.githubusercontent.com/run-llama/llama_index/main/docs/docs/examples/data/paul_graham/paul_graham_essay.txt' -O 'data/paul_graham/paul_graham_essay.txt'

In [None]:
documents = SimpleDirectoryReader("./data/paul_graham").load_data()

Wordlift Knowledge Graphs are built on the principles of fully Linked Data, where each entity is assigned a permanent dereferentiable URI. 

When adding nodes to an existing Knowledge Graph, it's essential to include an "entity_id" in the metadata of each loaded document. 

For further insights into Fully Linked Data, explore these resources: 
[W3C Linked Data](https://www.w3.org/DesignIssues/LinkedData.html), 
[5 Star Data](https://5stardata.info/en/).


To retrieve the URI of your Knowledge Graph, execute the following command and locate the "datasetURI" parameter. Ensure to substitute "your_key" with your Wordlift Key. Keep in mind that there is a unique key assigned to each Knowledge Graph.

In [None]:
!curl https://api.wordlift.io/accounts/me -H 'Authorization: Key your_key' | jq .

To generate the entity_id, concatenate the datasetURI with the normalized filename. 

For instance, if your datasetURI is `https://data.wordlift.io/wl0000000/` and your text file is named `sample-file.txt`, the entity_id can be constructed as follows: 

`entity_id = datasetURI + normalize(filename)` 

which results in `https://data.wordlift.io/wl0000000/sample-file-txt`.

In [None]:
dataset_uri = "your_dataset_uri"

for document in documents:
    norm_filename = document.metadata["file_name"].replace(".", "-")
    entity_id = dataset_uri + norm_filename
    document.metadata["entity_id"] = entity_id

### Create Wordlift Vectore Store

To create a Wordlift vector store instance you just need the key of your Wordlift Knowledge Graph. Remember that there is a key for each Knowledge Graph

In [None]:
vector_store = WordliftVectorStore.create("your_key")

# set Wordlift vector store instance as the vector store
storage_context = StorageContext.from_defaults(vector_store=vector_store)

index = VectorStoreIndex.from_documents(documents, storage_context=storage_context)

In [None]:
# test the vector store query
query_engine = index.as_query_engine()
response = query_engine.query("What did the author do growing up?")
print(response)