# Clarifai

>[Clarifai](https://www.clarifai.com/) is a AI Platform that provides the full AI lifecycle ranging from data exploration, data labeling, model building and inference. A Clarifai application can be used as a vector database after uploading inputs. 

This notebook shows how to use functionality related to the `Clarifai` vector database.

To use Clarifai, you must have an account and a Personal Access Token key. 
Here are the [installation instructions](https://clarifai.com/settings/security ).

# Dependencies

In [None]:
# Install required dependencies
!pip install clarifai

# Imports
Here we will be setting the personal access token. You can find your PAT under settings/security on the platform.

In [3]:
# Please login and get your API key from  https://clarifai.com/settings/security 
from getpass import getpass

CLARIFAI_PAT_KEY = getpass()

We want to use `OpenAIEmbeddings` so we have to get the OpenAI API Key.

In [4]:
# Import the required modules
from langchain.text_splitter import CharacterTextSplitter
from langchain.document_loaders import TextLoader
from langchain.vectorstores import Clarifai

# Setup
Setup the user id and app id where the model resides. You can find a list of public models on https://clarifai.com/explore/models

You will have to also initialize the model id and if needed, the model version id. Some models have many versions, you can choose the one appropriate for your task.

In [5]:
USER_ID = 'minhajul'
APP_ID = 'test-lang'
NUMBER_OF_DOCS = 4

## From Texts
Create a Clarifai vectorstore from a list of texts. This section will upload each text with its respective metadata to a Clarifai Application. The Clarifai Application can then be used for semantic search to find relevant texts.

In [6]:
texts = ["I really enjoy spending time with you", "I hate spending time with my dog", "I want to go for a run", \
    "I went to the movies yesterday", "I love playing soccer with my friends"]

metadatas = [{"id": i, "text": text} for i, text in enumerate(texts)]

In [7]:
clarifai_vector_db = Clarifai.from_texts(USER_ID, APP_ID, texts, CLARIFAI_PAT_KEY, NUMBER_OF_DOCS, metadatas = metadatas)

Input 54419b64e7c0463fbeca7ba1903a7f10 posted successfully.
Input 796b335094384e73a8bf31a3a3736c84 posted successfully.
Input 9e1724f4b0274bf0ba170f4fda7ee9b1 posted successfully.
Input 2d1de0343853454dbfe03c6332ebfb06 posted successfully.
Input 83ca767f694940ae8a4071f945d53042 posted successfully.


In [9]:
docs = clarifai_vector_db.similarity_search("I would love to see you")
docs

	Score 0.91 for annotation: 588b6143722a4a8da9560b9686bc19aa off input: 54419b64e7c0463fbeca7ba1903a7f10, text: I really enjoy spending time with you
	Score 0.83 for annotation: d07ea05ca4ce4d81b353848cf6a332fb off input: 2d1de0343853454dbfe03c6332ebfb06, text: I went to the movies yesterday
	Score 0.81 for annotation: 96410b27d9974b98b49b1b55181fa638 off input: 853e101ffa5a4563a0d4301bb296ad1b, text: Madam Speaker, Madam Vice President, our First Lady and Second Gentleman. Members of Congress and the Cabinet. Justices of th
	Score 0.81 for annotation: 64af34db090245fd830f0f8f5cf214bc off input: dba9e6d61cef4d72ba6a938f32196e5a, text: We see the unity among leaders of nations and a more unified Europe a more unified West. And we see unity among the people wh


[Document(page_content='I really enjoy spending time with you', metadata={'text': 'I really enjoy spending time with you', 'id': 0.0}),
 Document(page_content='I went to the movies yesterday', metadata={'text': 'I went to the movies yesterday', 'id': 3.0}),
 Document(page_content='Madam Speaker, Madam Vice President, our First Lady and Second Gentleman. Members of Congress and the Cabinet. Justices of the Supreme Court. My fellow Americans.  \n\nLast year COVID-19 kept us apart. This year we are finally together again. \n\nTonight, we meet as Democrats Republicans and Independents. But most importantly as Americans. \n\nWith a duty to one another to the American people to the Constitution. \n\nAnd with an unwavering resolve that freedom will always triumph over tyranny. \n\nSix days ago, Russia’s Vladimir Putin sought to shake the foundations of the free world thinking he could make it bend to his menacing ways. But he badly miscalculated. \n\nHe thought he could roll into Ukraine and 

## From Documents
Create a Clarifai vectorstore from a list of Documents. This section will upload each document with its respective metadata to a Clarifai Application. The Clarifai Application can then be used for semantic search to find relevant documents.

In [10]:
loader = TextLoader('../../../state_of_the_union.txt')
documents = loader.load()
text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0)
docs = text_splitter.split_documents(documents)

In [11]:
docs

[Document(page_content='Madam Speaker, Madam Vice President, our First Lady and Second Gentleman. Members of Congress and the Cabinet. Justices of the Supreme Court. My fellow Americans.  \n\nLast year COVID-19 kept us apart. This year we are finally together again. \n\nTonight, we meet as Democrats Republicans and Independents. But most importantly as Americans. \n\nWith a duty to one another to the American people to the Constitution. \n\nAnd with an unwavering resolve that freedom will always triumph over tyranny. \n\nSix days ago, Russia’s Vladimir Putin sought to shake the foundations of the free world thinking he could make it bend to his menacing ways. But he badly miscalculated. \n\nHe thought he could roll into Ukraine and the world would roll over. Instead he met a wall of strength he never imagined. \n\nHe met the Ukrainian people. \n\nFrom President Zelenskyy to every Ukrainian, their fearlessness, their courage, their determination, inspires the world.', metadata={'source'

In [12]:
USER_ID = 'minhajul'
APP_ID = 'test-lang'
NUMBER_OF_DOCS = 4

In [13]:
clarifai_vector_db = Clarifai.from_documents(USER_ID, APP_ID, docs, CLARIFAI_PAT_KEY, NUMBER_OF_DOCS)

Input 853e101ffa5a4563a0d4301bb296ad1b posted successfully.
Input f4dcc30dbd2947c2a636ecbe08ff0ee9 posted successfully.
Input 5639de0b2f3c404b92527f205dddb936 posted successfully.
Input 502a405ad2af4fe98a72499b224858f8 posted successfully.
Input 413fd00d897b49b7877b6dfa6abf0775 posted successfully.
Input ebf3700cd1f7419589bc9b58f925cb11 posted successfully.
Input 21240aca8c8d4712a92e0df7132976b5 posted successfully.
Input dba9e6d61cef4d72ba6a938f32196e5a posted successfully.
Input 0935b8d5abee49e1ad738e3f7767ae46 posted successfully.
Input e0606a5545354a7b84c824f304f0ad7e posted successfully.
Input d6bc9ea7198546938555b72c5fdb0d2a posted successfully.
Input 17566c0de60a4b67afb321166ce57190 posted successfully.
Input 0f681e687d5b4009ae2c6d25c7889d63 posted successfully.
Input 3e7bdc2b353e447a8f2be0d8dfd610a2 posted successfully.
Input 791496387f1c40b48a298d229f332c1e posted successfully.
Input c778efc1d6044ea98a3aa3f98e605330 posted successfully.
Input 20507ff55f9944a588032a217671e8f7 p

In [14]:
docs = clarifai_vector_db.similarity_search("Texts related to criminals and violence")
docs

	Score 0.90 for annotation: fb2bd77a2b7c48959f7a36992f627951 off input: 35e76a1f35ac44888b5f5f2bdf5eddd3, text: And I will keep doing everything in my power to crack down on gun trafficking and ghost guns you can buy online and make at h
	Score 0.89 for annotation: 0a37166079c34abbbd6e5130a24840b9 off input: babed8c7f1ab40e4bd540c546243de19, text: We can’t change how divided we’ve been. But we can change how we move forward—on COVID-19 and other issues we must face toget
	Score 0.89 for annotation: fb7a4bbe56624e2a99e748412b00f15c off input: f464033f5a8f4742bf19ce58578ee665, text: A former top litigator in private practice. A former federal public defender. And from a family of public school educators an
	Score 0.88 for annotation: c5dbc4050f7b4831819d87ad531b9d42 off input: 589a17c9dfdd4d06b3de96e9e2d4d613, text: So let’s not abandon our streets. Or choose between safety and equal justice. 

Let’s come together to protect our communitie


[Document(page_content='And I will keep doing everything in my power to crack down on gun trafficking and ghost guns you can buy online and make at home—they have no serial numbers and can’t be traced. \n\nAnd I ask Congress to pass proven measures to reduce gun violence. Pass universal background checks. Why should anyone on a terrorist list be able to purchase a weapon? \n\nBan assault weapons and high-capacity magazines. \n\nRepeal the liability shield that makes gun manufacturers the only industry in America that can’t be sued. \n\nThese laws don’t infringe on the Second Amendment. They save lives. \n\nThe most fundamental right in America is the right to vote – and to have it counted. And it’s under assault. \n\nIn state after state, new laws have been passed, not only to suppress the vote, but to subvert entire elections. \n\nWe cannot let this happen.', metadata={'source': '../../../state_of_the_union.txt'}),
 Document(page_content='We can’t change how divided we’ve been. But we