# Psychic
This notebook covers how to load documents from the [Psychic python library](https://pypi.org/project/psychicapi/) and use them for question answering. See [here](../../../../ecosystem/psychic.md) for more details.

## Prerequisites
1. Follow the Quick Start section in [this document](../../../../ecosystem/psychic.md)
2. Log into the [Psychic dashboard](https://dashboard.psychic.dev/) and get your secret key
3. Install the frontend react library into your web app and have a user authenticate a connection. The connection will be created using the connection id that you specify.

In [1]:
!pip install psychicapi
!pip install langchain
!pip install openai 
!pip install chromadb
!pip install tiktoken

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Collecting psychicapi
  Downloading psychicapi-0.4-py3-none-any.whl (2.6 kB)
Installing collected packages: psychicapi
Successfully installed psychicapi-0.4
Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Collecting langchain
  Downloading langchain-0.0.174-py3-none-any.whl (869 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m869.7/869.7 kB[0m [31m13.6 MB/s[0m eta [36m0:00:00[0m
Collecting aiohttp<4.0.0,>=3.8.3 (from langchain)
  Downloading aiohttp-3.8.4-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.0 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.0/1.0 MB[0m [31m10.6 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting async-timeout<5.0.0,>=4.0.0 (from langchain)
  Downloading async_timeout-4.0.2-py3-none-any.whl (5.8 kB)
Collecting dataclasses-json<0.6.0,>=0.5.7 (from langchain)

In [11]:
import os 

os.environ["OPENAI_API_KEY"] = "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"

## Loading documents

Use the `get_documents` function to load in documents from a connection. Each connection has a connector id (corresponding to the SaaS app that was connected) and a connection id (which you passed in to the frontend library).

In [6]:
from langchain.docstore.document import Document
from psychicapi import Psychic, ConnectorId

# Create a document loader for google drive. We can also load from other connectors by setting the connector_id to the appropriate value e.g. ConnectorId.notion.value
# This loader uses our test credentials
psychic = Psychic(secret_key="d9df0919-5b99-441c-ab8e-0f7ef1440c28")

raw_docs = psychic.get_documents(ConnectorId.notion, "test_connector")

docs = [
    Document(page_content=doc["content"], metadata={"title": doc["title"], "source": doc["uri"]},)
    for doc in raw_docs
]


[Document(page_content='Test page. Test content', metadata={'title': 'Test page', 'source': 'https://www.notion.so/Test-page-aad23d3c9d7f46d6805110920dee99ac'})]


## Converting the docs to embeddings 

We can now convert these documents into embeddings and store them in a vector database like Chroma

In [12]:
from langchain.embeddings.openai import OpenAIEmbeddings
from langchain.vectorstores import Chroma
from langchain.text_splitter import CharacterTextSplitter
from langchain.llms import OpenAI
from langchain.chains import RetrievalQAWithSourcesChain

text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0)
texts = text_splitter.split_documents(docs)


embeddings = OpenAIEmbeddings()
docsearch = Chroma.from_documents(texts, embeddings)



## Question answering

Now we use OpenAI to ask questions over the

In [14]:
chain = RetrievalQAWithSourcesChain.from_chain_type(OpenAI(temperature=0), chain_type="stuff", retriever=docsearch.as_retriever())
chain({"question": "what is this page called?"}, return_only_outputs=True)

ERROR:root:Chroma collection langchain contains fewer than 4 elements.
ERROR:root:Chroma collection langchain contains fewer than 3 elements.
ERROR:root:Chroma collection langchain contains fewer than 2 elements.


{'answer': ' This page is called "Test page".\n',
 'sources': 'https://www.notion.so/Test-page-aad23d3c9d7f46d6805110920dee99ac'}