# Psychic
This notebook covers how to load documents from the [Psychic python library](https://pypi.org/project/psychicapi/) and use them for question answering. See [here](../../../../ecosystem/psychic.md) for more details.

## Prerequisites
1. Follow the Quick Start section in [this document](../../../../ecosystem/psychic.md)
2. Log into the [Psychic dashboard](https://dashboard.psychic.dev/) and get your secret key
3. Install the frontend react library into your web app and have a user authenticate a connection. The connection will be created using the connection id that you specify.

In [24]:
!pip install psychicapi
!pip install langchain
!pip install openai 
!pip install chromadb
!pip install tiktoken


[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip available: [0m[31;49m22.3.1[0m[39;49m -> [0m[32;49m23.1.2[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpython3.10 -m pip install --upgrade pip[0m

[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip available: [0m[31;49m22.3.1[0m[39;49m -> [0m[32;49m23.1.2[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpython3.10 -m pip install --upgrade pip[0m

[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip available: [0m[31;49m22.3.1[0m[39;49m -> [0m[32;49m23.1.2[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpython3.10 -m pip install --upgrade pip[0m
Collecting pybind11
  Using cached pybind11-2.10.4-py3-none-any.whl (222 kB)
Installing collected packages: pybind11
Successfully installed pybind11-2.10.4

[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m 

Collecting overrides>=7.3.1
  Using cached overrides-7.3.1-py3-none-any.whl (17 kB)
Collecting clickhouse-connect>=0.5.7
  Using cached clickhouse_connect-0.5.25-cp310-cp310-macosx_11_0_arm64.whl (227 kB)
Collecting duckdb>=0.7.1
  Using cached duckdb-0.8.0-cp310-cp310-macosx_11_0_arm64.whl (12.6 MB)
Collecting posthog>=2.4.0
  Using cached posthog-3.0.1-py2.py3-none-any.whl (37 kB)
Collecting zstandard
  Downloading zstandard-0.21.0-cp310-cp310-macosx_11_0_arm64.whl (364 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m364.7/364.7 kB[0m [31m5.8 MB/s[0m eta [36m0:00:00[0ma [36m0:00:01[0m
Collecting lz4
  Downloading lz4-4.3.2-cp310-cp310-macosx_11_0_arm64.whl (212 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m212.3/212.3 kB[0m [31m11.6 MB/s[0m eta [36m0:00:00[0m
Collecting coloredlogs
  Downloading coloredlogs-15.0.1-py2.py3-none-any.whl (46 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m46.0/46.0 kB[0m [31m5.6 MB


[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip available: [0m[31;49m22.3.1[0m[39;49m -> [0m[32;49m23.1.2[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpython3.10 -m pip install --upgrade pip[0m


In [1]:
import os

## Loading documents

Use the `get_documents` function to load in documents from a connection. Each connection has a connector id (corresponding to the SaaS app that was connected) and a connection id (which you passed in to the frontend library).

In [28]:
from dotenv import load_dotenv
load_dotenv()
from langchain.docstore.document import Document
from psychicapi import Psychic, ConnectorId
from bs4 import BeautifulSoup
import datetime

# Create a document loader for google drive. We can also load from other connectors by setting the connector_id to the appropriate value e.g. ConnectorId.notion.value
# This loader uses our test credentials
psychic = Psychic(secret_key=os.getenv("PSYCHIC_SECRET_KEY"))

raw_messages = psychic.get_conversations(connector_id=ConnectorId.slack, account_id="slack_test")
crammed_messages = []
channel_id = None
channel_name = None
for message in raw_messages:
    # Message content is provided as HTML in order to preserve markup like links, tables, images, etc.
    channel_id = message["channel"]["id"]
    channel_name = message["channel"]["name"]
    content = BeautifulSoup(message["content"]).get_text()
    time = datetime.datetime.fromtimestamp(float(message["timestamp"])).strftime("%m/%d/%y:%H:%M")
    crammed_messages.append("({}) ".format(time) + message["sender"]["name"] + ": " + content)
# print(crammed_messages)

docs = [Document(page_content="\n".join(crammed_messages), metadata={"source": channel_id, "source_name": channel_name})]
print(docs)


[Document(page_content='(05/31/23:12:17) jason: nvm adding it now\n(05/31/23:10:53) jason: i.e. add get_conversations\n(05/31/23:10:53) jason: did you already make the changes to the python package for slack? I want to make a post about it today', metadata={'source': 'C04DWRUG83T', 'source_name': 'engineering'})]


## Converting the docs to embeddings 

We can now convert these documents into embeddings and store them in a vector database like Chroma

In [29]:
from langchain.embeddings.openai import OpenAIEmbeddings
from langchain.vectorstores import Chroma
from langchain.text_splitter import CharacterTextSplitter
from langchain.llms import OpenAI
from langchain.chains import RetrievalQAWithSourcesChain

text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0)
texts = text_splitter.split_documents(docs)


embeddings = OpenAIEmbeddings()
docsearch = Chroma.from_documents(texts, embeddings)

## Question answering

Now we use OpenAI to ask questions over the

In [32]:
chain = RetrievalQAWithSourcesChain.from_chain_type(OpenAI(temperature=0), chain_type="stuff", retriever=docsearch.as_retriever())
chain({"question": "what did jason want to do on 05/31/2023?"}, return_only_outputs=True)

Number of requested results 4 is greater than number of elements in index 1, updating n_results = 1


{'answer': ' Jason wanted to add the get_conversations feature to the Python package for Slack.\n',
 'sources': 'C04DWRUG83T'}