In [15]:
import pandas as pd 
import os
OPENAI_API_KEY = "..."
os.environ["OPENAI_API_KEY"] = OPENAI_API_KEY

# Use OpenAI Chat
Just Testing out the chat to see some code work!

In [3]:
from langchain.llms import OpenAI
llm = OpenAI(temperature=0.9, openai_api_key=OPENAI_API_KEY)

In [4]:
query = "Have you heard of MyCase?"
print(llm(query))



Yes, MyCase is a cloud-based legal practice management platform used by lawyers and law firms. It offers features such as document management, electronic billing, client management, time tracking, and more.


# Creating a Document
Practice with documents, more info here: https://python.langchain.com/en/latest/modules/indexes/document_loaders.html

## Copy and Paste

In [5]:
text = """
Even after an invoice has been created, you can still go back and make changes to it. This is useful for fixing errors, adding additional line items, changing payment terms, etc.

 


To edit an invoice, simply find the invoice you want to edit and open its details page (pictured below). Then, in the actions toolbar, click the Edit button.

 


MyCase will then take you to the Invoice Editor where you will be able to make changes to the invoice.

If trust funds have been applied to the invoice, you can choose to change how the trust information is displayed. While editing the invoice, scroll down to the Apply Trust & Credit Funds section. You will see the available dropdown menus to select whether to have the invoice "Show Trust Summary" (balance only), "Show Trust History," or "Don't Show on Invoice" (no trust information).

If you need to change the contact or case/matter for the invoice, please note that upon changing this information, the invoice will then change to reflect uninvoiced flat fees, time and expenses for the newly selected contact or case/matter.

If you need to change an invoice that had no case selected to now include a case, you can do this by clicking the Edit button and then selecting the case you wish to link it to under the 'Matter' section. Uninvoiced time entries, expenses and flat fees related to this case will then appear on the invoice.

If you need to move a flat fee, time entry, or expense off of an invoice and to another case, click the red X to the left of the entry and choose 'Remove'. This will send the item back to the case file. Save the invoice before moving forward. Then, go to the case file where that removed item is housed and click the Edit button to change the case it should be associated with. The next time that you invoice for this case, the item you moved to this case will be able to be invoiced. 

 

 

User-added image
To edit invoice sharing:

If you need to share or un-share the invoice with contacts, click the Share via Portal or Email Invoice on the invoice toolbar. Read more about Invoice Sharing. You can also send payment reminders and record a payment from this invoice toolbar.

 


 

User-added image
To edit payment history of an invoice (issue refunds and delete transactions):

To edit payment history, click the Invoice History button on the invoice toolbar. Read more about Editing Payment History.

 
"""

In [9]:
from langchain.docstore.document import Document
metadata={"article-name": "How do I edit an existing invoice? [w/ Video]"}
doc = Document(page_content=text, metadata=metadata) 

In [10]:
doc

Document(page_content='\nEven after an invoice has been created, you can still go back and make changes to it. This is useful for fixing errors, adding additional line items, changing payment terms, etc.\n\n \n\n\nTo edit an invoice, simply find the invoice you want to edit and open its details page (pictured below). Then, in the actions toolbar, click the Edit button.\n\n \n\n\nMyCase will then take you to the Invoice Editor where you will be able to make changes to the invoice.\n\nIf trust funds have been applied to the invoice, you can choose to change how the trust information is displayed. While editing the invoice, scroll down to the Apply Trust & Credit Funds section. You will see the available dropdown menus to select whether to have the invoice "Show Trust Summary" (balance only), "Show Trust History," or "Don\'t Show on Invoice" (no trust information).\n\nIf you need to change the contact or case/matter for the invoice, please note that upon changing this information, the inv

## URLs
Loading URLs, here we are loading multiple help articles without having to copy and paste 
https://python.langchain.com/en/latest/modules/indexes/document_loaders/examples/url.html

In [3]:
from langchain.document_loaders import UnstructuredURLLoader
import nltk

In [8]:
urls = [
   'https://support.mycase.com/en/articles/6638008-how-do-i-edit-an-existing-invoice-w-video',
    'https://support.mycase.com/en/articles/6449590-can-i-create-time-expense-entries-for-other-firm-users',
    'https://support.mycase.com/en/articles/6442227-improved-invoice-details-layout'
]

In [9]:
loader = UnstructuredURLLoader(urls=urls)

In [10]:
data = loader.load()

In [11]:
data

[Document(page_content='Even after an invoice has been created, you can still go back and make changes to it. This is useful for fixing errors, adding additional line items, changing payment terms, etc.\n\nTo edit an invoice, simply find the invoice you want to edit and open its details page (pictured below). Then, in the actions toolbar, click the Edit button.\n\nTo edit invoice sharing:\n\nIf you need to share or un-share the invoice with contacts, click the Share via Portal or Email Invoice on the invoice toolbar. Read more about Invoice Sharing. You can also send payment reminders and record a payment from this invoice toolbar.\n\nTo edit payment history of an invoice (issue refunds and delete transactions):\n\nTo edit payment history, click the Invoice History button on the invoice toolbar. Read more about Editing Payment History.\n\nCreating an Invoice [w/ VIDEO]Enabling Online Payments on an Invoice [w/ VIDEO]Quick Invoicing Without a Case', metadata={'source': 'https://support.

# QA With Documents
https://python.langchain.com/en/latest/use_cases/question_answering.html
https://python.langchain.com/en/latest/modules/chains/index_examples/vector_db_qa.html

In [21]:
from langchain.indexes import VectorstoreIndexCreator
from langchain.vectorstores import Chroma
from langchain.text_splitter import CharacterTextSplitter
from langchain.embeddings import OpenAIEmbeddings

In [24]:
# using the loader up above with the urls
index = VectorstoreIndexCreator(
    vectorstore_cls=Chroma, 
    embedding=OpenAIEmbeddings(openai_api_key=OPENAI_API_KEY),
    text_splitter=CharacterTextSplitter(chunk_size=1000, chunk_overlap=0)
).from_loaders([loader]) 

Using embedded DuckDB without persistence: data will be transient


In [25]:
index

VectorStoreIndexWrapper(vectorstore=<langchain.vectorstores.chroma.Chroma object at 0x16ce47490>)

In [28]:
query = "How do I edit an invoice?"
index.query_with_sources(query)

{'question': 'How do I edit an invoice?',
 'answer': ' To edit an invoice, find the invoice you want to edit and open its details page. Then, in the actions toolbar, click the Edit button. You can also share or un-share the invoice with contacts, send payment reminders, record a payment, delete the invoice, print the invoice, download the invoice, and email the invoice with a credit card payment link.\nSource: \nhttps://support.mycase.com/en/articles/6638008-how-do-i-edit-an-existing-invoice-w-video\nhttps://support.mycase.com/en/articles/6442227-improved-invoice-details-layout',
 'sources': ''}

In [30]:
query = "How do I create a time entry?"
index.query_with_sources(query)

{'question': 'How do I create a time entry?',
 'answer': " To create a time entry, open the Invoices section under the Contact details and select the contact you'd like to add the entry for. Then, select 'None' under the 'Matter' section and add your notes and the amount of the flat fee.\n",
 'sources': 'https://support.mycase.com/en/articles/6449590-can-i-create-time-expense-entries-for-other-firm-users, https://support.mycase.com/en/articles/6442227-improved-invoice-details-layout'}

In [45]:
query = "Can I create a time entry for other firm users?"
index.query_with_sources(query)

{'question': 'Can I create a time entry for other firm users?',
 'answer': ' Yes, you can create time/expense entries on behalf of another firm user.\n',
 'sources': 'https://support.mycase.com/en/articles/6449590-can-i-create-time-expense-entries-for-other-firm-users'}

In [46]:
query = "Is MyCase a good company?"
index.query_with_sources(query)

{'question': 'Is MyCase a good company?',
 'answer': " I don't know if MyCase is a good company.\n",
 'sources': 'N/A'}

In [47]:
query = "Who is oliva rodrigo?"
index.query_with_sources(query)

{'question': 'Who is oliva rodrigo?',
 'answer': " I don't know.\n",
 'sources': 'N/A'}

It looks like the bot is slightly over confident but I think this has to do more with the limited information given to it. It is a good sign that it says it doesn't know to things it doesn't know

# Retrieval Question/Answering 
https://python.langchain.com/en/latest/modules/chains/index_examples/vector_db_qa.html
https://python.langchain.com/en/latest/modules/chains/index_examples/question_answering.html

This is essentially what we are doing above, but the code above is a level of abstraction higher. Here, we are really diving into the RetrievalQA chain

In [34]:
from langchain.embeddings.openai import OpenAIEmbeddings
from langchain.vectorstores import Chroma
from langchain.text_splitter import CharacterTextSplitter
from langchain.llms import OpenAI
from langchain.chains import RetrievalQA
from langchain.prompts import PromptTemplate

In [36]:
urls = [
   'https://support.mycase.com/en/articles/6638008-how-do-i-edit-an-existing-invoice-w-video',
    'https://support.mycase.com/en/articles/6449590-can-i-create-time-expense-entries-for-other-firm-users',
    'https://support.mycase.com/en/articles/6442227-improved-invoice-details-layout'
]
loader = UnstructuredURLLoader(urls=urls)
documents = loader.load()

text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0)
texts = text_splitter.split_documents(documents)

embeddings = OpenAIEmbeddings()
docsearch = Chroma.from_documents(texts, embeddings)

Using embedded DuckDB without persistence: data will be transient


In [37]:
prompt_template = """Use the following pieces of context to answer the question at the end. If you don't know the answer, just say that you don't know, don't try to make up an answer.

{context}

Question: {question}
"""
PROMPT = PromptTemplate(
    template=prompt_template, input_variables=["context", "question"]
)

In [39]:
chain_type_kwargs = {"prompt": PROMPT}
qa = RetrievalQA.from_chain_type(
    llm=OpenAI(),
    chain_type="stuff",
    retriever=docsearch.as_retriever(),
    chain_type_kwargs=chain_type_kwargs,
    return_source_documents=True
)


In [40]:
query = "How do I create a time entry?"
result = qa({"query": query})

In [42]:
result['result']

"\nTo create a time entry, select the User field and choose the name of the person you'd like to add the entry for. Then, add notes and the amount of the flat fee. You can also add more flat fees if you require them. Time entries and expenses will not be an option unless a case file were to be chosen."

In [44]:
result['source_documents']

[Document(page_content="Yes, you can create time/expense entries on behalf of another firm user.\n\nOVERVIEW\n\nThis comes in handy for situations such as a paralegal entering time for an attorney. Even though the paralegal is entering the entry into MyCase, it should still be tied to the attorney, not the paralegal.\n\nWhen creating a time/expense entry, there is a field called User (pictured below). You will see a dropdown menu that has the names of all the firm employees on your MyCase account. By default, your name will appear here. Simply select the name of the person you'd like to add the entry for.", metadata={'source': 'https://support.mycase.com/en/articles/6449590-can-i-create-time-expense-entries-for-other-firm-users'}),
 Document(page_content="MyCase makes it easy for you to create an invoice for contacts whom you need to bill for a consultation, or to bill clients who need a basic task done that isn't necessarily associated with a case or matter.When creating an invoice, a

Doesn't look like the response here was much better, but still cool that we are getting answers. I've spent 15 cents at this point too. There was a big jump from 35 hundreths of a cent to 15 cents when running the chain up above for some reason. Could have been a delay too. 

Seems like this isn't really any better than the version above

# Takeaways
After messing around with this and doing some research this last weekend I've concluded the following:
- This is a super powerful/interesting tool
- Langchain is awesome because you can build quickly, but it is very abstracted. It is definitely important to spend some time really going through the code and what's going on under the hood if we are to actually implement a solution with it
- Costs are still a bit unclear to me, howver I know the general costs/ workflow goes like this
    - We get our documents prepped to be embedded (free)
    - We embed our data (there are free options, but OpenAI embeddings endpoint has costs)
    - We send our embeddings to a vectorDB (using chroma which is free, but transient, there are free tier with paid options like pinecone, weiviate, etc. Postgres also has the ability to store vectors
    - We receive user input and embed that input (have to pay to embed it but shouldn't be too expensive
    - We query the vectorDB for documents based on our input (free)
    - We return the documents from our query and then "stuff" them into our prompt as context. There are other ways to do this too. The prompt used for this is actually written above. (OPENAI chat endpoint which can cost some money)
    - We return the result to the user (have the option to include the sources used, probably could get other metadata shown too but not sure
- Data not persistent
    - Need to look into a way to store these vector embeddings, otherwise we are wasting resources embedding the article every time someone queries
    - As mentioned above can look into pinecone, weviate and other options
- Need more data
    - Would be good to add a few more links to really show the world what this puppy can do, just don't want to waste money doing this until we are saving it in a vectorDB
- Model can be slightly overconfident
    - As seen above, the model thinks it knows how to create a time entry, but it doesn't. I think by supplying it with more info, I think we can get arount this issue, the nice thing is that we can cite sources so users at least know where the model is coming from. 