In [1]:
import os
os.environ["OPENAI_API_TYPE"] = "azure"
os.environ["OPENAI_API_VERSION"] = "2023-07-01-preview"
os.environ["OPENAI_API_BASE"] = ""
os.environ["OPENAI_API_KEY"] = ""

In [2]:
from langchain.chains.llm import LLMChain
from langchain.prompts import PromptTemplate
from langchain.chains.combine_documents.stuff import StuffDocumentsChain
from langchain.chat_models import AzureChatOpenAI
from langchain.chains.mapreduce import MapReduceChain
from langchain.text_splitter import CharacterTextSplitter
from langchain.chains import ReduceDocumentsChain, MapReduceDocumentsChain
from langchain.text_splitter import RecursiveCharacterTextSplitter

In [3]:
llm = AzureChatOpenAI(
    deployment_name="gpt-4"
    #deployment_name="gpt-4-32k"
)

In [4]:
with open('conversation.txt', 'r') as file:
    text = file.read()

# Printing the first 285 characters as a preview
print (text[:285])

00:00.000 --> 00:13.000]  I can say we are at the moment in advance stage, I can say, because we've been adopting different automations.
[00:13.000 --> 00:20.000]  I have a different view from the previous panelists when we discussed about the book.
[00:20.000 --> 00:27.000]  I rememb


In [5]:
num_tokens = llm.get_num_tokens(text)

print (f"There are {num_tokens} tokens in your file")

There are 16520 tokens in your file


In [6]:
text_splitter = RecursiveCharacterTextSplitter(separators=["\n\n", "\n"], chunk_size=5000, chunk_overlap=350)
docs = text_splitter.create_documents([text])

print (f"You now have {len(docs)} docs intead of 1 piece of text")

You now have 12 docs intead of 1 piece of text


In [7]:
# Reduce
reduce_template = """The following is the conversation in the panel discussion:
{doc_summaries}

Summarise the conversation in the 10 bullet points in details.

Helpful Answer:"""
reduce_prompt = PromptTemplate.from_template(reduce_template)
map_chain = LLMChain(llm=llm, prompt=reduce_prompt)

In [8]:
# Run chain
reduce_chain = LLMChain(llm=llm, prompt=reduce_prompt)

# Takes a list of documents, combines them into a single string, and passes this to an LLMChain
combine_documents_chain = StuffDocumentsChain(
    llm_chain=reduce_chain, document_variable_name="doc_summaries"
)

# Combines and iteravely reduces the mapped documents
reduce_documents_chain = ReduceDocumentsChain(
    # This is final chain that is called.
    combine_documents_chain=combine_documents_chain,
    # If documents exceed context for `StuffDocumentsChain`
    collapse_documents_chain=combine_documents_chain,
    # The maximum number of tokens to group documents into.
    token_max=4000,
)

In [9]:
# Combining documents by mapping a chain over them, then combining results
map_reduce_chain = MapReduceDocumentsChain(
    # Map chain
    llm_chain=map_chain,
    # Reduce chain
    reduce_documents_chain=reduce_documents_chain,
    # The variable name in the llm_chain to put the documents in
    document_variable_name="doc_summaries",
    # Return the results of the map steps in the output
    return_intermediate_steps=False,
)

text_splitter = CharacterTextSplitter.from_tiktoken_encoder(
    chunk_size=1000, chunk_overlap=0
)
split_docs = text_splitter.split_documents(docs)

In [10]:
print(map_reduce_chain.run(split_docs))

1. The discussion began with panelists sharing their experiences of automation and integration in their businesses. They discussed the use of automated solutions in areas like procurement and the implementation of robotics in vendor management. They also expressed their future plans to incorporate AI in sourcing.

2. Panelists then discussed the challenges of digital adoption faced by smaller companies with limited resources. They highlighted the need to integrate standalone solutions and shared examples from HR processes. They introduced the concept of "dog fooding", the practice of testing solutions internally before introducing them to customers.

3. The role of CFOs in digital transformation was discussed next. Panelists emphasized the importance of CFOs championing change and fostering a strong partnership with CIOs. They also discussed potential technology disruptions, focusing on process mining, its role in identifying inefficiencies, and providing insights for potential automat