# Case study writer

In [17]:
# Imports
import os
import openai
from dotenv import load_dotenv
from langchain_community.vectorstores import FAISS
from langchain_core.output_parsers import StrOutputParser
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.runnables import RunnableParallel, RunnablePassthrough
from langchain_openai import ChatOpenAI, OpenAIEmbeddings

To use OpenAI models online we need an API key, which you can request on the OpenAI website. The best way to keep this key is in a hidden `.env` file.

In [18]:
# Load the API key
load_dotenv()
openai.api_key = os.environ["OPENAI_API_KEY"]

We have converted the documents we want to use as reference material into a vector store. Now we can load this vector store to use it. We will also load a separate vector store with examples from the Covolt case study (this could have been any other case study).

In [19]:
# Load Vereijken vector store
vector_store = FAISS.load_local('../vector_stores/vereijken.faiss', OpenAIEmbeddings(),
                                allow_dangerous_deserialization=True)

# Create retriever
retriever = vector_store.as_retriever(search_type='mmr',
                                      search_kwargs={'k': 4, "score_threshold": 0.25})

# Load Example vector store
example_vector_store = FAISS.load_local('../vector_stores/vereijken.faiss', OpenAIEmbeddings(),
                                        allow_dangerous_deserialization=True)

# Create retriever
example_retriever = example_vector_store.as_retriever(search_type='mmr',
                                                      search_kwargs={'k': 1})

Now we can instantiate the model

In [20]:
# Instantiate the model
model = ChatOpenAI(model="gpt-4o")

The prompt should contain enough context and detailed instructions so that the model can generate output that is as close to the user's expectation as possible. This is called prompt engineering and usually requires several iterations to get right. 

Some of the things written in the prompt were only added on the second or third iteration: 
- Do not write more than indicated in the instruction. Avoid repetition and keep the 
sections relatively concise and short.
- Do not explain what type of company Bright Cape is, only focus on the client company and the application. 
- Only write things relevant to the current section. For example, when asked to write a client profile, only write a 
client profile; do not go into the challenges or model.

In [21]:
# Build a prompt template
template = """
You are an expert copy writer that writes sections of a case study for a professional website of a data
consultancy firm (Bright Cape). The case studies highlight specific data science / engineering / visualization 
applications that have been developed for a client. You will receive an exact instruction, an example and information 
about the client company and application to guide you. Do as the instruction states and make extensive use of the 
example and additional information. Do not write more than indicated in the instruction. Avoid repetition and keep the 
sections relatively concise and short.

Instruction:
{instruction}

Keep the tone professional but approachable and not too dry. Give enough detail to show that Bright Cape employs 
competent specialists, but not so much that a PhD is needed to understand. Stay very close to the format shown in the 
example. Do not explain what type of company Bright Cape is, only focus on the client company and the application. 
Only write things relevant to the current section. For example, when asked to write a client profile, only write a 
client profile; do not go into the challenges or model.

Let's think step by step.

Here is an example to guide you:
{example}

Use the following information about the company and application:
{context}
"""

Here we define a `RunnableParallel`; we can query the retriever and example retriever at the same time. The instruction is used to query the retrievers but is also passed on to the next step (the LLM) in order to write an answer.

In [22]:
prompt = ChatPromptTemplate.from_template(template)
output_parser = StrOutputParser()

setup_and_retrieval = RunnableParallel(
    {"context": retriever, "example": example_retriever, "instruction": RunnablePassthrough()}
)

We write a separate instruction for each section of the case study, including the desired length.

In [23]:
# Define company
company_name = "Vereijken Kwekerijen"

# Instructions
instructions = [
    f"Write a client profile for {company_name} of no more than a few sentences, "
    f"explaining what industry they are in, what they do, and what their vision is",
    f"Write a section explaining the unique challenges and problems that {company_name} was facing of 1-2 paragraphs",
    f"Write a section explaining Bright Cape's approach to solve the problems {company_name} was facing of 2-3 paragraphs, "
    f"explain what type of model was used and why",
    f"Write a section explaining the results and the impact that the solution has had for {company_name} of 1-2 paragraphs. "
    f"Use numbers to quantify the impact, such as a percentage reduction in cost"
 ]


Now we can set up the full chain and loop over the instructions, eacht time calling the model with a new instructions and appending the output to our list. 

In [24]:
# Set up section writing chain with LCEL
section_writing_chain = setup_and_retrieval | prompt | model | output_parser

# Create an empty list for written sections
written_sections = []

# Iteratively call the LLM to the sections
for instruction in instructions:
    response = section_writing_chain.invoke(instruction)
    written_sections.append(response)

In [25]:
written_sections[0]

'Vereijken Kwekerijen operates in the greenhouse horticulture industry, specializing in the cultivation of vine tomatoes. With over 50 hectares of growing space across multiple locations in Noord-Brabant and the Westland area, they ensure year-round supply through 40 hectares of illuminated cultivation. Their vision is to create a progressive, result-oriented environment that emphasizes collaboration and personal development, ensuring high-quality products and efficient processes.'

In [26]:
written_sections[1]

'Vereijken Kwekerijen, a modern vine tomato cultivation company, was facing significant challenges in optimizing the use of their combined heat and power (CHP) systems and lighting installations spread across multiple greenhouses. The company needed to balance the production of CO2, heat, and electricity from the CHPs with the fluctuating energy demands of the greenhouses and the variable prices on the energy market. This complexity made it difficult to determine the optimal operation schedule for the CHPs and lighting, leading to potential inefficiencies and higher operational costs.\n\nThe existing manual approach to managing these systems was time-consuming and labor-intensive, involving the use of an Excel-based tool that could only analyze costs on a daily basis. This tool required the energy manager to manually test various operational scenarios, which did not guarantee finding the optimal solution. Moreover, this method was impractical for forecasting future needs due to the sig

In [27]:
written_sections[2]

"To address the challenges faced by Vereijken Kwekerijen, Bright Cape adopted a methodical approach centered on developing an automated optimization model. This model was designed to streamline the decision-making process regarding the operation of gas-powered combined heat and power (CHP) units. Vereijken's existing process involved manually testing and comparing various combinations to determine the optimal use of these units, which was not only time-consuming but also lacked assurance of achieving the most cost-effective solution. \n\nBright Cape's solution harnessed the power of linear programming to create a robust model that could minimize total energy costs by optimizing the use of CHP units. This model considered multiple variables such as gas and electricity consumption costs, potential revenue from returning excess electricity to the grid, and the operational constraints of the CHP units. By automating this process, the model could efficiently explore countless combinations a

In [28]:
written_sections[3]

"The implementation of the automated optimization model has yielded remarkable results for Vereijken Kwekerijen. By fine-tuning the use of their gas-powered cogeneration units, Vereijken has achieved a significant reduction in energy costs. Specifically, the optimization model led to a 12% decrease in energy consumption, translating into substantial cost savings. This reduction is a testament to the model's efficiency in balancing the internal energy demands with the fluctuating market prices for energy supply and return.\n\nFurthermore, the model has provided valuable insights into past energy management decisions, helping Vereijken refine their strategies going forward. The collaboration between Bright Cape's data scientists and Vereijken's product experts has culminated in a robust solution that not only optimizes current resource usage but also lays the groundwork for future improvements. This enhanced energy management process positions Vereijken Kwekerijen to maintain its competi

We can now set up a final template to do some post-processing. The purpose of this is to give us a smoother transition from one section to the next. We also ask the LLM to come up with a title based on some good examples.

In [29]:
# Define the final template
final_template = """
You are an expert copy writer that writes a full case study from separate sections for a data consultancy firm.
There are four sections in total, which are provided below in the correct order. Do not alter the content of 
the sections too much, as they are already correct. You are only allowed to delete or rewrite a sentence here and there.
 Your aim is to create a cohesive whole out of the separate sections, with a focus in particular on a natural,
 smooth transition from one section to the next. However, the sections should remain separated: keep the headers in 
 between the sections. Keep the tone professional but approachable and not too dry. Avoid repetition and keep the 
 sections relatively concise.

Please find the four sections below:

### Section 1: Client Profile
{section1}

### Section 2: Unique Challenges and Problems
{section2}

### Section 3: Bright Cape's Approach
{section3}

### Section 4: Results and Impact
{section4}

Also write a good consulting title for the piece. Here are some examples of good titles:
- An AI power play: Fueling the next wave of innovation in the energy sector
- From farm to tablet: Building a new business to solve an old challenge
- Building a next-generation carbon platform to accelerate the path to net zero
- Banking on innovation: How ING uses generative AI to put people first
"""


We can now set up the full chain and final template.

In [30]:
final_prompt = ChatPromptTemplate.from_template(final_template)

# LangChain Expressive Language chain syntax
chain = final_prompt | model | output_parser

This time when invoking the chain we do not have to give an explicit instruction since it is already in the prompt, but we pass the separate sections from the list into the correct places.

In [31]:
# Print output
full_case_study = chain.invoke({'section1': written_sections[0],
                                'section2': written_sections[1],
                                'section3': written_sections[2],
                                'section4': written_sections[3]})

full_case_study

"### Illuminating Efficiency: How Bright Cape Optimized Energy Management for Vereijken Kwekerijen\n\n### Section 1: Client Profile\nVereijken Kwekerijen operates in the greenhouse horticulture industry, specializing in the cultivation of vine tomatoes. With over 50 hectares of growing space across multiple locations in Noord-Brabant and the Westland area, they ensure year-round supply through 40 hectares of illuminated cultivation. Their vision is to create a progressive, result-oriented environment that emphasizes collaboration and personal development, ensuring high-quality products and efficient processes.\n\n### Section 2: Unique Challenges and Problems\nVereijken Kwekerijen, a modern vine tomato cultivation company, was facing significant challenges in optimizing the use of their combined heat and power (CHP) systems and lighting installations spread across multiple greenhouses. The company needed to balance the production of CO2, heat, and electricity from the CHPs with the fluc

Finally we can save the case study as a text file.

In [16]:
# Save the string 'full_case_study' to a .txt file
with open('../results/full_case_study_4o_3.txt', 'w') as f:
    f.write(full_case_study)

print("\nThe full case study was successfully saved")


The full case study was successfully saved
