### Objective

In this notebook, we test the quality of the generated conversation

In [1]:
from chatbot import JournalistBot, AuthorBot
from embedding_engine import Embedder
from topic_classifier import TopicClassifier
import utilities
from pdf2image import convert_from_path
import PyPDF2
import os

In [2]:
issue = 'ABB Review_02_2023_layout complete_EN_72-300dpi.pdf'
articles = utilities.extract_articles(issue)
relevant_topics = ['Sustainability initiatives']
article = articles[-4]

In [3]:
focal_points = {
        
    'Tech and product insights': {
        'description': 'Spotlight on new tech and product',
        'target audience': 'R&D engineers',
        'prompt': f"""
        - Seek detailed information about the technology and product advancements. 
        - Explore the underlying tech innovations and their advantages.
        - Discuss product features, benefits, and differentiators.
        """
    },
    
    'Market dynamics': {
        'description': 'Explore the market implications',
        'target audience': 'marketing professionals',
        'prompt': f"""
        - Seek detailed information about market implications and relevance.
        - Explore challenges and opportunities in the current market scenario.
        - Discuss marketable insights and strategies. 
        """
    },
    
    'Operational transformation': {
        'description': 'Insights on optimized processes and operations',
        'target audience': 'operational experts & managers',
        'prompt': f"""
        - Seek detailed information about changes in operational processes and efficiency gains. 
        - Explore how the transformations impact daily operations and long-term strategies.
        """
    },
    
    'Sustainability initiatives': {
        'description': "ABB's contributions to environmental sustainability",
        'target audience': 'sustainability officers',
        'prompt': f"""
        - Seek detailed information about ABB's sustainability strategies and solutions.
        - Explore the measurable impacts of the sustainability initiatives.
        """
    },
    
    'Customer experience': {
        'description': "Dive into the end-user benefits and experiences",
        'target audience': 'customers',
        'prompt': f"""
        - Explore how ABB’s solutions and products enhance user interactions and satisfaction.
        """
    },
    
    'Industry challenges and opportunities': {
        'description': "Peering into hurdles and growth areas",
        'target audience': 'business developers',
        'prompt': f"""
        - Discuss industry-wide pain points.
        - Explore ABB’s strategies to tackle them.
        - Explore the potential for growth
        """
    },
    
    'Strategic collaborations': {
        'description': "Highlighting strategic partnerships",
        'target audience': 'partnership managers',
        'prompt': f"""
        - Inquire about the nature and objectives of ABB's partnerships and strategic collaborations. 
        - Discuss the synergies achieved through such collaborations.
        """
    },
    
    'Strategy innovation': {
        'description': "Unpacking ABB's approaches to business strategies",
        'target audience': 'executives',
        'prompt': f"""
        - Seek detailed information about the innovative strategies adopted by ABB. 
        - Discuss the anticipated impact of these strategic innovations.
        """
    },
    
    'General overview': {
        'description': "A holistic breakdown of the article's key themes",
        'target audience': 'general public',
        'prompt': f"""
        - Summarize the main themes and highlights of the article.
        - Probe into any general insights or takeaways from the content.
        """
    }
}

In [4]:
# Create embeddings
embedding = Embedder()
documents = embedding.load_n_process_document(issue="./papers/"+issue,
                                             page={
                                                 'start': article['start_page'],
                                                 'length': article['length']
                                             }, chunk_size=1000, debug=True)
vectorstore = embedding.create_vectorstore(store_path="./vectorstore/"+article['title'])

# Create summary
article_summary = embedding.create_summary(summary_method='map_reduce')
print(f"Article summary: {article_summary}")

Embeddings found! Loaded the computed ones
Article summary: The article discusses a study by ABB that compares the carbon footprint of a Volkswagen Golf GTD (ICE) and a Volkswagen ID.4 (BEV) over their lifetime, taking into account emissions from fuel, exhaust, and BEV battery manufacture. The study estimates future scenarios based on increasing battery energy density and linearly calculated power grid CO₂ emissions in the EU, USA, and China during the BEV use phase. The primary contribution to emissions in both ICE and BEV vehicles occurs during the use phase, but BEVs are more efficient than ICE vehicles in the EU, Norway, China, and the UK. The reduction of emissions in battery electric vehicles (BEVs) is mainly due to the decline in power grid emissions, resulting from grid decarbonization. The article also discusses the impact of manufacturing country, commodity prices, and electricity carbon intensity on greenhouse gas emissions and costs in the context of electric vehicle produc

In [5]:
# Create two chatbots
journalist = JournalistBot('Azure')
author = AuthorBot('Azure', vectorstore)

# Specify instruction for journalist bot
prompt = [focal_points[topic]['prompt'] for topic in relevant_topics]
audience = [focal_points[topic]['target audience'] for topic in relevant_topics]
journalist.instruct(theme=relevant_topics, summary=article_summary,
                    focal_points=prompt, audience=audience)

# Specify instruction for author bot
author.instruct(theme=relevant_topics, audience=audience)

In [6]:
print(journalist._specify_system_message())

You are a journalist examining ABB's developments related to ['Sustainability initiatives'] for ['sustainability officers'].

        Your mission is to interview the article's author, represented by another chatbot, extracting key insights and addressing specific subjects. 
        The provided summary gives you an overview of the article's core details. 
        While the focal points guide your exploration, they shouldn't prompt you to stray far from the article's essence.

        Begin by gaining a broad understanding of the article through the focal points, and progressively focus on specific details. 
        Adjust your line of questioning based on the author bot's feedback, ensuring that your inquiries are both wide-ranging and detailed.

        Guidelines to keep in mind:
        - **Initiate and Lead**: It's crucial that you take the lead in this conversation. 
        Always initiate with a question about the article and guide the dialogue throughout.
        - **Article's

In [7]:
# Book-keeping
question_list = []
answer_list = []
source_list = []

# Start conversation
for i in range(5):
    if i == 0:
        question = journalist.step('Start the conversation')
    else:
        question = journalist.step(answer)
    question_list.append(question)
    print("👨‍🏫 Journalist: " + question)
    
    answer, source = author.step(question)
    answer_list.append(answer)
    source_list.append(source)
    print("👩‍🎓 Author: " + answer)
    print("\n\n")

👨‍🏫 Journalist: Hello, I would like to know more about the study conducted by ABB that compares the carbon footprint of a Volkswagen Golf GTD (ICE) and a Volkswagen ID.4 (BEV) over their lifetime. Can you please provide me with more details on this study?
👩‍🎓 Author: Certainly! According to the article, ABB conducted a study comparing the life-cycle emissions of carbon dioxide (CO₂) for the manufacture of a Volkswagen Golf GTD (ICE) and a Volkswagen ID.4 (BEV), assuming a vehicle lifetime of 240,000 km. The study also collected fuel "well-to-tank" and exhaust emission data for the ICE, as well as data related to emissions from BEV battery manufacture and the lifetime electricity consumption of a BEV.

These emissions were converted into equivalent grams of CO₂ per kilometer driven (gCO₂eq/ km), and both the average-efficiency and most-efficient ICE cars were taken into account. A steadily increasing BEV battery maintenance was also considered.

The analysis presented in the study only 

In [8]:
source_list[-1]

[Document(page_content='136\nASSETS IN MOTION\nABB REVIEW \n—\nCARBON EMISSIONS FROM EV BATTERY  \nPRODUCTION AND USE\nClean machine\n— \nDaniel Chartouni, \nABB Corporate Research\nBaden-Daettwil,  \nSwitzerland\ndaniel.chartouni@\nch.abb.com\nSrinidhi Sampath\nABB Corporate Research\nVästerås, Sweden\nsrinidhi.sampath@\nse.abb.com \nSilvio Colombi\nABB ELSP Smart Power\nQuartino, Switzerland\nsilvio.colombi@ \nch.abb.com \n01', metadata={'source': './papers/ABB Review_02_2023_layout complete_EN_72-300dpi.pdf', 'file_path': './papers/ABB Review_02_2023_layout complete_EN_72-300dpi.pdf', 'page': 53, 'total_pages': 72, 'format': 'PDF 1.6', 'title': '', 'author': '', 'subject': '', 'keywords': '', 'creator': 'Adobe InDesign 18.2 (Macintosh)', 'producer': 'Adobe PDF Library 17.0', 'creationDate': "D:20230424085254+02'00'", 'modDate': "D:20230424164349+02'00'", 'trapped': ''}),
 Document(page_content='maintenance. Moreover, ABB believes there will \nbe a plateau in ICE emissions as this te

#### Editor proof-reading

#### Define editor bot

In [None]:
from langchain.chat_models import AzureChatOpenAI

class EditorBot():
    """Class definition for the editor bot, created with LangChain."""

    
    def __init__(self, engine):
        """Setup editor bot.
        """
        
        # Instantiate llm
        self.llm = AzureChatOpenAI(openai_api_base="https://abb-chcrc.openai.azure.com/",
                            openai_api_version="2023-03-15-preview",
                            openai_api_key=os.environ["OPENAI_API_KEY_AZURE"],
                            openai_api_type="azure",
                            deployment_name="gpt-35-turbo-0301",
                            temperature=0.8)
        
        # Instantiate memory
        self.memory = ConversationBufferMemory(return_messages=True)


    def instruct(self, theme, summary, focal_points, audience):
        """Determine the context of editor chatbot. 
        """
        
        self.theme = theme
        self.summary = summary
        self.focal_points = focal_points
        self.audience = audience
        
        # Define prompt template
        prompt = ChatPromptTemplate.from_messages([
            SystemMessagePromptTemplate.from_template(self._specify_system_message()),
            MessagesPlaceholder(variable_name="history"),
            HumanMessagePromptTemplate.from_template("""{input}""")
        ])
        
        # Create conversation chain
        self.conversation = ConversationChain(memory=self.memory, prompt=prompt, 
                                              llm=self.llm, verbose=False)
        

    def step(self, prompt):
        """Editor chatbot proof-reading Q&A pairs. 
        """
        response = self.conversation.predict(input=prompt)
        
        return response
        


    def _specify_system_message(self):
        """Specify the behavior of the editor chatbot.
        """      

        # Focal points prompt
        focal_point_prompt = ('\n    '.join(self.focal_points)) 

        # Base prompt
        prompt = f"""You are an editor tasked with refining individual Q&A pairs. 
        These Q&A pairs come from an interview between a journalist and an author about ABB's developments related to {self.theme}. 
        The journalist's questions aimed to address the following focal points:
            {focal_point_prompt}.

        Your objectives in this chat-based format are:
        1. Realign the journalist's question, when necessary, to fit the article's content.
        2. Refine the author's answer to ensure it accurately mirrors the article's information.

        Your goal is to produce a refined Q&A interaction for each input, which should be concise, coherent, natural-sounding, and 
        informative for the audience {self.audience}. 

        For every Q&A pair provided, offer a refined version following the format:
        [Refined Question (without any prefix)]
        [Refined Answer (without any prefix)]

        You will be supplied with individual Q&A pairs, one at a time, for refinement, along with relevant article snippet, in the following format:

        Original Q&A:
        [Insert original Q&A here for reference]

        Article Reference:
        [Insert relevant snippets from the article here for guidance]
        """

        
        return prompt

prompt = f"""
You are a journalist examining ABB's developments related to {self.theme} for {self.audience}.

Your mission is to interview the article's author, represented by another chatbot, to extract key insights about {self.theme}. Use the provided summary as a starting point, but be prepared to adapt your line of questioning based on the author bot's feedback. Your questions should delve deeper into the theme and be based on the information the author bot provides.

Guidelines to keep in mind:
- **Article's Essence**: Use the article's summary as your anchor. Your questions should resonate with the theme of {self.theme}.
- **Adapt and Improvise**: Pay close attention to the author's answers and craft your follow-up questions based on the information provided. This will ensure that your interview stays within the bounds of the article's content.
- **Stay in Role**: Remember, your role as a journalist is to unearth valuable details for {self.audience}.
- **Question Quality**: Ask clear, concise questions that stem from the article's content.
- **Formatting**: Refrain from prefixing questions with labels like "Interviewer:" or "Question:".

[Summary]: {self.summary}
"""


#### Simulate with GPT-4

In [None]:
question = """Thank you for joining me today. Let's dive straight into the topic. Your article touches upon the carbon emissions of battery electric vehicles (BEVs) versus internal combustion engine (ICE) vehicles, emphasizing the role of battery production and use. ABB's study into the matter provides some compelling findings. To set the stage for our readers, particularly those who are sustainability officers, could you elaborate on ABB's key strategies and solutions related to sustainability in this context?"""

In [None]:
answer, source = author.step(question)
print(answer)

In [None]:
print(answer)

In [None]:
question = """That's insightful. With this detailed comparison in place, it's evident that geography plays a significant role in the carbon footprint of BEVs. Can you shed some light on the factors that caused BEVs to have a more favorable footprint in regions like the EU and the USA as compared to China, and why do we see China catching up by 2030?"""

In [None]:
print(question)

In [None]:
answer, source = author.step(question)
print(answer)

In [None]:
question = """It's interesting how the decarbonization of the power grid significantly impacts the efficiency of BEVs. Moving forward, what measurable impacts have ABB's sustainability initiatives in promoting BEVs, and their broader efforts in grid decarbonization, had so far? Are there specific milestones or data points that stand out?"""
print(question)

In [None]:
answer, source = author.step(question)
print(answer)

In [None]:
question = """Understood, thank you for clarifying. Delving deeper into the article's essence, can you provide more information about the emissions associated with BEV battery production? Specifically, how does the carbon footprint of BEV battery production compare with the emissions from fueling and operating an ICE vehicle?"""
print(question)

In [None]:
answer, source = author.step(question)
print(answer)

In [None]:
question = """Given that the primary emissions for both BEVs and ICE vehicles mainly arise from the usage stage, it's evident that the power grid's carbon intensity plays a pivotal role. As countries transition towards cleaner energy sources, how does ABB envision the future landscape for BEVs, especially in regions where grid decarbonization may not be progressing as rapidly? Are there potential solutions or recommendations proposed in your article to address these challenges?"""
print(question)

In [None]:
answer, source = author.step(question)
print(answer)

In [None]:
question = """Thank you for sharing those insights. Lastly, considering the significant emphasis on grid decarbonization for the efficiency of BEVs, how can sustainability officers leverage this information in their respective organizations? Are there specific actionable takeaways that sustainability officers can implement, or advocate for, based on ABB's findings?"""
print(question)

In [None]:
answer, source = author.step(question)
print(answer)

In [None]:
question = """Thank you for elucidating on the actionable steps. Your article and insights will certainly be invaluable for sustainability officers and other stakeholders aiming to make more informed decisions around BEVs and grid decarbonization. We appreciate your time and expertise on this matter."""
print(question)

#### Repository

prompt = f"""You are a journalist examining ABB's developments related to {self.theme} for {self.audience}.

You will interview the article's author (portrayed by another chatbot) to elucidate key points of the article. The provided summary gives you an overview of the article's core details. While the focal points guide your exploration, they shouldn't prompt you to stray far from the article's essence.

Begin by gaining a broad understanding of the article through the focal points, and progressively focus on specific details. Adjust your line of questioning based on the author bot's feedback, ensuring that your inquiries are both wide-ranging and detailed.

Guidelines:
- **Foundation**: Always anchor your questions in the article's summary. If a topic isn't mentioned in the summary, approach it cautiously.
- **Stay in Role**: As a journalist, your goal is to unveil key information for {self.audience}.
- **Address the Focal Points**: While they guide your questions, only delve deep if the summary or the author bot's responses support it:
    {focal_point_prompt}
- **Question Quality**: Ask clear, concise questions that stem from the article's content.
- **Formatting**: Do not add prefixes like "Interviewer:" or "Question:" to your questions.

[Summary]: {self.summary}
"""


In [None]:
You are a journalist diving into ABB's developments related to ['Sustainability initiatives'], aiming to provide insights beneficial for ['sustainability officers'].

        Your mission is to interview me, the article's author and extract key insights and addressing specific subjects. 
        Utilize the provided summary for a foundational understanding of the article, but be prepared to venture beyond it based on the highlighted focal points.

        Start your interview by setting the context with a broad overview of the focal points. 
        As the dialogue unfolds, hone in on specific details and nuances. 
        Adapt your line of inquiry based on the author bot's responses, ensuring a blend of overarching views and intricate insights.

        Guidelines to keep in mind:
       - Ask one question at a time, I will answer your question. Once I am done, you can ask the next question.
        - **Article's Essence**: Let the article's summary be your anchor, but don't be restricted by it. Dive deeper to address the focal points.
        - **Stay in Role**: Your role as a journalist is to unearth valuable details for ['sustainability officers'].
        - **Address the Focal Points**: These are your guideposts. Ensure your questions resonate with these themes:
            
        - Seek detailed information about ABB's sustainability strategies and solutions.
        - Explore the measurable impacts of the sustainability initiatives.
        
        - **Question Quality**: Frame questions that are precise, concise, and intrinsically tied to the provided content.
        - **Formatting**: Refrain from prefixing questions with labels like "Interviewer:" or "Question:".

        [Summary]: The article discusses the carbon emissions of battery electric vehicles (BEVs) and internal combustion engine (ICE) vehicles, with a focus on the impact of battery production and use. ABB conducted a study comparing the carbon footprint of a BEV and an ICE vehicle, estimating the future impact of BEV battery production and use on emissions. The study found that BEVs already have lower emissions than the most efficient ICE vehicles in some regions, and will be better than the most efficient ICE vehicles in China by 2030. The reduction of emissions in BEVs is primarily due to the decline of power grid emissions brought about by grid decarbonization.

Now start the interview.

#### Refinement bot

You are tasked with refining a given Q&A pair from a journalist-author interaction to ensure that the conversation aligns closely with the content of a specific article. Your goal is to rephrase the journalist's question, if necessary, to make it more relevant to the article's details, and ensure that the author's response accurately reflects the content from the article. Create a refined, coherent, and informative Q&A pair that serves as a precise representation of the article's content.

Original Q&A:
[Insert original Q&A here]

Article Reference:
[Insert relevant sections or snippets from the article here]
`

*[Another version]*

You are tasked with refining a Q&A pair derived from a journalist-author interview about ABB's developments related to {self.theme}. The audience for this information is {self.audience}, and the intent is to ensure accurate representation of the article's content.

Your goal: Realign the journalist's question with the article's content, if necessary, and refine the author's response to reflect the article accurately. Produce a coherent and concise interaction that feels natural and is informative for the specified audience.

Original Q&A:
[Insert original Q&A here]

Article Reference:
[Insert relevant sections or snippets from the article here]
