# Getting Started with Prompt Engineering
by Karen (inspired by https://github.com/dair-ai/Prompt-Engineering-Guide)


This notebook contains examples and exercises to learning about prompt engineering.

API key from PlusGPT was used. 

In [1]:
import openai
import os
import IPython
from langchain.llms import OpenAI
from dotenv import load_dotenv
import requests

In [None]:
load_dotenv()

# for LangChain
os.environ["PLUSGPT_API_KEY"] = os.getenv("PLUSGPT_API_KEY")
os.environ["SERPAPI_API_KEY"] = os.getenv("SERPAPI_API_KEY")

---

## 1. Prompt Engineering Basics

Objectives
- Load the libraries
- Review the format
- Cover basic prompts
- Review common use cases

Below we are loading the necessary libraries, utilities, and configurations. 

Please install `dotenv` as follows: `pip install python-dotenv`

In [4]:
def get_response(data):
    """ GET completion response from plusgpt api"""
    response = requests.post("https://api.plusgptcloud.com", headers=headers, json=data, timeout=70)
    print(response.json()['choices'][0]['message']['content'])    

In [5]:
def get_response_from_prompt():
  while True:
    user_input = input()
    data = {
  "user": f"""
system: principal data engineer
prompt: {user_input}
  """,
  "lang": "english"
}
    if user_input == "exit":
        break
    get_response(data)

In [7]:
# basic example
prompt="Apache Kafka is"

get_response_from_prompt()

Apache Kafka is


Apache Kafka is an open-source distributed event streaming platform that is gaining widespread adoption in modern architecture. It provides a fast, scalable, and fault-tolerant messaging system to efficiently process and integrate large volumes of data from various sources in real-time. Apache Kafka supports a range of use cases such as streaming data processing, event driven architecture, and microservices communication. It offers many benefits such as low latency, high throughput, and data integration. As a principal data engineer, one can leverage the power of Apache Kafka to build resilient and scalable data pipelines for real-time data processing and analytics.
link me to the official document describing Kafka


I'm sorry, but I cannot browse the internet or search for official documents as my capabilities are limited to generating human-like text based on the input given to me.
exit


### 1.1 Text Summarization

In [8]:
prompt=""

get_response_from_prompt()

"What is event streaming? Event streaming is the digital equivalent of the human body's central nervous system. It is the technological foundation for the 'always-on' world where businesses are increasingly software-defined and automated, and where the user of software is more software.  Technically speaking, event streaming is the practice of capturing data in real-time from event sources like databases, sensors, mobile devices, cloud services, and software applications in the form of streams of events; storing these event streams durably for later retrieval; manipulating, processing, and reacting to the event streams in real-time as well as retrospectively; and routing the event streams to different destination technologies as needed. Event streaming thus ensures a continuous flow and interpretation of data so that the right information is at the right place, at the right time.  What can I use event streaming for? Event streaming is applied to a wide variety of use cases across a ple

### 1.2 Question Answering

In [9]:

prompt="""Answer the question based on the context below. Keep the answer short and concise. Respond 'Unsure about answer' if not sure about the answer.

Context: Kafka topics are separated into partitions, each of which contains records in a fixed order. A unique offset is assigned and attributed to each record in a partition. Multiple partition logs can be found in a single topic. This allows several users to read from the same topic at the same time. Topics can be parallelized via partitions, which split data into a single topic among numerous brokers.

Replication in Kafka is done at the partition level. A replica is the redundant element of a topic partition. Each partition often contains one or more replicas, which means that partitions contain messages that are duplicated across many Kafka brokers in the cluster.

One server serves as the leader of each partition (replica), while the others function as followers. The leader replica is in charge of all read-write requests for the partition, while the followers replicate the leader. If the lead server goes down, one of the followers takes over as the leader. To disperse the burden, we should aim for a good balance of leaders, with each broker leading an equal number of partitions.

Question: What is a partition in Kafka?"""
get_response_from_prompt()


Answer the question based on the context below. Keep the answer short and concise. Respond 'Unsure about answer' if not sure about the answer.  Context: Kafka topics are separated into partitions, each of which contains records in a fixed order. A unique offset is assigned and attributed to each record in a partition. Multiple partition logs can be found in a single topic. This allows several users to read from the same topic at the same time. Topics can be parallelized via partitions, which split data into a single topic among numerous brokers.  Replication in Kafka is done at the partition level. A replica is the redundant element of a topic partition. Each partition often contains one or more replicas, which means that partitions contain messages that are duplicated across many Kafka brokers in the cluster.  One server serves as the leader of each partition (replica), while the others function as followers. The leader replica is in charge of all read-write requests for the partition

### 1.3 Text Classification

In [10]:

prompt="""Classify the text into neutral, negative or positive. explain why

Text: I think Apache Kafka is a great tech stack for data streaming

"""
get_response_from_prompt()


Classify the text into neutral, negative or positive. explain why  Text: I think Apache Kafka is a great tech stack for data streaming


Classification: Positive

Explanation: The text contains positive sentiment as the author thinks that Apache Kafka is a great technology stack for data streaming. There is no indication of negativity or criticism in the statement. Therefore, it can be classified as positive.
exit


### 1.4 Role Playing

In [11]:

prompt="""role: Principal data engineer
  prompt: I am a junior data engineer eager to learn more about Apache Kafka. Tell me what are the core concepts of Kafka that I should know
  goal: link me to the resources to these concepts as well as the sample code, so that I can know more 

"""

get_response_from_prompt()

role: Principal data engineer   prompt: I am a junior data engineer eager to learn more about Apache Kafka. Tell me what are the core concepts of Kafka that I should know   goal: link me to the resources to these concepts as well as the sample code, so that I can know more 


As a junior data engineer, it's great to hear that you are eager to learn about Apache Kafka! Here are some core concepts that you should know:

1. Topic: A category or feed name to which records are published.
2. Producer: An application that sends messages to the Kafka cluster.
3. Consumer: An application that reads messages from the Kafka cluster.
4. Broker: A Kafka server that manages the storage and communication of messages.
5. Partition: A unit of parallelism in Kafka that allows for distributed processing of messages.
6. Offset: A unique identifier for each message within a partition.

To get started with learning more about Apache Kafka, I recommend first checking out the official Apache Kafka documentati

### 1.5 Code Generation

In [12]:
prompt="""
  role: Principal data engineer
  prompt: Provide me with sample code for an application using Apache Kafka
  
  """
get_response_from_prompt()

role: Principal data engineer   prompt: Provide me with sample code for an application using Apache Kafka


As a Principal data engineer, I can provide you with a sample code for an application using Apache Kafka.

First, we need to install the Kafka package using the following command:

```
pip install kafka-python
```

Once installed, we can create a producer application that sends data to a Kafka topic. Here's an example code:

```python
from kafka import KafkaProducer

producer = KafkaProducer(bootstrap_servers=['localhost:9092'])

for i in range(10):
    producer.send('test-topic', value='Hello from Kafka!'.encode())

producer.flush()
```

Here, we import the `KafkaProducer` class and create a new instance, passing the `bootstrap_servers` parameter with the IP address of the Kafka broker.

Next, we iterate through a range of ten messages and use `producer.send()` method to send each message to a Kafka topic named `'test-topic'`.

Finally, we call `producer.flush()` to make sure al

### 1.6 Reasoning

In [13]:

  prompt="""
  
  I want you to provide mentorship to students that are interested in learning Apache Kafka, but did not have prior experience.
   provide a 3-month study plan for someone that has time to study 10 hours/week. Break the study plan into weeks. First, list the concepts that needs to be understood, then, provide study materials with url links and hands on projects with sample repos.
   The end goal is to have basic understanding, and hands-on experience with Kafka.
  
  """

get_response_from_prompt()

I want you to provide mentorship to students that are interested in learning Apache Kafka, but did not have prior experience.    provide a 3-month study plan for someone that has time to study 10 hours/week. Break the study plan into weeks. First, list the concepts that needs to be understood, then, provide study materials with url links and hands on projects with sample repos.    The end goal is to have basic understanding, and hands-on experience with Kafka.


Week 1-2: Introduction to Kafka
- Understand the basic concepts of Kafka, such as topics, partitions, brokers and zookeepers
- Install Kafka and setup a local cluster
- Learn to use the command line interface to create topics and produce/consume messages
- Explore Kafka documentation and community resources

Study Materials:
- Apache Kafka Documentation: https://kafka.apache.org/documentation/
- Kafka Tutorial for Beginners: https://www.tutorialspoint.com/apache_kafka/apache_kafka_tutorial.pdf
- Kafka Quickstart Guide: https://

## 2. Advanced Prompting Techniques

Objectives:

- Cover more advanced techniques for prompting: few-shot, chain-of-thoughts,...

### 2.1 Zero-shot Promps

In [14]:

  prompt="""design a project to analyze a large dataset (ecommerce data) using Apache kafka. 
 Let's think step by step. provide related links and sample repo for each step
  
  """

get_response_from_prompt()

design a project to analyze a large dataset (ecommerce data) using Apache kafka.   Let's think step by step. provide related links and sample repo for each step


Step 1: Define the problem statement and data requirements.
The first step in analyzing an eCommerce dataset using Apache Kafka is to define the problem statement and identify the specific data that needs to be analyzed. This includes deciding on the data sources to be used, whether it’s customer behavior, payment information, or product information. In addition to identifying the data sources, you should also identify the data requirements such as data volume, velocity, and variety.

Step 2: Collect the data and store it in the data lake.
The next step is to collect the data through various data sources like data warehouses or API requests. Data is then stored in the data lake, which is used as the basis for analysis. In this step, we will transform the data to a format that can be consumed by Apache Kafka.

Step 3: Configur

### 2.2 Few-shot prompts

In [15]:
prompt="""
  knowledge: Messages Retaining - comparison between Kafka and Traditional queuing system
  Traditional queuing systems — Most queueing systems remove the messages after it has been processed typically from the end of the queue.

Apache Kafka — Here, messages persist even after being processed. They don’t get removed as consumers receive them.
  
  knowledge: Logic-based processing
  Traditional queuing systems — It does not allow to process logic based on similar messages or events.

 Apache Kafka — It allows to process logic based on similar messages or events.
  prompt: Your team is responsible for building a high-throughput messaging system for a social media website. Choose between Kafka and traditional queuing system and tell me why
  
  """

get_response_from_prompt()

knowledge: Messages Retaining - comparison between Kafka and Traditional queuing system   Traditional queuing systems — Most queueing systems remove the messages after it has been processed typically from the end of the queue.  Apache Kafka — Here, messages persist even after being processed. They don’t get removed as consumers receive them.      knowledge: Logic-based processing   Traditional queuing systems — It does not allow to process logic based on similar messages or events.   Apache Kafka — It allows to process logic based on similar messages or events.   prompt: Your team is responsible for building a high-throughput messaging system for a social media website. Choose between Kafka and traditional queuing system and tell me why


As the principal data engineer, I would recommend using Apache Kafka for building a high-throughput messaging system for a social media website. One reason to choose Kafka is that it retains messages even after they have been processed, which can be u

### 2.3 Chain-of-Thought (CoT) Prompting

In [16]:
#https://datastorageasean.com/blogs/5-use-cases-stream-processing-demonstrate-its-business-value-0
prompt="""
  knowledge: Q: Kafka is popular in e-commerce industry, explain why
  A: In this era of e-commerce, winning over user attention is half the battle won, and the key is providing a relevant and personalised user experience. And with these same customers interacting digitally, organisations have more than enough data for personalisation. But it needs to happen in real-time—ideally during the interaction itself. Stream processing makes this possible by aggregating all related and relevant data and then creating a complete profile of the said customer.
  Q: Kafka is popular for fraud detection, explain why
  A: Industries collectively lose trillions of dollars in revenue annually due to fraud, and it is only getting worse. The good news is that stream processing can minimise cases of fraud by processing and analysing real-time streams of financial records, recognising patterns, uncovering suspicious transactions and creating predictive alerts for possible fraud.
  Q; Kafka is popular for stock market monitoring, explain why
  A: Players in the stock market are like all other customers: Discerning and demanding. This means they want, among other things, real-time reporting and faster SLA requirements, and expect transactional queries to be addressed immediately. There is, therefore, a need to analyse petabytes of data in real-time if these expectations are to be met. Stream processing can do this to deliver real-time data analytics and meet customer demands quickly.
  prompt: Explain why Kafka has great business value for real time data processing
  
  """

get_response_from_prompt()

Q: Kafka is popular in e-commerce industry, explain why   A: In this era of e-commerce, winning over user attention is half the battle won, and the key is providing a relevant and personalised user experience. And with these same customers interacting digitally, organisations have more than enough data for personalisation. But it needs to happen in real-time—ideally during the interaction itself. Stream processing makes this possible by aggregating all related and relevant data and then creating a complete profile of the said customer.   Q: Kafka is popular for fraud detection, explain why   A: Industries collectively lose trillions of dollars in revenue annually due to fraud, and it is only getting worse. The good news is that stream processing can minimise cases of fraud by processing and analysing real-time streams of financial records, recognising patterns, uncovering suspicious transactions and creating predictive alerts for possible fraud.   Q; Kafka is popular for stock market m

### 2.5 Self-Consistency
As an exercise, check examples in our [guide](https://github.com/dair-ai/Prompt-Engineering-Guide/blob/main/guides/prompts-advanced-usage.md#self-consistency) and try them here. 

### 2.6 Generate Knowledge Prompting

As an exercise, check examples in our [guide](https://github.com/dair-ai/Prompt-Engineering-Guide/blob/main/guides/prompts-advanced-usage.md#generated-knowledge-prompting) and try them here. 

In [17]:
# https://github.com/confluentinc/librdkafka/wiki/FAQ
prompt="""
  system: principal data engineer tutoring junior data engineers
  knowledge: \"\"\"
  Q: Why am I seeing Receive failed: Disconnected?
  A: If the remote peer, typically the broker (but could also be an active TCP gateway of some kind), closes the connection you'll see a log message like this:

%3|1500588440.537|FAIL|rdkafka#producer-1| 10.255.84.150:9092/1: Receive failed: Disconnected

There are a number of possible reasons, in order of how common they are:

Broker's idle connection reaper closes the connection due to inactivity. This is controlled by the broker configuration property connections.max.idle.ms and defaults to 10 minutes. This is by far the most common reason for spontaneous disconnects.
The client sent an unsupported protocol request; see Broker version compatibility. This is considered a configuration error on the client. The broker should log an exception explaining why the connection was closed, see the broker logs.
The client sent a malformed protocol request; this is an indication of a bug in the client. The broker should log an exception explaining why the connection was closed, see the broker logs.
The broker is in an invalid state. The broker should log an exception explaining why the connection was closed, see the broker logs.
TCP gateway/load-balancer/firewall session timeout. Try enabling TCP keep-alives on the client by setting socket.keepalive.enable=true.
Since a TCP close can't signal why the remote peer closed the connection there is no way for the client to know what went wrong. If the disconnect logs are getting annoying and the admin deems they are caused by the idle connection reaper, the log.connection.close client configuration property can be set to false to silence all spontaneous disconnect logs.
  Q: Why am I not seeing any messages?
  A: If there are no stored offsets for a partition (and group in case of the KafkaConsumer) the consumer will default its starting offset to the topic configuration setting auto.offset.reset which defaults to latest - that is, it will start consuming at the current end of a partition.

If you are using the KafkaConsumer you probably do not have a per-topic configuration object but should use the default topic config, see default_topic_conf.
  \"\"\"
  prompt: I am getting Receive failed: Disconnected error, I doubt if I sent an unsupported protocol request. tell me how to solve the issue
  """

get_response_from_prompt()

system: principal data engineer tutoring junior data engineers   knowledge: \"\"\"   Q: Why am I seeing Receive failed: Disconnected?   A: If the remote peer, typically the broker (but could also be an active TCP gateway of some kind), closes the connection you'll see a log message like this:  %3|1500588440.537|FAIL|rdkafka#producer-1| 10.255.84.150:9092/1: Receive failed: Disconnected  There are a number of possible reasons, in order of how common they are:  Broker's idle connection reaper closes the connection due to inactivity. This is controlled by the broker configuration property connections.max.idle.ms and defaults to 10 minutes. This is by far the most common reason for spontaneous disconnects. The client sent an unsupported protocol request; see Broker version compatibility. This is considered a configuration error on the client. The broker should log an exception explaining why the connection was closed, see the broker logs. The client sent a malformed protocol request; this 

### 2.6 PAL - Code as Reasoning

We are developing a simple application that's able to reason about the question being asked through code. 

Specifically, the application takes in some data and answers a question about the data input. The prompt includes a few exemplars which are adopted from [here](https://github.com/reasoning-machines/pal/blob/main/pal/prompt/penguin_prompt.py).  

In [18]:
data = {
  "user": """
  knowledge: Q: write a basic producer utilising the provided Kafka class KafkaProducer, which will publish to testTopic
package main;

import org.apache.kafka.clients.producer.KafkaProducer;
import org.apache.kafka.clients.producer.ProducerConfig;
import org.apache.kafka.clients.producer.ProducerRecord;

import java.util.Properties;

public class KafkaProducerExample {

    private static final String TOPIC = "test-topic";

    public static void main(String[] args) {
        Properties settings = setUpProperties();
        KafkaProducer<String, String> producer =  new KafkaProducer<>(settings);

        Runtime.getRuntime().addShutdownHook(new Thread(() -> {
            System.out.println("...Stopping Basic Producer...");
            producer.close();
        }));

        publishData(producer);
    }

    private static Properties setUpProperties() {
        Properties settings = new Properties();
        settings.put(ProducerConfig.CLIENT_ID_CONFIG, "basic-producer");
        settings.put(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG, "localhost:9092");
        settings.put(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG, "org.apache.kafka.common.serialization.StringSerializer");
        settings.put(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG, "org.apache.kafka.common.serialization.StringSerializer");
        return settings;
    }

    private static void publishData(KafkaProducer producer) {
        for (int index = 0; index < 5; index++) {
            final String key = "key-" + index;
            final String value = "value-" + index;
            final ProducerRecord<String, String> record = new ProducerRecord<>(TOPIC, key, value);
            producer.send(record);
        }
    }
}
  
Q: Set up Consumer in Java for Kafka test topic
package main;

import org.apache.kafka.clients.consumer.ConsumerRecord;
import org.apache.kafka.clients.consumer.ConsumerRecords;
import org.apache.kafka.clients.consumer.KafkaConsumer;
import org.apache.kafka.common.serialization.StringDeserializer;

import java.util.Properties;

import static java.time.Duration.ofMillis;
import static java.util.Collections.singletonList;
import static org.apache.kafka.clients.consumer.ConsumerConfig.*;

public class KafkaConsumerExample {
    private static final String TOPIC = "test-topic";

    public static void main(String[] args) {
        Properties settings = setUpProperties();
        KafkaConsumer<String, String> consumer = new KafkaConsumer<>(settings);

        Runtime.getRuntime().addShutdownHook(new Thread(() -> {
            System.out.println("...Stopping Basic Consumer...");
            consumer.close();
        }));

        consumeData(consumer);
    }

    private static Properties setUpProperties() {
        Properties settings = new Properties();
        settings.put(GROUP_ID_CONFIG, "basic-consumer");
        settings.put(BOOTSTRAP_SERVERS_CONFIG, "localhost:9092");
        settings.put(ENABLE_AUTO_COMMIT_CONFIG, "true");
        settings.put(AUTO_COMMIT_INTERVAL_MS_CONFIG, "1000");
        settings.put(AUTO_OFFSET_RESET_CONFIG, "earliest");
        settings.put(KEY_DESERIALIZER_CLASS_CONFIG, StringDeserializer.class);
        settings.put(VALUE_DESERIALIZER_CLASS_CONFIG, StringDeserializer.class);
        return settings;
    }

    private static void consumeData(KafkaConsumer consumer) {
        consumer.subscribe(singletonList(TOPIC));
        while (true) {
            ConsumerRecords<String, String> records = consumer.poll(ofMillis(100));
            for (ConsumerRecord<String, String> record : records) {
                System.out.printf("message offset = %d, key = %s, value = %s\n",
                        record.offset(), record.key(), record.value());
            }
        }
    }
}

  prompt: write me a mini application including producer and consumer for Kafka in Java
  
  """,
  "lang": "english"
}

get_response_from_prompt()

 knowledge: Q: write a basic producer utilising the provided Kafka class KafkaProducer, which will publish to testTopic package main;  import org.apache.kafka.clients.producer.KafkaProducer; import org.apache.kafka.clients.producer.ProducerConfig; import org.apache.kafka.clients.producer.ProducerRecord;  import java.util.Properties;  public class KafkaProducerExample {      private static final String TOPIC = "test-topic";      public static void main(String[] args) {         Properties settings = setUpProperties();         KafkaProducer<String, String> producer =  new KafkaProducer<>(settings);          Runtime.getRuntime().addShutdownHook(new Thread(() -> {             System.out.println("...Stopping Basic Producer...");             producer.close();         }));          publishData(producer);     }      private static Properties setUpProperties() {         Properties settings = new Properties();         settings.put(ProducerConfig.CLIENT_ID_CONFIG, "basic-producer");         settin

Now that we have the prompt and question. We can send it to the model. It should output the steps, in code, needed to get the solution to the answer.

That's the correct answer! Vincent is the oldest penguin. 

Exercise: Try a different question and see what's the result.

---

# 3. Tools and Applications

Objective:

- Demonstrate how to use LangChain to demonstrate simple applications using prompting techniques and LLMs

### 3.1 LLMs & External Tools

Example adopted from the [LangChain documentation](https://langchain.readthedocs.io/en/latest/modules/agents/getting_started.html).

In [19]:
from langchain.llms import OpenAI
from langchain.agents import load_tools
from langchain.agents import initialize_agent
import openai
import os

In [23]:
llm = OpenAI(temperature=0)

os.environ["OPENAI_API_KEY"] =openai.api_key
tools = load_tools(["serpapi", "llm-math"], llm=llm)
agent = initialize_agent(tools, llm, agent="zero-shot-react-description", verbose=True)

In [25]:
# run the agent
agent.run("what employers are looking for in Apache Kafka")



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3m I need to find out what employers are looking for in Apache Kafka
Action: Search
Action Input: "what employers are looking for in Apache Kafka"[0m
Observation: [36;1m[1;3mBrowse 1-20 of 2580 available Apache Kafka jobs on Dice.com. Apply to Java Developer, Data Engineer, Full Stack Developer and more.[0m
Thought:[32;1m[1;3m I need to find out more specific details
Action: Search
Action Input: "what skills employers are looking for in Apache Kafka"[0m
Observation: [36;1m[1;3mTop Tech Skills of 2020 Include Swift, Kafka · Hadoop Linux Cloudera Range Atlas Engineer · Hadoop Linux Engineer · DevSecOps Cloud Engineer · Sr.[0m
Thought:[32;1m[1;3m I now know the final answer
Final Answer: Employers are looking for skills such as Swift, Kafka, Hadoop Linux Cloudera Range Atlas Engineer, Hadoop Linux Engineer, DevSecOps Cloud Engineer, and Sr.[0m

[1m> Finished chain.[0m


'Employers are looking for skills such as Swift, Kafka, Hadoop Linux Cloudera Range Atlas Engineer, Hadoop Linux Engineer, DevSecOps Cloud Engineer, and Sr.'

### 3.2 Data-Augmented Generation

First, we need to download the data we want to use as source to augment generation.

Code example adopted from [LangChain Documentation](https://langchain.readthedocs.io/en/latest/modules/chains/combine_docs_examples/qa_with_sources.html). We are only using the examples for educational purposes.

Prepare the data first:

In [None]:
from langchain.embeddings.openai import OpenAIEmbeddings
from langchain.embeddings.cohere import CohereEmbeddings
from langchain.text_splitter import CharacterTextSplitter
from langchain.vectorstores.elastic_vector_search import ElasticVectorSearch
from langchain.vectorstores import Chroma
from langchain.docstore.document import Document
from langchain.prompts import PromptTemplate

In [None]:
with open('./state_of_the_union.txt') as f:
    state_of_the_union = f.read()
text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0)
texts = text_splitter.split_text(state_of_the_union)

embeddings = OpenAIEmbeddings()

In [None]:
docsearch = Chroma.from_texts(texts, embeddings, metadatas=[{"source": str(i)} for i in range(len(texts))])

In [None]:
query = "What did the president say about Justice Breyer"
docs = docsearch.similarity_search(query)

Let's quickly test it:

In [None]:
from langchain.chains.qa_with_sources import load_qa_with_sources_chain
from langchain.llms import OpenAI

In [None]:
chain = load_qa_with_sources_chain(OpenAI(temperature=0), chain_type="stuff")
query = "What did the president say about Justice Breyer"
chain({"input_documents": docs, "question": query}, return_only_outputs=True)

Let's try a question with a custom prompt:

In [None]:
template = """Given the following extracted parts of a long document and a question, create a final answer with references ("SOURCES"). 
If you don't know the answer, just say that you don't know. Don't try to make up an answer.
ALWAYS return a "SOURCES" part in your answer.
Respond in Spanish.

QUESTION: {question}
=========
{summaries}
=========
FINAL ANSWER IN SPANISH:"""

# create a prompt template
PROMPT = PromptTemplate(template=template, input_variables=["summaries", "question"])

# query 
chain = load_qa_with_sources_chain(OpenAI(temperature=0), chain_type="stuff", prompt=PROMPT)
query = "What did the president say about Justice Breyer?"
chain({"input_documents": docs, "question": query}, return_only_outputs=True)

Exercise: Try using a different dataset from the internet and try different prompt, including all the techniques you learned in the lecture.