# Extracting text
EDSL comes with a variety of question types that can be selected based on the form of response that you want.
This notebook demonstrates how to use the `QuestionExtract` question type to return information extracted (or extrapolated) from a given text in the form of a Pythonic dictionary. The required parameters are <i>question_name</i>, <i>question_text</i> and and <i>answer_template</i>, which is a dictionary of example responses that the agent is prompted to use for reference (we will show this in the prompts).

Please see the [Questions page](https://docs.expectedparrot.com/en/latest/questions.html) of the docs for details on other question types.

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/expectedparrot/edsl/blob/main/docs/notebooks/question_extract_example.ipynb)

## Question template
We start by importing the question type, and then use the `.example()` method to inspect the format of an example object:

In [1]:
# ! pip install edsl

In [2]:
from edsl.questions import QuestionExtract

In [3]:
QuestionExtract.example()

We can then run the example question and check that the agent's response mirrors the <i>answer_template</i> that it was given:

In [4]:
results = QuestionExtract.example().run()
results.select("extract_name").print(format="rich")

## Creating a question
Here we create a new example of the question type where we prompt the agent to review a (longer) text and return information about it. Note that we use a <b>{{placeholder}}</b> in the question so that we can parameterize it with different texts. This is useful when we want to conduct a data labeling task where we want to ask the same questions about many different pieces of data at once. This is done by creating `Scenario` objects for the inputs to the questions. 

Note also that our instructions to the agent are quite short; we could substitute a more detailed question text with context about the actual task you want performed.

Learn more about using [Scenario](https://docs.expectedparrot.com/en/latest/scenarios.html) objects in the docs.

In [5]:
simpsons = """
"The Simpsons" is an iconic American animated sitcom created by Matt Groening that debuted in 1989 on the Fox network. 
The show is set in the fictional town of Springfield and centers on the Simpsons family, consisting of the bumbling but well-intentioned father Homer, the caring and patient mother Marge, and their three children: mischievous Bart, intelligent Lisa, and baby Maggie. 
Renowned for its satirical take on the typical American family and society, the series delves into themes of politics, religion, and pop culture with a distinct blend of humor and wit. 
Its longevity, marked by over thirty seasons, makes it one of the longest-running television series in history, influencing many other sitcoms and becoming deeply ingrained in popular culture.
"""

In [6]:
from edsl.questions import QuestionExtract
from edsl import Scenario

q = QuestionExtract(
    question_name="example",
    question_text="Review the following text: {{ content }}",
    answer_template={
        "main_characters_list": ["name", "name"],
        "location": "location",
        "genre": "genre",
    },
)

scenario = Scenario({"content": simpsons})
results = q.by(scenario).run()

In [7]:
results.select("example").print(format="rich")

## Show prompts
We can inspect the prompts that were used to generate the response:

In [8]:
results.select("prompt.*").print(format="rich")