# Importing pdfs
This notebook provides sample [EDSL](https://docs.expectedparrot.com/) code demonstrating how to import text from PDFs to use as parameters of survey questions. This can be helpful when using EDSL to extract information from a large text efficiently. 

EDSL is an open-source library for simulating surveys and experiments with AI agents and large language models. Please see our [documentation page] for tips and tutorials on getting started.

## How it works
EDSL comes with a [variety of question types](https://docs.expectedparrot.com/en/latest/questions.html) that we can select from based on the desired form of the response (multiple choice, free text, etc.). We can also parameterize questions with textual content in order to ask questions about the content. We do this by creating a `{{ placeholder }}` in a question text, e.g., *What are the key themes of this text: {{ text }}*, and then creating corresponding `Scenario` objects for the content to be inserted in the placeholder when we run the survey. This allows us to efficiently administer multiple versions of a question with different inputs all at once. A common use case for this is performing [data labeling tasks](https://docs.expectedparrot.com/en/latest/notebooks/data_labeling_example.html) designed as questions about one or more pieces of textual data from a that can be input into the survey question texts. [Learn more about using scenarios](https://docs.expectedparrot.com/en/latest/scenarios.html).

## Example
For purposes of demonstration we'll use a PDF copy of the recent paper [Automated Social Science:
Language Models as Scientist and Subjects](https://arxiv.org/pdf/2404.11794) and conduct a survey consisting of several questions about the contents of it:

<img src="automated_social_science_paper.png" width="500px">

Importing the tools:

In [1]:
# pip install edsl

In [2]:
from edsl.questions import QuestionFreeText, QuestionList
from edsl import ScenarioList, Survey

Creating a survey of questions about a text:

In [3]:
q_summary = QuestionFreeText(
    question_name = "summary",
    question_text = "Briefly summarize the abstract of this paper: {{ text }}"
)

q_authors = QuestionList(
    question_name = "authors",
    question_text = "List the names of all the authors of the following paper: {{ text }}"
)

q_thanks = QuestionList(
    question_name = "thanks",
    question_text = "List the names of the people thanked in the following paper: {{ text }}"
)

survey = Survey([q_summary, q_authors, q_thanks])

Creating a `ScenarioList` for the PDF copy of the paper, the contents of which will be inserted in our questions:

In [4]:
automated_social_scientist = ScenarioList.from_pdf("automated_social_scientist.pdf")

Adding the scenario list to to the survey and running it:

In [5]:
results = survey.by(automated_social_scientist).run()

We can see a list of all the components of results that are directly accessible:

In [6]:
results.columns

['agent.agent_instruction',
 'agent.agent_name',
 'answer.authors',
 'answer.summary',
 'answer.thanks',
 'comment.authors_comment',
 'comment.thanks_comment',
 'iteration.iteration',
 'model.frequency_penalty',
 'model.logprobs',
 'model.max_tokens',
 'model.model',
 'model.presence_penalty',
 'model.temperature',
 'model.top_logprobs',
 'model.top_p',
 'prompt.authors_system_prompt',
 'prompt.authors_user_prompt',
 'prompt.summary_system_prompt',
 'prompt.summary_user_prompt',
 'prompt.thanks_system_prompt',
 'prompt.thanks_user_prompt',
 'question_options.authors_question_options',
 'question_options.summary_question_options',
 'question_options.thanks_question_options',
 'question_text.authors_question_text',
 'question_text.summary_question_text',
 'question_text.thanks_question_text',
 'question_type.authors_question_type',
 'question_type.summary_question_type',
 'question_type.thanks_question_type',
 'raw_model_response.authors_raw_model_response',
 'raw_model_response.summary_

We can select components of the results to inspect and print:

In [7]:
results.select("summary", "authors", "thanks").print(format="rich")

Please see our [documentation page](https://docs.expectedparrot.com/) for examples of other survey methods and use cases!