# Adding metadata to survey results
This notebook provides sample [EDSL](https://docs.expectedparrot.com/) code for adding metadata to survey [results](https://docs.expectedparrot.com/en/latest/results.html). This can be useful when you are running a survey with [scenarios](https://docs.expectedparrot.com/en/latest/scenarios.html) of data as inputs to question texts (e.g., in [data labeling](https://docs.expectedparrot.com/en/latest/notebooks/data_labeling_example.html) tasks) and want to preserve information about the inputs, such as the source, date or comments but without having to pass it to the language model and without having to perform post-survey data match up steps.

The solution is to include metadata in the scenarios that you create for the data inputs to the question texts, but without creating parameters for the metadata in the actual question texts. When the scenarios are added to the survey and it is run, this will generate columns in the results for the metadata fields so that the results are immediately analyzable with the metadata.

EDSL is an open-source library for simulating surveys and experiments with AI. Please see our [documentation page](https://docs.expectedparrot.com/) for tips and tutorials on getting started.

In [1]:
# pip install edal

Importing tools:

In [1]:
from edsl import QuestionFreeText, QuestionYesNo, QuestionNumerical, Survey, ScenarioList, Scenario

Creating a survey of questions:

In [2]:
q_reference = QuestionFreeText(
    question_name="reference",
    question_text="What is this headline referring to: {{ headline }}",
)

q_frontpage = QuestionYesNo(
    question_name="frontpage",
    question_text="Is this story likely to be on the front page of the newspaper: {{ headline }}",
)

survey = Survey([q_reference, q_frontpage])

Some mock data that includes a field that is an input to the question texts - `{{ headline }}` - and other fields that we will preserve as metadata to access them in the survey results:

In [3]:
data = {
    "headline": [
        "Armistice Signed, War Over: Celebrations Erupt Across City",
        "Spanish Flu Pandemic: Hospitals Overwhelmed as Cases Surge",
        "Women Gain Right to Vote: Historic Amendment Passed",
        "Broadway Theaters Reopen After Flu Shutdown",
        "City Welcomes Returning Soldiers with Parade",
        "Prohibition Debate Heats Up: Public Opinion Divided",
        "New York Yankees Win First Pennant in Franchise History",
        "Subway Expansion Project Approved by City Council",
        "Harlem Renaissance: New Wave of Cultural Expression",
        "Mayor Announces New Housing Initiative for Veterans",
    ],
    "date": [
        "1918-11-11",
        "1918-10-15",
        "1918-06-05",
        "1918-12-01",
        "1918-11-12",
        "1918-07-20",
        "1918-09-30",
        "1918-08-18",
        "1918-04-25",
        "1918-11-20",
    ],
    "author": [
        "John Doe",
        "Jane Smith",
        "Robert Johnson",
        "Mary Lee",
        "James Brown",
        "Patricia Green",
        "William Davis",
        "Barbara Wilson",
        "Charles Miller",
        "Elizabeth Taylor",
    ],
    "section": [
        "Front Page",
        "Health",
        "Politics",
        "Entertainment",
        "Local",
        "Opinion",
        "Sports",
        "City News",
        "Culture",
        "Housing",
    ],
}

Creating scenarios for the data:

In [4]:
scenarios = ScenarioList.from_nested_dict(data)

Running the survey with the scenarios:

In [5]:
results = survey.by(scenarios).run()

Accessing the metadata together with the responses:

In [6]:
(results
 .select("headline", "date", "author", "section", "reference", "frontpage")
 .print(format="rich")
)