# Extracting Action Items from Meeting Transcripts

In this notebook, we'll walk through how to extract action items from meeting transcripts using OpenAI's API and Pydantic. This use case is essential for automating project management tasks, such as task assignment and priority setting.

## Motivation

Significant amount of time is dedicated to meetings, where action items are generated as the actionable outcomes of these discussions. Automating the extraction of action items can save time and guarantee that no critical tasks are overlooked.

## Defining the Structures

We'll model a meeting transcript as a collection of Ticket objects, each representing an action item. Every Ticket can have multiple Subtask objects, representing smaller, manageable pieces of the main task.

In [31]:
import instructor
from openai import OpenAI
from typing import Iterable, List, Optional
from enum import Enum
from pydantic import BaseModel
import graphviz
import json

In [32]:
from rich import pretty, print
pretty.install()

In [33]:
from dotenv import load_dotenv
load_dotenv("../api_keys.env")

[3;92mTrue[0m

In [34]:
class PriorityEnum(str, Enum):
    high = "High"
    medium = "Medium"
    low = "Low"


class Subtask(BaseModel):
    """Correctly resolved subtask from the given transcript"""

    id: int
    name: str


class Ticket(BaseModel):
    """Correctly resolved ticket from the given transcript"""

    id: int
    name: str
    description: str
    priority: PriorityEnum
    assignees: List[str]
    subtasks: Optional[List[Subtask]]
    dependencies: Optional[List[int]]

## Extracting Action Items

To extract action items from a meeting transcript, we use the `generate` function. It calls OpenAI's API, processes the text, and returns a set of action items modeled as `ActionItems`.

In [35]:
# Apply the patch to the OpenAI client
# enables response_model keyword
client = instructor.from_openai(OpenAI())


def generate_action_items(data: str) -> Iterable[Ticket]:
    return client.chat.completions.create(
        model="gpt-4o",
        response_model=Iterable[Ticket],
        messages=[
            {
                "role": "system",
                "content": "You are given the transcript of a meeting. Your task is to create detailed action items"
            },
            {
                "role": "user",
                "content": f"Create Ticket graph for the following transcript:\n{data}"
            }
        ]
    )

## Evaluation and Testing

To test the `generate` function, we provide it with a sample transcript, and then print the JSON representation of the extracted action items.

In [36]:
transcript = """Alice: Hey team, we have several critical tasks we need to tackle for the upcoming release. First, we need to work on improving the authentication system. It's a top priority.

Bob: Got it, Alice. I can take the lead on the authentication improvements. Are there any specific areas you want me to focus on?

Alice: Good question, Bob. We need both a front-end revamp and back-end optimization. So basically, two sub-tasks.

Carol: I can help with the front-end part of the authentication system.

Bob: Great, Carol. I'll handle the back-end optimization then.

Alice: Perfect. Now, after the authentication system is improved, we have to integrate it with our new billing system. That's a medium priority task.

Carol: Is the new billing system already in place?

Alice: No, it's actually another task. So it's a dependency for the integration task. Bob, can you also handle the billing system?

Bob: Sure, but I'll need to complete the back-end optimization of the authentication system first, so it's dependent on that.

Alice: Understood. Lastly, we also need to update our user documentation to reflect all these changes. It's a low-priority task but still important.

Carol: I can take that on once the front-end changes for the authentication system are done. So, it would be dependent on that.

Alice: Sounds like a plan. Let's get these tasks modeled out and get started."""

In [37]:
prediction = generate_action_items(transcript)

In [38]:
print(json.dumps([ticket.dict() for ticket in prediction], indent=2))

## Visualizing the tasks

In order to quickly visualize the data we used code interpreter to create a graphviz export of the json version of the ActionItems array.  
Example:

!['Action Items graph'](action_items_example.png)

In [39]:

dot = graphviz.Digraph(comment='Action Items')

for ticket in prediction:
    dot.node(str(ticket.id), f"{ticket.name}\n{ticket.priority}\n{', '.join(ticket.assignees)}")
    
    if ticket.subtasks:
        for subtask in ticket.subtasks:
            dot.node(f"{ticket.id}.{subtask.id}", subtask.name)
            dot.edge(str(ticket.id), f"{ticket.id}.{subtask.id}")
    
    if ticket.dependencies:
        for dep in ticket.dependencies:
            dot.edge(str(dep), str(ticket.id))

dot.render('action_items', view=True)

[32m'action_items.pdf'[0m

In this example, the `generate_action_items` function successfully identifies and segments the action items, assigning them priorities, assignees, subtasks, and dependencies as discussed in the meeting.

By automating this process, you can ensure that important tasks and details are not lost in the sea of meeting minutes, making project management more efficient and effective.