## 🧱 1. Core Idea: What Are You Building?

You're building a **tool that helps you search through footage using the script's content** — not just by keyword, but by **context**, like:

* **Who** is in the scene (characters)
* **What** is happening (actions)
* **Where** it's set (locations)
* **How** it feels (mood or tone)
* **When** it occurs (scene, day/night)

This is like a **smart search keyboard**, powered by the script.


## 🛠 3. Step-by-Step Build Overview

Here’s a full breakdown:


### ✅ Step 1: Load & Parse Your Script

Your script might look like this:

```
INT. KITCHEN - NIGHT

Sarah enters with a knife in her hand.
John looks up from the table, startled.
```
You parse it using spaCy:

In [2]:
nlp= spacy.load('en_core_web_sm')

In [None]:
script_text = "Alex had always dreamed of reaching the top of Eagle's Peak, a mountain so tall many said it can't be climbed without years of training. But Alex wasn't an expert. Just a person with a dream and a backpack full of hope. The first steps were easy. The path was clear. But soon the trail grew steeper. Rocks blocked the way. The wind howled. Doubt crept in."


In [3]:
import spacy

nlp = spacy.load("en_core_web_sm")

script = """
INT. KITCHEN - NIGHT

Sarah enters with a knife in her hand.
John looks up from the table, startled.
"""

doc = nlp(script)

### ✅ Step 2: Extract Keywords, Characters, and Actions

In [4]:
structured = []

for sent in doc.sents:
    structured.append({
        "sentence": sent.text,
        "characters": [ent.text for ent in sent.ents if ent.label_ == "PERSON"],
        "actions": [token.lemma_ for token in sent if token.pos_ == "VERB"],
        "objects": [token.lemma_ for token in sent if token.pos_ == "NOUN"]
    })

In [5]:
structured

[{'sentence': '\nINT.', 'characters': [], 'actions': [], 'objects': []},
 {'sentence': 'KITCHEN - NIGHT\n\nSarah enters with a knife in her hand.\n',
  'characters': ['Sarah'],
  'actions': ['enter'],
  'objects': ['knife', 'hand']},
 {'sentence': 'John looks up from the table, startled.\n',
  'characters': ['John'],
  'actions': ['look', 'startle'],
  'objects': ['table']}]

In [6]:
# Collect unique tags
all_characters = sorted(set(c for line in structured for c in line['characters']))
all_actions = sorted(set(a for line in structured for a in line['actions']))
all_objects = sorted(set(o for line in structured for o in line['objects']))


In [9]:
all_characters

['John', 'Sarah']

In [10]:
all_actions

['enter', 'look', 'startle']

In [11]:
all_objects

['hand', 'knife', 'table']