# When Data Processing Needs Knowledge
Chrissy LeMaire, Microsoft MVP & GitHub Star

## Is Beverly Hills part of Los Angeles?

Traditional regex approach fails because there's no pattern that connects them

In [None]:
"Beverly Hills" -match "Los Angeles"
"Los Angeles" -match 90210

# 
# 
#
# 
# 
# 
# 

# 
# 
# 





### AI Integration with JSON/Structured Output

* Run Ollama: **ollama serve**
* Pull the model: **ollama pull llama3.1**
* Ask the local model a simple true/false:

In [None]:
$question = "Is Beverly Hills part of Los Angeles?"

# Define schema for true/false response
$schema = @{
    type = "object"
    properties = @{
        answer = @{
            type = "boolean"
        }
    }
    required = @("answer")
}

# Make the API call
$body = @{
    model = "llama3.1"
    messages = @(
        @{
            role = "user"
            content = $question
        }
    )
    stream = $false
    format = $schema
} | ConvertTo-Json -Depth 3

$response = Invoke-RestMethod -Uri http://localhost:11434/api/chat -Method Post -Body $body

# Clean, structured data we can work with 🎉
$answerData = $response.message.content | ConvertFrom-Json

"Question: $question"
"Answer:   $($answerData.answer)"

### Another Example: Filename Cleaning

Let's see structured output tackle a practical problem - cleaning messy MP3 filenames using **world knowledge** or **context**:

In [None]:
# Sample messy MP3 filenames
$messyMp3Files = @(
    "01_bohrhap_queen.mp3",
    "material_girl-madonna85.mp3",
    "hotel_cali_eagles1976.mp3",
    "IMAGINE-J-LENNON-track2.mp3",
    "hey_jude_(beetles)_1968_.mp3",
    "billiejean_MJ_thriller.mp3",
    "sweet_child_of_mine_gnr87.mp3",
    "shake_it_off-taylorswift.mp3",
    "purple-haze-jimmy_hendrix_1967.mp3",
    "bohemian(queen)rhaps.mp3",
    "smells_like_teen_spirit_nirvana91.mp3",
    "halo_beyonce_2008.mp3"
)

# Define the schema for clean artist/song extraction
$schema = @{
    type       = "object"
    properties = @{
        artist = @{ type = "string" }
        song   = @{ type = "string" }
    }
    required   = @("artist", "song")
}

# The prompt that makes all the difference - includes examples!
$prompt = @"
You are an AI that extracts artist and song names from messy MP3 filenames.

Examples:
1. "hotel_cali_eagles1976.mp3" → {"artist": "Eagles", "song": "Hotel California"}
2. "rolling_in_the_deep-adele_2011.mp3" → {"artist": "Adele", "song": "Rolling in the Deep"}
3. "californication-RHCP.mp3" → {"artist": "Red Hot Chili Peppers", "song": "Californication"}

Now, extract from this filename:
"@

# Process each file individually ("asking tiny questions")
foreach ($file in $messyMp3Files) {
    # Create the message for LLM
    $msg = "$prompt $file"

    # Create the payload with the schema object
    $body = @{
        model = "llama3.1"
        messages = @(
            @{
                role    = "user"
                content = $msg
            }
        )
        stream = $false
        format = $schema
    } | ConvertTo-Json -Depth 6 -Compress

    # Call the local LLM API
    $response = Invoke-RestMethod -Uri http://localhost:11434/api/chat -Method Post -Body $body
    $info = $response.message.content | ConvertFrom-Json

    # Store result as PowerShell object
    [pscustomobject]@{
        Old = $file
        New = "$($info.artist) - $($info.song).mp3"
    }
}