# **Machine Learning: Project Part 1**

---

**Author: Damien Farrell**

---

In [1]:
# imports

import os
import torch
import librosa
import numpy as np
from dotenv import load_dotenv
from pydub import AudioSegment

from pyannote.audio import Pipeline
from pyannote.audio.pipelines.utils.hook import ProgressHook
from transformers import pipeline as hf_pipeline

INFO:speechbrain.utils.quirks:Applied quirks (see `speechbrain.utils.quirks`): [disable_jit_profiling, allow_tf32]
INFO:speechbrain.utils.quirks:Excluded quirks specified by the `SB_DISABLE_QUIRKS` environment (comma-separated list): []


In [2]:
# Environment Variables
load_dotenv()
secret_key = os.environ.get("../HF_API_KEY")

# Audio Files
audio = "./audio/TrumpHarrisDebate.wav"

## **Project Part 1: Interview/Debate Audio Analysis**

> 1. **Performs Speaker Diarisation Analysis**  
>    - Uses pre-built models to identify who spoke and when.  
>    - Outputs time segments for each speaker and calculates total speaking time.
> <br><br>
> 2. **Performs Speech to Text Analysis**  
>    - Transcribes the audio for each speaker.  
>    - Combines speaker labels with the transcript (e.g., “[Speaker 1] …”).  
>    - Allows for further analysis, such as word counts or word frequency per speaker.
> <br><br>
> 3. **Leverages a Large Language Model**  
>    - Once the transcript is annotated, the notebook can query a large language model for sentiment or ideological analysis.  
>    - Could identify speaker names or approximate political leanings based on transcript content.
> <br><br>
> 4. **Testing & Evaluation**  
>    - An audio file of the “Harris vs. Trump 2024 US Presidential Debate” will be provided for initial testing.  
>    - A more complex file (with additional speakers or speakers of similar gender) should be used for further evaluation.  
>    - The performance of each component should be documented and assessed.
> <br><br>


---

### **Performs Speaker Diarisation Analysis** 

---

In [3]:
# instantiate the pipeline
pipeline = Pipeline.from_pretrained(
  "pyannote/speaker-diarization-3.1",
  use_auth_token="HF_API_KEY")

In [4]:
# run the pipeline on an audio file
with ProgressHook() as hook:
    diarization = pipeline(audio, hook=hook)

Output()

In [10]:
print(type(diarization))

<class 'pyannote.core.annotation.Annotation'>


In [12]:
diarization.labels()

['SPEAKER_00', 'SPEAKER_01', 'SPEAKER_02']

In [16]:
diarization.chart()

[('SPEAKER_02', 87.19312499999997),
 ('SPEAKER_01', 81.84375),
 ('SPEAKER_00', 25.835624999999993)]

---

### **References**

1. https://huggingface.co/pyannote/segmentation
2. http://pyannote.github.io/pyannote-core/reference.html#annotation

---

# END