## Audio Classification-based Speech Emotion Recognition (SER)

### ehcalabres/wav2vec2-lg-xlsr-en-speech-emotion-recognition

In [None]:
# Use a pipeline as a high-level helper
from transformers import pipeline
import os # Import the os module to check for file existence

# 1. Define the path to your audio file
# The r"..." syntax (raw string) is great for Windows paths to avoid issues with backslashes.
audio_file_path = r"C:\...\.wav"

# --- Optional but Recommended: Check if the file exists before proceeding ---
if not os.path.exists(audio_file_path):
    print(f"Error: The file was not found at the path: {audio_file_path}")
    # You might want to exit the script if the file doesn't exist
    exit()

# 2. Initialize the pipeline
# This will download the model the first time you run it, which may take a few minutes.
print("Loading the audio classification pipeline... (This might take a while on the first run)")
pipe = pipeline("audio-classification", model="ehcalabres/wav2vec2-lg-xlsr-en-speech-emotion-recognition")
print("Pipeline loaded successfully.")

# 3. Perform the inference by passing the file path to the pipeline
print(f"Analyzing audio file: {audio_file_path}")
results = pipe(audio_file_path)

# 4. Print the results
# The output is a list of dictionaries, each with a label (the emotion) and a score (the confidence).
print("\n--- Inference Result (Top Emotion) ---")
print(results)

# To get all possible emotion scores, you can use the top_k parameter
print("\n--- All Emotion Scores ---")
all_scores = pipe(audio_file_path, top_k=8) # This model has 8 labels
for emotion in all_scores:
    # Round the score to 4 decimal places for cleaner output
    print(f"Emotion: {emotion['label']:<10} | Score: {emotion['score']:.4f}")

Analyzing audio file: `C:\...\.wav`

---

### **All Emotion Scores**

* **Emotion: happy**      | **Score:** `0.1377`
* **Emotion: fearful**    | **Score:** `0.1322`
* **Emotion: surprised**  | **Score:** `0.1286`
* **Emotion: sad**        | **Score:** `0.1271`
* **Emotion: neutral**    | **Score:** `0.1238`
* **Emotion: disgust**    | **Score:** `0.1216`
* **Emotion: calm**       | **Score:** `0.1159`
* **Emotion: angry**      | **Score:** `0.1130`

### ```superb/wav2vec2-large-superb-er``` (1.26G)

In [None]:
# Use a pipeline as a high-level helper
from transformers import pipeline
import os

# 1. Define the path to your audio file
audio_file_path = r"C:\...\.wav"

# --- Optional but Recommended: Check if the file exists ---
if not os.path.exists(audio_file_path):
    print(f"Error: The file was not found at the path: {audio_file_path}")
    exit()

# 2. Initialize the pipeline with the NEW model
# Note: The only change is the model name here.
print("Loading the audio classification pipeline (superb/wav2vec2-large-superb-er)...")
pipe = pipeline("audio-classification", model="superb/wav2vec2-large-superb-er")
print("Pipeline loaded successfully.")

# 3. Perform the inference by passing the file path to the pipeline
print(f"Analyzing audio file: {audio_file_path}")
results = pipe(audio_file_path)

# 4. Print the results
print("\n--- Inference Result (Top Emotion) ---")
print(results)

# To get all possible emotion scores for THIS model, we use top_k=4
print("\n--- All Emotion Scores ---")
all_scores = pipe(audio_file_path, top_k=4) # This model has 4 labels
for emotion in all_scores:
    print(f"Emotion: {emotion['label']:<5} | Score: {emotion['score']:.4f}")

Analyzing audio file: `C:\...\.wav`

---

### **All Emotion Scores**

* **Emotion: hap**   | **Score:** `0.8158`
* **Emotion: neu**   | **Score:** `0.1796`
* **Emotion: sad**   | **Score:** `0.0043`
* **Emotion: ang**   | **Score:** `0.0004`

### superb/hubert-base-superb-er (378M)

In [None]:
# Use a pipeline as a high-level helper
from transformers import pipeline
import os

# 1. Define the path to your audio file
audio_file_path = r"C:\...\.wav"

# --- Optional but Recommended: Check if the file exists ---
if not os.path.exists(audio_file_path):
    print(f"Error: The file was not found at the path: {audio_file_path}")
    exit()

# 2. Initialize the pipeline with the NEW model
# Note: The only change is the model name here.
print("Loading the audio classification pipeline (superb/wav2vec2-large-superb-er)...")
pipe = pipeline("audio-classification", model="superb/hubert-base-superb-er")
print("Pipeline loaded successfully.")

# 3. Perform the inference by passing the file path to the pipeline
print(f"Analyzing audio file: {audio_file_path}")
results = pipe(audio_file_path)

# 4. Print the results
print("\n--- Inference Result (Top Emotion) ---")
print(results)

# To get all possible emotion scores for THIS model, we use top_k=4
print("\n--- All Emotion Scores ---")
all_scores = pipe(audio_file_path, top_k=4) # This model has 4 labels
for emotion in all_scores:
    print(f"Emotion: {emotion['label']:<5} | Score: {emotion['score']:.4f}")

Analyzing audio file: `C:\...\.wav`

---

### **All Emotion Scores**

* **Emotion: hap**   | **Score:** `0.6047`
* **Emotion: sad**   | **Score:** `0.1896`
* **Emotion: neu**   | **Score:** `0.1819`
* **Emotion: ang**   | **Score:** `0.0238`

**Inference performance poorer than ```superb/wav2vec2-large-superb-er```!**