In [1]:
from transformers import pipeline, T5Tokenizer, TFT5ForConditionalGeneration
import tensorflow as tf

print("Libraries imported successfully!")
print(f"TensorFlow version: {tf.__version__}")


Libraries imported successfully!
TensorFlow version: 2.20.0


In [2]:
# Cell 2: Our Sample Data (Input)
# (Please re-run this cell to ensure we're using this text)

MEETING_TRANSCRIPT = """
Tom: Okay everyone, let's kick off. The main goal today is to finalize the new marketing slogan for the Q4 launch. Sarah, what does your team have?

Sarah: Thanks, Tom. We've narrowed it down to three options. "Innovation for Tomorrow," "Your Future, Our Passion," and "Simply Better." The data suggests "Simply Better" is resonating most with our test groups.

Alex: I agree. It's clean and direct. "Innovation for Tomorrow" is too generic.

Tom: Good point, Alex. Let's go with "Simply Better." Sarah, can you please get the final design assets to the web team?

Sarah: Will do. I'll have them sent over by end-of-day Friday.

Alex: I also have an action item. I will coordinate with the legal team to get the trademark paperwork started for "Simply Better." I should have an update on that by our next meeting.

Tom: Perfect. That's all for today. Great work, team.
"""

print("Sample transcript loaded.")

Sample transcript loaded.


In [3]:
print("Loading summarization model... (This may take a moment on first run)")

summarizer = pipeline("summarization", model="facebook/bart-large-cnn")

summary_output = summarizer(
    MEETING_TRANSCRIPT, 
    max_length=90, 
    min_length=30, 
    do_sample=False
)

print("\n--- ✅ MEETING SUMMARY ---")
print(summary_output[0]['summary_text'])

Loading summarization model... (This may take a moment on first run)


Falling back to torch.float32 because loading with the original dtype failed on the target device.
Device set to use cpu



--- ✅ MEETING SUMMARY ---
"Simply Better" is resonating most with our test groups," says Tom. "Innovation for Tomorrow" is too generic. Sarah, can you please get the final design assets to the web team?


In [4]:
# Cell 4: Task 2 - Extract Action Items (Using a Simple Prompt)

print("\nLoading FLAN-T5 model with a new prompt...")

tokenizer = T5Tokenizer.from_pretrained("google/flan-t5-base")
model = TFT5ForConditionalGeneration.from_pretrained("google/flan-t5-base", from_pt=True)

# --- THIS IS THE NEW, SIMPLIFIED PROMPT ---
# We are removing the complex "Format as..." instructions
# and just giving a clear task.
prompt = f"""
What are the assigned tasks for this transcript.

Transcript:
{MEETING_TRANSCRIPT}

Assigned Tasks:
"""

# Now we run the model
inputs = tokenizer.encode(prompt, return_tensors="tf", max_length=1024, truncation=True)

outputs = model.generate(
    inputs, 
    max_length=200, 
    num_beams=4,
    early_stopping=True
)

# Decode the output
action_items_text = tokenizer.decode(outputs[0], skip_special_tokens=True)

print("\n--- ✅ ACTION ITEMS ---")
print(action_items_text)


Loading FLAN-T5 model with a new prompt...


You are using the default legacy behaviour of the <class 'transformers.models.t5.tokenization_t5.T5Tokenizer'>. This is expected, and simply means that the `legacy` (previous) behavior will be used so nothing changes for you. If you want to use the new behaviour, set `legacy=False`. This should only be set if you understand what it means, and thoroughly read the reason why this was added as explained in https://github.com/huggingface/transformers/pull/24565





TensorFlow and JAX classes are deprecated and will be removed in Transformers v5. We recommend migrating to PyTorch classes or pinning your version of Transformers.
Some weights of the PyTorch model were not used when initializing the TF 2.0 model TFT5ForConditionalGeneration: ['encoder.embed_tokens.weight', 'decoder.embed_tokens.weight']
- This IS expected if you are initializing TFT5ForConditionalGeneration from a PyTorch model trained on another task or with another architecture (e.g. initializing a TFBertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing TFT5ForConditionalGeneration from a PyTorch model that you expect to be exactly identical (e.g. initializing a TFBertForSequenceClassification model from a BertForSequenceClassification model).
All the weights of TFT5ForConditionalGeneration were initialized from the PyTorch model.
If your task is similar to the task the model of the checkpoint was trained on, you can al


--- ✅ ACTION ITEMS ---
Tom, Sarah, Alex, Alex and Tom will finalize the new slogan for the Q4 launch. Sarah will have the final design assets sent over by end-of-day Friday. Alex will coordinate with the legal team to get the trademark paperwork started for "Simply Better."


In [2]:
# Cell 6: Full Pipeline (Audio -> Text -> Summary) - Corrected for Long Audio

from transformers import pipeline
import librosa
import tensorflow as tf 

AUDIO_FILE_PATH = "A1-044-LYRA-WHERE-DO-YOU-GO-IN-THE-MORNING.mp3" 

print(f"Loading audio file: {AUDIO_FILE_PATH}...")

try:
    input_audio_array, sample_rate = librosa.load(AUDIO_FILE_PATH, sr=16000)
    print("Audio loaded and resampled to 16kHz successfully.")
except Exception as e:
    print(f"Error loading audio file. Make sure '{AUDIO_FILE_PATH}' is in the same directory.")
    print(f"Error: {e}")
    raise

# --- 3. TASK 1: AUDIO-TO-TEXT (ASR) ---
print("\nLoading Whisper ASR model...")

asr_pipeline = pipeline(
    "automatic-speech-recognition",
    model="openai/whisper-base"
)

print("Transcribing audio... (This may take a moment)")

# --- THIS IS THE FIX ---
# We add 'chunk_length_s=30' to tell the pipeline to
# automatically chunk the long audio.
transcribed_output = asr_pipeline(input_audio_array, chunk_length_s=30)
transcribed_text = transcribed_output["text"]

print("\n--- ✅ TRANSCRIBED TEXT ---")
print(transcribed_text)


# --- 4. TASK 2: TEXT-TO-SUMMARY ---
print("\nLoading summarization model...")

summarizer = pipeline("summarization", model="facebook/bart-large-cnn")

summary_output = summarizer(
    transcribed_text, 
    max_length=150,
    min_length=30, 
    do_sample=False
)

print("\n--- ✅ FINAL SUMMARY (FROM AUDIO) ---")
print(summary_output[0]['summary_text'])

Loading audio file: A1-044-LYRA-WHERE-DO-YOU-GO-IN-THE-MORNING.mp3...
Audio loaded and resampled to 16kHz successfully.

Loading Whisper ASR model...


Device set to use cpu


Transcribing audio... (This may take a moment)


Using custom `forced_decoder_ids` from the (generation) config. This is deprecated in favor of the `task` and `language` flags/config options.
Transcription using a multilingual Whisper will default to language detection followed by transcription instead of translation to English. This might be a breaking change for your use case. If you want to instead always translate your audio to English, make sure to pass `language='en'`. See https://github.com/huggingface/transformers/pull/28687 for more details.



--- ✅ TRANSCRIBED TEXT ---
 Hello, my name is Lura. I am from Kosovo. My question is, where do you go in the morning? I don't go anywhere in the morning. I like to stay home. In the morning, I always sleep. I am not a morning person. I am a night person. I wake up very late. I love to stay home in the morning. I like my mornings, peaceful and quiet. What about you? Where do you go in the morning?

Loading summarization model...


Device set to use cpu
Your max_length is set to 150, but your input_length is only 100. Since this is a summarization task, where outputs shorter than the input are typically wanted, you might consider decreasing max_length manually, e.g. summarizer('...', max_length=50)



--- ✅ FINAL SUMMARY (FROM AUDIO) ---
Lura, from Kosovo, says she is not a morning person. "I like my mornings, peaceful and quiet," she says. Where do you go in the morning?
