# Team Project: Concierge, the AI-powered meeting assistant

Forget AI code generation, autocomplete and automated testing – what developers and SMEs really need is less time spent in meetings all day. _Concierge_ is an AI-enabled meeting summarization and action item tracker that ensures key takeaways are recorded, summarized and communicated effectively without the overhead of a dedicated individual. Our service will leverage large language models and video analysis tools to generate meeting minutes, notify stakeholders and communicate achievements, roadblocks and deliverables to the right people at the right time. 

## Extracting audio from meeting video

In [None]:
# For demonstration purposes, we're using an actual meeting from the team at Gitlab - a great example of a typical corporate conference call. 
video_url = "https://www.youtube.com/watch?v=qGFoZ8yodc4"

In [None]:
from pytubefix import YouTube
from pytubefix.cli import on_progress
 
video = YouTube(video_url, on_progress_callback = on_progress)
video.title="meeting_video"
video.streams.get_lowest_resolution().download() # Since we only need the audio, this saves time and space

 ↳ |████████████████████████████████████████████| 100.0%

'/Users/avenugopal/Desktop/UB Work/MGS 636 Applied AI for Managers/Team Project/meeting_video.mp4'

In [None]:
from moviepy.video.io.ffmpeg_tools import ffmpeg_extract_audio

audio_path = "meeting_audio.mp3"
ffmpeg_extract_audio("meeting_video.mp4", audio_path)
print("audio saved at:", audio_path)

MoviePy - Running:
>>> /Users/avenugopal/Desktop/UB Work/MGS 636 Applied AI for Managers/.venv/lib/python3.11/site-packages/imageio_ffmpeg/binaries/ffmpeg-macos-aarch64-v7.1 -y -i meeting_video.mp4 -ab 3000k -ar 44100 meeting_audio.mp3
MoviePy - Command successful
audio saved at: meeting_audio.mp3


## Audio Transcription using Whisper

In [8]:
from pyannote.audio import Pipeline
import whisper
from pydub import AudioSegment

In [9]:
model = whisper.load_model("turbo")

### Direct Transcription

In [10]:
result = model.transcribe("meeting_audio.mp3")



In [33]:
import textwrap

for line in textwrap.wrap(result["text"], width=140):
    print(line)

 Hi, this is Eric Johnson. It's February 18th, 2021, and this is the engineering key review at GitLab. So I've got number four in the
agenda, which is a proposal to break up this meeting into four department key reviews. So currently this is engineering, development,
quality, security, and UX. Infrastructure and support do their own key reviews already. I have the reasons why increased visibility, able to
go deeper, increase the objectivity with which my reports can manage their groups, allow me more time to focus on new markets, and allow me
to shift into more of a question asker mode than generating content and answering questions in these meetings. And, but to avoid adding
three net new meetings to stakeholders calendars, I propose we do a sort of two month rotation. So month one, development quality go, month
two, security and UX would go. How do people feel about that proposal? I think in the group conversations, it's working really well. So I'm,
I'm supportive. And this is the sm

### NLP using Spacy: Keyword/Keyphrase extraction for contextual clues

In [24]:
import spacy
from spacy.matcher import PhraseMatcher
from pprint import pprint 

nlp = spacy.load("en_core_web_md")
def initialize_matcher(phrase_dict):
    phrase_matcher = PhraseMatcher(nlp.vocab)
    for name, phrase_list in phrase_dict.items():
        patterns = [nlp(text) for text in phrase_list]
        phrase_matcher.add(name, None, *patterns)
        print(f"Added {len(phrase_list)} phrases to PhraseMatcher '{name}'.")
    return phrase_matcher

In [None]:
phrases = {
    'engineering': ['engineering', 'engineer', 'code', 'release', 'developer', 'infra'],
    'leadership': ['senior leadership', 'c-suite', 'cxo', 'ceo', 'cto', 'ciso', 'manager','upper management', 'stakeholders'],
    'sales': ['sales target', 'client-facing']
}

corpus = str(result["text"])
phrase_matcher = initialize_matcher(phrases)

doc = nlp(corpus)
output = {domain: [] for domain in phrases.keys()}
for sent in doc.sents:
    for match_id, start, end in phrase_matcher(nlp(sent.text)):
        output[nlp.vocab.strings[match_id]].append(sent.text)

Added 6 phrases to PhraseMatcher 'engineering'.
Added 9 phrases to PhraseMatcher 'leadership'.
Added 2 phrases to PhraseMatcher 'sales'.


In [40]:
pprint(output)

{'engineering': ["It's February 18th, 2021, and this is the engineering key "
                 'review at GitLab.',
                 'So currently this is engineering, development, quality, '
                 'security, and UX.',
                 "And then I've got number five, which is we've got R and D "
                 'overall MR rate, and we also have R and D wider MR rate, '
                 'both as top level KPIs for engineering.',
                 'Um, and so I think the DRI needs to be your kind of data '
                 "engineering team, but of course there's a dependency on "
                 'infrastructure.',
                 "Um, but one of the most specific actions, uh, we're going to "
                 'take though is separating out and having a dedicated host so '
                 "that we're just dealing with the profile of the data "
                 'engineering traffic on there.',
                 "And I'll tell you what I'll put into the infra key review "
   

In [44]:
import requests
import json

url = "http://localhost:8002/rag/add_documents/"
headers = {"Content-Type": "application/json"}
meeting_information  = [
    {
        "text": result['text'],
        "source": "Zoom Meetings (Recorded)",
        "date": "2024-05-10T14:23:45.123Z",
        "meeting_id": "8a7fd89912e",
        "participants": ["Sid", "Eric Johnson"],
        "tags": [domain for domain in output.keys() if output[domain]]
    }
]
response = requests.post(url, headers=headers, data=json.dumps(meeting_information))

if response.status_code == 200:
    print(response.json())
else:
    print(f"Request failed with status code: {response.status_code}")

{'message': '1 new documents added successfully'}


In [52]:
!curl -X POST "http://localhost:8002/rag/query" -H "Content-Type: application/json" -d '{"query": "What are action items for Sid?", "source": "Zoom Meetings (Recorded)"}'

{"answer":"Based on the transcript, Sid’s action items are to focus on improving key indicators related to productivity and quality while holding the line at a target of 10 for MR (Monthly Rate). He also needs to address security, quality, availability, and other factors to prevent the narrow MR rate from dipping. Essentially, Sid needs to prioritize maintaining current productivity levels while proactively addressing potential issues to ensure long-term success and avoid a downward trend in MR."}

### Transcription with Speaker Diarization (requires compute)

In [None]:
from pydub import AudioSegment

def extract_audio_chunk(audio_path, start_time, end_time):
    """
    Extract a segment of an audio file.

    Args:
        audio_path (str): Path to the input audio file (.wav, .mp3, etc.)
        start_time (float): Start time in seconds.
        end_time (float): End time in seconds.

    Returns:
        AudioSegment: Extracted audio chunk.
    """
    audio = AudioSegment.from_file(audio_path)
    start_ms = start_time * 1000
    end_ms = end_time * 1000
    chunk = audio[start_ms:end_ms]
    return chunk

HF_TOKEN = "hf_YOUR_TOKEN_GOES_HERE"
diarization_pipeline = Pipeline.from_pretrained("pyannote/speaker-diarization", use_auth_token=HF_TOKEN)

# Step 1: Diarize
diarization = diarization_pipeline("meeting_audio.mp3")

# Step 2: Cut and transcribe each speaker segment
i = 1
for segment, track, speaker in diarization.itertracks(yield_label=True):
    i+=1
    audio_chunk = extract_audio_chunk("meeting_audio.mp3", segment.start, segment.end)
    text = model.transcribe(audio_chunk)
    print(f"Segment {i}:: [{speaker}]: {text}")
    if i >= 10:
        break

Lightning automatically upgraded your loaded checkpoint from v1.5.4 to v2.5.1.post0. To apply the upgrade to your files permanently, run `python -m pytorch_lightning.utilities.upgrade_checkpoint ../../../../.cache/torch/pyannote/models--pyannote--segmentation/snapshots/c4c8ceafcbb3a7a280c2d357aee9fbc9b0be7f9b/pytorch_model.bin`


Model was trained with pyannote.audio 0.0.1, yours is 3.3.2. Bad things might happen unless you revert pyannote.audio to 0.x.
Model was trained with torch 1.10.0+cu102, yours is 2.7.0. Bad things might happen unless you revert torch to 1.x.




KeyboardInterrupt: 