Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add audio + video recording #834

Merged
merged 17 commits into from
Mar 11, 2024
Merged

Add audio + video recording #834

merged 17 commits into from
Mar 11, 2024

Conversation

jlowin
Copy link
Member

@jlowin jlowin commented Feb 7, 2024

This PR starts to add tools for working with audio as easily as text (and video!). The marvin.audio module can record and play audio and is fully compatible with the marvin.ai.audio functions like transcribe and speak. marvin.video supports intermittent frame capture that is fully compatible with the marvin.beta.vision functions like caption and cast/extract/etc.

In order to record and play audio, the optional audio dependencies must be installed:
pip install "marvin[audio]"
Note that transcription / speaking is available without these dependencies, but Marvin can not be used directly for recording / playback.

For video:
pip install "marvin[video]"

Example of spoken conversation with an AI assistant:

import marvin
import marvin.audio

ai = marvin.beta.Application()

while True:
    # record the user and transcribe the audio
    user_audio = marvin.audio.record_phrase()
    user_text = marvin.transcribe(user_audio)
    print(f"User: {user_text}")
    
    # send text to AI assistant
    ai_response = ai.say(user_text)
    ai_text = ai_response[0].content[0].text.value
    print(f"AI: {ai_text}")
    
    # speak the AI response out loud
    ai_audio = marvin.speak(ai_text)
    ai_audio.play()

Live transcription:

import marvin.audio

recorder = marvin.audio.record_background()

# iterate over the audio clips
for clip in recorder.stream():
    print(marvin.transcribe(clip))

	if some_condition():
		recorder.stop()

Live captioning from webcam:

import marvin.video

recorder = marvin.video.record_background()

# iterate over frames
for frame in recorder.stream():
		print(marvin.caption(frame))

	if some_condition():
		recorder.stop()

@jlowin jlowin changed the title Audio utilities + experimental live transcription support Add audio + video recording Feb 16, 2024
@jlowin jlowin marked this pull request as ready for review March 10, 2024 17:09
Copy link
Collaborator

@zzstoatzz zzstoatzz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

super cool!

this closes #861

@zzstoatzz zzstoatzz merged commit 9601033 into main Mar 11, 2024
15 checks passed
@zzstoatzz zzstoatzz deleted the audio branch March 11, 2024 15:45
@darinkishore
Copy link

darinkishore commented Apr 2, 2024

Hey, if anyone runs into an error (like me) while installing marvin[audio], run

pip install --upgrade setuptools wheel first.

Also, if you don't have portaudio installed, make sure it's installed, so python can build the PyAudio wheels! If on macos, brew install portaudio works.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants