TranscripterAI is a Python pipeline that allows you to convert any audio recording of meetings or conversations into text and analyze it using AI. The script performs two stages:
- Transcription: the audio file is converted into text using OpenAI Whisper (locally).
- Analysis: the transcript is analyzed to identify key topics, technical skills, strengths and weaknesses of the candidate, as well as evaluate answers to technical questions.
- Local audio transcription using Whisper.
- Interview analysis highlighting key aspects.
- Flexible configuration of analysis prompts.
- Support for various audio formats via ffmpeg.
All installation steps are done via the command line (press Win+S, type cmd).
- Download the latest version of Python from the official website: https://www.python.org/downloads/.
- During installation, make sure to check Add Python to PATH.
- Verify the installation:
python --version
pip --version-
Download the ffmpeg build: https://ffmpeg.org/download.html
-
Extract the archive, for example to
C:\ffmpeg. -
Add the path
C:\ffmpeg\binto the system PATH variable:- Open Control Panel: press
Win + S, type Control Panel, and open it. - Go to System and Security → System → Advanced system settings (on the right).
- In the System Properties window, click Environment Variables…
- In System variables, find the variable
Pathand click Edit… - Click New and add the path to the
binfolder of ffmpeg:C:\ffmpeg\bin - Click OK in all windows to save the changes.
- Open Control Panel: press
-
Verify the installation:
ffmpeg -version- Install Whisper and PyTorch via pip:
pip install openai-whisper
pip install torch- Verify the installation:
whisper --help
pip show torchpip install can be very slow, so be patient.
-
Obtain an API key via Google AI Studio:
- Go to the API key creation page: https://aistudio.google.com/apikey
- Log in with your Google account.
- Create a new API key.
- Select an existing project or create a new one.
- Confirm the key creation.
- Copy and save the key securely, as it will be shown only once.
-
Install the Python SDK for Gemini API:
pip install -q -U google-genai- Verify the installation:
pip show google-genai-
Clone this repository (or download it as a ZIP).
-
Open the project using IntelliJ IDEA or another IDE. Notepad++ can also be used.
-
Edit the config.py file with the following parameters:
Your Gemini API key:
GEMINI_API_KEY = "YOUR_KEY"
Feature flags:
transcription_flag = True # activates local transcription via WhisperAI analysis_flag = True # activates analysis via Gemini API
Choose the transcription model. Model names can be found in the table inside
config.py:MODEL_NAME = "tiny"
Set the prompt for AI analysis:
PROMPT_ANALYSIS = """YOUR PROMPT"""
You can run it through the IDE or directly by executing the file.
All modern audio file formats are supported.
The output files will be saved in the same folder as your audio file. After running the program, you will get three files:
- Original audio file
- Transcript file
- Transcript analysis file (labeled as Gemini)