Skip to content

anton418788/TranscripterAI

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

15 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

TranscripterAI

README на Русском языке

TranscripterAI is a Python pipeline that allows you to convert any audio recording of meetings or conversations into text and analyze it using AI. The script performs two stages:

  1. Transcription: the audio file is converted into text using OpenAI Whisper (locally).
  2. Analysis: the transcript is analyzed to identify key topics, technical skills, strengths and weaknesses of the candidate, as well as evaluate answers to technical questions.

Features

  • Local audio transcription using Whisper.
  • Interview analysis highlighting key aspects.
  • Flexible configuration of analysis prompts.
  • Support for various audio formats via ffmpeg.

Installation

All installation steps are done via the command line (press Win+S, type cmd).

1. Install Python

  1. Download the latest version of Python from the official website: https://www.python.org/downloads/.
  2. During installation, make sure to check Add Python to PATH.
  3. Verify the installation:
python --version
pip --version

2. Install ffmpeg

  1. Download the ffmpeg build: https://ffmpeg.org/download.html

  2. Extract the archive, for example to C:\ffmpeg.

  3. Add the path C:\ffmpeg\bin to the system PATH variable:

    1. Open Control Panel: press Win + S, type Control Panel, and open it.
    2. Go to System and Security → System → Advanced system settings (on the right).
    3. In the System Properties window, click Environment Variables…
    4. In System variables, find the variable Path and click Edit…
    5. Click New and add the path to the bin folder of ffmpeg:
      C:\ffmpeg\bin
      
    6. Click OK in all windows to save the changes.
  4. Verify the installation:

ffmpeg -version

3. Install OpenAI Whisper

  1. Install Whisper and PyTorch via pip:
pip install openai-whisper
pip install torch
  1. Verify the installation:
whisper --help
pip show torch

⚠️ Note: downloading via pip install can be very slow, so be patient.

4. Set up Gemini API

  1. Obtain an API key via Google AI Studio:

    1. Go to the API key creation page: https://aistudio.google.com/apikey
    2. Log in with your Google account.
    3. Create a new API key.
    4. Select an existing project or create a new one.
    5. Confirm the key creation.
    6. Copy and save the key securely, as it will be shown only once.
  2. Install the Python SDK for Gemini API:

pip install -q -U google-genai
  1. Verify the installation:
pip show google-genai

5. Install and Configure TranscripterAI

  1. Clone this repository (or download it as a ZIP).

  2. Open the project using IntelliJ IDEA or another IDE. Notepad++ can also be used.

  3. Edit the config.py file with the following parameters:

    Your Gemini API key:

    GEMINI_API_KEY = "YOUR_KEY"

    Feature flags:

    transcription_flag = True  # activates local transcription via WhisperAI
    analysis_flag = True       # activates analysis via Gemini API

    Choose the transcription model. Model names can be found in the table inside config.py:

    MODEL_NAME = "tiny"

    Set the prompt for AI analysis:

    PROMPT_ANALYSIS = """YOUR PROMPT"""

Running and Output

1. Run main.py

You can run it through the IDE or directly by executing the file.

2. Select an audio file

All modern audio file formats are supported.

3. Wait for the process to finish

The output files will be saved in the same folder as your audio file. After running the program, you will get three files:

  • Original audio file
  • Transcript file
  • Transcript analysis file (labeled as Gemini)

About

TranscripterAI — a Python pipeline for automatically transcribing audio recordings of meetings and interviews, followed by AI-powered analysis. The script performs two stages: local transcription using OpenAI Whisper and text analysis via the Gemini API.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages