Skip to content

This project aims to develop a platform that utilizes Artificial Intelligence (AI) to assist users in improving their communication skills for public speaking and interviews

License

Notifications You must be signed in to change notification settings

JermaineV/AI_Coach

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

AI Communication Coach

AI Communication Coach Image

This project aims to develop a platform that utilizes Artificial Intelligence (AI) to assist users in improving their communication skills for public speaking and interviews. The coach can leverage various techniques, including:

  • Speech Recognition: Capture user input through speech.
  • Text Analysis: Analyze the text content to understand sentiment, keywords, and named entities.
  • Pre-trained LLM Integration: Utilize a pre-trained Large Language Model (LLM) like GPT-j to generate informative responses.

Index

  1. Features (In Progress)
  2. Tech Stack
  3. Getting Started
  4. Current Functionality
  5. Future Development
  6. Transformers and LLMs
  7. License

Features (In Progress)

  • Speech Recognition: Capture and analyze user speech for content and delivery.
  • Text Analysis: Process user-provided text (e.g., job descriptions, presentation outlines) to understand context and generate relevant questions.
  • Mock Interview Simulation: Simulate interview scenarios with AI-powered questions based on user input.
  • Performance Analysis: Provide feedback on speech patterns, body language (future implementation), and content organization.

Tech Stack

  • Python: Programming Language
  • Jupyter Notebook: Development environment
  • SpeechRecognition: Speech Recognition
  • NLP Libraries: NLTK, spaCy, TextBlob
  • Transformers: Hugging Face Transformers for text generation
  • (Optional) OpenCV: Computer Vision for future features

Getting Started

Prerequisites

  • Python 3.x (Ensure you have the latest version)
  • Jupyter Notebook (Consider using Anaconda for a bundled environment)
  • Additional libraries (install using pip in your terminal or the Jupyter Notebook environment):
    • transformers
    • speech_recognition
    • nltk
    • textblob
    • spacy
    • torch (for PyTorch backend)

Installation

  1. Clone this repository :

    git clone https://github.com/JermaineV/AI_Coach
    cd AI_Coach
  2. Install the required libraries:

    pip install -r requirements.txt # To install the dependencies, you can simply run this line
    # OR individual installations
    pip install transformers speech_recognition nltk textblob spacy torch
  3. Download the necessary NLTK resources:

    import nltk
    nltk.download('punkt')
  4. (Optional) Install PyAudio for speech recognition:

    • Download The necessary PyAudio file:
      pip install PyAudio

Usage

  1. Open the Jupyter Notebook:

    jupyter notebook AI_coach.ipynb
  2. Ensure you have a working internet connection for initial model downloads (if applicable).

  3. Run the notebook cells (usually by pressing Shift + Enter) to execute the code.

Current Functionality

  • Speech Recognition: Capture and analyze user speech.
  • Text Analysis: Perform sentiment analysis, keyword extraction, and named entity recognition.
  • LLM Integration: Use pre-trained LLMs to generate responses based on user input.
  • Video Showcasing Current progress Video link

Future Development

  • Enhance response generation using the LLM with context awareness and conversation history.
  • Integrate feedback mechanisms to improve the coach's responses over time.
  • Explore additional features like voice synthesis for coach responses or sentiment visualization.

Transformers and LLMs

The project utilizes the transformers library to interact with pre-trained LLMs.

Transformers Library

Pre-trained LLM

The code snippet utilizes the EleutherAI/gpt-j-6B model. Explore the Hugging Face Model Hub for various LLMs: https://huggingface.co/models

Downloading the LLM

There are two main approaches to using pre-trained LLMs with the transformers library:

A. Using Transformers Pipeline

  • Pros:
    • Simpler setup, no need to manage model files.
  • Cons:
    • Requires an internet connection during script execution.
    • Might have limitations on model customization or fine-tuning.

B. Downloading Model Weights

  • Pros:
    • Offline functionality after initial download.
    • More flexibility for fine-tuning or advanced usage.
  • Cons:
    • Requires additional storage space for the model files.
    • Downloading large models can take time.

Downloading Instructions

For option A (using pipeline), no additional downloads are necessary.

For option B (downloading weights), refer to the specific LLM's documentation on the Hugging Face Model Hub. Some models might provide pre-built transformers compatible weights files, while others may require specific download steps.

Important Note: Downloading large LLMs can require significant storage space and processing power. Consider your computational resources and the specific needs of your project before choosing a model.

License

This project is licensed under the MIT License. See the LICENSE file for more details.

About

This project aims to develop a platform that utilizes Artificial Intelligence (AI) to assist users in improving their communication skills for public speaking and interviews

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published