Skip to content

Latest commit

 

History

History
36 lines (29 loc) · 1.6 KB

README.md

File metadata and controls

36 lines (29 loc) · 1.6 KB

core

An AI-powered personal assistant equipped with an array of advanced components, including a robust wakeword system, an accurate speech recognizer, a cutting-edge speech synthesis module, and a sophisticated natural language generation powered by a state-of-the-art language model. A middleware to integrate AI technology for automation, management, research and other purposes. One of the remarkable features of this system lies in its versatility to seamlessly exchange specific components with alternative options. For instance, the text-to-speech functionality can be easily swapped from Mimic to ElevenLabs, demonstrating the system's adaptability and flexibility.

Components

  • Wakeword Listener (precise/porcupine)
  • Automatic Speech Recognition/STT (Precise/Whisper)
  • NLU/Intent parser (Adapt/Padatious)
  • Skill (Main processing unit)
  • TTS/Speech Synthesis (Mimic 3/ElevenLabs)

Features

  • Provide answers to user questions (online-dependent)
  • AI-powered personal assistant with a predisposed persona emulation
  • Relevant history added to conversation
  • partially precise speech recognizer (using offline whisper)
  • icloud event reminder demo
  • get real-time weather updates
  • time and date functions

RoadMap

  • Fluid conversation
    • interruption remarks when user is speaking
  • offline execution
  • voice to IOS actions using osascript
  • GUI's for elaboration of speech

Aim

  • Interact with the environment (realworld)
  • Knowledge and deep analysis on user-provided data with environment
  • Remember the interactions with user-provided data
  • modularity