Skip to content
/ GTTS Public

This project converts written material into speech by using Google AI (Gemini) for text creation or internet searches.

License

Notifications You must be signed in to change notification settings

Stawa/GTTS

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

55 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Icon Gemini Text-To-Speech Icon

This project converts written material into speech by using Google AI (Gemini) for text creation or internet searches.

Documentation SonarCloud

📜 Table of Contents

  1. How It Works
  2. Project Note
  3. Project Installlation
  4. Project Examples
  5. Contributors

How It Works

You may be wondering how this project works; it's actually simple. This was based on an example in test/app.ts. So the first thing it will do is fetch our voice, and then it will call a function that sends a request to the Google Gemini API so we can receive an answer from the AI. Also, this is necessary; it can automatically play a TTS from the generated text.

📌 Project Note

This project is being tested on Linux using the Ubuntu 24.04 LTS x86_64 distribution. For windows users you can install SoX in SourceForge. In MacOS, I don't have any information about it since I don't use MacOS, but you can use any possible way to run SoX at least.

Task Priority Complete Status
Implement Gemini Chat High Completed
Develop Voice Recognition High Completed
Implement Audio Language Detection High Completed
Implement Text Language Detection Medium Completed
Implement an Audio Player Low Completed
Define Enums Low Completed
Integrate Debugging Low Completed

📦 Project Installlation

Before you use this repository, verify that you have the following libraries installed on Linux:

  1. SoX
  2. libsox-fmt-all
    • sudo apt-get install libsox-fmt-all
    • // Optional for windows
  3. FFmpeg
    • choco install ffmpeg
    • sudo apt install ffmpeg

After installing the necessary libraries, proceed to install the repository by using the following commands:

# npm
$ npm install git+https://github.com/Stawa/GTTS.git
# Bun
$ bun install git+https://github.com/Stawa/GTTS.git

📄 Project Examples

A few requirements must be completed in order for each class to execute successfully. These needs include the following:

  1. Google Gemini API Key (lib.GoogleGemini)
  2. TikTok SessionID (lib.TextToSpeech)
    • This SessionID can be obtained from TikTok cookies.
  3. Google Speech API Key (lib.VoiceRecognition.fetchTranscriptGoogle)
  4. Deepgram API Key (lib.VoiceRecognition.fetchTrascriptDeepgram)
    • This key can be obtained from Deepgram

This is an example of how you get a generated response from the Google Gemini API; it only takes one function:

import { GoogleGemini } from "@stawa/gtts";

const google = new GoogleGemini({
   apiKey: "XXXXX",
   debugLog: true;
})

async function app() {
   const res = await google.chat("When was Facebook launched?");
   console.log(res);
};

app();

👥 Contributors

About

This project converts written material into speech by using Google AI (Gemini) for text creation or internet searches.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published