Skip to content

A lightweight Python-based AI Agent that blends voice commands with GPT-3.5 intelligence. It listens via microphone, speaks back via TTS, and automates desktop tasks.

License

Notifications You must be signed in to change notification settings

Eamon2009/AI-agent

NOTE: I have used python library documentation pages and A.I assistance to build this ai agent 🧠 How Does It Work? (The Simple Version) Imagine your assistant is like a security guard sitting behind a desk. Here is the step-by-step process of what happens from the moment you speak until the moment it responds.

  1. The Ears (Speech Recognition) The assistant uses the speech_recognition library to "listen."

It opens your microphone and waits for a sound.

Once you finish speaking, it sends that audio recording to Google’s Speech API.

Google turns that audio into text (like a transcript) and sends it back to the script.

  1. The Brain (The if-elif-else Logic) Once the assistant has your words as text, it looks through a list of instructions to see if it recognizes a "keyword":

The "If" Check: It checks: "Did the user say 'Open YouTube'?" If yes, it triggers the web browser.

The "System" Check: It checks: "Did the user say 'Time'?" If yes, it looks at your computer's internal clock and prepares a sentence.

The "App" Check: If you say "Open Calculator," it sends a command to Windows to find calc.exe and run it.

  1. The "Imagination" (OpenAI Integration) If you ask something the assistant doesn't have a specific rule for (like "Why is the sky blue?"), it doesn't give up!

It moves to the else block (the fallback).

It packages your question and sends it over the internet to OpenAI's GPT-3.5.

The AI generates a smart response and sends it back as text.

  1. The Voice (Text-to-Speech) The assistant can't just leave the answer on the screen; it needs to talk!

It uses pyttsx3 (the Windows voice engine).

This library takes the text string and converts it into a digital voice that plays through your speakers.

🛠️ The "Save" Feature script has a cool extra feature: Memory. Every time the AI answers a question, the script:

Creates a folder called Openai.

Takes the first few words of your question to create a filename.

Saves the full conversation into a .txt file so you can read it later without running the code again.

About

A lightweight Python-based AI Agent that blends voice commands with GPT-3.5 intelligence. It listens via microphone, speaks back via TTS, and automates desktop tasks.

Resources

License

Code of conduct

Contributing

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages