🤖 Smart R2-D2 Assistant: Edge-to-Cloud AI Robotics

This project is an end-to-end robotics project that combines a custom 3D-printed design of the iconic R2-D2 character with a real-time, AI-based voice assistant. While the system handles hardware and sensor management on the edge device (ESP32), it offloads the AI workload—which requires heavy processing power—to a local Python server.

🌟 Highlighted Features

Real-Time Communication (Edge-to-Cloud): Transmission of RAW PCM audio data received from the microphone via I2S protocol on the ESP32 to the local server over HTTP.
Speech-to-Text (STT): High-accuracy voice recognition using the OpenAI Whisper (Turbo) model.
Artificial Intelligence (LLM): Context-aware conversational response generation fitting the R2-D2 character, integrated with the Google Gemini API.
Text-to-Speech (TTS): Fluent and natural voice synthesis using the Edge-TTS infrastructure.
Dynamic UI: State-based (Listening, Thinking, Speaking, etc.) dynamic facial animations on an OLED display using the RobotFace library.
Wireless Management: Network configuration with WiFiManager and remote over-the-air code update support with ArduinoOTA.

🏗️ System Architecture

The system consists of two asynchronous units: Edge and Server:

ESP32 (Edge Device): Woken up by a touch sensor. It records audio from the I2S microphone and transmits it to the server in RAW PCM format. It manages the animations and plays the response received from the server through the I2S amplifier.
Flask Server (Python):
- Converts the audio to a processable WAV format using FFmpeg.
- Converts speech to text using Whisper.
- Analyzes the text and generates a response using Gemini LLM.
- Converts the response to speech using Edge-TTS, converts it back to RAW PCM, and sends it back to the device.

🛠️ Hardware and 3D Design

R2-D2's outer case and inner chassis were designed from scratch, fully optimized for 3D printers.

👉 3D Model and STL Files (Thingiverse)

Main Electronic Components Used:

Microcontroller: ESP32 Development Board
Microphone: INMP441 (I2S MEMS)
Audio Output: MAX98357A (I2S Class D Amplifier) + Speaker
Display: I2C OLED Display
Sensors: Capacitive Touch Sensor (For wake-up)

🛠️ Hardware Components

The following table highlights the core electronic components used to bring R2-D2 to life. These are organized into a grid for clear visualization.

Component	Description	Component	Description
	ESP32 DevKit V1: The brain of the project. Manages WiFi, I2S audio streaming, and OLED animations.		INMP441 Microphone: High-performance I2S MEMS microphone for clear voice capture.
	MAX98357A Amp: I2S Class D amplifier that converts digital audio data into sound.		OLED Display (SSD1306): Displays real-time facial expressions and system status.
	Capacitive Touch Sensor: Acts as the wake-up trigger to start the listening process.		3W Speaker: Delivers the character-specific voice responses and system sounds.
	TP4056 Module: Lithium battery charger with protection circuit to safely charge the 18650 cell.		MT3608 Boost Converter: Steps up the battery voltage to a stable 5V for the ESP32 and peripherals.
	18650 Battery: High-capacity rechargeable Li-ion cell providing the main power source for the robot.

📐 System Connection Diagram

The diagram below illustrates the wiring between the ESP32 and its peripherals, as well as the logical flow between the Edge (ESP32) and the Cloud (Flask Server).

📁 Project Structure

├── server/
│   ├── app.py                 # Main Flask server application
│   ├── config/
│   │   └── settings.json      # API, TTS, and Server settings
│   └── requirements.txt       # Python dependencies
├── esp32/
│   ├── main.ino               # Main ESP32 source code
│   ├── RobotFace.h            # OLED animation library
│   └── r2d2_ses.h             # System sounds on PROGMEM
├── .gitignore
└── README.md

🚀 Setup and Usage

1. Server (Python Backend) Setup

For the project to work, FFmpeg must be installed on your computer and added to your system's PATH.

Clone the repository to your computer and enter the directory:

git clone [https://github.com/CinarSamet/Smart-R2-D2-Assistant.git](https://github.com/CinarSamet/Smart-R2-D2-Assistant.git)
cd Smart-R2-D2-Assistant/server

Install the required Python libraries:
```
pip install -r requirements.txt
```
Add your Gemini API key as an environment variable:
```
export GEMINI_API_KEY="your_api_key"
```
Start the server:
```
python app.py
```

2. Hardware (ESP32) Setup

Open the Arduino IDE and install the necessary libraries (WiFiManager, ArduinoOTA, etc.).
Update the serverUrl variable in the esp32/main.ino file with the local IP address of the computer running the server (e.g., http://192.168.1.X:5001/upload).
Upload the code to the ESP32.
On its first boot, the device will create a Wi-Fi network named R2D2_Kurulum. Connect to this network to configure your local internet settings on the device.

🗺️ Roadmap

Establishing the I2S audio pipeline on the ESP32.
Ensuring closed-loop communication with Whisper, Gemini, and TTS integration.
Creating the hardware State Machine structure and OLED interface.
Dockerization: Containerizing the entire Flask/AI server infrastructure using Docker to make it environment-independent and deployable.
Sensor Fusion: Adding autonomous movement capabilities by integrating IMU data and distance sensors.

📄 License

The software codes in this repository (ESP32 and Python) are licensed under the MIT License. See the LICENSE file for more details.

The 3D hardware designs (STL files) of the project are subject to the Creative Commons Attribution-NonCommercial 4.0 International (CC BY-NC 4.0) license. Reproduction and sale for commercial purposes are prohibited. You are free to use and develop them in your personal projects.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
esp32		esp32
hardware/images		hardware/images
server		server
sounds		sounds
.DS_Store		.DS_Store
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🤖 Smart R2-D2 Assistant: Edge-to-Cloud AI Robotics

🌟 Highlighted Features

🏗️ System Architecture

🛠️ Hardware and 3D Design

🛠️ Hardware Components

📐 System Connection Diagram

📁 Project Structure

🚀 Setup and Usage

1. Server (Python Backend) Setup

2. Hardware (ESP32) Setup

🗺️ Roadmap

📄 License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

🤖 Smart R2-D2 Assistant: Edge-to-Cloud AI Robotics

🌟 Highlighted Features

🏗️ System Architecture

🛠️ Hardware and 3D Design

🛠️ Hardware Components

📐 System Connection Diagram

📁 Project Structure

🚀 Setup and Usage

1. Server (Python Backend) Setup

2. Hardware (ESP32) Setup

🗺️ Roadmap

📄 License

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages