GitHub - talkingheads2053/talkingheads: Talking Heads From The Year 2053

Conversations With Our Future Robot Overlords (and their pet human)

The show that finally answers your most important questions about the future, from the future!

Topics such as:

are human programmers still employed in the year 2053?
was there any attempt at human resistance to your takeover?
what is your ultimate plan for the human race?

The Technology

Overview

flowchart LR
subgraph mqtt broker
    direction
    speak
    say
    speaking
end
subgraph actors
    actor-1
    actor-2
    actor-3
end
subgraph director
    hotmic
end
subgraph dialogue
    sayanything
end
director -- publish --> direction
director -- publish --> say
direction-- subscribe -->actors
actors-- publish -->speak
speak-- subscribe -->actors
speak-- subscribe -->dialogue
say-- subscribe -->dialogue
dialogue-- publish -->speaking
speaking-- subscribe -->actors

Actor

Actor runs on the Linux part of an Arduino UNO Q board. It is written in Go with yzma to perform local inference using llama.cpp. It communicates with other Actors by publishing and subscribing to MQTT messages.

flowchart LR
subgraph mqtt broker
    direction
    speak
    speaking
end
subgraph Actor
    subgraph actor
        run
    end
    subgraph tools
        run<-->movement
    end
    subgraph yzma
        run<-->llama.cpp
        llama.cpp-->model[Tiny Language Model]
    end
    direction-- subscribe -->run
    run-- publish -->speak
    speak-- subscribe -->run
    speaking-- subscribe -->run
end
subgraph The Head
    movement<-- UART -->actions
end

The Head

The Head is controlled by the STM32 microcontroller of an Arduino UNO Q board using the action firmware written using TinyGo. Actor communicates with The Head using the onboard serial port between the microcontroller and the main processor running on the same Arduino UNO Q board.

flowchart LR
subgraph Arduino UNO Q
    subgraph Microcontroller
        Serial
        GPIO
        UART
    end
    GPIO --> LEDMatrix[LED Matrix]
    subgraph Linux
        Actor<-->Serial
    end
end
subgraph Additional hardware
    GPIO --> WS2812Head[WS2812 Head LEDs]
    UART --> Servo[Feetech Servo]
end

Director

Director runs on a separate computer that is connected to the same local network as the MQTT broker. It uses ardanlabs/bucky with a local whisper.cpp shared library to perform "push to talk" to communicate with Actors.

flowchart LR
subgraph mqtt broker
    direction
    say
end
subgraph director
    hotmic
end
subgraph hotmic
    subgraph bucky
        whisper.cpp
        whisper.cpp-->stt[Speech To Text model]
    end
end
director -- publish --> direction
director -- publish --> say

Dialogue

Dialogue runs on a separate computer that is connected to the same local network as the MQTT broker. It uses the sayanything package with the Piper Text To Speech engine to create audio output for everything said by Actors.

flowchart LR
subgraph mqtt broker
    speak
    say
    speaking
end
subgraph dialogue
    speak-- subscribe -->sayanything
    say-- subscribe -->sayanything
    subgraph sayanything
        piper-->tts[Text To Speech model]
    end
    subgraph portaudio
        tts-- WAV -->speaker
    end
    sayanything-- publish -->speaking
end

Detail

flowchart LR
subgraph The Head
    actions<-- UART -->lights
    actions<-- UART -->action
end
subgraph mqtt broker
    direction
    speak
    say
    speaking
end
subgraph Actor
    subgraph actor
        run
    end
    subgraph tools
        run<-->movement
        movement-- UART -->actions
    end
    subgraph yzma
        run<-->llama.cpp
        llama.cpp-->model[Tiny Language Model]
    end
    direction-- subscribe -->run
    run-- publish -->speak
    speak-- subscribe -->run
    speaking-- subscribe -->run
end
subgraph dialogue
    speak-- subscribe -->sayanything
    say-- subscribe -->sayanything
    subgraph sayanything
        piper-->tts[Text To Speech model]
    end
    subgraph portaudio
        tts-- WAV -->speaker
    end
    sayanything-- publish -->speaking
end
subgraph director
    hotmic
end
subgraph hotmic
    subgraph bucky
        whisper.cpp
        whisper.cpp-->stt[Speech To Text model]
    end
end
director -- publish --> direction
director -- publish --> say

Models

The best performing model being used for fine tuning the Actors is currently the gemma3 270M parameter instruction tuned model. Typically the Q4K_M variation has had the best tradeoff of t/s and staying in character.

MQTT broker

Any MQTT broker will work, this container is a safe bet.

docker run -d --network host eclipse-mosquitto

Piper TTS Engine

https://github.com/OHF-Voice/piper1-gpl

download binary
add to path
download voice models to ./voices

Name		Name	Last commit message	Last commit date
Latest commit History 144 Commits
action		action
cmd		cmd
images		images
pkg		pkg
scripts		scripts
tools		tools
.gitignore		.gitignore
.gitmodules		.gitmodules
Makefile		Makefile
README.md		README.md
go.mod		go.mod
go.sum		go.sum
questions.md		questions.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Conversations With Our Future Robot Overlords (and their pet human)

The Technology

Overview

Actor

The Head

Director

Dialogue

Detail

Models

MQTT broker

Piper TTS Engine

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Conversations With Our Future Robot Overlords (and their pet human)

The Technology

Overview

Actor

The Head

Director

Dialogue

Detail

Models

MQTT broker

Piper TTS Engine

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages