Skip to content

FENRlR/DesktopARONA

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

22 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

title.png

DesktopARONA

Operation CWAL of "...아로나 또한 비서로 개발되어 나올 가능성이 있을까요?".

Model files not included

v1

v1demo.mp4

Demonstration of v1 with polyglot module.

Overall structure

structure.png

- Shell

v1test.gif

Base : Java - libgdx spine runtime

Targeted screen size : 1080p

Get a copy of the spine model from somewhere else, then place it to ./aronares.

Implemented method for lipsync is quite different from that of in-game example (which was added much later).

- LLM

Base : Python - kobart or polyglot

  • kobart

    Finetuned KoBART from KoBART-chatbot.

    Fast, but generates a fixed answer for a given sentence.

    To make it work on windows OS, you need to solve an error (as far as I remember, that was something related to type conversion in torch).

  • polyglot

    EleutherAI's polyglot-ko-1.3b finetuned with datasets of KoAlpaca.

    Slow, even with time limits - can be better if you have a modern gpu (probably better than my old 970m) and takes a lot of Vram.

    To make it work on windows OS, you need bitsandbytes modified for windows and some additional struggles for manual dtype allocations by opening up the library and fixing it.

- TTS

Base : Python - mb-istft-vits-multilingual

- Translator

Base : Python - ezTransWeb-omni

There are some clumsy requirements for this. Please refer to https://github.com/FENRlR/ezTransWeb-omni.

Limitations

  • Bug in PMA
  • Without PMA, the sprite innates white borders and slight transparancy
  • Limited options for the window

v2

- Shell

v2prototype.gif

Base : C# - SpineViewerWPF

Targeted screen size : 1080p

Get a copy of the spine model from somewhere else, then place it to ./aronares/arona.

WIP - now with correct representation of PMA.

Plans

  • Customized language model with emotion embeddings
  • Lipsync options (in-game style amplitude based vs freq based)
  • Functional interaction (simple search, opening something, etc. - ideas needed)
  • STT (not my style but if needed)

Limitations