Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

实时口型同步 #2

Closed
Paraworks opened this issue Feb 16, 2023 · 6 comments
Closed

实时口型同步 #2

Paraworks opened this issue Feb 16, 2023 · 6 comments

Comments

@Paraworks
Copy link

Paraworks commented Feb 16, 2023

请问可以做到实时口型同步吗?这样配合api和tts就可以做到语音聊天了

@Arkueid
Copy link
Owner

Arkueid commented Feb 17, 2023

这个是通过读取音频振幅大小实现口型同步,所以只要能够读取到音频文件就可以口型同步。音频文件的路径目前只能在XXX.model3.json中读取,点击模型设置页面的更新模型或者保存可以重新读取音频路径。实现实时口型同步的话,要把聊天模块改掉,大概步骤是应用向某个服务端发起请求,服务端代替处理例如chatgpt的聊天消息,并通过tts生成音频文件,然后把文本和音频信息返回给应用。目前的思路是服务端需要自己架设,应用只提供接口,服务端返回的消息按特定格式包含文本和音频。我之后再弄一个更新吧。

@Paraworks
Copy link
Author

Paraworks commented Feb 17, 2023

这个服务器端其实我已经写好了,思路和你是一样的 https://github.com/Paraworks/vits_with_chatgpt-gpt3 应该可以直接结合你的工程,唯一需要做出的改动应该就是把ogg格式改回wav并且换上44100采样率

@Arkueid
Copy link
Owner

Arkueid commented Feb 17, 2023

ogg转wav可以使用ffmpeg: ffmpeg -i "输入音频" -ac 1 "输出音频"
我这边客户端也要改一下,接收音频文件和文本,然后播放,但是什么时候改完不好说><。

@Paraworks
Copy link
Author

Paraworks commented Feb 17, 2023

把api改成对音频文件的直接修改跑了一下实时聊天,确实没我想象地那样会出错。只不过没有gui还是有点丑陋,客户端修改好后桌宠就真成cyberwaifu了

@Arkueid
Copy link
Owner

Arkueid commented Feb 17, 2023

改完了,但只是稍微测试了一下,看看有没有问题吧,现在只能改config.json,还没写GUI,到时候大改一下设置页面,先这样吧。CyberWaifu是信仰啊哈哈😋,感谢科技。

@Paraworks
Copy link
Author

Paraworks commented Feb 17, 2023

可以,我已经改好了,就是服务器响应时间非常久(),不过在本地用gpu就没事了

@Arkueid Arkueid closed this as completed Feb 24, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants