[Rust] voicevox_tts と voicevox_wav_free の実装 #186

PickledChair · 2022-07-16T15:02:30Z

内容

voicevox_tts と voicevox_wav_free を実装します。
ビルド結果のコアを cpp example で呼び出せることを確認していますが、wav 自体は出力できる（wav ヘッダも問題ないように見える）ものの、音が鳴る wav を出力できていないので draft にしています。

ビルド結果のコアを cpp example から呼び出せます。おそらく期待通りに音声合成できていそうでした。

関連 Issue

ref #128

qwerty2501 · 2022-07-16T15:43:35Z

crates/voicevox_core/src/engine/synthesis_engine.rs

+            (output_sampling_rate / Self::DEFAULT_SAMPLING_RATE) * num_channels as u32;
+        let block_size: u16 = bit_depth * num_channels / 8;
+
+        let buf: Vec<u8> = Vec::new();


with_capacityを使ったほうが良いと思います

そうですね！　with_capacity で必要な大きさを事前確保するように修正しました。

qwerty2501 · 2022-07-16T15:47:25Z

そもそも yukarin_s_forward,yukarin_sa_forward,decode_forwardはまともに動かせるか確認取れてるんでしたっけ

qwerty2501 · 2022-07-16T15:49:26Z

crates/voicevox_core/src/internal.rs

-        output_wav: *const *mut u8,
-    ) -> Result<()> {
-        unimplemented!()
+    pub fn voicevox_tts(&mut self, text: &CStr, speaker_id: usize) -> Result<Vec<u8>> {


Suggested change

pub fn voicevox_tts(&mut self, text: &CStr, speaker_id: usize) -> Result<Vec<u8>> {

pub fn voicevox_tts(&mut self, text: &str, speaker_id: usize) -> Result<Vec<u8>> {

ここはCStrじゃなくても良さそうですね

&str にしました！（ついでに Internal::voicevox_load_openjtalk_dict の引数の &CStr も &str に変えました）

PickledChair · 2022-07-16T15:51:27Z

そもそも yukarin_s_forward,yukarin_sa_forward,decode_forwardはまともに動かせるか確認取れてるんでしたっけ

voicevox_engine から正常に呼び出せる（再生可能な音声を取得できる）ことを手元で確認しているので、大丈夫のはずです！

今バグを発見しており、修正にかかっています（full_context_label::string_feature_by_regex 関数が期待通りの結果を返していませんでした。また、他にもバグがあり、潰しにかかっています）。

qwerty2501 · 2022-07-16T15:54:51Z

full_context_label::string_feature_by_regex 関数が期待通りの結果を返していませんでした

あ、すみませんそのあたり実装急いでテスト書かなかったやつでした

PickledChair · 2022-07-16T16:31:27Z

voicevox_tts 関数で音声合成できるようになりました

crates/voicevox_core/src/engine/model.rs

crates/voicevox_core/src/c_export.rs

PickledChair · 2022-07-16T16:56:10Z

crates/voicevox_core/src/c_export.rs

-        output_binary_size,
-        output_wav,
+    let (output_opt, result_code) = convert_result(lock_internal().voicevox_tts(
+        unsafe { CStr::from_ptr(text) }.to_str().unwrap(),


ここも to_str でパニックにせずエラーハンドリングした方が良いでしょうか？

qwerty2501

LGTM

Hiroshiba

LGTM！！

処理を細かく読んでみました。たぶん問題ない･･･はず･･･！
念の為にENGINEの出力と一致するか確かめておきたいなーと思いました！

Hiroshiba · 2022-07-16T21:59:14Z

crates/voicevox_core/src/engine/synthesis_engine.rs

+                                    .map(|phoneme| phoneme.phoneme().to_string())
+                                    .collect::<Vec<_>>()
+                                    .join("");
+                                mora_text = mora_text.to_lowercase();


別にこのままでも問題ないのですが、ENGINEに合わせるならカタカナのが合ってそうかもです。
（たぶん変換用mapが必要で変更業が大きくなるし、別PRのが良さそう！）

ENGINEに合わせるならカタカナのが合ってそうかもです。

すみません、単なる実装忘れでした！

テキストに変換する関数は既に存在している（以下リンク）ので、すぐに PR 出せそうな気がします……！

voicevox_core/crates/voicevox_core/src/engine/mora_list.rs

Lines 189 to 200 in ce9d36b

#[allow(dead_code)] // TODO: remove this feature

fn mora2text(mora: &str) -> &str {

for &[text, consonant, vowel] in MORA_LIST_MINIMUM {

if mora.len() >= consonant.len()

&& &mora[..consonant.len()] == consonant

&& &mora[consonant.len()..] == vowel

{

return text;

}

}

mora

}

Hiroshiba · 2022-07-16T22:03:27Z

crates/voicevox_core/src/engine/synthesis_engine.rs

+        let volume_scale = *query.volume_scale();
+        let output_stereo = *query.output_stereo();
+        // TODO: 44.1kHzなどの対応
+        let output_sampling_rate = *query.output_sampling_rate();


ただのメモなのですが、リサンプリング（なかなか奥が深い）とかはこのライブラリの範疇にしないほうが良いかもと感じました。

* implements create_accent_phrases * implements synthesis * implements synthesis_wave_format * implements voicevox_tts and voicevox_wav_free * resolve clippy warning * 音声合成できるように修正 * wavのためのバッファのvecをwith_capacityでメモリ確保 * Internalのメソッドの引数としてCStrの代わりにstrを使う * Dissolveを使わない * UTF-8文字列としてデコードできない場合のエラーをハンドリングする

PickledChair added 6 commits July 16, 2022 17:36

implements create_accent_phrases

0b46eeb

implements synthesis

f8b1313

implements synthesis_wave_format

62dafef

implements voicevox_tts and voicevox_wav_free

1c71cb5

Merge branch 'rust' into feature/implements-voicevox_tts

3558230

resolve clippy warning

a38db73

qwerty2501 reviewed Jul 16, 2022

View reviewed changes

音声合成できるように修正

ae9eeed

PickledChair added 2 commits July 17, 2022 01:37

wavのためのバッファのvecをwith_capacityでメモリ確保

5fb33d6

Internalのメソッドの引数としてCStrの代わりにstrを使う

24e08c6

qwerty2501 reviewed Jul 16, 2022

View reviewed changes

crates/voicevox_core/src/engine/model.rs Outdated Show resolved Hide resolved

Dissolveを使わない

04bc976

PickledChair marked this pull request as ready for review July 16, 2022 16:53

PickledChair commented Jul 16, 2022

View reviewed changes

crates/voicevox_core/src/c_export.rs Outdated Show resolved Hide resolved

PickledChair commented Jul 16, 2022

View reviewed changes

UTF-8文字列としてデコードできない場合のエラーをハンドリングする

7d5507c

qwerty2501 approved these changes Jul 16, 2022

View reviewed changes

Hiroshiba approved these changes Jul 16, 2022

View reviewed changes

Hiroshiba merged commit ce9d36b into VOICEVOX:rust Jul 16, 2022

PickledChair mentioned this pull request Jul 17, 2022

コアの実装言語を C++ から Rust へ移行する #128

Closed

43 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Rust] voicevox_tts と voicevox_wav_free の実装 #186

[Rust] voicevox_tts と voicevox_wav_free の実装 #186

PickledChair commented Jul 16, 2022 •

edited

Loading

qwerty2501 Jul 16, 2022

PickledChair Jul 16, 2022

qwerty2501 commented Jul 16, 2022

qwerty2501 Jul 16, 2022

PickledChair Jul 16, 2022

PickledChair commented Jul 16, 2022 •

edited

Loading

qwerty2501 commented Jul 16, 2022

PickledChair commented Jul 16, 2022

PickledChair Jul 16, 2022 •

edited

Loading

qwerty2501 left a comment

Hiroshiba left a comment

Hiroshiba Jul 16, 2022

PickledChair Jul 17, 2022

Hiroshiba Jul 16, 2022

	pub fn voicevox_tts(&mut self, text: &CStr, speaker_id: usize) -> Result<Vec<u8>> {
	pub fn voicevox_tts(&mut self, text: &str, speaker_id: usize) -> Result<Vec<u8>> {

	#[allow(dead_code)] // TODO: remove this feature
	fn mora2text(mora: &str) -> &str {
	for &[text, consonant, vowel] in MORA_LIST_MINIMUM {
	if mora.len() >= consonant.len()
	&& &mora[..consonant.len()] == consonant
	&& &mora[consonant.len()..] == vowel
	{
	return text;
	}
	}
	mora
	}

[Rust] voicevox_tts と voicevox_wav_free の実装 #186

[Rust] voicevox_tts と voicevox_wav_free の実装 #186

Conversation

PickledChair commented Jul 16, 2022 • edited Loading

内容

関連 Issue

qwerty2501 Jul 16, 2022

Choose a reason for hiding this comment

PickledChair Jul 16, 2022

Choose a reason for hiding this comment

qwerty2501 commented Jul 16, 2022

qwerty2501 Jul 16, 2022

Choose a reason for hiding this comment

PickledChair Jul 16, 2022

Choose a reason for hiding this comment

PickledChair commented Jul 16, 2022 • edited Loading

qwerty2501 commented Jul 16, 2022

PickledChair commented Jul 16, 2022

PickledChair Jul 16, 2022 • edited Loading

Choose a reason for hiding this comment

qwerty2501 left a comment

Choose a reason for hiding this comment

Hiroshiba left a comment

Choose a reason for hiding this comment

Hiroshiba Jul 16, 2022

Choose a reason for hiding this comment

PickledChair Jul 17, 2022

Choose a reason for hiding this comment

Hiroshiba Jul 16, 2022

Choose a reason for hiding this comment

PickledChair commented Jul 16, 2022 •

edited

Loading

PickledChair commented Jul 16, 2022 •

edited

Loading

PickledChair Jul 16, 2022 •

edited

Loading