kana_parser 実装 #155

nebocco · 2022-06-10T07:37:01Z

内容

core/src/engine/mora_list.cpp, kana_parser.cpp の内容を Rust で実装しました。

その他

存在する関数を一通り実装し、簡単なテストを書きました。とりあえずこれ自体で機能が完結しているものを作ろうとしたので、以下のような修正点・調整点が残っています。

エラーが独自型である（core の Error とは別物）
モジュールの階層構造をどうするか、整理整頓
関数の公開範囲をどうするか
Rust 的にはもう少しすっきり書けそう

また、C++ での実装で計算量の悪い箇所がありましたがそのまま残しています。

qwerty2501 · 2022-06-10T07:39:14Z

元がengineディレクトリ切ってるのでそれに寄せてengine module作ったほうが良いと思います

qwerty2501 · 2022-06-10T07:41:53Z

エラーを独自型にした理由ってなんかあったりするんですか？

nebocco · 2022-06-10T07:46:04Z

エラーを独自型にした理由ってなんかあったりするんですか？

一旦書き写すにあたって閉じたモジュールにしたかったためです。TTS機能の実装ももう少し先になりそうなのでそのときに対応させればいいかなと思っていました。

例えばTTS系統のエラーはcoreとは別に型を作ってまとめる、みたいなことも将来的にあるかな、と想像しています

PickledChair · 2022-06-10T07:51:26Z

@nebocco PR ありがとうございます！　細かいところは後で詳しく見てみたいと思います。

プロジェクトの構成については @qwerty2501 さんの意見

元がengineディレクトリ切ってるのでそれに寄せてengine module作ったほうが良いと思います

に同意で、例えば

voicevox_core
├── Cargo.toml
├── build.rs
└── src
    ├── c_export.rs
    ├── engine
    │   ├── kana_parser.rs
    │   └── mod.rs
    ├── error.rs
    ├── internal.rs
    ├── lib.rs
    ├── result.rs
    └── status.rs

みたいな構造にするのが良いのかな、と思いました！

Hiroshiba · 2022-06-10T14:29:10Z

あ！！
これはただの思いつきですが、engineディレクトリも良いのですが別クレート（例えばvoicevox_ttsとか）にすると、最終的にdll分けたりopenjtalkを含めたり含めなかったりできるかもと感じました。
（でもプルリクの主旨とは異なるので、engineディレクトリでも問題ないと思います。）

Hiroshiba · 2022-06-10T14:29:25Z

すみません、間違えました🙇‍♂️

PickledChair

とりあえずテスト以外のコードをレビューしました（すみません、テストコードはまた後ほどレビューします……！）。確認よろしくお願いいたします。

crates/voicevox_core/src/engine/kana_parser.rs

nebocco · 2022-06-12T01:04:45Z

@PickledChair
レビューありがとうございます。修正いたします。

nebocco · 2022-06-12T02:06:31Z

あ！！これはただの思いつきですが、engineディレクトリも良いのですが別クレート（例えばvoicevox_ttsとか）にすると、最終的にdll分けたりopenjtalkを含めたり含めなかったりできるかもと感じました。（でもプルリクの主旨とは異なるので、engineディレクトリでも問題ないと思います。）

@Hiroshiba
ご提案ありがとうございます。確かに最終的には tts 側の機能は別クレートにまとめてしまったほうがよさそうですね。ただ engine モジュールごとまるっとコピペしてしまえばクレートを切り分けるのは比較的簡単だと思うので、クレート分割は後々 tts 機能を本格的に実装していくタイミングに譲りたいと思います。いかがでしょうか？

nebocco · 2022-06-12T02:38:12Z

ひとまずいただいたレビューのうち以下の二点以外を修正しました。

Hiroshiba · 2022-06-12T09:09:12Z

ただ engine モジュールごとまるっとコピペしてしまえばクレートを切り分けるのは比較的簡単だと思うので、クレート分割は後々 tts 機能を本格的に実装していくタイミングに譲りたいと思います。いかがでしょうか？

TTS用のクレート分割は後のタイミングで大丈夫だと思います！

PickledChair

申し訳ありません、なかなかレビューの時間が取れず、かなりお待たせしてしまいました……！　テストコードについてもレビューしました。確認よろしくお願いします。

他の Rust 実装関連の PR がいくつかマージされ、おそらくコンフリクトが発生しているので、Draft を外す際は解消をよろしくお願いいたします……！

crates/voicevox_core/src/engine/kana_parser.rs

crates/voicevox_core/src/engine/mora_list.rs

nebocco · 2022-06-29T02:58:52Z

レビューありがとうございました。ご指摘いただいた点を修正いたしました。

Hiroshiba

LGTM！！！

まだ実際に音声合成として動かせてないかもですが、一旦マージしちゃうのが良いのかなと思っています･･･！！

PickledChair

すみません、前回のレビューが中途半端な書き方になってしまい、少し意図が伝わらなかった部分があったようなので、追加の change request を出しました（rstest を使うように書いたことについてなのですが、機能的に便利だから使うということの他に、voicevox_core クレートのテストが基本的に全て rstest を使って書かれており、可読性を上げるために全てのテストで rstest を似たような形式で用いたい、という意図があってコメントしていました。わかりづらくて申し訳ありませんでした……！）。

crates/voicevox_core/src/engine/kana_parser.rs

qwerty2501 · 2022-06-29T18:09:55Z

crates/voicevox_core/src/engine/model.rs

+#[derive(Clone, Debug)]
+pub(super) struct MoraModel {
+    pub text: String,
+    pub consonant: Option<String>,
+    pub consonant_length: Option<f32>,
+    pub vowel: String,
+    pub vowel_length: f32,
+    pub pitch: f32,
+}
+
+#[allow(dead_code)] // TODO: remove this feature
+#[derive(Debug)]
+pub(super) struct AccentPhraseModel {
+    pub moras: Vec<MoraModel>,
+    pub accent: usize,
+    pub pause_mora: Option<MoraModel>,
+    pub is_interrogative: bool,
+}
+
+#[allow(dead_code)] // TODO: remove this feature
+pub(super) struct AudioQueryModel {
+    accent_phrases: Vec<AccentPhraseModel>,
+    speed_scale: f32,
+    pitch_scale: f32,
+    intonation_scale: f32,
+    volume_scale: f32,
+    pre_phoneme_length: f32,
+    post_phoneme_length: f32,
+    output_sampling_rate: u32,
+    output_stereo: bool,
+    kana: String,
+}


いったんderive_newとderive_getters使ってみようという方針になっていたはずなので各structにdrive(new,Getter)をしてfieldのpubを外してstructのimmutable性を高めたほうが良いと思います

Suggested change

#[derive(Clone, Debug)]

pub(super) struct MoraModel {

pub text: String,

pub consonant: Option<String>,

pub consonant_length: Option<f32>,

pub vowel: String,

pub vowel_length: f32,

pub pitch: f32,

}

#[allow(dead_code)] // TODO: remove this feature

#[derive(Debug)]

pub(super) struct AccentPhraseModel {

pub moras: Vec<MoraModel>,

pub accent: usize,

pub pause_mora: Option<MoraModel>,

pub is_interrogative: bool,

}

#[allow(dead_code)] // TODO: remove this feature

pub(super) struct AudioQueryModel {

accent_phrases: Vec<AccentPhraseModel>,

speed_scale: f32,

pitch_scale: f32,

intonation_scale: f32,

volume_scale: f32,

pre_phoneme_length: f32,

post_phoneme_length: f32,

output_sampling_rate: u32,

output_stereo: bool,

kana: String,

}

#[derive(Clone, Debug, new, Getter)]

pub(super) struct MoraModel {

text: String,

consonant: Option<String>,

consonant_length: Option<f32>,

vowel: String,

vowel_length: f32,

pitch: f32,

}

#[allow(dead_code)] // TODO: remove this feature

#[derive(Debug, new, Getter)]

pub(super) struct AccentPhraseModel {

moras: Vec<MoraModel>,

accent: usize,

pause_mora: Option<MoraModel>,

is_interrogative: bool,

}

#[allow(dead_code)] // TODO: remove this feature

#[derive(new, Getter)]

pub(super) struct AudioQueryModel {

accent_phrases: Vec<AccentPhraseModel>,

speed_scale: f32,

pitch_scale: f32,

intonation_scale: f32,

volume_scale: f32,

pre_phoneme_length: f32,

post_phoneme_length: f32,

output_sampling_rate: u32,

output_stereo: bool,

kana: String,

}

とりあえず new と Getters を追加したのですが、ごく一部の属性は mutable に扱うことがあるため、最低限の広さで public としてみました。

voicevox_core/crates/voicevox_core/src/engine/kana_parser.rs

Lines 151 to 162 in 42e6cfd

let accent_phrase = {

let mut accent_phrase = text_to_accent_phrase(&phrase)?;

if letter == PAUSE_DELIMITER {

accent_phrase.pause_mora = Some(MoraModel::new(

PAUSE_DELIMITER.to_string(),

None,

None,

"pau".to_string(),

));

}

accent_phrase.is_interrogative = is_interrogative;

accent_phrase

voicevox_core/crates/voicevox_core/src/engine/model.rs

Lines 17 to 24 in 42e6cfd

#[allow(dead_code)] // TODO: remove this feature

#[derive(Debug, new, Getters)]

pub(super) struct AccentPhraseModel {

moras: Vec<MoraModel>,

accent: usize,

pub(super) pause_mora: Option<MoraModel>,

pub(super) is_interrogative: bool,

}

accessibilityを変えるのではなく、必要な部分にだけsetterをAccentPhraseModel に生やすように出来ますか？
機能上全てのフィールドを初期化後に変更する可能性があるのであればGettersを使わずに初めからpublicにしたほうがよいですが、今回のように部分的に変更するのであればsetterが良いと思います

fix: use rstest Co-authored-by: Gray Suitcase <41382894+PickledChair@users.noreply.github.com>

Co-authored-by: qwerty2501 <939468+qwerty2501@users.noreply.github.com>

Co-authored-by: Gray Suitcase <41382894+PickledChair@users.noreply.github.com>

PickledChair

LGTM！　PR ありがとうございました。長いレビュー期間になってしまい大変お待たせしてしまいました。お疲れ様でした！

#155 (comment) への対応に関して、 @qwerty2501 さんが特に問題ないという感じでしたらマージで良いと思います！

qwerty2501

対応はいってたんですね LGTM

Hiroshiba · 2022-07-01T04:34:41Z

PRありがとうございました！！
コアは結構ほしい機能がまだまだあったりするので、よかったらまたPRいただけると心強いです！！！

* WIP: kana parser * add: parse_kana/create_kana * add: error handling, tests * fix: removed unused #[derive()] * fix: cargo clippy * modify: modified directory layout * fmt: cargo fmt * fixx: mistake * modify: module layout * fix: reflect reviews * fix: use hashMap instead of BTreeMap * change: use rstest for mod tests * Update crates/voicevox_core/src/engine/kana_parser.rs fix: use rstest Co-authored-by: Gray Suitcase <41382894+PickledChair@users.noreply.github.com> * update: use #[derive(new, Getter)] Co-authored-by: qwerty2501 <939468+qwerty2501@users.noreply.github.com> * fix: use rstest Co-authored-by: Gray Suitcase <41382894+PickledChair@users.noreply.github.com> * fix: use rstest Co-authored-by: Gray Suitcase <41382894+PickledChair@users.noreply.github.com> * fix: use new() and getters * fmt: clippy * update: define setters for requiring attrs Co-authored-by: Gray Suitcase <41382894+PickledChair@users.noreply.github.com> Co-authored-by: qwerty2501 <939468+qwerty2501@users.noreply.github.com>

nebocco added 4 commits June 9, 2022 19:02

WIP: kana parser

9ae884b

add: parse_kana/create_kana

35e222c

add: error handling, tests

e3bdff5

fix: removed unused #[derive()]

56bc502

fix: cargo clippy

a708820

nebocco added 3 commits June 10, 2022 16:55

modify: modified directory layout

242307f

fmt: cargo fmt

37638cc

fixx: mistake

ec0d623

Hiroshiba closed this Jun 10, 2022

Hiroshiba reopened this Jun 10, 2022

PickledChair requested changes Jun 10, 2022

View reviewed changes

PickledChair mentioned this pull request Jun 11, 2022

コアの実装言語を C++ から Rust へ移行する #128

Closed

43 tasks

Merge remote-tracking branch 'origin/rust' into rust

9c1979a

modify: module layout

6419682

fix: reflect reviews

cac6cea

fix: use hashMap instead of BTreeMap

d726714

PickledChair requested changes Jun 26, 2022

View reviewed changes

crates/voicevox_core/src/engine/kana_parser.rs Outdated Show resolved Hide resolved

crates/voicevox_core/src/engine/kana_parser.rs Outdated Show resolved Hide resolved

crates/voicevox_core/src/engine/mora_list.rs Outdated Show resolved Hide resolved

nebocco added 2 commits June 29, 2022 11:54

change: use rstest for mod tests

3eca4ee

Merge remote-tracking branch 'upstream/rust' into rust

787212a

nebocco marked this pull request as ready for review June 29, 2022 02:58

Hiroshiba approved these changes Jun 29, 2022

View reviewed changes

nebocco requested a review from PickledChair June 29, 2022 08:42

PickledChair requested changes Jun 29, 2022

View reviewed changes

crates/voicevox_core/src/engine/kana_parser.rs Outdated Show resolved Hide resolved

crates/voicevox_core/src/engine/kana_parser.rs Outdated Show resolved Hide resolved

crates/voicevox_core/src/engine/kana_parser.rs Outdated Show resolved Hide resolved

qwerty2501 reviewed Jun 29, 2022

View reviewed changes

nebocco and others added 7 commits June 30, 2022 17:07

Update crates/voicevox_core/src/engine/kana_parser.rs

34add9c

fix: use rstest Co-authored-by: Gray Suitcase <41382894+PickledChair@users.noreply.github.com>

update: use #[derive(new, Getter)]

1e7eb17

Co-authored-by: qwerty2501 <939468+qwerty2501@users.noreply.github.com>

fix: use rstest

d6790cb

Co-authored-by: Gray Suitcase <41382894+PickledChair@users.noreply.github.com>

fix: use rstest

d2e1333

Co-authored-by: Gray Suitcase <41382894+PickledChair@users.noreply.github.com>

fix: use new() and getters

f29de84

fmt: clippy

42e6cfd

update: define setters for requiring attrs

b17d23f

PickledChair approved these changes Jul 1, 2022

View reviewed changes

qwerty2501 approved these changes Jul 1, 2022

View reviewed changes

PickledChair merged commit ad69533 into VOICEVOX:rust Jul 1, 2022

nebocco deleted the rust branch July 1, 2022 07:06

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

kana_parser 実装 #155

kana_parser 実装 #155

nebocco commented Jun 10, 2022

qwerty2501 commented Jun 10, 2022

qwerty2501 commented Jun 10, 2022

nebocco commented Jun 10, 2022 •

edited

Loading

PickledChair commented Jun 10, 2022

Hiroshiba commented Jun 10, 2022

Hiroshiba commented Jun 10, 2022

PickledChair left a comment

nebocco commented Jun 12, 2022 •

edited

Loading

nebocco commented Jun 12, 2022 •

edited

Loading

nebocco commented Jun 12, 2022

Hiroshiba commented Jun 12, 2022 •

edited

Loading

PickledChair left a comment

nebocco commented Jun 29, 2022

Hiroshiba left a comment

PickledChair left a comment

qwerty2501 Jun 29, 2022

nebocco Jun 30, 2022

qwerty2501 Jun 30, 2022

PickledChair left a comment

qwerty2501 left a comment

Hiroshiba commented Jul 1, 2022

	let accent_phrase = {
	let mut accent_phrase = text_to_accent_phrase(&phrase)?;
	if letter == PAUSE_DELIMITER {
	accent_phrase.pause_mora = Some(MoraModel::new(
	PAUSE_DELIMITER.to_string(),
	None,
	None,
	"pau".to_string(),
	));
	}
	accent_phrase.is_interrogative = is_interrogative;
	accent_phrase

	#[allow(dead_code)] // TODO: remove this feature
	#[derive(Debug, new, Getters)]
	pub(super) struct AccentPhraseModel {
	moras: Vec<MoraModel>,
	accent: usize,
	pub(super) pause_mora: Option<MoraModel>,
	pub(super) is_interrogative: bool,
	}

kana_parser 実装 #155

kana_parser 実装 #155

Conversation

nebocco commented Jun 10, 2022

内容

関連 Issue

その他

qwerty2501 commented Jun 10, 2022

qwerty2501 commented Jun 10, 2022

nebocco commented Jun 10, 2022 • edited Loading

PickledChair commented Jun 10, 2022

Hiroshiba commented Jun 10, 2022

Hiroshiba commented Jun 10, 2022

PickledChair left a comment

Choose a reason for hiding this comment

nebocco commented Jun 12, 2022 • edited Loading

nebocco commented Jun 12, 2022 • edited Loading

nebocco commented Jun 12, 2022

Hiroshiba commented Jun 12, 2022 • edited Loading

PickledChair left a comment

Choose a reason for hiding this comment

nebocco commented Jun 29, 2022

Hiroshiba left a comment

Choose a reason for hiding this comment

PickledChair left a comment

Choose a reason for hiding this comment

qwerty2501 Jun 29, 2022

Choose a reason for hiding this comment

nebocco Jun 30, 2022

Choose a reason for hiding this comment

qwerty2501 Jun 30, 2022

Choose a reason for hiding this comment

PickledChair left a comment

Choose a reason for hiding this comment

qwerty2501 left a comment

Choose a reason for hiding this comment

Hiroshiba commented Jul 1, 2022

nebocco commented Jun 10, 2022 •

edited

Loading

nebocco commented Jun 12, 2022 •

edited

Loading

nebocco commented Jun 12, 2022 •

edited

Loading

Hiroshiba commented Jun 12, 2022 •

edited

Loading