-
Notifications
You must be signed in to change notification settings - Fork 81
feat(perl) initial implementation #1099
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
現在の進捗 |
https://github.com/isucon/isucon11-qualify/pull/1099/checks?check_run_id=3348738825#step:7:143 ふむー。 |
|
環境まっさら(docker system prune -a)にしたら、ベンチ通り始めた。 1回目2回目3回目 |
|
こけるときもある。謎 |
|
|
||
| # ISUのコンディションの文字列からコンディションレベルを計算 | ||
| sub calculate_condition_level($condition) { | ||
| my $warn_count = () = $condition =~ m!=true!g; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NodeJSでも議論になっていたんですけど、ここの処理で正規表現つかうのはパフォーマンス的にどうなんだろう?という話があります。 Perlの場合はどうですかね…?やっぱり index 関数使った方が早そうな気はするけど…
#452 (comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ベンチとってみます・・!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ベンチとってみました。
この差であれば、正規表現ままで良いと思ったのですが、どうですかね。
ベンチ結果
conditionにfalseが多いとregexの勝利
逆にtrueが多いとindexの処理
CASE1: is_dirty=false,is_overweight=false,is_broken=false
Rate index regex
index 3308308/s -- -25%
regex 4411077/s 33% --
----------------
CASE2: is_dirty=false,is_overweight=true,is_broken=false
Rate index regex
index 2477508/s -- -5%
regex 2621439/s 6% --
----------------
CASE3: is_dirty=true,is_overweight=false,is_broken=true
Rate regex index
regex 1816839/s -- -7%
index 1946613/s 7% --
----------------
CASE4: is_dirty=true,is_overweight=true,is_broken=true
Rate regex index
regex 1336170/s -- -15%
index 1571331/s 18% --
----------------
MIX: $case1,$case2,$case3,$case4
Rate regex index
regex 481881/s -- -4%
index 504123/s 5% --
ベンチスクリプト
use strict;
use warnings;
use v5.34.0;
use Test2::V0;
use Benchmark qw(cmpthese);
my $case1 = 'is_dirty=false,is_overweight=false,is_broken=false';
my $case2 = 'is_dirty=false,is_overweight=true,is_broken=false';
my $case3 = 'is_dirty=true,is_overweight=false,is_broken=true';
my $case4 = 'is_dirty=true,is_overweight=true,is_broken=true';
sub warn_count_by_regex {
my $condition = shift;
my $warn_count = () = $condition =~ m!=true!g;
}
sub warn_count_by_index {
my $condition = shift;
my $count = 0;
my $pos = 0;
while ($pos != -1) {
$pos = index($condition, "=true", $pos);
if ($pos >= 0) {
$count += 1;
$pos += 5; # length "=true"
}
}
return $count;
}
is warn_count_by_regex($case1), 0;
is warn_count_by_regex($case2), 1;
is warn_count_by_regex($case3), 2;
is warn_count_by_regex($case4), 3;
is warn_count_by_index($case1), 0;
is warn_count_by_index($case2), 1;
is warn_count_by_index($case3), 2;
is warn_count_by_index($case4), 3;
done_testing;
say "CASE1: $case1";
cmpthese(-1, {
regex => sub {
warn_count_by_regex($case1);
},
index => sub {
warn_count_by_index($case1);
},
});
say '----------------';
say "CASE2: $case2";
cmpthese(-1, {
regex => sub {
warn_count_by_regex($case2);
},
index => sub {
warn_count_by_index($case2);
},
});
say '----------------';
say "CASE3: $case3";
cmpthese(-1, {
regex => sub {
warn_count_by_regex($case3);
},
index => sub {
warn_count_by_index($case3);
},
});
say '----------------';
say "CASE4: $case4";
cmpthese(-1, {
regex => sub {
warn_count_by_regex($case4);
},
index => sub {
warn_count_by_index($case4);
},
});
say '----------------';
say 'MIX: $case1,$case2,$case3,$case4';
cmpthese(-1, {
regex => sub {
warn_count_by_regex($case1);
warn_count_by_regex($case2);
warn_count_by_regex($case3);
warn_count_by_regex($case4);
},
index => sub {
warn_count_by_index($case1);
warn_count_by_index($case2);
warn_count_by_index($case3);
warn_count_by_index($case4);
},
});There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ちょっと意外な気はしましたが、たしかにこれだけ差がないんであればこのままでも良いかもですね…
webapp/perl/lib/IsuCondition/Web.pm
Outdated
| for (my $idx_keys = 0; $idx_keys < $keys->@*; $idx_keys++) { | ||
| my $key = $keys->[$idx_keys]; | ||
|
|
||
| if (substr($condition_str, $idx_cond_str) !~ m!^$key!) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
このあたりも正規表現の方がスッキリしていてそれらしいとは思うのですが index と比較して遅すぎたりしないか心配です
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
こちらに関しては、indexだけの実装にした方がパフォーマンスが明らかに良さそうなので、変更しました!
fd1365e
ベンチ結果
Rate regex substr index
regex 15459/s -- -84% -84%
substr 96376/s 523% -- -1%
index 97745/s 532% 1% --
ベンチスクリプト
use strict;
use warnings;
use v5.34;
use Test2::V0;
use Benchmark qw(cmpthese);
# ISUのコンディションの文字列がcsv形式になっているか検証
sub is_valid_condition_format_by_index {
my $condition_str = shift;
my $keys = ["is_dirty=", "is_overweight=", "is_broken="];
my $value_true = "true";
my $value_false = "false";
my $idx_cond_str = 0;
for (my $idx_keys = 0; $idx_keys < $keys->@*; $idx_keys++) {
my $key = $keys->[$idx_keys];
if (index($condition_str, $key, $idx_cond_str) != $idx_cond_str) {
return !!0;
}
$idx_cond_str += length $key;
if (index($condition_str, $value_true, $idx_cond_str) == $idx_cond_str) {
$idx_cond_str += length $value_true;
}
elsif (index($condition_str, $value_false, $idx_cond_str) == $idx_cond_str) {
$idx_cond_str += length $value_false;
}
else {
return !!0;
}
if ($idx_keys < $keys->@* - 1) {
if (index($condition_str, ",", $idx_cond_str) != $idx_cond_str) {
return !!0;
}
$idx_cond_str++;
}
}
return $idx_cond_str == length $condition_str;
}
sub is_valid_condition_format_by_substr {
my $condition_str = shift;
my $keys = ["is_dirty=", "is_overweight=", "is_broken="];
my $value_true = "true";
my $value_false = "false";
my $idx_cond_str = 0;
for (my $idx_keys = 0; $idx_keys < $keys->@*; $idx_keys++) {
my $key = $keys->[$idx_keys];
if (substr($condition_str, $idx_cond_str, length $key) ne $key) {
return !!0;
}
$idx_cond_str += length $key;
if (substr($condition_str, $idx_cond_str, length $value_true) eq $value_true) {
$idx_cond_str += length $value_true;
}
elsif (substr($condition_str, $idx_cond_str, length $value_false) eq $value_false) {
$idx_cond_str += length $value_false;
}
else {
return !!0;
}
if ($idx_keys < $keys->@* - 1) {
if (substr($condition_str, $idx_cond_str, 1) ne ",") {
return !!0;
}
$idx_cond_str++;
}
}
return $idx_cond_str == length $condition_str;
}
sub is_valid_condition_format_by_regex {
my $condition_str = shift;
my $keys = ["is_dirty=", "is_overweight=", "is_broken="];
my $value_true = "true";
my $value_false = "false";
my $idx_cond_str = 0;
for (my $idx_keys = 0; $idx_keys < $keys->@*; $idx_keys++) {
my $key = $keys->[$idx_keys];
if (substr($condition_str, $idx_cond_str) !~ m!^$key!) {
return !!0;
}
$idx_cond_str += length $key;
if (substr($condition_str, $idx_cond_str) =~ m!^$value_true!) {
$idx_cond_str += length $value_true;
}
elsif (substr($condition_str, $idx_cond_str) =~ m!^$value_false!) {
$idx_cond_str += length $value_false;
}
else {
return !!0;
}
if ($idx_keys < $keys->@* - 1) {
if (substr($condition_str, $idx_cond_str, 1) ne ",") {
return !!0;
}
$idx_cond_str++;
}
}
return $idx_cond_str == length $condition_str;
}
my @case = (
'is_dirty=false,is_overweight=false,is_broken=false',
'is_dirty=true,is_overweight=false,is_broken=false',
'is_dirty=true,is_overweight=false,is_broken=true',
'is_dirty=true,is_overweight=true,is_broken=true',
);
my @ng = (
'is_dirty=1,is_overweight=0,is_broken=0',
'is_dirty=1,is_overweight=0,is_broken=0,aaa',
'is_overweight=true,is_broken=true,is_dirty=true',
);
for (@case) {
note $_;
ok is_valid_condition_format_by_index($_);
ok is_valid_condition_format_by_substr($_);
ok is_valid_condition_format_by_regex($_);
}
for (@ng) {
note $_;
ok !is_valid_condition_format_by_index($_);
ok !is_valid_condition_format_by_substr($_);
ok !is_valid_condition_format_by_regex($_);
}
done_testing;
cmpthese(-1, {
index => sub {
is_valid_condition_format_by_index($_) for @case;
is_valid_condition_format_by_index($_) for @ng;
},
substr => sub {
is_valid_condition_format_by_substr($_) for @case;
is_valid_condition_format_by_substr($_) for @ng;
},
regex => sub {
is_valid_condition_format_by_regex($_) for @case;
is_valid_condition_format_by_regex($_) for @ng;
},
});There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
こちらはこれだけ差がつくのに… warn_count との差は一体…
> バージョン14.1.2 (15611.3.10.1.5, 15611) でみるとバグっていた :memo: Kossyもこのコード入っているので抜いた方が良さそう
sugyan
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
手元で負荷かけてみるとちょいちょい接続切れ (timeout?) るけど、検証失敗しているということはなさそうです。StarmanとかStarletとかかませば安定するかな…?
アプリケーションコードとしてはLGTMです
|
ありがとうございます! 負荷かけ自分でもやってみます・・! |


やったこと
初期実装の Perlへの移植。
対応issue
セルフチェック
静的解析割愛備考