Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

download: add workaround for broken markup of AtCoder s8pc_4_d #615

Closed
kmyk opened this issue Nov 19, 2019 · 3 comments · Fixed by #620
Closed

download: add workaround for broken markup of AtCoder s8pc_4_d #615

kmyk opened this issue Nov 19, 2019 · 3 comments · Fixed by #620
Labels

Comments

@kmyk
Copy link
Member

kmyk commented Nov 19, 2019

oj failed to download samples from https://atcoder.jp/contests/s8pc-4/tasks/s8pc_4_d, and also failed to recognize that oj itself failed.

  • Also AtCoder's JavaScript to add Copy buttons doesn't work. The HTML tags is not even balanced.
  • They work properly for other problems in the same contest.

そもそも HTML として不正な文字列であるように見える (<section> より </section> の数の方が多い) ので「これに対応するのは無理でしょ」という気持ちですが、だとしてもエラーにならず正常終了してしまうのはまずい

$ oj d -n https://atcoder.jp/contests/s8pc-4/tasks/s8pc_4_d
[x] problem recognized: AtCoderProblem.from_url('https://atcoder.jp/contests/s8pc-4/tasks/s8pc_4_d'): https://atcoder.jp/contests/s8pc-4/tasks/s8pc_4_d
[x] load cookie from: /home/ubuntu/.local/share/online-judge-tools/cookie.jar
[x] GET: https://atcoder.jp/contests/s8pc-4/tasks/s8pc_4_d
[x] 200 OK
[!] strange name for input string: Sample Output 3
[!] strange name for input string: Sample Output 5
[x] save cookie to: /home/ubuntu/.local/share/online-judge-tools/cookie.jar

[*] sample 0
[x] input: sample-1
4
1 2
2 3
2 4

[x] output: sample-1
2.0
1.0
2.0
2.0


[*] sample 1
[x] input: sample-2
4
1 2
2 4
4 3

[x] output: sample-2
3.0
1.5
3.0
1.5


[*] sample 2
[x] input: sample-3
4.0
2.0
2.0
2.0
4.0

[x] output: sample-3
2.000000000000
1.666666666667
1.666666666667
3.000000000000
3.000000000000
3.000000000000
3.000000000000


[*] sample 3
[x] input: sample-4
3.666666666667
2.250000000000
3.666666666667
2.833333333333
2.555555555556
2.666666666667
4.333333333333
2.666666666667
5.333333333333
2.500000000000
2.500000000000
5.000000000000

[x] output: sample-4
1.0
1.0
@kmyk kmyk added the bug label Nov 19, 2019
@kawacchu
Copy link
Contributor

kawacchu commented Nov 19, 2019

D問題のHTMLを確認したところ、たしかに以下の1~2行目にあるべき<section>が抜けています。

<div class="part">
        <h3>Sample Input 2</h3>
<pre>
4
1 2
2 4
4 3
</pre>
    </section>
</div>

このことがバグの原因です。
この問題が出題されているコンテストの他の問題を確認したところ、A問題とF問題も同様の箇所でのクリティカルな「開始タグ抜け」を確認しました。

@kmyk
Copy link
Member Author

kmyk commented Nov 19, 2019

一般性を考えると「HTML が壊れていることを検出したらエラーにする」みたいな方向で解決をしたいが、探した限りでは BeautifulSoup やその周辺に今回の用途で使えそうな「HTML が壊れているかの判定機能」はなさそう。困ったな

@kmyk
Copy link
Member Author

kmyk commented Nov 21, 2019

h3 内の Sample Output 3 みたいな文字列は取得できてるわけだし、これの一致をちゃんと確認して積極的にエラーにするのがよさそう。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants