Skip to content
This repository has been archived by the owner on Jan 10, 2020. It is now read-only.

修改下一页url的正则表达式 #1

Merged
merged 1 commit into from
Dec 2, 2018
Merged

修改下一页url的正则表达式 #1

merged 1 commit into from
Dec 2, 2018

Conversation

Llf0703
Copy link
Contributor

@Llf0703 Llf0703 commented Dec 2, 2018

OJStatusCrawler/BZOJCrawler.cs:34

Regex findNextPage = new Regex(@"\]&nbsp;&nbsp;\[<a href=(.+?)>Next");

正则表达式显然是错误的,因为在BZOJ的网页源代码中是

[<a href=status.php?user_id=wxh010910>Top</a>]&nbsp;&nbsp;[<a href=status.php?user_id=wxh010910&top=3037010>Previous Page</a>]&nbsp;&nbsp;[<a href=status.php?user_id=wxh010910&top=2813041&prevtop=3036990>Next Page</a>]

而您的正则表达式显然还会匹配到上一页的链接

@Duanyll
Copy link
Owner

Duanyll commented Dec 2, 2018

您是怎么发现这个问题的
我打赌您没有dotnetcore的SDK

@Duanyll Duanyll merged commit 44c16d7 into Duanyll:master Dec 2, 2018
@Duanyll
Copy link
Owner

Duanyll commented Dec 2, 2018

其实不是正则表达式的锅,是我把Regex类用错了

@Llf0703
Copy link
Contributor Author

Llf0703 commented Dec 3, 2018

因为我最开始也差不多这么写的然后发现匹配出来有两个URL
您太强辣

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants