-
-
Notifications
You must be signed in to change notification settings - Fork 17
Closed
Description
RequestHelper.QueryWithContinuation() doesn't correctly implement pagination logic. It retains the continuation parameters from the previous request. E.g. if the API returns these continuation objects in 2 consecutive requests:
"continue": {
"clcontinue": "aaa",
"continue": "||"
}"continue": {
"gcmcontinue": "bbb",
"continue": "gcmcontinue||"
}The next request sent by the library will have
"clcontinue": "aaa"
"gcmcontinue": "bbb",
"continue": "gcmcontinue||"
I.e. "clcontinue": "aaa" will still be sent, even though it shouldn't be there. This results in incomplete data.
I wasn't able to reproduce this issue with standard providers easily, but it's reproducible with this CategoryPropertyProvider I wrote to retrieve categories of the page (I don't know why but the original CategoryPropertyProvider is marked as internal):
class CategoryPropertyProvider : WikiPagePropertyProvider<CategoryPropertyGroup>
{
public override string PropertyName => "categories";
public int PaginationSize { get; set; }
public override IEnumerable<KeyValuePair<string, object>> EnumParameters(MediaWikiVersion version)
{
yield return KeyValuePair.Create("clshow", "!hidden" as object);
yield return KeyValuePair.Create("cllimit", PaginationSize as object);
}
public override CategoryPropertyGroup ParsePropertyGroup(JObject json)
{
return new CategoryPropertyGroup(json[PropertyName]?.Select(x => x.Value<string>("title")).ToArray());
}
}
class CategoryPropertyGroup : WikiPagePropertyGroup
{
public CategoryPropertyGroup(IReadOnlyList<string> categories)
{
Categories = categories ?? Array.Empty<string>();
}
public IReadOnlyList<string> Categories { get; }
}static async Task Main()
{
using var client = new WikiClient();
var site = new WikiSite(client, "https://en.wikipedia.org/w/api.php");
await site.Initialization;
// set both limits to 1 to allow the bug manifest more easily
var result = await new CategoryMembersGenerator(site)
{
CategoryTitle = "Category:Oceanian_cuisine",
MemberTypes = CategoryMemberTypes.Page,
PaginationSize = 1,
}.EnumPagesAsync(new WikiPageQueryProvider
{
Properties =
{
new CategoryPropertyProvider() { PaginationSize = 1 }
}
}).Select(item => Tuple.Create(item.Title, item.GetPropertyGroup<CategoryPropertyGroup>())).ToArrayAsync();
var test = result.Where(x => x.Item1 == "Australian cuisine").SelectMany(x => x.Item2.Categories).Count();
}Here test will be 0 even though the real article has 2 categories.