-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add GetLibraryItemAsyncEnumerable for unlimited library books #27
Conversation
The reason I haven't spent much time on tokens is because I have 1 specific use case that I'm almost positive they won't address. My understanding is that fetch-by-token will show you added and removed books. However, I personally make heavy use of the user ratings -- how the community rates it but especially how I rate it. I can't make use of any get-library implementation unless it includes any changes to these. That said, I haven't actually tested it yet so maybe I'm wrong and it does solve this problem. |
I tested it and it imports all fields, including personal rating. It appears to be identical to the call with page numbers like @mkb79 said, just presumably without an upper limit. I say presumably, because I don't have nearly enough books to test it and I'm not interested in adding the whole AudiblePlus catalogue in an attempt to find an upper limit! |
You can fetch the entire library with the continuation token. But this has some disadvantages, if you want to fetch the whole library. A new request can only be done if the previous has finished. If you use the |
That is a good idea. Unfortunately @rmcrackan noted that if requested num_results is larger than 300, the query will sometimes not return provided_review and reviews groups. That limitation caps the num_results at around 250, thus caps the total library size at 10,000 using pages. To exceed 10,000 books and still receive reviews groups, the continuation token and synchronous call appears the only option. |
Why don't use them together. Make your first library request. If |
How would that work? Does the response from page 40 return a continuation token? |
Oh I think you misunderstood me. You are requesting the first library page like before. Then checking the FYI: Edit: |
Thanks! |
OK, this is ready for review and testing. I implemented @mkb79's suggestion, and the flow is as follows: When GetEnumerator() is called on ItemAsyncEnumerable, a producer thread is created to get all library Items. The producer:
No matter how the Items are retrieved (either by page number or continuation token), the Task<Item[]> is added to the GetItemsTasks BlockingCollection. When the iterator moves next, it detects if it needs a new batch of Item[]. If it does, it tries to take a Task<Item[]> from the GetItemsTasks BlockingCollection. Then it awaits that task and collects the Item[]. |
@mkb79 I'm still not certain about something. When requesting by page=pageNumber, is the limitation 10,000 Items, or is the limitation 40 pages? |
My memory is rusty but I think it was 40 pages. I'd like to hear what he says though |
@rmcrackan I just tested it and I could exceed 60 pages at 2 items per page. |
@Mbucari |
The problem with the catalog endpoint is, you can fetch more than 40 pages. But page 41 and above give you the result for page 40. |
For my test that exceeded 40 pages on the library endpoint, all returned items are unique. I checked the entire json structure, not just the title. It seems the catalog limitation doesn't exist for the library endpoint. |
Is this ready for testing? Or does the 40-pages discussion mean that another round of revisions is on its way? Did you benchmark the old vs new way? If yes, how does it compare? |
I'm making changes under the assumption that the library endpoint can support an unlimited number of pages. As for benchmarks, I haven't measured but what I have committed is definitely faster. |
I can verify, that there is no page limit (more?)! I'll try this on the catalog/products endpoint next. Maybe the limit is removed?! |
OK, this is super cool. Using the a IAsyncEnumerable yield return pattern, I trimmed all the new concurrent library functions to 4 new methods. I did 4 benchmark runs on ApiExtended.getItemsAsync using the existing library scan and another 4 on the new library scan. My account has 107 books Items, and a total of 281 items with episodes. Here are the results, in ms: Existing L\library scan: 6491 Parallel async library scan: 2870 |
Let me know if you have any question, but this is ready for review and testing. I think you'll be pleasantly surprised. |
Wow, those results are really promising. I look forward to trying on my library which is 1,200+. I'll either test tonight or tomorrow |
Umm, wow. I'll spare you the stats porn. The bottom line is that, on average, for my library of 1,209 titles, your new way takes 17.1% as long. |
Between your reworking of the api scan, background scanning, and my recent refactor of the entity framework insert/updates, even casual users should perceive a major performance boost. Oh yeah, and the UI is not longer blocking during scanning, and auto scan, and ... The past month or 2 has been friggin' massive. |
@@ -118,7 +118,7 @@ public partial class Api | |||
{ | |||
const string CATALOG_PATH = "/1.0/catalog"; | |||
|
|||
#region GetCatalogProductAsyncAsync |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What kinda weird find + replace must I have done to cause this??
Dayum! Did you run it at the default 50 items per page, or did you up that number? I'm still a little uncertain about this new revelation that apparently there's no longer a page limit. My instinct says to up the default to maybe 100 per page. A lower number will show greater speed up for users with fewer books because there will be more concurrent gets. Another point of concern I have right now is inside libation, there is no limit on getChildEpisodesAsync. A new task is spun up for each episode, and they all run concurrently. Presumably audible has some per-ip concurrent connection limit, and I'm afraid without throttling that we might find it. |
Yep, I tested your code as-is. I'm curious about performance with bigger pages but I haven't yet tested this. I wonder how these perform 50, 100, 200, 250.
Yes, please throttle this. The implications could range from 'meh' to 'lifetime ip block'. Fortunately (fortunately?!?) something's going weird in either github or nuget and it's not letting me publish the updated package |
Ok, I'll add episode throttling. That code is inside libation though, so no API changes necessary. Unfortunately. Lol |
I implemented mkb79's approach outlined Here
I wrapped the getAllLibraryItemsAsync inside an IAsyncEnumerable. Libation now calls Api.GetLibraryItemAsyncEnumerable() and iterates through all library items in an async foreach.