Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enhanced Reading Log Exports #6621

Merged
merged 2 commits into from
Jul 14, 2022

Conversation

cclauss
Copy link
Collaborator

@cclauss cclauss commented Jun 2, 2022

Closes #6519
Blocked by #6734 -- Rebase this PR after 6734 lands.
Related to #6645 and #6653 which also use csv_string().

Export your Reading Log with attention to adding more fields and without building large lists in memory which makes this process unwieldy for users with large reading lists.

  • Create iterate_users_logged_books() to get around the bug discussed at the bottom of get_users_logged_books()
    • Repeatedly get blocks of books and yield one book at a time. (lower memory footprint)
  • Create new generate_reading_log() to build the CSV by yielding one line at a time. (lower memory footprint)
    • Need to use database (NOT requests!) to access works, editions, authors, redirects
    • Need to know how to deal with long entries caused by long lists (ex. Tom Sawyer's Subjects, People, Places)
    • Need to know how to access other data fields such as my_rating, date_read, read_count

Current output: OpenLibrary_ReadingLog(65).csv or see Screenshots below.

What code returns the results for:

The /api/books endpoint does not return enough information...

from openlibrary.plugins.books.dynlinks import ol_get_many_as_dict
ol_get_many_as_dict(['/books/OL2058361M', '/works/OL54120W'])

ol_get_many_as_dict() results are missing keys covers and description but we don't need them for CSV.

Being able to instantiate a openlibrary.code.models.Work would substantially reduce code complexity but this class has no .__init__() method. If I know the key (e.g. /works/OL8193488W), how can I get an openlibrary.code.models.Work instance for that key?

Answer: web.ctx.site.get(key)

As in work: Work = web.ctx.site.get("/works/OL8193488W") returns an openlibrary.plugins.upstream.models.Work

Technical

Testing

In one terminal tab: docker compose up
In another terminal tab: open http://localhost:8080
Search on Twain and set one book each to: Want to Read, Currently Reading, Already Read and add star ratings.
On http://localhost:8080/account/import click Export your Reading Log --> Download (.csv format)

Screenshot

Screenshot 2022-06-06 at 14 20 19
Screenshot 2022-06-06 at 14 20 43
Screenshot 2022-06-06 at 14 20 54

Stakeholders @mheiman, @Fl0rent

@cclauss cclauss added Theme: Reading Log Related to workflows for creating, modifying, displaying a user's reading log. [managed] export Module: My Account Page labels Jun 2, 2022
@cclauss cclauss force-pushed the enhanced_reading_log branch 2 times, most recently from 2098c9a to 8abf80e Compare June 5, 2022 16:46
@cclauss cclauss changed the title DRAFT: Enhanced reading log Enhanced reading log Jun 5, 2022
@cclauss cclauss requested review from mekarpeles and cdrini June 5, 2022 16:47
@cclauss cclauss marked this pull request as ready for review June 6, 2022 10:41
@cclauss cclauss requested a review from seabelis June 6, 2022 12:37
@mekarpeles mekarpeles self-assigned this Jun 6, 2022
@mekarpeles mekarpeles added the Priority: 1 Do this week, receiving emails, time sensitive, . [managed] label Jun 6, 2022
@cclauss cclauss force-pushed the enhanced_reading_log branch 2 times, most recently from 84372f6 to a1ad646 Compare June 9, 2022 13:18
@mekarpeles mekarpeles changed the title Enhanced reading log Enhanced Reading Log Exports Jun 9, 2022
@mekarpeles mekarpeles added this to the Active Sprint milestone Jun 13, 2022
@cdrini cdrini modified the milestones: Sprint 2022-05, Active Sprint Jun 13, 2022
@mekarpeles mekarpeles added the On testing.openlibrary.org This PR has been deployed to testing.openlibrary.org for testing label Jun 16, 2022
@mekarpeles mekarpeles removed the On testing.openlibrary.org This PR has been deployed to testing.openlibrary.org for testing label Jun 20, 2022
@cdrini cdrini modified the milestones: Sprint 2022-06, Active Sprint Jul 6, 2022
cclauss added a commit that referenced this pull request Jul 12, 2022
Canonical formatting will make future diffs easier to read thus making pull requests easier to review.
`openlibrary/core/bookshelves.py` will be modified in both #6621
`openlibrary/plugins/upstream/account.py` will be modified in both #6621 and #6653
@cclauss cclauss added the State: Blocked Work has stopped, waiting for something (Info, Dependent fix, etc. See comments). [managed] label Jul 12, 2022
mekarpeles pushed a commit that referenced this pull request Jul 14, 2022
* psf/black: core/bookshelves and upstream/account

Canonical formatting will make future diffs easier to read thus making pull requests easier to review.
`openlibrary/core/bookshelves.py` will be modified in both #6621
`openlibrary/plugins/upstream/account.py` will be modified in both #6621 and #6653

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Copy link
Member

@mekarpeles mekarpeles left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, only concern is Reading Logs with e.g. 10k+ items (if we're building the csv in memory) OR if we have multiple patrons doing exports at the same time, a more tenable solution could be to use a temp file (instead of memory).

@mekarpeles mekarpeles merged commit 1bd02e2 into internetarchive:master Jul 14, 2022
@cclauss cclauss deleted the enhanced_reading_log branch July 14, 2022 16:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
export Priority: 1 Do this week, receiving emails, time sensitive, . [managed] State: Blocked Work has stopped, waiting for something (Info, Dependent fix, etc. See comments). [managed] Theme: Reading Log Related to workflows for creating, modifying, displaying a user's reading log. [managed]
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Enriching Reading Log data exports (more fields)
3 participants