Skip to content

Implement index as list of page numbers where words appear#253

Merged
AdventurousGui merged 3 commits into
stjiris:mainfrom
AdventurousGui:index-page
Jul 13, 2025
Merged

Implement index as list of page numbers where words appear#253
AdventurousGui merged 3 commits into
stjiris:mainfrom
AdventurousGui:index-page

Conversation

@AdventurousGui
Copy link
Copy Markdown
Collaborator

The current format of the indexes of words is a list of their total number of occurrences.

This PR replaces the PDF index with a traditional index, listing the numbers of the pages in which each word appears. The CSV index keeps the column indicating the number of occurrences, but includes an additional column with the lists of page numbers.

The page added to the PDF contains a title, followed by a list in the format `{word in bold}: {comma-separated page numbers where the word occurs}`.
The CSV index contains 3 columns: word, number of total occurrences, and list of page numbers where the word occurs.

AdventurousGui/updates-OCR#7
@AdventurousGui AdventurousGui self-assigned this Jul 13, 2025
@AdventurousGui AdventurousGui merged commit bda8ca4 into stjiris:main Jul 13, 2025
@AdventurousGui AdventurousGui deleted the index-page branch August 4, 2025 22:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant