Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extracting and showing PDF metadata (title and author) #447

Merged
merged 1 commit into from
May 9, 2023
Merged

Extracting and showing PDF metadata (title and author) #447

merged 1 commit into from
May 9, 2023

Conversation

dinhani
Copy link
Contributor

@dinhani dinhani commented May 5, 2023

I have some PDF files where the filename are just MD5 hashes.
What I need for these files is to show the PDF title that is present in the PDF metadata instead of the filename.
So what I am doing here is extracting author and title from the metadata and showing them when available.

Personally I would prefer title to be displayed with the other tags like the Author/Owner, but I don't know if Tags are appropriate for that.

If you think this is useful, I will add some tests for PDF parsing because they are missing.

@a5huynh
Copy link
Collaborator

a5huynh commented May 5, 2023

Hi @dinhani, thanks for opening this pull request! The "Author" tag is definitely appropriate here if you don't mind adding that in. Let us know when you're ready for this to go through review

@dinhani dinhani marked this pull request as ready for review May 6, 2023 01:25
@dinhani
Copy link
Contributor Author

dinhani commented May 6, 2023

@a5huynh It is ready for review.

I decided to show the document title under the filename because I think it is important to show both information, so I created a subtitle section.

The general idea is to have something reusable for extracting metadata from other file types like .docx, .epub and .mobi, for example.

@a5huynh
Copy link
Collaborator

a5huynh commented May 9, 2023

Thanks @dinhani! I'll take a look tomorrow and merge this in if all looks good 😄 . We'll try and get a release out by the end of the week w/ the updates!

@a5huynh
Copy link
Collaborator

a5huynh commented May 9, 2023

Everything looks good @dinhani , merging this in!

@a5huynh a5huynh merged commit 867936e into spyglass-search:main May 9, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants