Skip to content
This repository has been archived by the owner on Jun 15, 2023. It is now read-only.

Detect metadata from Arxiv Documents #52

Open
dufferzafar opened this issue Nov 16, 2021 · 0 comments
Open

Detect metadata from Arxiv Documents #52

dufferzafar opened this issue Nov 16, 2021 · 0 comments

Comments

@dufferzafar
Copy link

Arxiv documents don't have title / author etc metadata.

➜ pdfx https://arxiv.org/pdf/1911.02782.pdf
Document infos:
- CreationDate = D:20200708010812Z
- Creator = LaTeX with hyperref package
- ModDate = D:20200708010812Z
- PTEX.Fullbanner = This is pdfTeX, Version 3.14159265-2.6-1.40.17 (TeX Live 2016) kpathsea version 6.2.2
- Pages = 15
- Producer = pdfTeX-1.40.17
- Trapped = False

References: 77
- URL: 71
- ARXIV: 4
- PDF: 2

PDF References:
- http://www.lrec-conf.org/proceedings/lrec2008/pdf/445_paper.pdf
- http://ceur-ws.org/Vol-2345/paper2.pdf

Perhaps we could use arxiv.py to query Arxiv and get that metadata?

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant