Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add document hashes. #221

Merged
merged 1 commit into from
May 30, 2022
Merged

Add document hashes. #221

merged 1 commit into from
May 30, 2022

Conversation

J08nY
Copy link
Member

@J08nY J08nY commented May 30, 2022

Fixes #220.

@adamjanovsky
Copy link
Collaborator

Not sure if it's wise to add also txt document hashes, dunno how deterministic the conversion tools are. But we can try and we'll see I guess.

@adamjanovsky adamjanovsky merged commit 191997b into main May 30, 2022
@adamjanovsky adamjanovsky deleted the feat/track-doc-hashes branch May 30, 2022 12:00
@J08nY
Copy link
Member Author

J08nY commented May 30, 2022

So with the txt hashes that is exactly what I wanted to capture, we already have several heuristics in the tool that could change without the underlying data changing and the pdftotext is one of them, so we at least know when we change stuff.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Track document hashes
2 participants