-
Notifications
You must be signed in to change notification settings - Fork 96
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Pdffulltext #1103
Pdffulltext #1103
Conversation
From those logs
Presumable it needs the full path towards the binary (we seem to do similar things for pdf2htmlex and pdftk) |
Changes Unknown when pulling 9048ba4 on pdffulltext into * on master*. |
Changes Unknown when pulling e23fa44 on pdffulltext into * on master*. |
Install poppler-utils on travis ci
We might need to play with scoring/boosting of documents. The cutoff point is currently set at |
Discussing search analysis requirements sounds remarkably similar to a weather forecast :) |
@@ -525,6 +525,13 @@ describe('Preview processor', function() { | |||
assert.ok(_.find(previews.files, function(file) { return file.filename === 'page.1.html'; })); | |||
assert.ok(!_.find(previews.files, function(file) { return file.filename === 'page.2.html'; })); | |||
|
|||
// The PDF has 1 page, there should only be 1 corresponding txt file |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't we have a test for a multi-page PDF where we verify that multiple page.x.txt
files?
requesting pull of this from oaeproject repo so that travis will upload the logs when it fails