New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix support for TIFF documents in Reader/eFolder #14193
Comments
The Caseflow efolder app has a TIFF-to-PDF converter already. Maybe we just need to expose that as an API? Earlier thought by @enriquemanuel and I was making that into a lambda so it was not married to either app. |
That's good news, although it does raise the question why Reader was getting TIFFs when Perhaps this ticket should be reincarnated as a straight-up bug report in the efolder repo? |
I believe that's because Reader does not reader from eFolder Express. It reads directly from VBMS efolder. |
For the document that I was investigating in the Bat Team thread above, Reader was hitting eFolder Express's |
Researching related efolder INC we discovered imagemagick error example:
We are wondering whether the tiff-to-pdf service is broken. Investigation in https://dsva.slack.com/archives/CAM9FJ85P/p1589485314027300 |
Update: yes, TIFF service in EE was broken. Still not the case that Caseflow Reader uses it, afaik. |
Support ticket came up about this issue again: https://dsva.slack.com/archives/CHX8FMP28/p1590085212373400 |
I have a sequence of actions that enable Reader to display TIFF as PDF (almost all the time) based on this code: # Refresh Reader at https://appeals.cf.ds.va.gov/reader/appeal/4025589/documents/13466417
# Get "Unable to load document" error
# In Certification console
doc=Document.find(13466417)
vbms_doc_id=doc.vbms_document_id
RequestStore.store[:application]="reader"
doc.content_url
=> "https://efolder.cf.ds.va.gov/api/v2/records/7F45E2D6-6060-46F3-AFAA-041D666694AF"
# Go to that doc.content_url in the browser, and it downloads the file as a TIFF
# doc.content_url is used by Reader's PDF.js
# In eFolder Express console
vbms_doc_id="{7F45E2D6-6060-46F3-AFAA-041D666694AF}"
record=Record.find_by(version_id: vbms_doc_id)
# check if conversion worked in the past
record.conversion_status
content=record.service.v2_fetch_document_file(record)
content=ImageConverterService.new(image: content, record: record).process
# If "conversion_success", then store file in S3 for Reader to retrieve.
S3Service.store_file(record.s3_filename, content) if record.conversion_status=="conversion_success"
# I don't know why this is necessary:
# Refresh the browser at doc.content_url; browser downloads file as a PDF
# Refresh Reader and it shows the pdf Note the download button (near the top-right corner) within Reader may still download the file as TIFF. To do a mass conversion, may want to query for record.manifest_source.records.count
record.manifest_source.records.where(mime_type: "image/tiff", conversion_status: "not_converted").count
retryRecords=record.manifest_source.records.where(mime_type: "image/tiff", conversion_status: "not_converted")
retryRecords.map{|record|
content=record.service.v2_fetch_document_file(record)
content=ImageConverterService.new(image: content, record: record).process
S3Service.store_file(record.s3_filename, content) if record.conversion_status=="conversion_success"
} |
@yoomlam that's good stuff. I would suggest working it into this https://github.com/department-of-veterans-affairs/appeals-deployment/issues/2718 |
Some more info as I'm digging into a related ticket #14298.
So if the document is not in S3 and comes from VVA, then Reader won't be able to show it. A RetrieveDocumentsForReaderJob caches documents in S3
When developing a solution, we should also consider that these S3 documents are auto-deleted after 5 days -- Slack convo. |
Description
TIFF documents are sometimes included in the Reader documents for a case, and cannot currently be displayed by Reader's PDF viewer,
PDF.js
. See technical notes for some thoughts on implementation.Background/context/resources
Bat Team thread
Related ticket on "corrupted" files in eFolder: #10504
Technical notes
There is some internet literature on getting PDF.js to display TIFF images embedded within the PDF, but displaying plain TIFF images (mimetype
image/tiff
) is out of scope for PDF.js and will likely never be implemented.Rather than detecting the file type in Reader and using yet another frontend library to display TIFFs, a more straightforward approach may be to implement a PDF-to-TIFF converter in eFolder.
Because PDF-to-TIFF conversion is likely to be fraught and full of exotic edge cases, here is the output of /usr/bin/file on the header of one such problematic TIFF file seen in Reader:
TIFF image data, little-endian, direntries=19, height=3367, bps=1, compression=bi-level group 4, PhotometricIntepretation=WhiteIsZero, orientation=upper-left, width=2541
The text was updated successfully, but these errors were encountered: