-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
jpeg vs progressive jpeg #4
Comments
24bit total = 8 bit per channel. Is the LZW compression available for it? (it's not super clear in your comment) For remarkable artefacts, I still think we could use 48bit = 16 bit per channel, but that can be considered optional. 24 bit per channel is quite excessive. If LZW and Zip are not available, then none. I suppose lossless jpeg2000 is not unreasonable if there's no other option... |
8 bits and 24 bits are two different options, so that must be 24 per channel. 16 isn't on the menu. LZW is not available for 24 bits. |
ok, I suppose 8bit = one channel and 24 bit = three channels then. What are the options available outside of tiff? (also, to answer one of your questions, progressive is better for large images but it's really a detail) |
@jeehuajian will post some screenshots of the setting menus tomorrow. Progressive JPEG is nearly half of JPEG and it visually looks clearer, so if there isn't any issue with progressive we might want to go for it. |
ok thanks! half the size raises eyebrows... it's supposed to be only slightly smaller... what's the problem with producing uncompressed tiffs that can the be zip-compressed with xnview? |
Before you set a standard, run the XnView results through audittool. It
can’t read everything, and I’ve seen NT create files audittool can’t read.
If you can figure out how to duplicate that problem, please let me know.
|
This is a scoping question which may be too late: If you are using scanners, does this mean you are scanning printed material? If so, why not retain the old standard of binary TIFF with LZW compression for the vast majority of the pages, and keep archival quality TIFF for the front and back material, and any other color illustrations? In a 100 page book, it really doesn’t matter if 5 or 10 pages are uncompressed. |
I agree, for generally black and white stuff, gray tiff in lzw is the best option. I think we're missing too much information to give a reasonable answer... what problem are we trying to solve? what's the context? what are the limitations or the users, of their machines, of the software, etc.? |
Scanners are used for both modern prints and pechas. They are used for everything as long as the paper isn't cardboard style. The staff on the ground has cameras but prefers Fujitsu scanners by far. For black and white material they used to scan straight to G4, which we then replaced by j2k as a single lossless color format for all archive images. Web images were then derived into G4 or JPEG depending on the content. The problem we're trying to solve now is deciding what we replace j2k with. We could go back to 3 formats/compression for color, grayscale and BW. This matters since the scanner and software tutorials will cover the scanning-time settings. The constraints are simplicity, file size, and processing time. |
So they discriminate between three cases (color, gray and bitonal), that's interesting... is lzw available in tiff for bitonal or just G4? Would something like j2k for color and lzw for gray and bitonal be simple enough? |
|
just to be sure, can you send me a j2k that the scanner produces? I want to check if it's lossless or if they encode it in a lossy way... thanks! |
This comment has been minimized.
This comment has been minimized.
After a few days of intense testing, here's what decided the final winner: The final decision for images produced with Fujitsu scanners is:
Resizing is based on ། size, (for OCR min char height is 20 pixels, optimal is 40 pixels):
། height are measured on the archive images and they inform the resizing % ratio:
|
Elie's intuition for uncompressed tifs, and everyone's lack of enthusiasm
for j2k wins!
Please refer to:
#4 (comment)
Using a unique scan-time format avoids a lot of issues so it should
definitely be kept.
Jim you might be happy to learn that we now only have two variables for the
web image derivation:
- a list of images that need to be kept in color (all; a subset like cover
and illustrations; none)
- the ། height expressed in pixels
This should allow the audit tool to generate the derivatives from the
scanner output. The ། height represent the mean char height and can be
replaced by any other frequent character for other languages.
Thanks for your input!
NT
…On Wed, Mar 4, 2020 at 9:30 PM Elie Roux ***@***.***> wrote:
After exploring many possibilities, it seems the only one that is:
- usable with the Fujitsu scanner
- usable by people on the field
- best quality
is color jpeg2000 for archives (then usual stuff for web)
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#4?email_source=notifications&email_token=AEG3IQYKNPO3SBZVNQLOUVDRFZJZBA5CNFSM4K2226K2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOENX3I3Q#issuecomment-594523246>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AEG3IQZECNZDWLKZP4HWXGLRFZJZBANCNFSM4K2226KQ>
.
|
An easy way to make a derivation script would be to ask field staff to add a suffix to images that need to be converted to color, something like (I know, we should have figured this out 10 years ago) |
@eroux and @TBRC-JimK, the scanners we use in China don't have Zip compression for Tiff, while LZW is only available for 8 bits.
I believe we want to scan in Tiff 24 bits rather than 8 bits (16 is a no-n0), or do we:
TIFF 24 bits presents the following compression options, which one do we prefer:
We did analyse the pros and cons of various options and J2K seemed to be the best option at the time, check the documentation here.
The text was updated successfully, but these errors were encountered: