-
Notifications
You must be signed in to change notification settings - Fork 246
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
JBIG2Decoder implementation #67
Merged
Merged
Changes from 37 commits
Commits
Show all changes
46 commits
Select commit
Hold shift + click to select a range
1e673d4
Prepared skeleton and basic component implementations for the jbig2 e…
kucjac 0a410f2
Added Bitset. Implemented Bitmap.
kucjac fe58978
Decoder with old Arithmetic Decoder
kucjac 8d1d930
Partly working arithmetic
kucjac db0167b
Working arithmetic decoder.
kucjac f9c1e12
MMR patched.
kucjac 25b6aa5
rebuild to apache.
kucjac 0a60b21
Working generic
kucjac ea22544
Working generic
kucjac ec90da4
Decoded full document
kucjac 70bad1e
Update Jenkinsfile go version [master] (#398)
gunnsth 32d5dd0
Decoded AnnexH document
kucjac 45b2403
Minor issues fixed.
kucjac b9c451c
Merge branch 'master' into jbig2
kucjac 2658add
Update README.md
gunnsth f893ad5
Fixed generic region errors. Added benchmark. Added bitmap unpadder. …
kucjac 33af779
Fixed endofpage error
kucjac 73695ba
Added integration test.
kucjac 17cb288
Decoded all test files without errors. Implemented JBIG2Global.
kucjac 427789b
Merge branch 'v3' of https://github.com/unidoc/unidoc into v3
kucjac 211f1b9
Merged V3 with JBIG2 fork
kucjac 7f7d869
Merged with v3 version
kucjac 01c8132
Fixed the EOF in the globals issue
kucjac 3991842
Fixed the JBIG2 ChocolateData Decode
kucjac da99936
JBIG2 Added license information
kucjac 2365cfc
Minor fix in jbig2 encoding.
kucjac bc67c01
Applied the logging convention
kucjac d58c0ab
Cleaned unnecessary imports
kucjac b1abde7
Go modules clear unused imports
kucjac 59ba5d1
checked out the README.md
kucjac 60ae357
Moved trace to Debug. Fixed the build integrate tag in the document_d…
kucjac 0e9620c
Merged with the unipdf/v3
kucjac 190a1fd
Applied UniPDF Developer Guide. Fixed lint issues.
kucjac 39a38e3
Cleared documentation, fixed style issues.
kucjac dcc309b
Added jbig2 doc.go files. Applied unipdf guide style.
kucjac df74d0c
Minor code style changes.
kucjac ffffa40
Minor naming and style issues fixes.
kucjac a40a01f
Minor naming changes. Style issues fixed.
kucjac 64dcdef
Review r11 fixes.
kucjac 3e1d818
Integrate jbig2 tests with build system
gunnsth 16d0846
Merge remote-tracking branch 'upstream/development' into kucjac-v3-jbig2
gunnsth cc2e03a
Added jbig2 integration test golden files.
kucjac 50f39aa
Minor jbig2 integration test fix
kucjac c006fea
Removed jbig2 integration image assertions
kucjac 5a56d12
Fixed jbig2 rowstride issue. Implemented jbig2 bit writer
kucjac 91e78a1
Changed golden files logic. Fixes r13 issues.
kucjac File filter
Filter by extension
Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -35,6 +35,7 @@ import ( | |
|
||
"github.com/unidoc/unipdf/v3/common" | ||
"github.com/unidoc/unipdf/v3/internal/ccittfax" | ||
"github.com/unidoc/unipdf/v3/internal/jbig2" | ||
) | ||
|
||
// Stream encoding filter names. | ||
|
@@ -1976,49 +1977,197 @@ func (enc *CCITTFaxEncoder) EncodeBytes(data []byte) ([]byte, error) { | |
return encoder.Encode(pixels), nil | ||
} | ||
|
||
// JBIG2Encoder implements JBIG2 encoder/decoder (dummy, for now) | ||
// FIXME: implement | ||
type JBIG2Encoder struct{} | ||
const ( | ||
jbig2Globals = "JBIG2Globals" | ||
) | ||
|
||
// JBIG2Encoder is the jbig2 image encoder (WIP)/decoder. | ||
type JBIG2Encoder struct { | ||
// Globals are the JBIG2 global segments. | ||
Globals jbig2.Globals | ||
|
||
// IsChocolateData defines if the data is encoded such that | ||
// binary data '1' means black and '0' white. | ||
// otherwise the data is called vanilla. | ||
// Naming convention taken from: 'https://en.wikipedia.org/wiki/Binary_image#Interpretation' | ||
IsChocolateData bool | ||
} | ||
|
||
// NewJBIG2Encoder returns a new instance of JBIG2Encoder. | ||
func NewJBIG2Encoder() *JBIG2Encoder { | ||
return &JBIG2Encoder{} | ||
} | ||
|
||
// setChocolateData sets the chocolate data flag when the pdf stream object contains the 'Decode' object. | ||
// Decode object ( PDF32000:2008 7.10.2 Type 0 (Sampled) Functions). | ||
// NOTE: this function is a temporary helper until the samples handle Decode function. | ||
func (enc *JBIG2Encoder) setChocolateData(decode PdfObject) { | ||
arr, ok := decode.(*PdfObjectArray) | ||
if !ok { | ||
common.Log.Debug("JBIG2Encoder - Decode is not an array. %T", decode) | ||
return | ||
} | ||
|
||
// (PDF32000:2008 Table 39) The array should be of 2 x n size. | ||
// For binary images n stands for 1bit, thus the array should contain 2 numbers. | ||
floatSlice, err := arr.GetAsFloat64Slice() | ||
if err != nil { | ||
// check if the arr is an array of integers. | ||
iArr, err := arr.ToIntegerArray() | ||
if err != nil { | ||
common.Log.Debug("JBIG2Encoder unsupported Decode value. %s", arr.String()) | ||
return | ||
} | ||
if iArr[0] == 1 && iArr[1] == 0 { | ||
enc.IsChocolateData = true | ||
} else if iArr[1] == 0 && iArr[0] == 1 { | ||
enc.IsChocolateData = false | ||
} else { | ||
common.Log.Debug("JBIG2Encoder unsupported Decode value: %s", arr.String()) | ||
} | ||
return | ||
} | ||
|
||
if len(floatSlice) != 2 { | ||
return | ||
} | ||
|
||
if floatSlice[0] == 1.0 && floatSlice[1] == 0.0 { | ||
enc.IsChocolateData = true | ||
} else if floatSlice[0] == 0.0 && floatSlice[1] == 1.0 { | ||
enc.IsChocolateData = false | ||
} else { | ||
common.Log.Debug("JBIG2Encoder unsupported DecodeParams->Decode value: %s", arr.String()) | ||
} | ||
|
||
} | ||
|
||
func newJBIG2EncoderFromStream(streamObj *PdfObjectStream, decodeParams *PdfObjectDictionary) (*JBIG2Encoder, error) { | ||
encoder := NewJBIG2Encoder() | ||
encDict := streamObj.PdfObjectDictionary | ||
if encDict == nil { | ||
// No encoding dictionary. | ||
return encoder, nil | ||
} | ||
|
||
// If decodeParams not provided, see if we can get from the stream. | ||
if decodeParams == nil { | ||
obj := encDict.Get("DecodeParms") | ||
if obj != nil { | ||
switch t := obj.(type) { | ||
case *PdfObjectDictionary: | ||
decodeParams = t | ||
break | ||
case *PdfObjectArray: | ||
if t.Len() == 1 { | ||
if dp, ok := GetDict(t.Get(0)); ok { | ||
decodeParams = dp | ||
} | ||
} | ||
default: | ||
common.Log.Error("DecodeParams not a dictionary %#v", obj) | ||
return nil, errors.New("invalid DecodeParms") | ||
} | ||
} | ||
} | ||
|
||
if decodeParams != nil { | ||
if globals := decodeParams.Get("JBIG2Globals"); globals != nil { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Should the |
||
globalsStream, ok := globals.(*PdfObjectStream) | ||
if !ok { | ||
err := errors.New("the Globals stream should be an Object Stream") | ||
common.Log.Debug("ERROR: %s", err.Error()) | ||
return nil, err | ||
} | ||
|
||
gdoc, err := jbig2.NewDocument(globalsStream.Stream) | ||
if err != nil { | ||
err = fmt.Errorf("decoding global stream failed. %s", err.Error()) | ||
common.Log.Debug("ERROR: %s", err) | ||
return nil, err | ||
} | ||
|
||
encoder.Globals = gdoc.GlobalSegments | ||
} | ||
} | ||
|
||
// Inverse the bits on the 'Decode [1.0 0.0]' function (PDF32000:2008 7.10.2) | ||
if decode := streamObj.Get("Decode"); decode != nil { | ||
encoder.setChocolateData(decode) | ||
} | ||
|
||
return encoder, nil | ||
} | ||
|
||
// GetFilterName returns the name of the encoding filter. | ||
func (enc *JBIG2Encoder) GetFilterName() string { | ||
return StreamEncodingFilterNameJBIG2 | ||
} | ||
|
||
// MakeDecodeParams makes a new instance of an encoding dictionary based on | ||
// the current encoder settings. | ||
// MakeDecodeParams makes a new instance of an encoding dictionary based on the current encoder settings. | ||
func (enc *JBIG2Encoder) MakeDecodeParams() PdfObject { | ||
return nil | ||
return MakeDict() | ||
} | ||
|
||
// MakeStreamDict makes a new instance of an encoding dictionary for a stream object. | ||
func (enc *JBIG2Encoder) MakeStreamDict() *PdfObjectDictionary { | ||
return MakeDict() | ||
dict := MakeDict() | ||
if enc.IsChocolateData { | ||
// /Decode[1.0 0.0] - see note in the 'setChocolateData' method. | ||
dict.Set("Decode", MakeArray(MakeFloat(1.0), MakeFloat(0.0))) | ||
} | ||
dict.Set("Filter", MakeName(enc.GetFilterName())) | ||
|
||
return dict | ||
} | ||
|
||
// UpdateParams updates the parameter values of the encoder. | ||
func (enc *JBIG2Encoder) UpdateParams(params *PdfObjectDictionary) { | ||
if decode := params.Get("Decode"); decode != nil { | ||
enc.setChocolateData(decode) | ||
} | ||
} | ||
|
||
// DecodeBytes decodes a slice of JBIG2 encoded bytes and returns the result. | ||
// DecodeBytes decodes a slice of JBIG2 encoded bytes and returns the results. | ||
func (enc *JBIG2Encoder) DecodeBytes(encoded []byte) ([]byte, error) { | ||
common.Log.Debug("Error: Attempting to use unsupported encoding %s", enc.GetFilterName()) | ||
return encoded, ErrNoJBIG2Decode | ||
// create new JBIG2 document | ||
doc, err := jbig2.NewDocumentWithGlobals(encoded, enc.Globals) | ||
if err != nil { | ||
return nil, err | ||
} | ||
|
||
// the jbig2 PDF document should have only one page, where page numeration | ||
// starts from '1'. | ||
page, err := doc.GetPage(1) | ||
if err != nil { | ||
return nil, err | ||
} | ||
|
||
if page == nil { | ||
err = errors.New("jbig2 corrupted data. No page#1 found") | ||
common.Log.Debug("ERROR: %s", err.Error()) | ||
return nil, err | ||
} | ||
|
||
// get the page data | ||
bm, err := page.GetBitmap() | ||
if err != nil { | ||
return nil, err | ||
} | ||
|
||
// check if data IsChocolate | ||
if enc.IsChocolateData { | ||
return bm.GetChocolateData(), nil | ||
} | ||
return bm.GetVanillaData(), nil | ||
} | ||
|
||
// DecodeStream decodes a JBIG2 encoded stream and returns the result as a | ||
// slice of bytes. | ||
// DecodeStream decodes a JBIG2 encoded stream and returns the result as a slice of bytes. | ||
func (enc *JBIG2Encoder) DecodeStream(streamObj *PdfObjectStream) ([]byte, error) { | ||
common.Log.Debug("Error: Attempting to use unsupported encoding %s", enc.GetFilterName()) | ||
return streamObj.Stream, ErrNoJBIG2Decode | ||
return enc.DecodeBytes(streamObj.Stream) | ||
} | ||
|
||
// EncodeBytes JBIG2 encodes the passed in slice of bytes. | ||
// EncodeBytes encodes the passed slice in slice of bytes into JBIG2. | ||
func (enc *JBIG2Encoder) EncodeBytes(data []byte) ([]byte, error) { | ||
common.Log.Debug("Error: Attempting to use unsupported encoding %s", enc.GetFilterName()) | ||
return data, ErrNoJBIG2Decode | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
GetAsFloat64Slice
converts integer values to floats so both are covered. It looks like the method could be simplified to something like: