Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid tesseract writing Pix out/reading them back. #2965

Merged
merged 1 commit into from
May 16, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
18 changes: 18 additions & 0 deletions src/ccstruct/imagedata.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -207,12 +207,28 @@ bool ImageData::SkipDeSerialize(TFile* fp) {
// In case of missing PNG support in Leptonica use PNM format,
// which requires more memory.
void ImageData::SetPix(Pix* pix) {
#ifdef TESSERACT_IMAGEDATA_AS_PIX
internal_pix_ = pix;
#else
SetPixInternal(pix, &image_data_);
#endif
}

// Returns the Pix image for *this. Must be pixDestroyed after use.
Pix* ImageData::GetPix() const {
#ifdef TESSERACT_IMAGEDATA_AS_PIX
#ifdef GRAPHICS_DISABLED
/* The only caller of this is the scaling functions to prescale the
* source. Thus we can just return a new pointer to the same data. */
return pixClone(internal_pix_);
#else
/* pixCopy always does an actual copy, so the caller can modify the
* changed data. */
return pixCopy(NULL, internal_pix_);
#endif
#else
return GetPixInternal(image_data_);
#endif
}

// Gets anything and everything with a non-nullptr pointer, prescaled to a
Expand Down Expand Up @@ -320,6 +336,7 @@ void ImageData::AddBoxes(const GenericVector<TBOX>& boxes,
}
}

#ifndef TESSERACT_IMAGEDATA_AS_PIX
// Saves the given Pix as a PNG-encoded string and destroys it.
// In case of missing PNG support in Leptonica use PNM format,
// which requires more memory.
Expand Down Expand Up @@ -348,6 +365,7 @@ Pix* ImageData::GetPixInternal(const GenericVector<char>& image_data) {
}
return pix;
}
#endif

// Parses the text string as a box file and adds any discovered boxes that
// match the page number. Returns false on error.
Expand Down
3 changes: 3 additions & 0 deletions src/ccstruct/imagedata.h
Original file line number Diff line number Diff line change
Expand Up @@ -196,6 +196,9 @@ class ImageData {
private:
STRING imagefilename_; // File to read image from.
int32_t page_number_; // Page number if multi-page tif or -1.
#ifdef TESSERACT_IMAGEDATA_AS_PIX
Pix *internal_pix_;
#endif
GenericVector<char> image_data_; // PNG/PNM file data.
STRING language_; // Language code for image.
STRING transcription_; // UTF-8 ground truth of image.
Expand Down