Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Changes as proposed in #14 #15

Merged
merged 2 commits into from Jul 12, 2018
Merged

Changes as proposed in #14 #15

merged 2 commits into from Jul 12, 2018

Conversation

kba
Copy link
Collaborator

@kba kba commented Jul 3, 2018

Makefile Outdated
@@ -3,15 +3,22 @@ export
SHELL := /bin/bash
LOCAL := $(PWD)/usr
PATH := $(LOCAL)/bin:$(PATH)
TESSDATA = $(LOCAL)/share/tessdata
LANGDATA = $(PWD)/langdata-$(LANGDATA_VERSION)
HOME := /home/ubuntu
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

$HOME is set by default (i.e. HOME := $(HOME)), if you want to override it you can, but I would not set a different default. As you use it, it seems more like you meant PWD or REPO_ROOT?

Makefile Outdated
LANGDATA = $(PWD)/langdata-$(LANGDATA_VERSION)
HOME := /home/ubuntu
TESSDATA = $(HOME)/tessdata_best
LANGDATA = $(HOME)/langdata
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find it preferable to have the explicit version of langdata for reproducible results.

Makefile Outdated

# Name of the model to be built
MODEL_NAME = foo

# Name of the model to continue from
CONTINUE_FROM = frk
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Default to $(MODEL_NAME)?

Makefile Outdated

$(ALL_LSTMF): $(sort $(patsubst %.tif,%.lstmf,$(wildcard $(TRAIN)/*.tif)))
find $(TRAIN) -name '*.lstmf' -exec echo {} \; | sort -R -o "$@"

$(TRAIN)/%.lstmf: $(TRAIN)/%.box
tesseract $(TRAIN)/$*.tif $(TRAIN)/$* lstm.train
tesseract $(TRAIN)/$*.tif $(TRAIN)/$* --psm 6 lstm.train
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make page segmentation mode configurable?

@kba kba mentioned this pull request Jul 3, 2018
@Shreeshrii
Copy link
Collaborator

I had changed it for my own testing. Please make the required changes for the generic script.

Some related discussion is at https://groups.google.com/forum/?utm_medium=email&utm_source=footer#!msg/tesseract-ocr/be4-rjvY2tQ/vIF2a0XbCgAJ

Copy link
Collaborator

@wrznr wrznr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice. I am excited to test CONTINUE_FROM. Many thanks to @kba and @Shreeshrii

@wrznr wrznr merged commit f1baf17 into master Jul 12, 2018
@kba
Copy link
Collaborator Author

kba commented Jul 23, 2018

Hi, this commit changed the gt.txt extension to "-gt.txt" from ".gt.txt". I think it should be set as before.

Thanks for spotting, that was a mistake, fixed in master.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants