New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Method for adding config settings #141
Comments
This would be great, I wish I was able to do something like this: Mebus |
Hi, thank you very much for the proposal. As far as integrating this into pytesseract - well, if I have some free time, I will try to implement the logic for this. The only "problematic" part of this is - where to store this temp config. And btw, we can have this nice python approach: config = pytesseract.temp_config(path='<custom_filepath>')
config.set_variables({'key': 'value'})
pytesseract.image_to_string('<image_filepath>', config=config) |
@int3l the above method is giving an error AttributeError: module 'pytesseract' has no attribute 'temp_config'. Any solutions? |
@Debjoy10 it is not implemented yet. This is a feature request. |
@int3l do we have any workaround for doing this for the time being? |
@Raghwendra-Dey please take a look at the README documentation and the example configurations. |
@int3l we were looking around for modifying the variables like editor_image_word_bb_color , editor_word_height , editor_word_width , etc. but no way to do it in pytesseract, though tesseract has its work around... |
I am not very familiar with the tesseract custom config files, where you can add this options and then pass the custom config file to pytesseract via the config argument. Maybe you should ask in the Tesseract Github Issue Tracker. Pytesseract is just a tin wrapper around the tesseract executable. |
Can you tell me more about the config argument? I have made a config file but finding it difficult to use it in pytesseract. |
@int3l ?? |
Take a look at the Tesseract OCR documentation and example config files. |
But my doubt is, I want to use the config file in pytesseract, does pytesseract provide a way to do that conveniently(inside the code)? |
Apologies for the previous comment. It was mistakenly pasted here. |
@Debjoy10 Sorry, I see what you mean - In that case try to specify the name of the config file as second argument (string) to This function allows a lot more control, but it is not "public", although you can use it. |
So it can be possible to omit it in cases, where it's not needed. In general It will help with #141
Soon it will be possible to import run_and_get_output directly from |
Thanks for the reply. However, what I was wanting to use was pytesseract.image_to_data. Is there a workaround for that too? |
I was able to supply my own config file by using the following: |
Where do you save "words" custom config file ? |
Look mama, no config files!
I was wrestling with config files for some of the settings when I ran across this google group discussion about tesseract using java and it made my mouth water. Here's a code snippet from their discussion:
At first you may think, well that's cool I guess but you can really do the same thing by just defining a long string of configs and calling it whenever you need it. For example,
'--psm 10 --oem 3 -c load_system_dawg=0 load_freq_dawg=0 load_punc_dawg=0 . . .'
In the tesseract documentation, it mentions that you can't change 'init only' parameters with tesseract executable option
-c
. And those 'init only' parameters would include some of the ones I've been messing with. I think that most people would say that it would be nice to be able to set your variables for your config file directly in python using aset_config_variable
method instead of having to go make a config file. Since some of the variables that are being set in the code above are in fact 'init only', the Java guys must be creating a config file (I did not sniff through their code to verify this, however) from java code.I haven't done it yet because I'm not too familiar with the code inside
pytesseract
, but right now making a temporary config file and letting it be loadable via aset_config_variable
method doesn't seem very hard from my perspective. Here's the high level logic I'm thinking about:tsr.set_config_variable
method, just write the variable, a space, and the value on a new line in the temp.txt file.Why this would be a good feature:
But maybe it's actually not very easy to implement. Is this actually possible?
The text was updated successfully, but these errors were encountered: