Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Korean training (tensorflow serving included) #126

Closed
kspook opened this issue Mar 18, 2019 · 16 comments
Closed

Korean training (tensorflow serving included) #126

kspook opened this issue Mar 18, 2019 · 16 comments

Comments

@kspook
Copy link

kspook commented Mar 18, 2019

I got below message when I tried 'test'.

I changed several things;
1.'iso-8859-1' to 'utf-8'
2. add two Korean Character in data_gen.py
CHARMAP = ['', '', ''] + list('0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ신한')
3. in the bucketdata.py :
raise NotImplementedError --> omit and add 3 lines
(other wise the program exit with 'NotImplementedError')
else:
#raise NotImplementedError
self.label_list[l_idx] =
self.label_list[l_idx][:decoder_input_len]
target_weights.append([1]*decoder_input_len)

(py36) D:\attention-ocr_b2>python ./aocr/main.py test ./dataset/testing.tfrecords
2019-03-18 12:33:22.445747: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2
2019-03-18 12:33:22,458 root INFO phase: test
2019-03-18 12:33:22,458 root INFO model_dir: checkpoints
2019-03-18 12:33:22,458 root INFO load_model: True
2019-03-18 12:33:22,459 root INFO output_dir: results
2019-03-18 12:33:22,459 root INFO steps_per_checkpoint: 0
2019-03-18 12:33:22,459 root INFO batch_size: 1
2019-03-18 12:33:22,459 root INFO learning_rate: 1.000000
2019-03-18 12:33:22,459 root INFO reg_val: 0
2019-03-18 12:33:22,460 root INFO max_gradient_norm: 5.000000
2019-03-18 12:33:22,460 root INFO clip_gradients: True
2019-03-18 12:33:22,460 root INFO max_image_width 160.000000
2019-03-18 12:33:22,460 root INFO max_prediction_length 8.000000
2019-03-18 12:33:22,460 root INFO channels: 1
2019-03-18 12:33:22,460 root INFO target_embedding_size: 10.000000
2019-03-18 12:33:22,461 root INFO attn_num_hidden: 128
2019-03-18 12:33:22,461 root INFO attn_num_layers: 2
2019-03-18 12:33:22,461 root INFO visualize: False
2019-03-18 12:33:24,005 root INFO data_gen.gen()
2019-03-18 12:33:24,225 root INFO Step 1 (0.136s). Accuracy: 0.00%, loss: 4.895189, perplexity: 133.645, probability: 1.03% 0% (85 vs 4)
2019-03-18 12:33:24,243 root INFO Step 2 (0.017s). Accuracy: 0.00%, loss: 12.590834, perplexity: 2.93853e+05, probability: 39.16% 0% (53 vs 2)
2019-03-18 12:33:24,260 root INFO Step 3 (0.016s). Accuracy: 0.00%, loss: 15.508214, perplexity: 5.43415e+06, probability: 98.23% 0% (51 vs 3)
2019-03-18 12:33:24,278 root INFO Step 4 (0.017s). Accuracy: 0.00%, loss: 16.600834, perplexity: 1.62051e+07, probability: 71.18% 0% (49 vs 1)

@kspook
Copy link
Author

kspook commented Mar 18, 2019

it's related with #11
but in my case, it is not prediction length.

And I also checked with #52. but I had the same problem

@emedvedev
Copy link
Owner

Do you have the output from training the model as well? Does it converge on training at all? The error is a bit strange, since testing should (almost) always work if training worked, unless you've modified something between training and testing.

@kspook
Copy link
Author

kspook commented Mar 19, 2019

As per raise NotImplementedError (#11), I fixed. The script handles the output data as corrupted data with long digit. So, I changed prediction_length more than 8.

As per utf-8 (#52), I followed the your mention in the every step. So, it looks fine except one thing;
in case of number, it product 51, 52 like Unicode number (' ord( )')
according to the output above, the numbers was produced correctly and it includes Unicode number( 'ord()'). Now I just have Unicode number. .
in case of Korean character, it product 9 digit number.
I am in the middle of re-checking.

Thanks a lot.

@kspook
Copy link
Author

kspook commented Mar 20, 2019

I think I have Unicode problem I have #52

in the dataset, I changed like below.
label=''.join(map(str,label.encode('utf-8')))
feature = {}
feature['image'] = _bytes_feature(img)
#feature['label'] = _bytes_feature(b(label))
feature['label'] = _bytes_feature(b(label))

Otherwise, I have below error. T.T
Can you help me how to fix?

(1) error msg
Traceback (most recent call last):
File "./aocr2/main.py", line 285, in
main()
File "./aocr2/main.py", line 225, in main
parameters.save_filename
File "D:\attention-ocr_b2\aocr2\util\dataset.py", line 54, in generate
feature['label'] = _bytes_feature(b(label))
File "C:\Users\60067527\Anaconda3\envs\py36\lib\site-packages\six.py", line 626, in b
return s.encode("latin-1")
UnicodeEncodeError: 'latin-1' codec can't encode character '\uc2e0' in position 0: ordinal not in range(256)

(2) test messege

2019-03-20 14:39:19,999 root INFO Step 1 (0.206s). Accuracy: 100.00%, loss: 0.000109, perplexity: 1.00011, probability: 54.81% 100% (51)

2019-03-20 14:39:20,024 root INFO Step 2 (0.019s). Accuracy: 100.00%, loss: 0.000018, perplexity: 1.00002, probability: 91.36% 100% (49)

2019-03-20 14:39:21,647 root INFO Step 3 (0.020s). Accuracy: 70.37%, loss: 4.728099, perplexity: 113.080, probability: 41.93% 11% (51 vs 237149156)

2019-03-20 14:39:21,675 root INFO Step 4 (0.020s). Accuracy: 77.78%, loss: 0.067931, perplexity: 1.07029, probability: 17.18% 100% (50)

2019-03-20 14:39:21,702 root INFO Step 5 (0.020s). Accuracy: 64.44%, loss: 2.101013, perplexity: 8.17445, probability: 1.35% 11% (49 vs 236139160)

@emedvedev
Copy link
Owner

Ah, that actually makes sense. Please submit a PR once you get the model working—I'm sure quite a lot of people will appreciate it! I'd be happy to merge and help with any modifications, if needed.

@emedvedev
Copy link
Owner

ord('신') = 4988, but I have 236139160
label.decode('utf-8') = b'236139160'
So I can't get final data = '신'

I am stucked here T.T

236/139/160 are the UTF-8 codepoints for 신 in decimal. :)

>>> '신'
'\xec\x8b\xa0'
>>> int('ec', 16)
236
>>> int('8b', 16)
139
>>> int('a0', 16)
160

The fact they're all glued together might be an issue though, but I'm not sure how to address that with minimal changes off the top of my head.

@kspook
Copy link
Author

kspook commented Mar 25, 2019

In addition, curl -X POST \ http://localhost:9001/v1/models/yourmodelname:predict -H 'cache-control: no-cache' -H 'content-type: application/json' -d '{ "signature_name": "serving_default", "inputs": { "input": { "b64": "/9j/4AAQ==" } }}'

what should I type for "yourmodelname"?

@kspook
Copy link
Author

kspook commented Mar 27, 2019

no response?

As per Korean recognition, #126 (comment) , I put the result below.

  • In the dataset, I put two lines

          label= [ord(str(c)) for c in label ]
          label=''.join(map(str,label))
      	
          feature = {}
          feature['image'] = _bytes_feature(img)
          feature['label'] = _bytes_feature(b(label))
    

As my comment, da03/Attention-OCR#48 (comment), there is the output beyond training character.

To solve this, I made an index and trained at the original one, da03/Attention-OCR#48
So, in your source, I succeeded in character based recognition.
But it doesn't support word. (maybe it is possible if a few word get their own index)

**** detail of result
(py36) D:\attention-ocr_b34ui>python aocr34/main.py test dataset/testingk.tfrecords
2019-03-27 16:16:11.812149: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2
2019-03-27 16:16:11,826 root INFO phase: test
2019-03-27 16:16:11,827 root INFO model_dir: checkpoints
2019-03-27 16:16:11,827 root INFO load_model: True
2019-03-27 16:16:11,828 root INFO output_dir: results
2019-03-27 16:16:11,829 root INFO steps_per_checkpoint: 0
2019-03-27 16:16:11,830 root INFO batch_size: 1
2019-03-27 16:16:11,833 root INFO learning_rate: 1.000000
2019-03-27 16:16:11,834 root INFO reg_val: 0
2019-03-27 16:16:11,834 root INFO max_gradient_norm: 5.000000
2019-03-27 16:16:11,835 root INFO clip_gradients: True
2019-03-27 16:16:11,836 root INFO max_image_width 160.000000
2019-03-27 16:16:11,836 root INFO max_prediction_length 18.000000
2019-03-27 16:16:11,837 root INFO channels: 1
2019-03-27 16:16:11,838 root INFO target_embedding_size: 10.000000
2019-03-27 16:16:11,839 root INFO attn_num_hidden: 128
2019-03-27 16:16:11,841 root INFO attn_num_layers: 2
2019-03-27 16:16:11,842 root INFO visualize: False
2019-03-27 16:16:13,842 root INFO data_gen.gen()
word [1 8 5 2] b'52'
(
2019-03-27 16:16:14,132 root INFO Step 1 (0.187s). Accuracy: 0.00%, loss: 3.016383, perplexity: 20.4173, probability: 31.50% 0% (( vs 4)
word [ 1 7 12 2] b'49'
1
2019-03-27 16:16:14,156 root INFO Step 2 (0.021s). Accuracy: 50.00%, loss: 0.000025, perplexity: 1.00003, probability: 99.97% 100% (1)
word [1 8 3 2] b'50'
2
2019-03-27 16:16:14,180 root INFO Step 3 (0.021s). Accuracy: 66.67%, loss: 0.000059, perplexity: 1.00006, probability: 99.86% 100% (2)
word [1 8 4 2] b'51'
3
2019-03-27 16:16:14,204 root INFO Step 4 (0.021s). Accuracy: 75.00%, loss: 0.021416, perplexity: 1.02165, probability: 93.58% 100% (3)
word [1 8 7 9 5 3 2] b'54620'
2
2019-03-27 16:16:14,230 root INFO Step 5 (0.021s). Accuracy: 68.00%, loss: 6.107238, perplexity: 449.097, probability: 53.86% 40% (2 vs 한)
word [ 1 7 12 11 11 11 2] b'49888'
(
2019-03-27 16:16:14,253 root INFO Step 6 (0.020s). Accuracy: 60.00%, loss: 6.769095, perplexity: 870.524, probability: 72.55% 20% (( vs 신)

@emedvedev
Copy link
Owner

But it doesn't support word. (maybe it is possible if a few word get their own index)

It's true that this model is mostly optimized for character-based recognition, not word-based. You can, as you said, modify it to give words their own indices instead of characters, although I'm not sure if the performance will be acceptable in that case, and you'll need a massive dataset, too. Doesn't hurt to experiment though. :)

@kspook
Copy link
Author

kspook commented Mar 28, 2019

providing index of word is almost impossible.
There is no chance if you change the original code, da03/Attention-OCR#48 (comment)?
In the original, I succeeded word recognition.

Anyway what do I put "yourmodel" when running your code, #126 (comment)

@kspook
Copy link
Author

kspook commented Mar 29, 2019

can you answer to me???
please.

In addition, curl -X POST \ http://localhost:9001/v1/models/yourmodelname:predict -H 'cache-control: no-cache' -H 'content-type: application/json' -d '{ "signature_name": "serving_default", "inputs": { "input": { "b64": "/9j/4AAQ==" } }}'

what should I type for "yourmodelname"?

@emedvedev
Copy link
Owner

You just put aocr.

Please don't post repeated requests: all support in this (and other) open source projects is done on volunteering basis, whenever people have time and capacity to respond. Kindly be prepared to do your own research: in this case, there are other issues in this repository that concern POST requests to the API, and they have correct URLs which you could just look at.

@kspook
Copy link
Author

kspook commented Apr 3, 2019

Thank you for comments and your other efforts.
It took me so long time to succeed in tensorflow serving with below link.
why don't you change README.md file?

#94 (comment)

curl -X POST --output -
http://localhost:9001/v1/models/aocr:predict
-H 'cache-control: no-cache'
-H 'content-type: application/json'
-d '{
"signature_name": "serving_default",
"inputs":
{
"input": { "b64": "/9j/4AAQSkZJRgABAQAASABIAAD/4QBYRXhpZgAATU0AKgAAAAgAAgESAAMAAAABAAEAAIdpAAQAAAABAAAAJgAAAAAAA6ABAAMAAAABAAEAAKACAAQAAAABAAAAOaADAAQAAAABAAAAHAAAAAD/7QA4UGhvdG9zaG9wIDMuMAA4QklNBAQAAAAAAAA4QklNBCUAAAAAABDUHYzZjwCyBOmACZjs+EJ+/8AAEQgAHAA5AwEiAAIRAQMRAf/EAB8AAAEFAQEBAQEBAAAAAAAAAAABAgMEBQYHCAkKC//EALUQAAIBAwMCBAMFBQQEAAABfQECAwAEEQUSITFBBhNRYQcicRQygZGhCCNCscEVUtHwJDNicoIJChYXGBkaJSYnKCkqNDU2Nzg5OkNERUZHSElKU1RVVldYWVpjZGVmZ2hpanN0dXZ3eHl6g4SFhoeIiYqSk5SVlpeYmZqio6Slpqeoqaqys7S1tre4ubrCw8TFxsfIycrS09TV1tfY2drh4uPk5ebn6Onq8fLz9PX29/j5+v/EAB8BAAMBAQEBAQEBAQEAAAAAAAABAgMEBQYHCAkKC//EALURAAIBAgQEAwQHBQQEAAECdwABAgMRBAUhMQYSQVEHYXETIjKBCBRCkaGxwQkjM1LwFWJy0QoWJDThJfEXGBkaJicoKSo1Njc4OTpDREVGR0hJSlNUVVZXWFlaY2RlZmdoaWpzdHV2d3h5eoKDhIWGh4iJipKTlJWWl5iZmqKjpKWmp6ipqrKztLW2t7i5usLDxMXGx8jJytLT1NXW19jZ2uLj5OXm5+jp6vLz9PX29/j5+v/bAEMACAYGBwYFCAcHBwkJCAoMFA0MCwsMGRITDxQdGh8eHRocHCAkLicgIiwjHBwoNyksMDE0NDQfJzk9ODI8LjM0Mv/bAEMBCQkJDAsMGA0NGDIhHCEyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMv/dAAQABP/aAAwDAQACEQMRAD8AyYYwgG6rIIAHBqCLHFQeVLHP57SHPmhdmeApOK5bHKaKnjjjtzT1ljjLhnXKj5gD0rMaVxdtCsch2zA7ieADVqK1X7VMxX5y2Dk9QQD/AFq0hWLkc0fmhFbJKhvwNTxMHGQehway9LVUnl9D9xic/KKuW88e+ZVJyGLdO2KZJahZZYhIDgH9Kk2j+8KpWcu7chRhhmIJHByc1fwP8igR/9DKhIwNtI1oHuPM3tsyG8vtkd6bExAFW0Y4rmRx3AwD5gR94hjz3FXLqMJcOP4gFB+oAH9KhjG48+lWLg7rxye8hqhNkSRhQqqoAAwKsKVHtxjIHvUP3nGalQYFMBwOCD/+qpd/1pgUcU/NAH//2Q==" }
}
}'

@kspook kspook changed the title Korean training Korean training (include tensorflow serving) Apr 3, 2019
@kspook kspook changed the title Korean training (include tensorflow serving) Korean training (tensorflow serving included) Apr 3, 2019
@emedvedev
Copy link
Owner

emedvedev commented Apr 3, 2019 via email

@kspook
Copy link
Author

kspook commented Apr 16, 2019

I tried to train Korean words again.
For example, '신','한' was converted '853863'.
that is, 85('신')+3+86('한')+3
FYI, I omitted 1,2,3 for index of all input.
Later, I put serial index (0-40) for training of characters in the word
If I put 36+3 as index of '신', 37+3 as '한' --> [1, 39, 40, 2 ],
I could trained well.

Finally I succeeded in word recognition.
Thank you for sharing your knowledge. ^^

*** result ***

one small example : 172 images traininig
(py36) D:\attention-ocr_b36uwi>python aocr36 test dataset/testingwk.tfrecords
2019-04-16 09:06:56.586067: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2
2019-04-16 09:06:56,598 root INFO phase: test
2019-04-16 09:06:56,598 root INFO model_dir: checkpoints
2019-04-16 09:06:56,598 root INFO load_model: True
2019-04-16 09:06:56,598 root INFO output_dir: results
2019-04-16 09:06:56,599 root INFO steps_per_checkpoint: 0
2019-04-16 09:06:56,599 root INFO batch_size: 1
2019-04-16 09:06:56,599 root INFO learning_rate: 1.000000
2019-04-16 09:06:56,599 root INFO reg_val: 0
2019-04-16 09:06:56,599 root INFO max_gradient_norm: 5.000000
2019-04-16 09:06:56,600 root INFO clip_gradients: True
2019-04-16 09:06:56,600 root INFO max_image_width 160.000000
2019-04-16 09:06:56,600 root INFO max_prediction_length 18.000000
2019-04-16 09:06:56,600 root INFO channels: 1
2019-04-16 09:06:56,600 root INFO target_embedding_size: 10.000000
2019-04-16 09:06:56,600 root INFO attn_num_hidden: 128
2019-04-16 09:06:56,600 root INFO attn_num_layers: 2
2019-04-16 09:06:56,601 root INFO visualize: False
2019-04-16 09:06:58,598 root INFO data_gen.gen()
2019-04-16 09:06:58,914 root INFO Step 1 (0.186s). Accuracy: 100.00%, loss: 0.000002, perplexity: 1.00000, probability: 100.00% 100% (신한)
2019-04-16 09:06:58,934 root INFO Step 2 (0.019s). Accuracy: 100.00%, loss: 0.000002, perplexity: 1.00000, probability: 100.00% 100% (신한)
2019-04-16 09:06:58,955 root INFO Step 3 (0.020s). Accuracy: 100.00%, loss: 0.000002, perplexity: 1.00000, probability: 100.00% 100% (신한)
2019-04-16 09:06:58,975 root INFO Step 4 (0.019s). Accuracy: 100.00%, loss: 0.000002, perplexity: 1.00000, probability: 100.00% 100% (신한)
2019-04-16 09:06:58,996 root INFO Step 5 (0.019s). Accuracy: 100.00%, loss: 0.000002, perplexity: 1.00000, probability: 100.00% 100% (신한)
2019-04-16 09:06:59,017 root INFO Step 6 (0.020s). Accuracy: 100.00%, loss: 0.000002, perplexity: 1.00000, probability: 100.00% 100% (신한)
2019-04-16 09:06:59,040 root INFO Step 7 (0.022s). Accuracy: 100.00%, loss: 0.000002, perplexity: 1.00000, probability: 100.00% 100% (신한)
2019-04-16 09:06:59,063 root INFO Step 8 (0.021s). Accuracy: 100.00%, loss: 0.000003, perplexity: 1.00000, probability: 100.00% 100% (신한)
2019-04-16 09:06:59,085 root INFO Step 9 (0.021s). Accuracy: 100.00%, loss: 0.000002, perplexity: 1.00000, probability: 100.00% 100% (신한)
2019-04-16 09:06:59,108 root INFO Step 10 (0.020s). Accuracy: 100.00%, loss: 0.000002, perplexity: 1.00000, probability: 100.00% 100% (신한)
2.FYI, one bad result - 10 images training
2019-04-15 16:16:31.221433: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2
2019-04-15 16:16:31,234 root INFO phase: test
2019-04-15 16:16:31,234 root INFO model_dir: checkpoints
2019-04-15 16:16:31,235 root INFO load_model: True
2019-04-15 16:16:31,235 root INFO output_dir: results
2019-04-15 16:16:31,235 root INFO steps_per_checkpoint: 0
2019-04-15 16:16:31,235 root INFO batch_size: 1
2019-04-15 16:16:31,236 root INFO learning_rate: 1.000000
2019-04-15 16:16:31,236 root INFO reg_val: 0
2019-04-15 16:16:31,236 root INFO max_gradient_norm: 5.000000
2019-04-15 16:16:31,236 root INFO clip_gradients: True
2019-04-15 16:16:31,236 root INFO max_image_width 160.000000
2019-04-15 16:16:31,236 root INFO max_prediction_length 18.000000
2019-04-15 16:16:31,237 root INFO channels: 1
2019-04-15 16:16:31,237 root INFO target_embedding_size: 10.000000
2019-04-15 16:16:31,237 root INFO attn_num_hidden: 128
2019-04-15 16:16:31,237 root INFO attn_num_layers: 2
2019-04-15 16:16:31,237 root INFO visualize: False
2019-04-15 16:16:33,224 root INFO data_gen.gen()
step , [1.223667, b'\xed\x95\x9c', 0.568962602180392]
test output ground 한 853863

label_list
[(4, '0'), (5, '1'), (6, '2'), (7, '3'), (8, '4'), (9, '5'), (40, '6'), (44, '7'), (45, '8'), (46, '9'), (47, 'A'), (48, 'B'), (49, 'C'), (50, 'D'), (54, 'E'), (55, 'F'), (56, 'G'), (57, 'H'), (58, 'I'), (59, 'J'), (60, 'K'), (64, 'L'), (65, 'M'), (66, 'N'), (67, 'O'), (68, 'P'), (69, 'Q'), (70, 'R'), (74, 'S'), (75, 'T'), (76, 'U'), (77, 'V'), (78, 'W'), (79, 'X'), (80, 'Y'), (84, 'Z'), (85, '신'), (86, '한')]

c lex, 8 853863
revert n=n+c 8
c lex, 5 853863
revert n=n+c 85
c lex, 3 853863
c lex, 8 853863
revert n=n+c 8
c lex, 6 853863
revert n=n+c 86
c lex, 3 853863

revert() for ground
n l_new label[0], label, 신한 86 (86, '한')

output : 한

2019-04-15 16:16:33,517 root INFO Step 1 (0.188s). Accuracy: 50.00%, loss: 1.223667, perplexity: 3.39963, probability: 56.90% 50% (한 vs 신한) ./dataset/word-img/hangul-images/hangul_521.jpeg

@emedvedev
Copy link
Owner

Awesome, glad it's working for you! I'm going to close the issue since you've succeeded, but please feel free to open another one (or write right here) if you encounter other problems. Happy to help!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants