Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore: Added CI jobs to check classification training #457

Merged
merged 6 commits into from
Sep 3, 2021
Merged

Conversation

fg-mindee
Copy link
Contributor

As per #429, this PR introduces the following modifications:

  • added option to change the font family in the character classification script
  • added also an option to change the number of samples during training & validation
  • added CI jobs to run the classification training for a small epoch
  • increased the sleep in CI job before checking if the demo is up

Any feedback is welcome!

@fg-mindee fg-mindee added topic: ci Related to CI ext: references Related to references folder topic: character classification Related to the task of character classification labels Sep 3, 2021
@fg-mindee fg-mindee added this to the 0.4.0 milestone Sep 3, 2021
@fg-mindee fg-mindee self-assigned this Sep 3, 2021
@codecov
Copy link

codecov bot commented Sep 3, 2021

Codecov Report

Merging #457 (2949965) into main (64c7864) will increase coverage by 0.02%.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff             @@
##             main     #457      +/-   ##
==========================================
+ Coverage   95.83%   95.86%   +0.02%     
==========================================
  Files          96       96              
  Lines        4013     4013              
==========================================
+ Hits         3846     3847       +1     
+ Misses        167      166       -1     
Flag Coverage Δ
unittests 95.86% <ø> (+0.02%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
doctr/models/core.py 95.08% <0.00%> (+0.81%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 64c7864...2949965. Read the comment docs.

Copy link
Collaborator

@charlesmindee charlesmindee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@fg-mindee
Copy link
Contributor Author

I've checked on my end, there is some seriously obscure issue with TF backprop:
the script works perfectly on GPU, but if you run on CPU, it throws a model sizing error ...

Traceback (most recent call last):
  File "references/classification/train_tensorflow.py", line 261, in <module>
    main(args)
  File "references/classification/train_tensorflow.py", line 198, in main
    fit_one_epoch(model, train_loader, batch_transforms, optimizer, mb)
  File "references/classification/train_tensorflow.py", line 41, in fit_one_epoch
    grads = tape.gradient(train_loss, model.trainable_weights)
  File "/home/fg/miniconda3/lib/python3.8/site-packages/tensorflow/python/eager/backprop.py", line 1074, in gradient
    flat_grad = imperative_grad.imperative_grad(
  File "/home/fg/miniconda3/lib/python3.8/site-packages/tensorflow/python/eager/imperative_grad.py", line 71, in imperative_grad
    return pywrap_tfe.TFE_Py_TapeGradient(
  File "/home/fg/miniconda3/lib/python3.8/site-packages/tensorflow/python/eager/backprop.py", line 159, in _gradient_function
    return grad_fn(mock_op, *out_grads)
  File "/home/fg/miniconda3/lib/python3.8/site-packages/tensorflow/python/ops/nn_grad.py", line 581, in _Conv2DGrad
    gen_nn_ops.conv2d_backprop_input(
  File "/home/fg/miniconda3/lib/python3.8/site-packages/tensorflow/python/ops/gen_nn_ops.py", line 1247, in conv2d_backprop_input
    _ops.raise_from_not_ok_status(e, name)
  File "/home/fg/miniconda3/lib/python3.8/site-packages/tensorflow/python/framework/ops.py", line 6897, in raise_from_not_ok_status
    six.raise_from(core._status_to_exception(e.code, message), None)
  File "<string>", line 3, in raise_from
tensorflow.python.framework.errors_impl.InvalidArgumentError: Computed input depth 960 doesn't match filter input depth 1 [Op:Conv2DBackpropInput]

@fg-mindee
Copy link
Contributor Author

fg-mindee commented Sep 3, 2021

I can confirm this is at TF level and this is because of grouped convolutions. We cannot do much about this right now

For reference tensorflow/tensorflow#51825

@fg-mindee fg-mindee merged commit 3e7e9de into main Sep 3, 2021
@fg-mindee fg-mindee deleted the ci-ref branch September 3, 2021 15:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ext: references Related to references folder topic: character classification Related to the task of character classification topic: ci Related to CI
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants