Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add pytorch demo #1008

Merged
merged 17 commits into from
Aug 26, 2022
Merged

Add pytorch demo #1008

merged 17 commits into from
Aug 26, 2022

Conversation

odulcy-mindee
Copy link
Collaborator

@odulcy-mindee odulcy-mindee commented Aug 3, 2022

This PR aims to add pytorch demo equivalent to the tensorflow one. See #658.

Question

Should we need also to update something for the live demo ? https://github.com/mindee/doctr#live-demo

@odulcy-mindee odulcy-mindee added type: enhancement Improvement ext: demo Related to demo folder framework: pytorch Related to PyTorch backend labels Aug 3, 2022
@odulcy-mindee odulcy-mindee self-assigned this Aug 3, 2022
@codecov
Copy link

codecov bot commented Aug 3, 2022

Codecov Report

Merging #1008 (b93328e) into main (7c73cdf) will decrease coverage by 0.01%.
The diff coverage is n/a.

@@            Coverage Diff             @@
##             main    #1008      +/-   ##
==========================================
- Coverage   94.93%   94.91%   -0.02%     
==========================================
  Files         135      135              
  Lines        5590     5590              
==========================================
- Hits         5307     5306       -1     
- Misses        283      284       +1     
Flag Coverage Δ
unittests 94.91% <ø> (-0.02%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
doctr/transforms/functional/base.py 95.65% <0.00%> (-1.45%) ⬇️

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

@odulcy-mindee odulcy-mindee changed the title Convert TF demo into Torch demo Add pytorch demo Aug 4, 2022
@odulcy-mindee odulcy-mindee marked this pull request as ready for review August 5, 2022 14:45
@frgfm frgfm requested review from frgfm and removed request for fg-mindee August 5, 2022 17:03
Copy link
Contributor

@felixdittrich92 felixdittrich92 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for taking care of this @odulcy-mindee 😃

But i think we have 3 options to make it work:

  1. We integrate a landing page where the user can choose which framework (maybe think about a switch to gradio which will make it much easier and cleaner) to use (install dependencies afterwards - progress bar)
  2. We install the dependencies for both frameworks as default and let the user decide which framework to use (maybe dropdown same as the closed PR)
  3. To tackle this issue after we have something valuable done for the onnx support / 'onnx ocr predictor'. (Which would be the best option i think)

I disagree a bit to have two 'different' apps (or better to split it outside of the app) 🤔@charlesmindee @frgfm wdyt ?

@felixdittrich92 felixdittrich92 added this to the 0.6.0 milestone Aug 5, 2022
@felixdittrich92 felixdittrich92 linked an issue Aug 5, 2022 that may be closed by this pull request
@frgfm
Copy link
Collaborator

frgfm commented Aug 6, 2022

Hello everyone 👋

I agree that having two different apps isn't necessarily the best option. Apart from training scripts (and actually we could also do it for this, but considering the user group, it's not a big deal), everything in docTR is handled with env variables to select the backend.

I would argue it's one of the core strengths of the project: the user doesn't see the difference but under the hood, a lot is done differently. I thought about it a month ago and I figured that the model loading and preprocessing+inference+postprocessing should be conditional on the backend. My suggestion is either:

  • keep a single script, with conditional statements where it's relevant to perform the operations
  • move the DL backend dependent ops to another module, and call this from the app.py

The second might sound better, but having a single file for a Streamlit/Gradio app is good advantage. However, this is only my opiniong 🤷‍♂️ (good news is that we can reuse everything you've done here @odulcy-mindee if we go for any of my suggestions)

@felixdittrich92
Copy link
Contributor

@odulcy-mindee @frgfm I think the main problem to solve is to find a good way to handle the different framework specific dependencies (if we dont want to wait until onnx part is done) than we could go in a similar way i have had in mind for #663

@odulcy-mindee
Copy link
Collaborator Author

@frgfm @felixdittrich92 thanks for your review ! Also thanks for the closed PR linked @felixdittrich92, I haven't seen it before.

I agree with you and the philosophy of the repo: it would be better to put everything in one script named app.py. I'll do a small code refactoring of this demo app.

I'll try to add a drop down menu to select the backend but I'm not sure how it'll behave with Streamlit (increase in RAM for instance). If it does not work, the user will have to set the env variables through the command line as it has been suggested.

@odulcy-mindee
Copy link
Collaborator Author

I tried to add a drop down menu to select Tensorflow or PyTorch but it turns out that it was a bit ugly to reimport functions from the selected backend at runtime.

Without any specification, it will use TensorFlow backend. You can also select backend using environment variable:

USE_TF=1 streamlit run demo/app.py

Or

USE_TORCH=1 streamlit run demo/app.py

I also added a quick mention of which backend is in used in sidebar:

image

Copy link
Contributor

@felixdittrich92 felixdittrich92 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @odulcy-mindee 👋 ,
thanks a lot for your changes.

Some points:

  • please merge main into your branch :)
  • your currently changes leads to the same results now we have one app and the user can decide which backend to use (locally) but for the HF space it is the same case we would need to deploy 2 apps (TF/PT). I`m not sure if we want to do this 😅 @frgfm @charlesmindee wdyt ? I would personally prefer to have one space where you can choose the backend (python-doctr[all]) and keep only the HF spaces deployment (remove guidiance to start it locally)
  • I would suggest to complete the list of Archs (det / reco) where is a pretrained checkpoint availabe (check in main branch / sar + master are broken in current pip release can be removed - adding pretrained versions are pinned to 0.6.0)

We should definitly keep in mind to change it to ONNX Runtime if we have a valuable solution done 👍

Let's wait on a second opinion before we go further in any direction 😃

@frgfm
Copy link
Collaborator

frgfm commented Aug 11, 2022

Hi everyone 👋

I agree that perhaps we should wait a bit to switch the demo to ONNX. I've done it for other projects and for inference it makes way more sense and deletes a LOT of dependencies, making the minimal environment considerably lighter!

Otherwise, going for the double backend:

  • on dev side, changing the env variable makes sense
  • on HF spaces, that won't be possible though.

So either we:

  • go for separate apps (that complicates things for maintenance though, but it can be temporary)
  • or we try to plan for a full ONNX + no PT/TF env (that means docTR itself won't be loaded)

Since the demo is a good way to quickly try out a model, I'd say that if we cannot afford to do ONNX right away, we need a way to use PyTorch. So should we go for env vars that select the backend (my favorite option), or separate apps?

@felixdittrich92
Copy link
Contributor

My favorite way would definitely be ONNX.
But I think that's still a bit too far away at the moment.
I'm not sure if changing the environment variable during runtime is really that problematic (HF Space), maybe we should try it!?
(If it should work than i think it would be the best option for the moment)

@felixdittrich92
Copy link
Contributor

felixdittrich92 commented Aug 24, 2022

Hi @odulcy-mindee 👋 any updates ? :) Otherwise lets convert to Draft

@odulcy-mindee
Copy link
Collaborator Author

odulcy-mindee commented Aug 24, 2022

Hello @felixdittrich92 @frgfm,

I think we can keep this implementation as it is: it has the benefit to show how we should do an inference with TensorFlow and PyTorch thanks to load_predictor and forward_image functions.

For HuggingFace, we'll not have the ability to change backend but I think it's not the purpose of it: it's more a toy to play with and to see a concrete application of our app demo.

I'll rebase this branch on main to check that everything is up-to-date.

Let's do another PR for ONNX when it's ready !

@felixdittrich92
Copy link
Contributor

felixdittrich92 commented Aug 26, 2022

Hi @odulcy-mindee 👋,

So I think that the deployment as an HF space has a much greater benefit / range than being able to test the demo locally. But yes for the moment I would agree until we are ONNX ready (should be possible in 0.6.0). @frgfm wdyt ?

@odulcy-mindee Can you solve the conflicts please :)

@odulcy-mindee
Copy link
Collaborator Author

@felixdittrich92 I rebased on main, it should good.
Concerning HuggingFace Space, I don't know how to update the demo 😕 We can swap the script on HF with the TF demo so we'll keep the same behavior as before.

@felixdittrich92
Copy link
Contributor

@felixdittrich92 I rebased on main, it should good. Concerning HuggingFace Space, I don't know how to update the demo confused We can swap the script on HF with the TF demo so we'll keep the same behavior as before.

I cannot help without access so you should ask: @charlesmindee or @mehdimindee (https://huggingface.co/mindee) 👍

demo/tensorflow/app.py Outdated Show resolved Hide resolved
Copy link
Contributor

@felixdittrich92 felixdittrich92 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @odulcy-mindee :)

@felixdittrich92 felixdittrich92 merged commit 61d0f1c into mindee:main Aug 26, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ext: demo Related to demo folder framework: pytorch Related to PyTorch backend type: enhancement Improvement
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Make the demo torch-compatible
3 participants