Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

[GSoC 2019] Growing the model garden 馃尦 #1684

Closed
sdll opened this issue Jun 22, 2019 · 9 comments
Closed

[GSoC 2019] Growing the model garden 馃尦 #1684

sdll opened this issue Jun 22, 2019 · 9 comments
Assignees
Labels

Comments

@sdll
Copy link
Contributor

@sdll sdll commented Jun 22, 2019

This issue tracked the progress for GSoC'19 project "Reasonable Effectiveness of Mobile Inference: Adaptive Growth of the TensorFlow.js Model Garden".

The final report is here.

Active

  • DeepLab

  • EfficientNet

  • Text Detection

    • Status: The js code is under review. The optimized model requires more training to pass the quality tests for the merge.
    • Requires compute: yes
    • Reference implementations: (tf.slim) (PyTorch)
    • PR
  • Text Recognition

Scratchpad

@sdll

This comment has been minimized.

Copy link
Contributor Author

@sdll sdll commented Jun 22, 2019

@nsthorat (as I have promised here), @dsmilkov, @manrajgrover, let me know if you have any opinion on which models you would like to see first. I would love to hear any ideas and high-level views on the plan and details of implementation as well.

@kavikode

This comment has been minimized.

Copy link

@kavikode kavikode commented Jul 1, 2019

Mask-RCNN and YOLO will be very helpful to see first @sdll @nsthorat @dsmilkov @manrajgrover because they are very useful and commonly used for computer vision. For my own research, I'm trying to use it for robotics control to provide accessibility people with disabilities. Among all of the other models, these models will increase accessibility control and independence for people with disabilities. In fact, I could really use some help to run Mask-RCNN and YOLO in the browser for robotics control as soon as possible. Hope you can please help.

@sdll

This comment has been minimized.

Copy link
Contributor Author

@sdll sdll commented Jul 3, 2019

@kavikode, TF.js implementations of YOLO already exist. To convert Mask-RCNN as described here, you might try the following:

  • switch the source to tf.keras
  • load the pre-trained weights
  • export the model to tf.SavedModel
  • convert tf.SavedModel using tfjs-converter

However, Mask-RCNN is quite a heavyweight model and may require further optimizations to run smoothly.

@kavikode

This comment has been minimized.

Copy link

@kavikode kavikode commented Jul 8, 2019

thank you so much @sdll

@beriberikix

This comment has been minimized.

Copy link

@beriberikix beriberikix commented Jul 15, 2019

What is the process to pitch for moving items from the scratchpad to the active list? I reported #1725 and love to make a case to whoever will listen :)

@sdll

This comment has been minimized.

Copy link
Contributor Author

@sdll sdll commented Jul 15, 2019

@beriberikix, thank you for your interest! I have not dived into the details, but DeepSpeech seems to be resource-intensive, given the discussion in your other issue. Have you had any success converting the model using tfjs-converter, how does it fare?

@beriberikix

This comment has been minimized.

Copy link

@beriberikix beriberikix commented Jul 18, 2019

I haven't had any success with the converter tool. I've asked for some clarification on how the saved model is generated. I'll updated if and when I get the converter working.

@jessetrana

This comment has been minimized.

Copy link

@jessetrana jessetrana commented Jul 20, 2019

I've worked some with DeepSpeech in the past. As I recall, it has both significant compute work for generating the input audio features (a la MFCC) as well as decoding via the language model. I'm not sure how much of that functionality also has corresponding JS libraries, so it's possible that porting it may be an involved effort. By any chance, have you looked into the supporting work apart from the TF model yet to see how easy or hard it might be?

@sdll

This comment has been minimized.

Copy link
Contributor Author

@sdll sdll commented Aug 27, 2019

Thanks everyone for the suggestions. Since the GSoC 2019 run came to the end, I have added the final report to the issue and updated the status of each port. Let me know if you have any other ideas on porting text detection/text recognition or would like to work on other models.

@sdll sdll closed this Aug 27, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
6 participants
You can鈥檛 perform that action at this time.