Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to use your own models #475

Closed
DomEscobar opened this issue Dec 24, 2023 · 2 comments
Closed

How to use your own models #475

DomEscobar opened this issue Dec 24, 2023 · 2 comments
Labels
question Further information is requested

Comments

@DomEscobar
Copy link

Question

Hey I really appreciate your work here!

I'm very interested in setting up a perfect RAG pipeline / flow and therefore I need a good document extraction with table-transformers and layout detection.

Example :
https://github.com/deepdoctection/deepdoctection

Where I'd use
https://huggingface.co/microsoft/layoutlmv3-base

https://huggingface.co/microsoft/table-transformer-detection

I could ask you if would add one of these but I want to try it myself.
As I understood I can use your script and deploy it on my huggingface.co so I could consume it, is this right?

@DomEscobar DomEscobar added the question Further information is requested label Dec 24, 2023
@xenova
Copy link
Owner

xenova commented Dec 25, 2023

Hi there 👋 I've opened a PR to add support for Table Transformer models in transformers.js. Fortunately, layoutlmv3 is already supported (by optimum), so it shouldn't be too difficult to add support for it too.

Do you have example python code for a workflow that you would like to work in JS?

@DomEscobar
Copy link
Author

Ohhh Okey amazing fast! I don't have a py flow, will mimic the deepdoctection by using

  1. Layout seg.
  2. Table extraction
  3. Tesseract
  4. (mb baklava if pure images)
    And store it into lancedb.. That's the idea for a good avg. extraction 🤔

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

3 participants
@xenova @DomEscobar and others