Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pre-trained models #43

Open
caop-kie opened this issue Oct 7, 2022 · 5 comments
Open

Pre-trained models #43

caop-kie opened this issue Oct 7, 2022 · 5 comments

Comments

@caop-kie
Copy link

caop-kie commented Oct 7, 2022

Thanks for the great work! Do you have any plan to release the pre-trained model of docformer?

@uakarsh
Copy link
Collaborator

uakarsh commented Oct 8, 2022

Hi @Aysp, thanks for your appreciation. We have the scripts ready as for now, to pre-train DocFormer, but not sure if it would produce the extact same results as that of paper, since the author basically didn't describe the exact collection of data they used for pre-training (although it was RVL-CDIP), and beside that, there is resource constraint with us, so that also makes it a bit difficult to pre-train.

Regards,
Akarsh

@jmandivarapu1
Copy link

jmandivarapu1 commented Nov 11, 2022

@uakarsh Can you release the existing pre-training code. Even thought it doesn't produce good results it would be good as an starting point.

@uakarsh
Copy link
Collaborator

uakarsh commented Nov 11, 2022

Hi @jmandivarapu1,

Although I didn't write the entire code, but I did write till the part where the pytorch dataset object could be made and one iteration/batch's forward and backward pass could be done

Here is the code https://github.com/shabie/docformer/blob/master/examples/DocFormer_for_MLM.ipynb

Hope it helps.

@uakarsh
Copy link
Collaborator

uakarsh commented Nov 11, 2022

I would be working from my side for MLM (although there are 3 pre-training task) and would update shortly.

Thanks,

@uakarsh
Copy link
Collaborator

uakarsh commented Feb 13, 2023

Hi @jmandivarapu1 @Aysp can you guys again try the fine-tuning using the pre-trained weights (I have attached them in the readme)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants