Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Differences between original implementation and HuggingFace implementation #9228

Closed
4 tasks
osabnis opened this issue Dec 21, 2020 · 2 comments
Closed
4 tasks
Labels

Comments

@osabnis
Copy link

osabnis commented Dec 21, 2020

Environment info

  • transformers version: 4.0.0
  • Platform: Windows
  • Python version: 3.6.5
  • PyTorch version (GPU?): 1.6.0+cu101
  • Tensorflow version (GPU?): -
  • Using GPU in script?: Yes
  • Using distributed or parallel set-up in script?: No

Who can help

@stefan-it

Information

The model I am using (Bert, XLNet ...): LayoutLMforTokenClassification

The problem arises when using:

  • the official example scripts: (give details below)
  • my own modified scripts: (give details below)
    My own modified script

The tasks I am working on is:

  • an official GLUE/SQUaD task: (give the name)
  • my own task or dataset: (give details below)
    My own dataset

To reproduce

Steps to reproduce the behavior:

Expected behavior

This is more of a question rather than an issue.
When I trained the layoutlm model using my data and I used the tokenclassification model from huggingface, I got a small drop in performance. I wanted to ask if there are any differences between the two models? I have kept the hyper-parameters to be exactly the same in both cases.
Two key points where I found the differences were:
(1). When taking in the dataset - in the Microsoft version, there is a concept called "segment_ids" which is not a parameter in the huggingface layoutlm documentation.
(2). I loaded both the models and printed the number of layers in both, I saw that there is 1 extra layer called layoutlm.embeddings.position_ids in the huggingface implementation.

I am trying to find out the reason for the drop in performance. Hence, wanted to find out if there is any difference between the model implementations itself. It would be great help if you could help explain the two differences I found!

Thanks!

@NielsRogge
Copy link
Contributor

NielsRogge commented Jan 4, 2021

Hi there,

I made some integration tests for both the base model (LayoutLM) as well as the model with a token classification head on top (LayoutLMForTokenClassification). These integration tests do not reveal any differences in terms of output on the same input data between the original implementation and the one in HuggingFace Transformers. So the implementation seems to be OK. Btw, the segment_ids you are referring to are called token_type_ids in the Transformers library.

I also made a demo notebook that showcases how to fine-tune LayoutLMForTokenClassification on the FUNSD dataset, I'm getting quite good performance even though I'm not using Mask-RCNN features. Let me know if this helps you.

@NielsRogge NielsRogge mentioned this issue Jan 8, 2021
4 tasks
@github-actions
Copy link

github-actions bot commented Mar 6, 2021

This issue has been automatically marked as stale and been closed because it has not had recent activity. Thank you for your contributions.

If you think this still needs to be addressed please comment on this thread.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants