You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The model I am using (Bert, XLNet ...): LayoutLMforTokenClassification
The problem arises when using:
the official example scripts: (give details below)
my own modified scripts: (give details below) My own modified script
The tasks I am working on is:
an official GLUE/SQUaD task: (give the name)
my own task or dataset: (give details below) My own dataset
To reproduce
Steps to reproduce the behavior:
Expected behavior
This is more of a question rather than an issue.
When I trained the layoutlm model using my data and I used the tokenclassification model from huggingface, I got a small drop in performance. I wanted to ask if there are any differences between the two models? I have kept the hyper-parameters to be exactly the same in both cases.
Two key points where I found the differences were:
(1). When taking in the dataset - in the Microsoft version, there is a concept called "segment_ids" which is not a parameter in the huggingface layoutlm documentation.
(2). I loaded both the models and printed the number of layers in both, I saw that there is 1 extra layer called layoutlm.embeddings.position_ids in the huggingface implementation.
I am trying to find out the reason for the drop in performance. Hence, wanted to find out if there is any difference between the model implementations itself. It would be great help if you could help explain the two differences I found!
Thanks!
The text was updated successfully, but these errors were encountered:
I made some integration tests for both the base model (LayoutLM) as well as the model with a token classification head on top (LayoutLMForTokenClassification). These integration tests do not reveal any differences in terms of output on the same input data between the original implementation and the one in HuggingFace Transformers. So the implementation seems to be OK. Btw, the segment_ids you are referring to are called token_type_ids in the Transformers library.
I also made a demo notebook that showcases how to fine-tune LayoutLMForTokenClassification on the FUNSD dataset, I'm getting quite good performance even though I'm not using Mask-RCNN features. Let me know if this helps you.
Environment info
transformers
version: 4.0.0Who can help
@stefan-it
Information
The model I am using (Bert, XLNet ...): LayoutLMforTokenClassification
The problem arises when using:
My own modified script
The tasks I am working on is:
My own dataset
To reproduce
Steps to reproduce the behavior:
Expected behavior
This is more of a question rather than an issue.
When I trained the layoutlm model using my data and I used the tokenclassification model from huggingface, I got a small drop in performance. I wanted to ask if there are any differences between the two models? I have kept the hyper-parameters to be exactly the same in both cases.
Two key points where I found the differences were:
(1). When taking in the dataset - in the Microsoft version, there is a concept called "segment_ids" which is not a parameter in the huggingface layoutlm documentation.
(2). I loaded both the models and printed the number of layers in both, I saw that there is 1 extra layer called layoutlm.embeddings.position_ids in the huggingface implementation.
I am trying to find out the reason for the drop in performance. Hence, wanted to find out if there is any difference between the model implementations itself. It would be great help if you could help explain the two differences I found!
Thanks!
The text was updated successfully, but these errors were encountered: