Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Candidate Selection #23

Open
bradfox2 opened this issue May 12, 2021 · 6 comments
Open

Candidate Selection #23

bradfox2 opened this issue May 12, 2021 · 6 comments

Comments

@bradfox2
Copy link

Have you experimented with altering the candidate selection process?

I am interested in what occurs when the candidate selection process is simplified or removed entirely so that every possible candidate is evaluated.

@Praneet9
Copy link
Owner

I didn't try that as removing it would mean a lot of meaningless negative candidates.
For eg: for a date, a floating-point number or random text doesn't make sense. So I didn't try that.

@Neelesh1121
Copy link

@Praneet9 I have trained this model on structured documents dataset and during the training, validation loss and accuracy were 0.0006 and .98421 respectively, but when I am testing it on the new documents the result is very poor. The trained model is not able to predict those keys for which there are multiple candidates.
I am attaching the snapshot of the documents for amounts information, in which, many keys have same candidate so model is not able to detect the keys.
amounts
we have to extract all the key value present in the snapshot.
Just want to confirm one thing, is there are any condition that a single candidate text can't be part of multiple keys?
please can you suggest how we can solve this issue?

@Praneet9
Copy link
Owner

Praneet9 commented Nov 2, 2021

@Neelesh1121 Can you please explain what you mean by

we have to extract all the key value present in the snapshot.

There's no rule like that. In the above case, what are you trying to extract for the amount key?

@panwar2001
Copy link

panwar2001 commented Nov 21, 2023

@Praneet9 Hi , i am really confused about positional embeddings, Do we have to collect every invoice relative positions of neighbors in the training set , then train the model to generate embedding for a 2 D coordinates like embedding([3,4])= [2.34,3.43,2.34........]
Or is it done by training one invoice at a time, but the embeddings generated for a particular relative position would be different each time while training for new invoice.

@Praneet9
Copy link
Owner

@panwar2001 Can you elaborate with an example of what issue you are facing?

@Praneet9
Copy link
Owner

@panwar2001 I think you are confusing that projection with the neighbour encoding. The neighbour encodings are projected to 4 * 2d and then maxpooled as you can see here. These tensors are then concatenated with candidate embeddings, which are then projected back down. Hope that answers your confusion.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants