Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Duplicate key error while adding two mentions which are same #501

Closed
saikalyan9981 opened this issue Aug 29, 2020 · 9 comments
Closed

Duplicate key error while adding two mentions which are same #501

saikalyan9981 opened this issue Aug 29, 2020 · 9 comments

Comments

@saikalyan9981
Copy link

Suppose that I have two mentions (say for example zip-code and tax code) whose matchers return true (checking 5 digit regex match for both mentions) for the same span in document, then I think Fonduer is throwing this error. please help me in resolving this.


sqlalchemy.exc.IntegrityError: (psycopg2.errors.UniqueViolation) duplicate key value violates unique constraint "context_stable_id_key"
DETAIL:  Key (stable_id)=(1443208965_10_subset::span_mention:23313:23321) already exists.

[SQL: INSERT INTO context (type, stable_id) VALUES (%(type)s, %(stable_id)s) RETURNING context.id]
@saikalyan9981
Copy link
Author

saikalyan9981 commented Aug 29, 2020

Note that this was happening because I was using two MentionExtractors for the two mentions.

@HiromuHota
Copy link
Contributor

What would happen if you use only one MentionExtractor for the two mentions?

@saikalyan9981
Copy link
Author

I need two mention Extractors in my program because first I'm extracting some candidates with certain matchers and labelling functions, and based on those candidates extracted, I need to extract some other candidates, in next round.
Do you think it's bad to have such a pipeline?

@HiromuHota
Copy link
Contributor

I'm not saying this is a bad idea or not. I'm just trying to understand the issue.
From your response, seems that you've never tried to extract these two mentions with a single MentionExtractor.

Which version of Fonduer are you using?

@saikalyan9981
Copy link
Author

saikalyan9981 commented Sep 1, 2020

I'm using "0.8.2". I'm using a singleMentionExtractor for 4 mentions in the first run. I need to have those mentions extracted to extract other mentions in my second run, so using a second mention extractor. If I use a single extractor for all mentions, it's giving no errors for the same context match. If I use two extractors for 2 mentions, its throwing above error for the same context match.

@HiromuHota
Copy link
Contributor

If I use a single extractor for all mentions, it's giving no errors for the same context match.

@saikalyan9981 Thanks for this valuable information! I'll look into this.
Meanwhile, you could use a single extractor as a workaround if the second extraction does not depend on the first one.

@HiromuHota
Copy link
Contributor

HiromuHota commented Sep 4, 2020

I found that this issue is a regression caused by #368. A workaround would be to set `clear=False` like `mention_extractor.apply(clear=False, ...)`. @saikalyan9981 can you try this workaround? Please let me know how it works.

@HiromuHota
Copy link
Contributor

HiromuHota commented Sep 5, 2020

I think this issue is similar (not identical) to #424.
You will have an error even when you set clear=False.
A workaround would be to downgrade to v0.7.1 or to upgrade to v0.8.3, which is not released yet.

@HiromuHota
Copy link
Contributor

Closing as v0.8.3 has been released.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants