Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

overwrite repeated entries #101

Closed
diemort opened this issue May 28, 2018 · 17 comments
Closed

overwrite repeated entries #101

diemort opened this issue May 28, 2018 · 17 comments
Assignees

Comments

@diemort
Copy link

diemort commented May 28, 2018

I have a publication list that is going to be updated along the year. The easiest way to do that is to upload regularly a new bibtex list, which will include all entries of a given year.

Let's say I have 10 publications in 2018 already listed in my site. I have noticed that if I upload a new bibtex list with one new entry (10 + 1 entry), I'll get a database containing the new entry plus 10 duplicated entries.

It would be nice to make the plugin to check for possible duplicated entries (in this case, 10) and add to the database only the new ones.

@winkm89
Copy link
Owner

winkm89 commented May 29, 2018

Hi,

this function is already availabe but hidden by default. But you can activate it:
overwrite

overwriting2

@diemort
Copy link
Author

diemort commented May 29, 2018

Dear Michael,

Thank you for the confirmation. I wasn't sure that option meant to overwrite duplicated entries.

I've tried to upload the same .bib twice, but I still get duplicated entries. So I assume it is really an experimental feature.

Are there plans to improve it? It'd be very convenient for many users.

@winkm89
Copy link
Owner

winkm89 commented Jul 18, 2018

Hi,

Sorry for the late answer. If you could post me a part of your .bib file (one or two entries which were duplicated with the import), then I could try to find the error in the function.

@diemort
Copy link
Author

diemort commented Jul 23, 2018 via email

@shahab-ab
Copy link

shahab-ab commented Apr 18, 2021

Hello,
Thank you for your great plugin.
There is still an issue with overwrite option. I tried to import the same file.. some entries are still duplicated with same title and authors,... I think instead of overwriting of the same record, maybe a record above or below is being replaced. This is a very useful feature if it works without bugs.

to be more clear in my case first import bring 311 entries. 2 entries is added for every import of the exact file!
Thanks for your time and huge effort.

@winkm89
Copy link
Owner

winkm89 commented Apr 21, 2021

Hi,
The overwrite option compares only the bibtex key and not a title/author combination. So I think that the duplicates have different bibtex keys?

@shahab-ab
Copy link

Hi, Thank you Michael for answering this issue.
Correct - but I try with the same file for two times- the number of publications increased by 2 per every import.

@winkm89
Copy link
Owner

winkm89 commented Apr 21, 2021

Are these always the same entries that are duplicated?

@shahab-ab
Copy link

shahab-ab commented Apr 21, 2021

Are these always the same entries that are duplicated?

I am sure for only the Titles. I will check and give you a feedback soon.

@shahab-ab
Copy link

UPDATE: Here are two examples: - First Import brought 312 Entries which is correct. Second run with same BibTex file brought 314 entries - increased by 2. It lists the same entry under under a different Pub-Type:

image

image

@winkm89
Copy link
Owner

winkm89 commented Apr 25, 2021

Could you post the original bibtex code of this 2 entries?

@shahab-ab
Copy link

shahab-ab commented Apr 25, 2021

Although there are 4 entries with the same title, at first try TP inserts only one of one of the similar entries. But in second run inserts the two others into publications. ( And the Tags table is also left empty without entries. (for all of entries)).

@inproceedings{DBLP:conf/cikm/Mulang0PNH020,
author = {Isaiah Onando Mulang and
Kuldeep Singh and
Chaitali Prabhu and
Abhishek Nadgeri and
Johannes Hoffart and
Jens Lehmann},
editor = {Mathieu d'Aquin and
Stefan Dietze and
Claudia Hauff and
Edward Curry and
Philippe Cudr{'{e}}{-}Mauroux},
title = {Evaluating the Impact of Knowledge Graph Context on Entity Disambiguation
Models},
booktitle = {{CIKM} '20: The 29th {ACM} International Conference on Information
and Knowledge Management, Virtual Event, Ireland, October 19-23, 2020},
pages = {2157--2160},
publisher = {{ACM}},
year = {2020},
url = {https://doi.org/10.1145/3340531.3412159},
doi = {10.1145/3340531.3412159},
timestamp = {Fri, 25 Dec 2020 01:15:14 +0100},
biburl = {https://dblp.org/rec/conf/cikm/Mulang0PNH020.bib},
bibsource = {dblp computer science bibliography, https://dblp.org}
}

@Article{DBLP:journals/corr/abs-2008-05190,
author = {Isaiah Onando Mulang and
Kuldeep Singh and
Chaitali Prabhu and
Abhishek Nadgeri and
Johannes Hoffart and
Jens Lehmann},
title = {Evaluating the Impact of Knowledge Graph Context on Entity Disambiguation
Models},
journal = {CoRR},
volume = {abs/2008.05190},
year = {2020},
url = {https://arxiv.org/abs/2008.05190},
archivePrefix = {arXiv},
eprint = {2008.05190},
timestamp = {Sun, 16 Aug 2020 01:00:00 +0200},
biburl = {https://dblp.org/rec/journals/corr/abs-2008-05190.bib},
bibsource = {dblp computer science bibliography, https://dblp.org}
}


@inproceedings{DBLP:conf/coling/XuNCL20,
author = {Chengjin Xu and
Mojtaba Nayyeri and
Yung{-}Yu Chen and
Jens Lehmann},
editor = {Donia Scott and
N{'{u}}ria Bel and
Chengqing Zong},
title = {Knowledge Graph Embeddings in Geometric Algebras},
booktitle = {Proceedings of the 28th International Conference on Computational
Linguistics, {COLING} 2020, Barcelona, Spain (Online), December 8-13,
2020},
pages = {530--544},
publisher = {International Committee on Computational Linguistics},
year = {2020},
url = {https://doi.org/10.18653/v1/2020.coling-main.46},
doi = {10.18653/v1/2020.coling-main.46},
timestamp = {Fri, 08 Jan 2021 00:00:00 +0100},
biburl = {https://dblp.org/rec/conf/coling/XuNCL20.bib},
bibsource = {dblp computer science bibliography, https://dblp.org}
}

@Article{DBLP:journals/corr/abs-2010-00989,
author = {Chengjin Xu and
Mojtaba Nayyeri and
Yung{-}Yu Chen and
Jens Lehmann},
title = {Knowledge Graph Embeddings in Geometric Algebras},
journal = {CoRR},
volume = {abs/2010.00989},
year = {2020},
url = {https://arxiv.org/abs/2010.00989},
archivePrefix = {arXiv},
eprint = {2010.00989},
timestamp = {Mon, 12 Oct 2020 01:00:00 +0200},
biburl = {https://dblp.org/rec/journals/corr/abs-2010-00989.bib},
bibsource = {dblp computer science bibliography, https://dblp.org}
}

@shahab-ab
Copy link

The complete file:
https://dblp.org/pid/71/4882.bib?param=1

winkm89 added a commit that referenced this issue Apr 26, 2021
@winkm89
Copy link
Owner

winkm89 commented Apr 26, 2021

Thank you for the examples. I think I have it. It's a small nasty bug in the method TP_Publications::generate_unique_bibtex_key(). This method includes a test, if the bibtex key exists, before the publication will be imported.

The problem was that the check was to weak. If you have for example a publication with the bibtex key "123" and another with "123a" then the method checks for a key like "%123%" which includes also "123a". And that was the problem.

I've released teachPress 7.1.5 over wordpress.org, which contains the bugfix.

@shahab-ab
Copy link

Thank you Michael for your availability and very helpful guides on every issue.

Awesome! Then the issue is now solved. To the UPDATE...

Cheers,

@shahab-ab
Copy link

shahab-ab commented May 29, 2021

HI Michael,

Problem Description:
Since non of the entries from DBLP BIbtex file have tag within (Keyword), I have to add them manually for over 300 items.
The problem is that by every update any customizations of a entry like adding a custom PDF link and adding tags are reset and I have to reapply them all. How is it possible to prevent this.

Regards

@winkm89
Copy link
Owner

winkm89 commented Jul 11, 2021

You can currently only comment this function out in the source code. I think i will add an import option for that.

@winkm89 winkm89 added this to the teachPress 8.0 milestone Jul 30, 2021
@winkm89 winkm89 self-assigned this Jul 30, 2021
winkm89 added a commit that referenced this issue Aug 1, 2021
@winkm89 winkm89 closed this as completed Aug 15, 2021
SimonPhumin pushed a commit to SimonPhumin/teachpress-apa that referenced this issue Feb 26, 2022
SimonPhumin pushed a commit to SimonPhumin/teachpress-apa that referenced this issue Feb 26, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants