Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Journal abbreviations in UTF-8 not recognized #5850

Closed
jorgman1 opened this issue Jan 20, 2020 · 13 comments · Fixed by #7639
Closed

Journal abbreviations in UTF-8 not recognized #5850

jorgman1 opened this issue Jan 20, 2020 · 13 comments · Fixed by #7639
Labels
bug Confirmed bugs or reports that are very likely to be bugs
Projects

Comments

@jorgman1
Copy link

When running an integrity check, Jabref complains about non-ASCII characters in the journal title. However, changing UTF-8 to ASCII generates the error: "Journal not found in abbreviation list".

Since the journal abbreviation lists are encoded in UTF-8, it would be nice if Jabref recognizes both formats and treats them equally.

@bernhard-kleine
Copy link

this is most probably related to #5562

@Siedlerchr Siedlerchr added the bug Confirmed bugs or reports that are very likely to be bugs label Jan 20, 2020
@tobiasdiez tobiasdiez added this to Needs triage in Bugs via automation Feb 1, 2020
@tobiasdiez tobiasdiez moved this from Needs triage to Normal priority in Bugs Feb 1, 2020
@Hollyqqqqq
Copy link
Contributor

Hi, I am interested in this issue, but I can not reproduce this bug.

@jorgman1 Can you give the example of a Journal name containing UTF-8 characters and briefly describe how you change UTF-8 to ASCII?

Thanks!

@jorgman1
Copy link
Author

jorgman1 commented May 5, 2020

E.g. a lot of German journals: Zeitschrift für Meteorologie. If I use the Convert Unicode to LaTeX tool, it gets converted to f{\"{u}}r. I've noticed that many such journals are in the built-in abbreviation list with fur, but that is wrong. The German convention is to put fuer.

@Siedlerchr
Copy link
Member

@jorgman1 Have you tested the latest development version?
There have been recently some major changes to the storage/and import of journals and their abbreviation
https://builds.jabref.org/master/

@Siedlerchr Siedlerchr added the status: waiting-for-feedback The submitter or other users need to provide more information about the issue label May 5, 2020
@jorgman1
Copy link
Author

jorgman1 commented May 5, 2020

In the newest version (JabRef 5.1--2020-05-04--b5599c9; Windows 7 6.1 amd64; Java 14.0.1) still see in the quality check either "non-ASCII error" or "journal not in abbreviation list error".

How should I put the journal name?

@github-actions
Copy link
Contributor

github-actions bot commented Jun 5, 2020

This issue will be closed in 7 days due to inactivity 💤 Please provide the requested information if the problem persists.

@jorgman1
Copy link
Author

jorgman1 commented Jun 5, 2020

This behaviour still persist in the latest version (JabRef 5.1--2020-05-04--b5599c9; Windows 7 6.1 amd64; Java 14.0.1).

How should I put the journal name?

@github-actions
Copy link
Contributor

github-actions bot commented Jul 6, 2020

This issue will be closed in 7 days due to inactivity 💤 Please provide the requested information if the problem persists.

@bernhard-kleine
Copy link

This has not been solved. Why do we get this status: stale messages. It is very annoying.

@Siedlerchr Siedlerchr removed the status: waiting-for-feedback The submitter or other users need to provide more information about the issue label Jul 6, 2020
@MrGhabi
Copy link
Contributor

MrGhabi commented Apr 15, 2021

Hi, I'm interested in this issue, and I think I have reproduced this bug.

The main reason for this bug is the check-tools Check integrity only accept the charset ASCII. It works well in English citations, but jabref has users worldwide and has different charsets. Here are Steps to reproduce:

  1. New library

  2. New article

  3. BibeTx source adds the following:

    @article{杨芙清2005软件工程技术发展思索,
      title={软件工程技术发展思索},
      author={杨芙清},
      journal={软件学报},
      volume={16},
      number={1},
      year={2005},
      publisher={Citeseer}
    }
    
  4. click "check integrity."

I want to fix this issue, and our goal is to check integrity only by warning the charset, which is not UTF-8. To meet this need, maybe I need an external jar to detect which charset does this string use. I'm trying to fix this issue by this method right now.

@Siedlerchr
Copy link
Member

@MrGhabi Thanks for your interest. JabRef fetches the journals from the https://github.com/JabRef/abbrv.jabref.org repo which are then assembled to a MV database
https://github.com/JabRef/jabref/blob/main/buildSrc/src/main/groovy/org/jabref/build/JournalAbbreviationConverter.groovy

@MrGhabi
Copy link
Contributor

MrGhabi commented Apr 15, 2021

@MrGhabi Thanks for your interest. JabRef fetches the journals from the https://github.com/JabRef/abbrv.jabref.org repo which are then assembled to a MV database
https://github.com/JabRef/jabref/blob/main/buildSrc/src/main/groovy/org/jabref/build/JournalAbbreviationConverter.groovy

Thank you~

Bugs automation moved this from Normal priority to Closed Apr 23, 2021
Siedlerchr added a commit that referenced this issue Apr 23, 2021
)

* fix issue #5850 for encoding problem

* add a blank line for build.gradle

* initial as main branch for build.gradle

* initial as main branch for build.gradle

* add the change of fix information of issue 5850

* Fix check style

* Update CHANGELOG.md

Co-authored-by: Christoph <siedlerkiller@gmail.com>

* Add the utf8 check for biblatex and ascii check for bibtex

* add the new localization string the l10 files

* fix error

* add the statement only in en.properties

* revert changes

* Update JabRef_da.properties

* Update JabRef_ru.properties

* Update build.gradle

* Update JabRef_fa.properties

* Update JabRef_no.properties

* Update JabRef_pl.properties

* Update JabRef_pt.properties

* Update JabRef_vi.properties

* Update JabRef_zh_TW.properties

* reset the default charset

* reset the default charset

* add the javaDoc of UTF8Checker

* add the javaDoc of UTF8CheckerTest and IntegrityCheckTest

add 2 Junit Test for UTF8Checker.UTF8EncodingChecker in UTF8CheckerTest

add 2 Junit Test for IntegrityCheck in IntegrityCheckTest

* Remove the unwieldy Junit tests

Co-authored-by: Christoph <siedlerkiller@gmail.com>
@jorgman1
Copy link
Author

Thanks for the fix!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Confirmed bugs or reports that are very likely to be bugs
Projects
Archived in project
Bugs
  
Done
Development

Successfully merging a pull request may close this issue.

5 participants