Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is it possible to delete characters/words when editing the parsed citation? #220

Open
aulline13 opened this issue Feb 11, 2024 · 3 comments

Comments

@aulline13
Copy link

First of all, thanks for a great website!

After the citation is parsed, one can reassign label of those citation parts that were not recognized correctly. Is it possible to delete some of these parts? In my case, the parser often assigns punctuation/special characters to labels. E.g., in the citation style that I often deal with, "//" is used to separate article's title and journal's name. This causes the parser to think that "//" is part of the article's name, but it actually does not belong to any label. I can't find any button to delete such characters from the parsed result. Seems like one can only reassign them to other labels.

@inukshuk
Copy link
Owner

For the labeling it's important that all parts of the input are assigned a label consistently. In this case I'd mark the // as part of the journal's name (though you could also pick the article's title instead). For JSON or BibTeX output the segments are then further 'normalized'. If you have something like // at the end or start of a title there our normalizers should probably remove these. If that's not happening yet, we can add a rule for it. Could you paste one or two of your references here? Then we'll add a test case for them.

@aulline13
Copy link
Author

Oh, now I see. I used bibtex to for importing into Zotero, and the '//' appeared in the journal name.

Below are the citations. The citations are in Russian, not English, sorry, hope you don't mind. I'll put an explanation in the details below for your convenience:

  1. Петрова Д.С. Санкции и их влияние на договорные обязательства в сравнительно-правовой перспективе // Правоведение. 2023. № 4 (67). C. 489–500.
Details

Eng Rus
Surname Петрова
Given name Д.С.
Title Санкции и их влияние на договорные обязательства в сравнительно-правовой перспективе
Journal // Правоведение.
Year 2023.
Issue (and Volume in brackets) № 4 (67).
Pages C. 489–500.

  1. Сирота А.Н., Иванова Е.Ю. Девальвация рубля и валютная аренда. Аргументы за и против применения статьи 451 ГК РФ в условиях кризиса // Арбитражная практика для юристов. 2016. № 4. C. 26–37.
Details

Eng Rus
Surname Сирота
Given name А.Н.
Surname Иванова
Given name Е.Ю.
Title Девальвация рубля и валютная аренда. Аргументы за и против применения статьи 451 ГК РФ в условиях кризиса
Journal // Арбитражная практика для юристов.
Year 2016.
Issue № 4.
Pages C. 26–37.

To pick up on that, one more question: is it possible to swap first name and last name when editing parsed citations? Currently, the parser identifies the abbreviated first name and patronymic name as surname in the above citations (e.g., "Д.С." in the first citation becomes surname). It should actually be the other way - what comes first (not abbreviated) is last name, everything else is given name.

@inukshuk
Copy link
Owner

inukshuk commented Mar 1, 2024

Thanks, I'll try to add these to the training set.

I'll have to check about the names. Parsing names in that order (surname given) without a separator between them is difficult, because it's difficult to distinguish from the reverse order (given surname). Of course, if the given name is initials only it's much easier to guess the intention -- and I'm under the impression that this should actually work, but it's possible that the Cyrillic throws it off. Will try to look into it!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants