Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add parsebib-clean-TeX-markup #21

Closed
wants to merge 1 commit into from

Conversation

bdarcus
Copy link
Contributor

@bdarcus bdarcus commented Jun 12, 2022

... along with the associated parsebib-TeX-markup-replace-alist, though
converted to a defvar, since this isn't a user-facing package.

Close #19


I didn't touch ebib; figure you can remove that independently, as you wish.

... along with the associated parsebib-TeX-markup-replace-alist, though
converted to a defvar, since this isn't a user-facing package.

Close joostkremers#19
@Hugo-Heagren
Copy link

JSYK, The ebib version of this function and associated variables changed a bit recently -- you might prefer to copy/alter that version (or you might not 🤷).

@bdarcus
Copy link
Contributor Author

bdarcus commented Jun 12, 2022

OMG; that was a total oversight on my part; I wasn't paying attention to getting the current version.

Seems it changed more than "a bit" though!!

Edit: I guess a lot of that was test code though.

I'll let @joostkremers weigh on on what he wants to do.

@joostkremers
Copy link
Owner

I'll let @joostkremers weigh on on what he wants to do.

Heh, good question. Ideally, this stuff should only be implemented once, of course. So if we want parsebib to be able to make use of it, it should be in parsebib. I don't think there's any reason not to have it there, so in principle I'm not against it.

AFAICT it would be possible to simply move ebib-clean-TeX-markup and related code to parsebib, so that in Ebib we'd only need to change the function calls and not much else.

The question is just what to do with ebib-TeX-markup-replace-alist, which is a user option.

Normally, I'd want to keep it a user option, which means duplicating it in Ebib and parsebib and having Ebib pass its value to parsebib. This is what I did with ebib-biblatex-inheritances when I moved that functionality into parsebib.

However, in this particular case, perhaps we don't need to do so. I mean, even though ebib-TeX-markup-replace-alist is technically a user option, it's such a long and unwieldy list that I doubt any normal user will try and customise it.

So I would move ebib-TeX-markup-replace-alist to parsebib as well and turn it into a defvar. A user that really wants to customise it can always redefine the variable in their init file.

I would like to have some customisability, though, so perhaps we can introduce an option ebib-clean-TeX-markup (or some similar name), which for now is just a toggle. The (new) function parsebib-clean-TeX-markup would then have to take an additional optional argument indicating whether it should actually do its magic or not. (Eventually, I'd like to turn this option into a list of replacement types, which would roughly correspond to the different groups indicated in the definition of ebib-TeX-markup-replacement-alist, but that can wait.)

Another question is the integration of this new parsebib-clean-TeX-markup into parsebib itself. It seems to me that parsebib-parse-bib-buffer should have an additional keyword argument to indicate whether it should clean TeX markup in field values and that parsebib-parse should set this argument to the value of its own display argument.

The actual clean-up should be done in parsebib--parse-bib-value, if I'm not mistaken, which means that it should also get an additional optional argument, and so should parsebib-read-entry and parsebib--parse-bibtex-field, because they need to pass the toggle on to parsebib--parse-bib-value.

@Hugo-Heagren
Copy link

Another question is the integration of this new parsebib-clean-TeX-markup into parsebib itself. It seems to me that parsebib-parse-bib-buffer should have an additional keyword argument to indicate whether it should clean TeX markup in field values and that parsebib-parse should set this argument to the value of its own display argument.

The actual clean-up should be done in parsebib--parse-bib-value, if I'm not mistaken, which means that it should also get an additional optional argument, and so should parsebib-read-entry and parsebib--parse-bibtex-field, because they need to pass the toggle on to parsebib--parse-bib-value.

I support @bdarcus original motivation for including this function in parsebib (as opposed to ebib). I'm less sure about the stuff quoted above. Parsebib is a parsing library -- it's principle function is to get information written in a structured file and return it in a structured lisp object. And the information in the file has the markup in it, not the cleaned up text. joostkremers/ebib#238 and joostkremers/ebib#248 were intended to make presenting that data prettier. I'm worried that including cosmetic functionality in the parsing functions might might make it harder to access downstream.

This is a long winded way of saying that if there's a boolean option to clean up when parsing, it should be off by default.

@joostkremers
Copy link
Owner

I support @bdarcus original motivation for including this function in parsebib (as opposed to ebib). I'm less sure about the stuff quoted above. Parsebib is a parsing library -- it's principle function is to get information written in a structured file and return it in a structured lisp object. And the information in the file has the markup in it, not the cleaned up text. joostkremers/ebib#238 and joostkremers/ebib#248 were intended to make presenting that data prettier. I'm worried that including cosmetic functionality in the parsing functions might might make it harder to access downstream.

But parsebib already includes functionality to make the data more presentable: Returning entries for display. This is basically just an extension of that.

Ebib doesn't use any of this cleanup functionality, because it needs the exact contents of the .bib file to work with. Other packages that use parsebib don't, though, they just present the contents of a .bib file to the user. For such packages, it's more convenient to be able to pass an argument to parsebib-parse-bib-buffer and get nice display-ready results.

This is a long winded way of saying that if there's a boolean option to clean up when parsing, it should be off by default.

Well, it's up to the caller to decide which kind of cleanup parsebib should do, if any. Obviously, when Ebib opens a .bib file, it shouldn't use this option.

@bdarcus
Copy link
Contributor Author

bdarcus commented Jun 13, 2022

But parsebib already includes functionality to make the data more presentable: Returning entries for display. This is basically just an extension of that.

Exactly.

For context, here's the original citar issue that led to this emacs-citar/citar#535.

The sequence is:

  1. I start bibtex-actions as a bibtex-completion front-end, which implements a simple version of this.
  2. I drop that dependency (and changed the project name) and use parsebib directly, and borrow that function
  3. but it's limited, per the above issue, and then @joostkremers noted he has a pretty good one in ebib.

So it naturally seems the best place to put this code here, so that each project should rely on the same function, rather than reimplement.

But as you can tell, I didn't realize the additional complexity :-)

PS - I should also note emacs-citar/citar#532, on name-parsing and formatting. Not sure where that would have to happen to work correctly, but probably here.

BTW, I find parsebib really useful, since I don't have to worry about low-level formatting details, and just call fields to display.

@bdarcus
Copy link
Contributor Author

bdarcus commented Jun 13, 2022

Also, we currently have a citar-display-transform-functions defcustom, which allows one to easily configure which string manipulation functions are run against which fields. I guess something like that could be here, or we could just use this function as one of them.

@joostkremers
Copy link
Owner

joostkremers commented Jun 13, 2022

But as you can tell, I didn't realize the additional complexity :-)

I don't think it's that complex. I started implementing it, should have something testable in a few days.

PS - I should also note emacs-citar/citar#532, on name-parsing and formatting. Not sure where that would have to happen to work correctly, but probably here.

It would at least be in line with the "prettify for display" functionality that parsebib already has. For JSON name fields there is already something similar: https://github.com/joostkremers/parsebib#parsebib-json-name-field-template.

@bdarcus
Copy link
Contributor Author

bdarcus commented Jun 13, 2022

I don't think it's that complex.

I just mean for someone not familiar with the codebase :-)

I started implementing it, should have something testable tomorrow.

Great! In that case, I'll close this.

It would at least be in line with the "prettify for display" functionality that parsebib already has. For JSON name fields there is already something similar: https://github.com/joostkremers/parsebib#parsebib-json-name-field-template.

A difference, however, is CSL JSON name representation is a structured object, while bibtex is a string.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Move ebib-clean-TeX-markup to parsebib
3 participants