PDF Scrapper: Save/Export the "Text mode" and "BibTeX mode" buffers #140

j-steinbach · 2020-12-06T20:32:03Z

At the moment, the PDF Scrapper extracts a list of references from the source PDF.
I manually edit and clean that list in the "Text mode" buffer, then it gets turned into a list of bib-entries in the "BibTeX mode" buffer. There I again manually do some cleaning, and then that list gets turned into a list of cite-keys in the "Org mode" buffer.
Finally, the list of cite-keys is inserted into the original .org file (where I started the PDF Scrapper extraction), but all the manual work I did in the "Text mode" and "BibTeX mode" buffer gets thrown away.

I would like to have the content of those buffers also get inserted into my original .org file.

There are multiple reasons:

If I make a mistake somewhere but the PDF extraction process is already closed, I have to do all the manual work again. As scientific papers can have upwards of a hundred references, this is a heavy chunk of work to do again.
I usually import the BibTeX-entries from the "BibTeX mode" buffer into Zotero (to update my global .bib library file). As they live in a temporary buffer, I have to interrupt my work-flow of cleaning the extracted references and deal with Zotero. If they are instead saved into the original .org file, I can deal with them when I want.
If the "in-between steps" are saved, they can be used to re-start the PDF scrapping process from there, maybe even without having to use Anystyle again.

As mentioned in #134

myshevchuk · 2020-12-07T09:13:30Z

There are technically three features here.

An export command to save the buffer. In current implementation, using Emacs functions such write-file (C-x C-w) to save a buffer results in a broken scrapper session. Either fix this or make a separate command. I'd prefer the former. But the latter is easier to implement.
Provide an option to allow including the extracted text and BibTeX references as subtrees in the buffer of origin.
Restart ORB PDF Scrapper from a text or BibTeX source.

Features 1. and 2. will be quite easy, while three 3. will require more work and should probably be filed as a separate feature request.

j-steinbach · 2020-12-07T16:06:39Z

I agree. Originally I only had the first two, the last one was an after-thought - something to consider for the future. Should I put it in a separate feature request? I don't find it that important personally, but it makes sense from a logical perspective.

myshevchuk · 2020-12-07T17:20:20Z

but it makes sense from a logical perspective.

Absolutely. Please make a separate feature request.

I actually had an idea of making ORB PDF Scrapper into a separate, stand-alone package. It would, of course, be integrated into Org Roam (BibTeX), yet could be used independently of them anywhere else in Emacs. Such a feature would be a step toward this. Thank you for the idea!

j-steinbach · 2020-12-08T16:39:31Z

Here you go :)

I also added an issue to discuss the splitting of the package.

🧹

myshevchuk · 2021-03-15T16:59:31Z

Now after #146 has been merged, this issue can be closed.

This was referenced Dec 8, 2020

PDF Scrapper: Restart ORB PDF Scrapper from a text or BibTeX source. #142

Open

Split ORB PDF Scrapper into its own package #143

Open

myshevchuk added the pdf scrapper - enhancement label Dec 9, 2020

myshevchuk closed this as completed Mar 15, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

PDF Scrapper: Save/Export the "Text mode" and "BibTeX mode" buffers #140

PDF Scrapper: Save/Export the "Text mode" and "BibTeX mode" buffers #140

j-steinbach commented Dec 6, 2020

myshevchuk commented Dec 7, 2020 •

edited

Loading

j-steinbach commented Dec 7, 2020

myshevchuk commented Dec 7, 2020

j-steinbach commented Dec 8, 2020 •

edited

Loading

myshevchuk commented Mar 15, 2021 •

edited

Loading

PDF Scrapper: Save/Export the "Text mode" and "BibTeX mode" buffers #140

PDF Scrapper: Save/Export the "Text mode" and "BibTeX mode" buffers #140

Comments

j-steinbach commented Dec 6, 2020

myshevchuk commented Dec 7, 2020 • edited Loading

j-steinbach commented Dec 7, 2020

myshevchuk commented Dec 7, 2020

j-steinbach commented Dec 8, 2020 • edited Loading

myshevchuk commented Mar 15, 2021 • edited Loading

myshevchuk commented Dec 7, 2020 •

edited

Loading

j-steinbach commented Dec 8, 2020 •

edited

Loading

myshevchuk commented Mar 15, 2021 •

edited

Loading