Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PDF Scrapper: Save/Export the "Text mode" and "BibTeX mode" buffers #140

Closed
j-steinbach opened this issue Dec 6, 2020 · 5 comments
Closed

Comments

@j-steinbach
Copy link

At the moment, the PDF Scrapper extracts a list of references from the source PDF.
I manually edit and clean that list in the "Text mode" buffer, then it gets turned into a list of bib-entries in the "BibTeX mode" buffer. There I again manually do some cleaning, and then that list gets turned into a list of cite-keys in the "Org mode" buffer.
Finally, the list of cite-keys is inserted into the original .org file (where I started the PDF Scrapper extraction), but all the manual work I did in the "Text mode" and "BibTeX mode" buffer gets thrown away.

I would like to have the content of those buffers also get inserted into my original .org file.

There are multiple reasons:

  • If I make a mistake somewhere but the PDF extraction process is already closed, I have to do all the manual work again. As scientific papers can have upwards of a hundred references, this is a heavy chunk of work to do again.
  • I usually import the BibTeX-entries from the "BibTeX mode" buffer into Zotero (to update my global .bib library file). As they live in a temporary buffer, I have to interrupt my work-flow of cleaning the extracted references and deal with Zotero. If they are instead saved into the original .org file, I can deal with them when I want.
  • If the "in-between steps" are saved, they can be used to re-start the PDF scrapping process from there, maybe even without having to use Anystyle again.

As mentioned in #134

@myshevchuk
Copy link
Member

myshevchuk commented Dec 7, 2020

There are technically three features here.

  1. An export command to save the buffer. In current implementation, using Emacs functions such write-file (C-x C-w) to save a buffer results in a broken scrapper session. Either fix this or make a separate command. I'd prefer the former. But the latter is easier to implement.
  2. Provide an option to allow including the extracted text and BibTeX references as subtrees in the buffer of origin.
  3. Restart ORB PDF Scrapper from a text or BibTeX source.

Features 1. and 2. will be quite easy, while three 3. will require more work and should probably be filed as a separate feature request.

@j-steinbach
Copy link
Author

I agree. Originally I only had the first two, the last one was an after-thought - something to consider for the future. Should I put it in a separate feature request? I don't find it that important personally, but it makes sense from a logical perspective.

@myshevchuk
Copy link
Member

but it makes sense from a logical perspective.

Absolutely. Please make a separate feature request.

I actually had an idea of making ORB PDF Scrapper into a separate, stand-alone package. It would, of course, be integrated into Org Roam (BibTeX), yet could be used independently of them anywhere else in Emacs. Such a feature would be a step toward this. Thank you for the idea!

@j-steinbach
Copy link
Author

j-steinbach commented Dec 8, 2020

Here you go :)

I also added an issue to discuss the splitting of the package.

🧹

@myshevchuk
Copy link
Member

myshevchuk commented Mar 15, 2021

Now after #146 has been merged, this issue can be closed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants