org-web-tools
This file contains library functions and commands useful for retrieving web page content and processing it into Org-mode content.
For example, you can copy a URL to the clipboard or kill-ring, then run a command that downloads the page, isolates the “readable” content with eww-readable, converts it to Org-mode content with Pandoc, and displays it in an Org-mode buffer. Another command does all of that but inserts it as an Org entry instead of displaying it in a new buffer.
Installation
This package requires Emacs 25.1 or later.
MELPA
If you installed from MELPA, just run one of the commands below. If you want to use any of the functions in your own code, you should (require 'org-web-tools).
Manual
Install dash.el, and s.el. Then require this package in your init file:
(require 'org-web-tools)Usage
Commands
org-web-tools-insert-link-for-url: Insert an Org-mode link to the URL in the clipboard or kill-ring. Downloads the page to get the HTML title.org-web-tools-insert-web-page-as-entry: Insert the web page for the URL in the clipboard or kill-ring as an Org-mode entry, as a sibling heading of the current entry.org-web-tools-read-url-as-org: Display the web page for the URL in the clipboard or kill-ring as Org-mode text in a new buffer, processed witheww-readable.org-web-tools-convert-links-to-page-entries: Convert all URLs and Org links in current Org entry to Org headings, each containing the web page content of that URL, converted to Org-mode text and processed witheww-readable. This should be called on an entry that solely contains a list of URLs or links.
Functions
These are used in the commands above and may be useful in building your own commands.
org-web-tools--eww-readable: Return “readable” part of HTML with title.org-web-tools--get-url: Return content for URL as string.org-web-tools--html-title: Return title of HTML page.org-web-tools--html-to-org-with-pandoc: Return string of HTML converted to Org with Pandoc.org-web-tools--url-as-readable-org: Return string containing Org entry of URL=s web page content. Content is processed witheww-readableand Pandoc. Entry will be a top-level heading, with article contents below a second-level “Article” heading, and a timestamp in the first-level entry for writing comments.org-web-tools--demote-headings-below: Demote all headings in buffer so the highest level is below LEVEL.org-web-tools--get-first-url: Return URL in clipboard, or first URL in the kill-ring, or nil if none.org-web-tools--read-org-bracket-link: Return (TARGET . DESCRIPTION) for Org bracket LINK or next link on current line.org-web-tools--remove-dos-crlf: Remove all DOS CRLF (^M) in buffer.
Changelog
1.0-pre
- Initial release.
Development
Contributions and suggestions are welcome.
License
GPLv3