Commands and functions for retrieving web page content and processing it into and displaying it as Org-mode content.
Emacs Lisp
Switch branches/tags
Nothing to show
Latest commit e958324 Oct 14, 2017 @alphapapa alphapapa Make Pandoc testing sleep time customizable
Fixes #9.  Thanks to @Priyadarshan for reporting.

README.org

org-web-tools

https://melpa.org/packages/org-web-tools-badge.svg https://stable.melpa.org/packages/org-web-tools-badge.svg

This file contains library functions and commands useful for retrieving web page content and processing it into Org-mode content.

For example, you can copy a URL to the clipboard or kill-ring, then run a command that downloads the page, isolates the “readable” content with eww-readable, converts it to Org-mode content with Pandoc, and displays it in an Org-mode buffer. Another command does all of that but inserts it as an Org entry instead of displaying it in a new buffer.

Installation

This package requires Emacs 25.1 or later.

MELPA

If you installed from MELPA, just run one of the commands below. If you want to use any of the functions in your own code, you should (require 'org-web-tools).

Manual

Install dash.el, and s.el. Then require this package in your init file:

(require 'org-web-tools)

Usage

Commands

  • org-web-tools-insert-link-for-url: Insert an Org-mode link to the URL in the clipboard or kill-ring. Downloads the page to get the HTML title.
  • org-web-tools-insert-web-page-as-entry: Insert the web page for the URL in the clipboard or kill-ring as an Org-mode entry, as a sibling heading of the current entry.
  • org-web-tools-read-url-as-org: Display the web page for the URL in the clipboard or kill-ring as Org-mode text in a new buffer, processed with eww-readable.
  • org-web-tools-convert-links-to-page-entries: Convert all URLs and Org links in current Org entry to Org headings, each containing the web page content of that URL, converted to Org-mode text and processed with eww-readable. This should be called on an entry that solely contains a list of URLs or links.

Functions

These are used in the commands above and may be useful in building your own commands.

  • org-web-tools--eww-readable: Return “readable” part of HTML with title.
  • org-web-tools--get-url: Return content for URL as string.
  • org-web-tools--html-title: Return title of HTML page.
  • org-web-tools--html-to-org-with-pandoc: Return string of HTML converted to Org with Pandoc.
  • org-web-tools--url-as-readable-org: Return string containing Org entry of URL=s web page content. Content is processed with eww-readable and Pandoc. Entry will be a top-level heading, with article contents below a second-level “Article” heading, and a timestamp in the first-level entry for writing comments.
  • org-web-tools--demote-headings-below: Demote all headings in buffer so the highest level is below LEVEL.
  • org-web-tools--get-first-url: Return URL in clipboard, or first URL in the kill-ring, or nil if none.
  • org-web-tools--read-org-bracket-link: Return (TARGET . DESCRIPTION) for Org bracket LINK or next link on current line.
  • org-web-tools--remove-dos-crlf: Remove all DOS CRLF (^M) in buffer.

Changelog

1.0-pre

  • Initial release.

Development

Contributions and suggestions are welcome.

License

GPLv3