Skip to content

Org Mode framework for managing data from web pages and web APIs.

License

Notifications You must be signed in to change notification settings

p-snow/org-web-track

Repository files navigation

org-web-track.el - Web data tracking framework in Org Mode

MELPA MELPA Stable

org-web-track provides an org-mode framework that helps users track, manage, and leverage data such as scores, prices, or states from web pages and web APIs.

Overview

With org-web-track users can:

  • Define a tracking item as an org-mode entry with a URL and selector, which indicate where to access and from which location to obtain the desired data, respectively.
  • Update tracking items at any time and check their value.
  • Manage a group of tracking items using the org-mode column view capability.
  • Take advantage of data logs, which are provided as an org-mode table, to create a value transitive graph, for example.

Retrieving data

When users specify which snippet of data to track on web pages or web APIs, the following methods are available:

  • CSS selectors, which are applied against the source code of the web page
  • Elisp functions, which receive a DOM object or a JSON object from a web page or a web API and are expected to return a desired value from the argument
  • Shell commands, which take a page source or an API response as STDIN and are expected to print a desired value to STDOUT

Installation

org-web-track can be installed directly from MELPA using package-install. Please note that the following packages are required.

Requirements:

Basic Usage

In this section, the basic org-web-track mechanism and fundamental command usage are described.

Defining Tracking Items

Before setting up each tracking item, users need to define org-web-track-selectors-alist, which is a list of lists consisting of URL-MATCH and SELECTOR. Each element is responsible for selecting data for the tracking item that has a URL matching against URL-MATCH and applying SELECTOR to the response. An example of org-web-track-selectors-alist is as follows:

(setq org-web-track-selectors-alist ‘((“example\.com/product” [.final-price])))

The above code means that the target value for all product items, which have a URL containing “example.com/product”, can be obtained from the HTML response by searching in the tag with a class attribute of “final-price”. Refer to the description of org-web-track-selectors-alist for detailed guidance on how to define the SELECTOR.

Users can also use org-web-track-test-selector function to create and/or verify a selector for the specific URL before configuring org-web-track-selectors-alist.

After appropriately defining org-web-track-selectors-alist, users can set up each specific tracking item by calling org-web-track-setup-entry on the desired org entry or before the first heading to create a new one in the org buffer. Users will be prompted to input the tracking URL, and then the updated value will be retrieved, stored, and displayed in the echo area according to the aforementioned settings.

* Book ABC
:PROPERTIES:
:TRACK_URL: https://example.com/products/book-abc.html
:TRACK_CURRENT_VALUE: $30
:TRACK_LAST_UPDATED_TIME: [2024-07-18 Thu 16:57]
:END:
:LOGBOOK:
- Update "$30"       [2024-07-18 Thu 16:57]
:END:

Accessing Unix Domain Socket Server

While org-web-track primarily focuses on the WWW server as the access target, users also have the option to connect to a Unix Domain Socket server that provides HTTP services on a local machine. A simple example of a Unix Socket server implementation complying with the org-web-track framework can be found at https://github.com/p-snow/socket-http-server.

Updating Values

The simplest way to update the value is to call org-web-track-update-entry on the desired org entry. If the retrieved value is updated compared to the last value, the updated value will be stored as the TRACK_CURRENT_VALUE org property; otherwise, the entry will remain unchanged.

Alternatively, bulk updating is supported. To enable bulk updating, users must first define org-web-track-files. This variable should be a list of files in which all tracking items, identified by having the TRACK_URL property, are selected for bulk updating. To perform bulk updating, call org-web-track-update-files.

Displaying Column View

Column view in org-mode is a feature that displays properties in a table, providing a clear and comprehensive perspective. org-web-track offers a specialized column view where updated values are juxtaposed with their previous values for tracking items. To display the column view, call org-web-track-columns in org buffer.

If tracking items are scattered across many files, org-web-track-agenda-columns is useful as all tracking items in the aforementioned org-web-track-files are gathered in the agenda column view. Users can also update any item in the agenda column view by calling org-web-track-agenda-update.

Creating Report

All updated values from the past are logged in the entry using the existing org log note feature. Log notes have a fixed format and are placed in a drawer only if org-log-into-drawer is non-nil.

org-web-track-report creates a table where all log note values are listed in ascending order of time, showing the transition of values over time. Users can utilize the table to create a graph using Gnuplot or analyze trends with Pandas, for example.

Extended Examples

In this section, examples of how to utilize org-web-track extensively are showcased.

Automatic Bulk Updating and Email Notifications

While automatic updating may be ideal in certain situations, org-web-track refrains from providing this feature directly to prevent potential data violations. However, users can enable automatic updating by calling org-web-track-update-entry or org-web-track-update-files from Elisp code. Below is an example implementation of automatic updates with email notifications scheduled for midnight.

(defun exp/email-updated ()
  "Check for updates on all tracking items in `org-web-track-files'
and email me the updated list of items formatted as requested."
  (let* ((message-kill-buffer-on-exit t)
         (mail-msg (mapconcat
                    (lambda (chg)
                      (org-with-point-at chg
                        (let ((org-trust-scanner-tags t))
                          (format "%s\n\t%s\n"
                                  (substring-no-properties
                                   (org-get-heading t t t t))
                                  (org-web-track-current-changes nil "%p => %c" " | ")))))
                    (org-web-track-update-files))))
    (unless (string-blank-p mail-msg)
      ;; SMTP settings are required in advance (see smtpmail-xxx vaiables)
      (message-mail user-mail-address "Web Tracking Notification")
      (message-goto-body)
      (insert mail-msg)
      (message-send-and-exit))))

(require 'midnight)
(add-hook 'midnight-hook #'exp/email-updated)
(midnight-mode 1)

Q&A

Network Certificate Issue

Non-interactive invocation for org-web-track-update-entry may fail due to an unverified network certificate. This issue can occur when accessing a website that offers an unverified certificate, and the variable network-security-level is set to ‘medium’ or higher. To address the issue, accept the certificate by calling the org-web-track-update-entry command interactively up-front.

License

GPLv3