Skip to content

talwrii/clixmod

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

clixmod

Status: Basic functionality in the process of being implemented. Author has used for real work

Modify XML and HTML documents using XPATHs from the command line. Analogous to s// in the command-line tool sed. A companion utility to clixpath.

Requires Python3 but can co-exist with Python2.

See also and flagrant self-promotion

  • clixpath is a tool by the author to extract information from XML and HTML documents.

The author also maintains a list of potentially interesting tools they have written here.

Motivation

  • Hopefully easier to use than XSLT
  • Delete element from HTML
  • Change element types
  • Remove attributes
  • Empty the contents of elements into their parents

Examples

# Replace all h2 elements with h1 elements
clixmod //h2 '<h1>{content}</h1>'

# Delete h2 elements
clixmod //h2 ''

# Turn all level-2 headers to level-1 headers
# clixmod //h2 children=descendant::* '<h1>{children}</h1>' # Not implemented

# Not implemented
# Show all the class attributes
# clixmod -f //@class  # Not implemented

# Delete all the class attributes
# clixmod //@class '' # Not implemented

Installing

pip install git+https://github.com/talwrii/clixmod#egg=clixmod

Usage

usage: clixmod [-h] [--debug] [--filter]
               XPATH [selection_xpath [selection_xpath ...]] replacement

Modify xml or html documents

positional arguments:
  XPATH            XPATH to modify
  selection_xpath  relative XPATH to select. Use name=XPATH for named groups
  replacement      XML to replace XPATH with. {} for entire match, {1}... for
                   selection XPATHs

optional arguments:
  -h, --help       show this help message and exit
  --debug          Include debug output (to stderr)
  --filter, -f     Show matches rather than output text

Alternatives and prior work

  • This tool is in many ways nothing more than a convenience wrapper around the lxml library (though this tool will be greatly easier to use for its uses cases)
  • XSLT is a w3c language for transforming XML.

About

Modify XML and HTML from the command line

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages