Skip to content

nedbat/edtext

Repository files navigation

EdText: ed-like text selection

This library provides an EdText class for selecting and manipulating lines of text from a string, using addressing inspired by the classic ed text editor.

This isn't on PyPI yet, if you want to use it, install it from GitHub:

python -m pip install git+https://github.com/nedbat/edtext

How it works

Suppose we have this file:

# gettysburg.txt

Four score and seven years ago our fathers brought forth
on this continent, a new nation, conceived in Liberty, and
dedicated to the proposition that all men are created equal.

Now we are engaged in a great civil war, testing whether that
nation, or any nation so conceived and so dedicated, can long
endure. We are met on a great battle-field of that war. We have
come to dedicate a portion of that field, as a final resting
place for those who here gave their lives that that nation
might live. It is altogether fitting and proper that we should
do this.

-- Abraham Lincoln, Gettysburg PA, 1863

Make an EdText object from the text of the file:

>>> getty = EdText(Path("gettysburg.txt").read_text())

The lines of text are stored for selection and manipulation. The full text is recreated when the object is turned into a string:

>>> str(getty)[:60]
'# gettysburg.txt\n\nFour score and seven years ago our fathers'

Line selection

Instead of using string slicing, EdText objects provide line selection. It's available via three aliases: range(), ranges(), or list-like slicing with square brackets. All do the same operation: select lines based on the addresses provided, and produce a new EdText object.

Here we select lines starting from the first line that matches "Four" to the line before the next blank line:

>>> print(getty.range("/Four/; /^$/-"))
Four score and seven years ago our fathers brought forth
on this continent, a new nation, conceived in Liberty, and
dedicated to the proposition that all men are created equal.

The range argument is a string with the ed range to select. In this example, /Four/ means the first line containing the regex "Four", the semicolon means to continue from that point, /^$/ matches the next blank line, and the trailing - backs up one line to select the line before the blank line.

You can use a number of address ranges to select a more than one range at once:

>>> print(getty.range("/Four/; +2", "$"))
Four score and seven years ago our fathers brought forth
on this continent, a new nation, conceived in Liberty, and
dedicated to the proposition that all men are created equal.
-- Abraham Lincoln, Gettysburg PA, 1863

The /Four/;+2 means the line matching "Four" then two more lines. $ means the last line.

With multiple address ranges, each range starts from where the previous range ended.

Although we are using strings to determine line numbers, this feels like slicing, so square bracket slicing does the same thing as range():

>>> print(getty["/Now/;/\./", "$-;$"])
Now we are engaged in a great civil war, testing whether that
nation, or any nation so conceived and so dedicated, can long
endure. We are met on a great battle-field of that war. We have

-- Abraham Lincoln, Gettysburg PA, 1863

Note that you must use strings, not integers, for slicing, and that like ed, lines are numbered starting from 1. To get lines 10 through 12, [10, 12] won't work, you need to use ["10, 12"]:

>>> print(getty["10, 12"])
come to dedicate a portion of that field, as a final resting
place for those who here gave their lives that that nation
might live. It is altogether fitting and proper that we should

Since we can select a number of ranges at once, ranges() is an alias for range().

sub(range, pattern, repl)

Another operation is EdText.sub(), which makes regex replacements on selected lines:

>>> print(getty.sub("g/and/", r"e", "E")["1,5"])
# gettysburg.txt

Four scorE and sEvEn yEars ago our fathErs brought forth
on this continEnt, a nEw nation, concEivEd in LibErty, and
dedicated to the proposition that all men are created equal.

The first argument is a range of line addresses, the line in which to apply the substitution. Note that /pat/ finds the next matching line, not all matching lines. Use g/pat/ to select all lines matching the pattern.

The result of sub() is another EdText object. You can do further manipulations or selections.

>>> print(getty["g/and/"])
Four score and seven years ago our fathers brought forth
on this continent, a new nation, conceived in Liberty, and
nation, or any nation so conceived and so dedicated, can long
might live. It is altogether fitting and proper that we should

Why?

I use cog to interpolate text files or code exection output into documentation, presentations and the like. I often want only a subset of the lines. Over the years I'd built a utility function to make the selection in various ways. It had become baroque, confusing, and cumbersome; and still didn't do everything I wanted. I realized that ed already had the language I needed for selecting and manipulating text. edtext was born.

For more back-story, see my EdText blog post.

Changelog

v0.5.0 – 2026-02-08

First version.

About

A string-like object with ed-like addressing

Resources

License

Stars

Watchers

Forks

Packages

No packages published