Skip to content
Branch: master
Find file History
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Type Name Latest commit message Commit time
Failed to load latest commit information.

Three tools for text preparation

Written by Christine Roughan. Feel free to use and adapt if you find them useful.


A diagnostic tool that outputs a list of characters that are present in an input text as well as a count of those characters. Expects a .txt file as input.

Usage in the terminal:

python <txt_file>


A tool which will remove the desired characters from an input text.

Usage in the terminal:

python -i <input txt> -o <output txt> <characters>

The input must be a .txt file. Output is optional and defaults to out.txt. If multiple characters are desired to be removed, separate each character with a space. Unicode escape codes may be used for characters if desired.

E.g.: python -i text.txt a b c \u0064


A tool which will take an input .txt file and split each paragraph into lines based on a specified character length. (Do not use a .txt file which has already been split into lines.) The default character length is 65 characters, but this may be changed with the -c option. The default output text is out_lines.txt.

Usage in the terminal:

python -c <character_length> -o <output txt> <input txt>

You can’t perform that action at this time.