Skip to content


Switch branches/tags

Latest commit


Git stats


Failed to load latest commit information.
Latest commit message
Commit time


Chokitto (チョキっと) is a minimal Python library for extracting highlights and annotations from your Kindle eReader.

  • Create a neat overview of all your notes and highlights in Markdown or JSON
  • Export annotations from side-loaded documents and library books
  • Use filters to extract only the information you need (e.g. title('Book No \d+', 'regex'))
  • Deduplicate entries and merge matching highlights and notes

Store your annotations in a unified and cleaner way for future reference, book clubs and literature reviews. 📚


Chokitto is written in Python3 (assumed to be the default interpreter for python) and only uses the standard library. Installation, merely involves cloning this repository:

git clone
cd chokitto


By default, chokitto requires the path to the clippings file (e.g. /documents/My Clippings.txt on Kindle). It is then parsed, optionally filtered and exported (default: Markdown) to standard output:

python path/to/clippings
# for instance, a Kindle connected to a Mac
python "/Volumes/Kindle/documents/My Clippings.txt"

This will produce a Markdown document with all documents and their clippings sorted by type and location. The output can be written to a file by using a pipe or the -o / --output argument:

python path/to/clippings > path/to/
python path/to/clippings -o path/to/

The -v / --verbose option can be used to print additional parsing and filtering information. It is best used together with a pre-specified output file.

python path/to/clippings -o path/to/ -v

If you just want to take a quick look as to which documents the clipping file contains, use the -ls / --list option.

python path/to/clippings -ls

Chokitto will then parse the data and output an alphabetically sorted list of documents before exiting.

Documents (42 total):
  <Document: "A Great Book" by "Lastname, Name", 6 clippings>
  <Document: "Another Great Book" by "Lastname, Name", 12 clippings>
  <Document: "Unauthored Document", 5 clippings>

For additional information regarding basic usage, please refer to the help text which can be accessed using the -h / --help flag.

python -h


Currently, only the KindleParser is available and enabled by default. It processes the My Clippings.txt file which contains the (slightly chaotic) highlights, annotations and bookmarks made in eBooks, PDFs and other documents on the eReader.

The parser can be explicitly specified by using the -p / --parsers argument:

python path/to/clippings -p "kindle" 

The library itself is written to accommodate any kind of parser which returns documents and clippings, so we hope to extend it in the future.


Kindle's default behavior is to write every clipping action to the My Clippings.txt file. This means that changing the span of a highlight will be produce two entries in the file with different lengths. Furthermore, notes which are added to a highlighted section are stored as separate entries and can be difficult to match and find.

By using the -m / --merge option, chokitto can attempt to remove duplicate entries and reconnect separated highlights and notes:

python path/to/clippings -m

This will produce merged clippings such as highlight+note which appear in the output as follows:

### Page 42, Location 4649-4650

>[Highlight] We are making a point here.

>[Note] They have a point.

Added around 2020-01-01 10:13:17.


Filters can be used to specify which documents and clippings to include in the output. They are specified using the filter('arg', 'arg') syntax or simply as filter if there are no arguments or if they are left at their default values. Any number of them can be combined using the -f / --filters option:

python path/to/clippings -f \
"title('One Great Book')" \
"type('highlight')" \
"after('2020-01-01 00:00:00')"

This will produce output which only includes highlights from "One Great Book" which were made after the beginning of 2020.

Filter by String

String filters can be applied to document titles and authors as well as to clipping types. They follow the syntax filter('Exact Match') and can be used together with regular expressions such as filter('One Great (Book|Document) \d+', 'regex').

Filtering by Document Title is done using title('Title'), e.g.:

python path/to/clippings -f "title('One Great Book')"
# filter for an entire series
python path/to/clippings -f "title('^Book No\. \d+', 'regex')"

Filtering by Document Author is done using author('Author'), e.g.:

python path/to/clippings -f "author('That Author')"
# filter for a family of authors
python path/to/clippings -f "author('Lastname, .+', 'regex')"

Filtering by Clipping Type is done using type('type'), e.g.:

python path/to/clippings -f "type('highlight')"
# use '+' to filter for merged types (remember to merge!)
python path/to/clippings -m -f "type('highlight+note')"
# use regular expressions to filter for multiple types
python path/to/clippings -f "type('(bookmark|note)', 'regex')"

Filter by Date and Time

Date filters can be useful for exporting more recent or older clippings depending on the time and date they were created. They follow the syntax filter('yyy-mm-dd hh:mm:ss').

# only return clippings created after a certain date
python path/to/clippings -f "after('2020-01-01 00:00:00')"
# only return clippings created before a certain date
python path/to/clippings -f "before('2020-01-01 00:00:00')"


Exporters handle the formatting of the output. They are specified using the syntax exporter or exporter('arg', 'arg') if you want to change the default arguments. The default exporter is Markdown and it can be changed using the -e / --exporter option:

python path/to/clippings -e "markdown"


The Markdown exporter will produce a document split into "# root → ## document → ### clipping type → #### clipping" sorted by location. If the output contains only a single document, the hierarchy shifts up one heading.

# One Great Book

Lastname, Name

## Bookmarks

### Page 11, Location 48

## Highlights

### Page 25, Location 1602-1603

> This part was especially interesting.

Added on 2020-01-01 2020-01-01 9:41:53.

## Highlights + Notes

### Page 42, Location 4649-4650

>[Highlight] We are making a point here.

>[Note] They have a point.

Added around 2020-01-01 10:13:17.

If you would like to change the date formatting or omit it entirely, there's an argument for that:

python path/to/clippings -e "markdown('%m.%d at %H:%M')"
# omit the timestamp entirely
python path/to/clippings -e "markdown('')"


The JSON exporter will produce a list of document objects containing a list of clipping objects. If the output contains only a single document, the document object is returned directly.

python path/to/clippings -e "json"

This produces an output akin to:

        "title": "One Great Book",
        "author": "Lastname, Name",
        "clippings": [
                "type": "bookmark",
                "page": 11,
                "location": 48,
                "datetime": "2020-01-01 8:20:12",
                "content": null,
                "type": "highlight",
                "page": 25,
                "location": [1602, 1603],
                "datetime": "2020-01-01 9:41:53",
                "content": "This part was especially interesting."
        "title": "One More Great Book",
        "author": "Lastname, Name",
        "clippings": [
                "type": "highlight+note",
                "page": 42,
                "location": [4649, 4650],
                "datetime": "2020-01-01 10:13:17",
                "content": [
                    "[highlight] We are making a point here.",
                    "[note] They have a point."

Similarly to the Markdown exporter, it is possible to change the date formatting or omit it entirely:

python path/to/clippings -e "json('%m.%d at %H:%M')"
# omit the timestamp entirely
python path/to/clippings -e "json('')"

PDFMerger (Experimental)

The PDFMergeExporter attempts to merge highlights and notes with a corresponding PDF document. This is especially useful for research papers.

As this involves some more advanced PDF parsing, it requires the installation of the MuPDF toolkit as well as its Python bindings.

# install MuPDF using your package manager of choice, e.g.:
brew install mupdf
# then install the python binding using pip
pip install PyMuPDF

Experimental Caveats

  • Only one document at a time can be merged, so please find it first using -ls and specify a filter -f which will only return the document in question.
  • Clippings from PDFs only provide vague locations, so matching highlights to the original document will work better the more specific the text is.
  • For the same reason it is difficult to match notes with the correct highlights. Chokitto will merge all potential matches so please remove the incorrect ones from the final output document.

To use the PDFMerger, specify pdfmerge as the exporter -e along with the path to the original PDF document (e.g. the file on Kindle). Use filters -f to retrieve this single PDF's clippings. The output will be printed to standard output or to the file specified in -o (recommended).

# find the unique document title using -ls
python path/to/clippings.txt -ls
# the recommended method for merging clippings and PDFs
python path/to/clippings.txt -v -m -f "title('pdf-title')" -e "pdfmerge('path/to/pdf-title.pdf')" -o path/to/output.pdf
# for example, use the data from a connected Kindle
python "/Volumes/Kindle/documents/My Clippings.txt" -v -m -f "title('pdf-title')" -e "pdfmerge('/Volumes/Kindle/documents/pdf-title.pdf')" -o path/to/output.pdf
# or pipe the output directly to disk
python path/to/clippings.txt -m -f "title('pdf-title')" -e "pdfmerge('path/to/pdf-title.pdf')" > path/to/output.pdf

The output will be the original PDF document plus highlights in yellow and corresponding text bubble style annotations.


Chokitto (チョキっと) is a Python library for exporting annotations from your Kindle.







No releases published


No packages published