An intuitive Python 3 library for processing and converting text-based data formats like JSON and CSV.
Have you ever sighed when writing code like this?

import csv
import json

with open("names.json") as f:
    data = json.loads(

data = [row["name"] for row in data if "John" in row["name"]]

with open("names.csv", "w") as f:
    writer = csv.writer(f)
    [writer.writerow([row]) for row in data]

Now you can write it like this:

from dataknead import Knead
Knead("names.json").filter(lambda r:"John" in r["name"]).write("names.csv")


Install dataknead from PyPi

pip install dataknead

Then import

from dataknead import Knead

Note that dataknead is Python 3-only.

Basic example

Let's say you have a small CSV file with cities called cities.csv.


And you want to load this csv file and transform it to a json file.

from dataknead import Knead


You'll now have a json file called cities.json that looks like this:

        "city" : "Amsterdam",
        "country" : "nl",
        "population" : 850000

Maybe you just want the city names and write them to a CSV filed called city-names.csv.

from dataknead import Knead


That will give you this list


Now you want to extract only the cities that are located in Italy, and write that back to a new csv file called cities-italy.csv:

from dataknead import Knead

Knead("cities.csv").filter(lambda r:r["country"] == "it").write("cities-italy.csv")

This gives you this:


Nice huh?

Advanced example

Check out the advanced example.


class dataknead.Knead(inp, parse_as = None, read_as = None, is_data = False)

If inp is a string, a filepath is implied and the extension is used to get the correct loader.


To overwrite this behaviour (for a file that doesn't have the correct extension), use the read_as argument.

Knead("cities", read_as="csv")

If inp is not a string, data is implied.


To force a string to be used as data instead of a file path, set is_data to True.

Knead("", is_data = True)

To force parsing of a string to data (e.g., from a JSON HTTP request), set parse_as to the correct format.

Knead('{"error" : 404}', parse_as="json")

Some loaders might come with extra arguments. E.g. the csv loader has an option to force using a header, if it isn't detected automatically

Knead("cities.csv", has_header = True)

The default loaders are for csv, json and txt files.


Runs all data through a function.

Knead(["a", "b", "c"]).apply(lambda x:"".join(x)).print() # 'abc'

data(check_instance = None)

Returns the parsed data.

data = Knead("cities.csv").data()

To raise an exception for an invalid instance, pass that to check_instance

data = Knead("cities.csv").data(check_instance = dict)


Run a function over the data and only keep the elements that return True in that functon.

Knead("cities.csv").filter(lambda city:city["country"] == "it").write("cities-italy.csv")

# Or do this
def is_italian(city):
    return city["country"]  == "it"



Returns the keys of the data.

map(fn | str | tuple)

Run a function over all elements in the data.

Knead("cities.csv").map(lambda city:city["city"].upper()).write("cities-uppercased.json")

To return one key in every item, you can pass a string as a shortcut:


# Is the same as

Knead("cities.csv").map(lambda c:c["city"]).write("city-names.csv")

To return multiple keys with values, you can use a tuple:

Knead("cities.csv").map(("city", "country")).write("city-country-names.csv")

# Is the same as

Knead("cities.csv").map(lambda c:{ "city" : c["city"], "country" : c["country"] }).write("city-country-names.csv")

# Or

def mapcity(city):
    return {
        "city" : city["city"],
        "country" : city["country"]



Prints the current data, formatted using json.dumps. These two lines are equivalent:



Queries a dict by using a path, separated by slashes.

    "image" : {
        "full" : {
            "src" : ""
}).query("image/full/src").print() # ''


Returns values of the data.

write(path, write_as = None)

Writes the data to a file. Type is implied by file extension.


To force the type to something else, pass the format to write_as.

Knead("cities.csv").map("city").write("cities.txt", write_as="csv")

Some of the loaders have extra options you can pass to write:

Knead("cities.csv").write("cities.json", indent = 4)
Knead("cities.csv").map("city").write("ciites.csv", fieldnames=["city"])

Extending dataknead

You can write your own loaders to read and write other formats than the default ones (csv, json and txt). For an example take a look at the Excel example and the XML example.


Performance drawbacks should be negligible. See this small performance test.


Written by Hay Kranen.


Licensed under the MIT license.

Release history


  • Adding tuple shortcut to map (#2)
  • Adding support for txt files ((#4)
  • Adding support for loader constructor argument passing, and adding a has_header option to CsvLoader (#5)


Initial release