API Documentation

JonMosenkis edited this page Nov 6, 2016 · 11 revisions

Sefaria Torah Texts API

The Torah Texts API allows live access to Sefaria's structured database of Jewish texts and their interconnections. It is designed to make getting up and running with a new web or mobile app as simple as possible. If you want to run your own database, please don't use this API to grab our data; instead you can download a complete dump of our data on GitHub.

What texts are included?

You can see a complete list of known texts on the Sefaria Table of Contents.

Text References

A number of API calls depend on creating valid text references (a.k.a citations) which are generally called "refs" in our code.

Caveat

Sefaria is still in its early life. This API is very rough, likely to be changing quickly, and shouldn't be considered stable. We're opening our API early because we want developers to be able to expand on our work in an open way from day one. If you build anything with it, we'd love for you to let us know. If you don't, be aware this version of the API may change without warning.

GET Methods

All API responses are in JSON. If there is an error that doesn't prevent a response, an error message is returned in the field error. All API methods support JSONP.

GET /api/texts/:ref

Retrieves the text and other metadata for the text reference string ref (read more on text references below).

Parameters

You can request a particular version of a text by specifying a version title and language, to the parameters lang and version. lang should be a two character language code. Spaces in version may be replaced with "_".

context=0 may be used to suppress the default behavior of returning surrounding texts when only a single segment is requested.

commentary=0 will prevent searching for and returning connections to other texts along with the requested text.

pad=0 will prevent default behavior of padding a ref down to a section level. Without pad=0 requests for Genesis will be resolved to Genesis 1. With, it a request for Genesis returns the entire book.

Result fields of GET /api/texts/

book The name of the requested text, normalized to the name considered primary. Hence "Kohelet" becomes "Eccelesiates".

categories A hierarchal array of categories to which this text belongs, beginning with the highest level.

commentary An array of connections to this text including the connected text and information about it. Looking up connections can be slower and create larger response size. If you're only interested in the text you requested use the ?commentary=0 parameter.

he The Hebrew (or Aramaic) text requested, as an array of strings. If only a single segment is requested (e.g. "Kohelet 5.3"), the surrounding text is still returned by default. If you only need the segment you request, you can add the parameter ?context=0 in which case this field will be a string. If no Hebrew text exists for this ref, the value will be []. Note, Aramaic is not currently distinguished from Hebrew.

heVersions Only present if multiple Hebrew versions of this text are available, this field contains an array of objects containing the versionTitle of each available version.

heVersionSource The URL or book citation of the source of the Hebrew text.

heVersionTitle The name of the version of the Hebrew text.

length The length of the text in its highest level section, in this case Kohelet has 12 chapters.

maps An array of maps between strings and deeper sections of the text (currently called "Shorthands" in the Sefaria interface). Used to give names to segments of text. E.g, "Rambam Laws of Human Dispositions" maps to "Mishneh Torah 1:2".

next A ref of the next section of text, if any.

order An array specifying the order of this text with regards to its categories. In this case, Kohelet is the 33rd book of the Tanach, and the 7th book of Writings.

prev A ref of the previous section of this text, if any.

ref A normalized version of the requested reference.

sectionNames An array containing the type names of the sections of this text. The length of sectionNames gives the depth of the structure of this text. For example, Kohelet (["Chapter", "Verse"]) has depth 2, while Mishneh Torah (["Book", "Topic", "Section", "Law"]) has depth 4. Comparing this depth to the depth of sections will show if the request was for the lowest level, or a higher level, of the text.

sections An array of ints which corresponds to sectionNames and represents the sections of the text requested. When sectionNames is ['Chatper', 'Verse'] a request for chapter 4 looks like [4] while a request for chapter 4, verse 7 would look like [4, 7]. Note, for Talmud, Dafs are represented by a string like '42a' or '12b'.

text The English text requested, as an array of strings. If only a single segment is requested (e.g. "Kohelet 5.3"), the surrounding text is still returned by default. If you only need the segment you requested, you can add the parameter ?context=0 in which case this field will be a string. If no English text exists for this ref, the value will be [].

titleVariants An array of alternate titles for this text, including titles in other language, spelling variants and abbreviations.

toSections An array parallel to sections which specifies the end range of the requested text For "Kohelet 3:1-4", this would be [3, 4] for chapter, 3, verse 4.

type A convenience for categories[0], the highest level category of the text.

versionSource The URL or book citation of the source of the English text.

versionTitle The title of the version of the text that was returned in text.

versions Only present if multiple English versions of this text are available, this field contains an array of objects containing the versionTitle of each available version.

/api/texts Example request/response

GET http://www.sefaria.org/api/texts/Kohelet.5

200 OK

{
    'book': 'Ecclesiastes',
    'categories': ['Tanach', 'Writings'],
    'commentary': [{'_id': '4f1e04321c81c4cf8a4149c8',
                 'anchorRef': 'Ecclesiastes 5:9',
                 'anchorText': '',
                 'anchorVerse': 9,
                 'category': 'Halacha',
                 'commentaryNum': 0,
                 'commentator': 'Mishneh Torah',
                 'he': '',
                 'ref': 'Mishneh Torah 1:2:1:2',
                 'sourceRef': 'Mishneh Torah 1:2:1:2',
                 'text': 'A person may be very greedy, not satisfied by all the wealth in the world, as it says, "A lover of silver never has his fill of silver" (Ecclesiastes 5:9)...',
                 'type': 'quotation'}],
    'he': ["אַל תְּבַהֵל עַל פִּיךָ וְלִבְּךָ אַל יְמַהֵר לְהוֹצִיא דָבָר לִפְנֵי הָאֱלֹהִים כִּי הָאֱלֹהִים בַּשָּׁמַיִם וְאַתָּה עַל הָאָרֶץ עַל כֵּן יִהְיוּ דְבָרֶיךָ  מְעַטִּים.",
           "כִּי בָּא הַחֲלוֹם בְּרֹב עִנְיָן וְקוֹל כְּסִיל בְּרֹב דְּבָרִים.",
           "כַּאֲשֶׁר תִּדֹּר נֶדֶר לֵאלֹהִים אַל תְּאַחֵר לְשַׁלְּמוֹ כִּי אֵין חֵפֶץ בַּכְּסִילִים אֵת אֲשֶׁר תִּדֹּר שַׁלֵּם.",
           ...],
    'heVersionSource': 'http://he.wikisource.org/wiki/%D7%9E%D7%A7%D7%A8%D7%90',
    'heVersionTitle': 'Wikisource Tanach',
    'length': 12,
    'maps': [],
    'next': 'Ecclesiastes 6',
    'order': [33, 7],
    'prev': 'Ecclesiastes 4',
    'ref': 'Ecclesiastes 5',
    'sectionNames': ['Chapter', 'Verse'],
    'sections': [5],
    'text': ['Be not rash with thy mouth, and let not thy heart be hasty to utter a word before God; for God is in heaven, and thou upon earth; therefore let thy words be few.',
             "For a dream cometh through a multitude of business; and a fool's voice through a multitude of words.",
             'When thou vowest a vow unto God, defer not to pay it; for He hath no pleasure in fools; pay that which thou vowest.',
             ...],
    'titleVariants': ['Ecclesiastes', 'Kohelet', 'Ecc', 'Ecc.'],
    'toSections': [5],
    'type': 'Tanach',
    'versionSource': 'http://opensiddur.org/2010/08/the-holy-scriptures-a-new-translation-jps-1917/',
    'versionTitle': 'The Hebrew Bible in English according to the JPS 1917 Edition',
    'versions': [{'language': 'en', 'versionTitle': 'The Hebrew Bible in English according to the JPS 1917 Edition'},
                 {'language': 'he', 'versionTitle': 'Wikisource Tanach'}]
}

GET /api/index/

Returns an object representing the table of contents of all texts currently known. Texts are grouped by their categories and ordered along with basic metadata for each. Useful for building a table of contents.

GET /api/index/titles/

Returns a list of all known text titles, including title variants and abbreviations. Useful for checking citations or user input of text names.

GET /api/index/:title

Returns metadata about the requested text, including section names, title variants, categories, and orders.

The data is returned in a legacy form that only supports simple texts. See the two endpoints below.

GET /api/v2/index/:title

Returns an index record, whether simple or complex, in the new format. The index record will be padded with related information:

  • references to shared titles are expanded
  • textual previews for alternate structures are included

To get the raw, unmodified record, use the method below.

GET /api/v2/raw/index/:title

Returns the raw index record, as it is stored in the db

To edit a complex Index record via the API, use /api/v2/raw/index to GET and POST it.

GET /api/links/:ref

Returns a list of connections known for the given segment of text.

e.g. http://www.sefaria.org/api/links/Exodus.1.12

Returns an array, with each element of the array looking like:

{
    "category": "Commentary",
    "commentator": "Rashi",
    "heCommentator": "רש\"י",
    "type": "commentary",
    "anchorRef": "Exodus 1:12",
    "anchorText": "",
    "sourceRef": "Rashi on Exodus 1:12:1",
    "commentaryNum": 1,
    "anchorVerse": 12,
    "heTitle": "רש\"י על שמות",
    "text": "In any way in which they deigned to oppress, that was where the heart of the Holy Blessed One [saw fit to] increase and spread out.",
    "_id": "5234a6adedbab465c9549b14",
    "ref": "Rashi on Exodus 1.12.1",
    "he": "<b>וכאשר יענו אתו.</b> בכל מה שהם נותנין לב לענות, כן לב הקדוש ברוך הוא להרבות ולהפריץ: "
},

  • The with_text parameter defaults to 1 (true). If it is passed a false value, e.g. ?with_text=0, it will return the links without text.

POST Methods

Building custom parsers for available digital texts is the fastest way to help Sefaria grow. POST methods allow conributors to work locally on parsing texts and post their results when ready.

All POST methods require an API Key attached on the post parameter apikey. To obtain an API Key, please introduce yourself on the Sefaria developers mailing list.

Before posting large datasets to www.sefaria.org, you may want to run a test on a different enviornment. dev.sefaria.org is availble for such testing and other environments can be brought up as needed.

POST /api/index/:title

Creates a new index record for the text named title. POST data must include two parameters and be x-www-form-urlencoded: apikey and json which includes a JSON object with the fields below:

title The primary title of this text (for now, this should be in Roman characters).

titleVariants An array of strings of title variants (alternate spellings, abbreviations) for this text. This must include the primary title, and each title variant must be unique among all text title variants and maps known to Sefaria.

sectionNames An array listing the names of the different levels of structure within this text. For a text like Bereishit, this would be ['Chapter', 'Verse']. For a larger text like Mishneh Torah this would be ['Book', 'Topic', 'Chapter', 'Halacha'].

categories A hierachal array of category names that apply to this text. For example, say we want to add an index record a Musar text called "Sefer Ploni". In Python we might do the following:

import urllib
import urllib2
from urllib2 import URLError, HTTPError
import json

index = {
    "title": "Sefer Ploni",
    "titleVariants": ["Sefer Ploni", "The Book of Someone"],
    "sectionNames": ["Chapter", "Paragraph"],
    "categories": ["Musar"],
}

def post_index(index):
    url = 'http://localhost:8000/api/index/' + index["title"].replace(" ", "_")
    indexJSON = json.dumps(index)
    values = {
        'json': indexJSON, 
        'apikey': 'yourapikey'
    }
    data = urllib.urlencode(values)
    req = urllib2.Request(url, data)
    try:
        response = urllib2.urlopen(req)
        print response.read()
    except HTTPError, e:
        print 'Error code: ', e.code

post_index(index)

POST /api/v2/raw/index/:title

Posts an index record in the new format. See Index Records for Simple & Complex Texts for details on the structure of index records.

To edit a complex Index record via the API, use /api/v2/raw/index to GET and POST it.

POST /api/texts/:ref

Posts the text named by ref. ref must name a text already known to Sefaria. When posting a new text, add a new index record first, either manually on the site or programatically as described above. POST data must include two parameters and be x-www-form-urlencoded: apikey and json which includes a JSON object with the fields below:

versionTitle The name of this text version (e.g., "Wikisource Tanach").

versionSource A URL identifying the original source of the posted text.

language The two character language code of the posted text.

text The text itself, as either a string (if ref names a lowest level segment, e.g., "Iyyov 3:2") or an array of strings (if ref names a higher level segment, e.g., "Shemot Rabbah 4").

count after When using an API key this parameter defaults to "0". Set it to "1" in order to rebuild the counts document after posting new content. Sefaria keeps a count document to summarize the number of available text versions for every segment of a text. When the counts document is out of date, the text table of contents page will not list all available content. Generally, if you are posting a text with multiple HTTP request, you want to include the count_after=1 on the last request or on an additional request after you've finished in order to count up all the text you've added.

skip links Setting this to "1" will skip the automatic linking process when uploading a new text. The auto-linking features are by far the slowest features of uploading a new text, which prompted us to create an option to skip it. It should be mentioned though, that the auto linking feature MUST run, and therefore it is highly recommended that anyone without administrative access not set this parameter.

For an example POST, let's assume we've added a text index record for Sefer Ploni as described above. To post the 5th chapter of this text in Python we could do the following:

import urllib
import urllib2
from urllib2 import URLError, HTTPError

text = {
    "versionTitle": "Example Sefer Ploni",
    "versionSource": "http://www.example.com/Sefer_Ploni",
    "language": "en",
    "text": [
        "Paragrpah 1",
        "Paragraph 2",
        "Paragraph 3"
    ]
}

def post_text(ref, text):
    textJSON = json.dumps(text)
    ref = ref.replace(" ", "_")
    url = 'http://localhost:8000/api/texts/%s' % ref
    values = {'json': textJSON, 'apikey': 'yourapikey'}
    data = urllib.urlencode(values)
    req = urllib2.Request(url, data)
    try:
        response = urllib2.urlopen(req)
        print response.read()
    except HTTPError, e:
        print 'Error code: ', e.code
        print e.read()

post_text("Sefer Ploni 5", text)

Texts can be posted at any level of granularity, from individual segments to sections to entire texts, so long as the depth of the nesting in the text field matches the depth of the ref passed. In the above example we posted a chapter, but we could also post an individual paragraph or the complete text like below:

text_paragraph = {
    "versionTitle": "Example Sefer Ploni",
    "versionSource": "http://www.example.com/Sefer_Ploni",
    "language": "en",
    "text": "Paragraph 4"
}
post_text("Sefer Ploni 5:4", text_paragraph)

text_whole = {
    "versionTitle": "Example Sefer Ploni",
    "versionSource": "http://www.example.com/Sefer_Ploni",
    "language": "en",
    "text": [
        [
            "Chapter 1, Paragraph 1",
            "Chapter 1, Paragraph 2"
        ],
        [
            "Chapter 2, Paragraph 1",
            "Chapter 2, Paragraph 2",
            "Chapter 2, Paragraph 3"
        ],
        [
            "Chapter 3, Paragraph 1"
        ]
    ]
}
post_text("Sefer Ploni", text_whole)

POST /api/links/

Adds links.

Payload can be one Link object, or an array of many.

Each link object has two required fields - refs which is an array of two references, and type which is a string indicating the type of the link. type can be any text value or an empty string.

Commonly used values of type:

commentary
quotation
reference
summary
explication
related
midrash
ein mishpat
mesorat hashas

Optional attributes are auto which is Boolean, and generated_by which is included if auto is true, and is a string identifying the process that added the link.

Example payload:

            {
                "refs": [
                        "Shabbat 15b:6", 
                        "Exodus 3:4"
                    ],
                "type": "quotation",
            }

Example payload of an automated process:

            {
                "refs": [
                        "Shabbat 5b:4-8", 
                        "Mishnah Shabbat 1:3-4"
                    ],
                "type": "Mishnah in Talmud",
                "auto": True,
                "generated_by": "connect_mishnah",
            }