Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Suggestions for API #922

Open
krixano opened this issue Jul 5, 2022 · 1 comment
Open

Suggestions for API #922

krixano opened this issue Jul 5, 2022 · 1 comment

Comments

@krixano
Copy link

krixano commented Jul 5, 2022

I wanted to leave some suggestions for the API that I've gathered from trying to use the API for one of my projects (creating a Sefaria proxy for the Gemini protocol, at gemini://auragem.space/texts/judaism).


isDigitizedBySefaria sometimes shows up as a string and sometimes as a boolean. I assume that when it has a value, that is when it returns as a boolean. This needs to be made consistent - a boolean at all times - and if there is no value, then false by default. This helps typed programming languages (Go, Odin, etc.) to unmarshall the json more easily. There honestly shouldn't be any reason for this field to sometimes be a string and sometimes be a boolean anyways, imo.


There seems to be a problem with poetry. Verses that have poetry have line breaks at every point except the end of the verse, which results in the next verse starting just after the last line of the verse with poetry. In Sefaria's frontend default layout, each verse starts on a new line, which makes this problem invisible, at least until you switch to the other layout where verses do not start on new lines. Sefaria currently ignores all line breaks in this mode. However, the expected result is that a line break goes at the end of the verse if that last line is part of the poem.
For example, Genesis 2:23-24 would show as this if one were to take the text directly from the API, replacing the line break tags with line breaks:

[verse 22...] (23) Then the Human said,
“This one at last
Is bone of my bones
And flesh of my flesh.
This one shall be called Woman,*
For from a Human* was she taken.” (24) Hence a man* leaves his father and mother and clings to his wife,* so that they become one flesh.

It should, however, display like this:

[verse 22...] (23) Then the Human said,
“This one at last
Is bone of my bones
And flesh of my flesh.
This one shall be called Woman,*
For from a Human* was she taken.”
(24) Hence a man* leaves his father and mother and clings to his wife,* so that they become one flesh.

And this happens to be how the majority of modern Bibles would format it, as well. For this to work, a line break needs to be added at the end of all poetry, even if it's at the end of a verse.
This problem is particularly evident when you look at the book of Job, or any other predominantly poetic book. Although, if you look at Job 17:1-7, you will see that sometimes there is a line break at the end of a verse, but these are clearly the stanza breaks. So, those would really need two line breaks.


There is a field called alts that you get from the Text API that will list the names of the alternative sections (aliyot, etc.). However, as far as I know, it never specifies where they are placed within the returned text. I'm not sure if there is an additional query that needs to be made to get that information, but I would suggest at the very least adding a "sections" field to each object within alts that specifies where that section starts at (or, more specifically, the text/verse just after the start of the alt section). Although, ideally, I'd rather have a new API version that integrates these alternative ways of specifying/numbering sections better.

@krixano
Copy link
Author

krixano commented Jul 7, 2022

Another thing with the Links API:

I would like a list of links (commentaries) for a whole chapter of the Tanakh (or a Daf of the Talmud, etc.), however, the list of links I get are based on verses of the commentary. For example, querying for the links for Genesis 1 gives back a bunch of links from "Rashi on Genesis 1:[verse]:[verse of Rashi]", with each "verse" of Rashi in a separate object. However, what I want is just one link for "Rashi on Genesis 1:1". The API unfortunately does not make it simple to figure out which links take this format and construct the desired reference that covers the "chapter" of the commentary.

Edit: I did end up figuring out a way to handle this with the golang code below. However, I feel like the API could still be improved.

links := GetLinks(text.Ref, "", "") // Get Commentaries
dict := make(map[string]bool)

fmt.Fprintf(&builder, "\n\n##Commentaries\n\n")
for _, link := range links {
	if link.Category != "Commentary" || !link.SourceHasEn {
		continue
	}
	if _, ok := dict[link.IndexTitle]; !ok {
		// Add to map so we can check whether it repeats, then print the first reference link
		// (relying on the link to display the surrounding context)
		dict[link.IndexTitle] = true
		fmt.Fprintf(&builder, "=> /texts/jewish/t/%s %s\n", url.PathEscape(link.Ref), link.IndexTitle)
	}
}

Edit2: The above code implied that the commentaries were listed chronologically, which is apparently not true. So there should probably be some sort to sort by link.CommentaryNum or something.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant