Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Complete text of BDB? #3

Open
biblicalhumanities opened this issue Dec 3, 2013 · 26 comments
Open

Complete text of BDB? #3

biblicalhumanities opened this issue Dec 3, 2013 · 26 comments

Comments

@biblicalhumanities
Copy link

Nice work! I would like to see the complete text of BDB, including the introduction. Is that something you would consider? Does the answer depend on who does the work?

@DavidTroidl
Copy link
Member

We have the front matter. It just hasn't made it into the release. The
full text is pretty much beyond the capability of one individual, so it
does depend on who does the work. I had a quirky PHP app for editing
the lexicon, that Daniel and I used to get it into its present form.
I've made some progress in updating it to the current format, and making
it somewhat more stable. Then so many other things came along, it
remains in limbo.

On 12/3/2013 10:22 AM, biblicalhumanities wrote:

Nice work! I would like to see the complete text of BDB, including the
introduction. Is that something you would consider? Does the answer
depend on who does the work?


Reply to this email directly or view it on GitHub
#3.

@rrshaban
Copy link

Hi @DavidTroidl,

How much of the text BDB is currently posted in BrownDriverBriggs.xml? Is there a rough estimate of how much remains a work in progress? How can people help with getting it completed?

thank you,
Razi

@DavidTroidl
Copy link
Member

Hi,

Brown, Driver, Briggs is a huge work. We have all the entries
represented. Some of the shorter ones are complete. Most of the others
have the "most significant" information included. We don't really have
a user-friendly method of contributing, but anybody who wants to extend
the work is free to do so. It's really hard to say how much we have
completed. A very uneducated guess would be maybe 35%?

Peace,

David

On 2/26/2015 4:11 PM, Razi Shaban wrote:

Hi @DavidTroidl https://github.com/DavidTroidl,

How much of the text BDB is currently posted in BrownDriverBriggs.xml?
Is there a rough estimate of how much remains a work in progress? How
can people help with getting it completed?

thank you,
Razi


Reply to this email directly or view it on GitHub
#3 (comment).


This email has been checked for viruses by Avast antivirus software.
http://www.avast.com

@rrshaban
Copy link

Have you given any thought to scraping a website that has the BDB posted? e.g. http://biblehub.com/hebrew/776.htm

I'm not sure how the terms of use for the BDB are, but as the BDB is in the public domain, I don't see a reason why scraping the digital version there might not be allowed. The attribution given there is as follows:

"Brown-Driver-Briggs Hebrew and English Lexicon, Unabridged, Electronic Database.
Copyright © 2002, 2003, 2006 by Biblesoft, Inc.
All rights reserved. Used by permission. BibleSoft.com"

@dowens76
Copy link
Member

Judging by a quick look at that entry, their database is abridged. I would think that what we have already at least has as much as that one and is unencumbered by their copyright assertions.

@strouptl
Copy link

strouptl commented Mar 7, 2016

@DavidTroidl this is a wonderful resource! I stumbled across it looking for some lexical information that I was not able to get at through the Accordance UI, and was able to export exactly what I needed using a simple XML parser. I see that "all entries are represented" from your comments above, but I was just wondering if you know for sure if all stems are present for those entries?

@DavidTroidl
Copy link
Member

I just came across an entry recently that seemed to need its senses
expanded. There may in fact be some verbs that don't have all their
stems represented. I have just uploaded the latest revision.

On 3/7/2016 2:14 AM, Laney Stroup wrote:

@DavidTroidl https://github.com/DavidTroidl this is a wonderful
resource! I stumbled across it looking for some lexical information
that I was not able to get at through the Accordance UI, and was able
to export exactly what I needed using a simple XML parser. I see that
"all entries are represented" from your comments above, but I was just
wondering if you know for sure if all stems are present for those entries?


Reply to this email directly or view it on GitHub
#3 (comment).


This email has been checked for viruses by Avast antivirus software.
https://www.avast.com/antivirus

@EliezerIsrael
Copy link

EliezerIsrael commented Dec 21, 2016

http://www.ericlevy.com/Revel/BDB/BDB/main.htm

This version of the BDB appears to be complete, although I have seen a few minor errors - numbering of senses being off, in particular. It looks to be parseable, with some effort.

@dowens76
Copy link
Member

Wow, that is an impressive piece of work, thanks for the link. I wonder if he would make his source files available.

@EliezerIsrael
Copy link

From the looks of it, R. Eric Levy copied it from biblecentre.net, which is no longer online. I reached out to R. Levy, but haven't yet heard back. It's relatively easy to download the entire html of the website. Then it's just a small matter of parsing. :)

The base text is in the public domain, but some of the emendations here make me wonder if this was digitized from a newer version that someone may try to assert rights over. In any case, the core material is squarely in the public domain, and no one could protest if the core work of the BDB were parsed and redistributed from here.

@DavidTroidl
Copy link
Member

DavidTroidl commented Dec 22, 2016 via email

@dowens76
Copy link
Member

dowens76 commented Dec 26, 2016 via email

@EliezerIsrael
Copy link

Ah, well that is disappointing. I'm not terribly surprised, though.

Do we have any idea who the proper originator of the BDB data is? I'd love to have a conversation with them. Perhaps there's a way we can get it released into the commons legitimately.

@EliezerIsrael
Copy link

@dowens76
Copy link
Member

Oooh, best not to mess with that.

@EliezerIsrael
Copy link

Here's a gift!
https://github.com/jackweinbender/bdb_parse

https://liberalarts.utexas.edu/mes/news/article.php?id=6768
A team at UTexas Austin got a NEH grant to create an online Lexicon based on the BDB. The grant wasn't renewed, but they got as far as digitizing the public domain DBD printing. I swapped emails with them, and their view is that since public money paid for the work, the resulting data is public property. They gave their blessing to carry the project forward in whatever ways we can.

It's a bit rough, the data - it needs to be converted from its current form into proper unicode. There's some node/js code that does some setup, but doesn't go so far as parsing the data.

Even so - this seems like a great bounty of data.

@DavidTroidl
Copy link
Member

The key map for Bwhebb is at Bible Works Fonts. This should help in constructing a search and replace script for the Hebrew. The consonants appear in reverse order, but each is followed by its vowel: bybia' means אָבִיב

@dajare
Copy link

dajare commented Nov 5, 2017

There is a macro for Word 2003 that converts BibleWorks fonts to unicode. It's in the "OLE and DDE" section of the help file (towards the end: section 58 in BWks 9). It includes this guidance:

To implement them just copy the blue text below into the Word Macro editor. If you want to use a different Unicode font you will need to edit the font names in the calling routines below. In other words, change "Ezra SIL" and "Arial Unicode MS" to the names of the fonts you want to use. BibleWorks ships with "SBL Greek" and "SBL Hebrew", as well as "Ezra SIL".

I have put the macro itself in a Gist, if that helps. But anyone with BibleWorks (for many versions back) will have this already.

@jackweinbender
Copy link

All,

A few things about this data.

  1. I wrote a crosswalk and converter for the legacy > Unicode conversion.

  2. There is one major issue with the Hebrew, namely, that all non-final Tsades without dagesh, for some reason, has been encoded as a het. I.e there’s not a straight forward way of knowing whether any particular “het” should actually be a tsade. You may be able to infer them based on their position in the Lexicon (all the words that start with het, obviously, are together; root aleph-het would show up before aleph-tsade, if it even exists [in which case a dictionary of BH roots could Ben helpful]).

@jackweinbender
Copy link

jackweinbender commented Feb 9, 2018

Here’s the transcoder (it was private, sry).
https://github.com/jackweinbender/bdb_transcoder
I wrote it in Elixir, for a reason I don’t recall. I’ve stopped working on BDB stuff for the present while I finish my dissertation.

@EliezerIsrael
Copy link

@jackweinbender This is great. Thank you.
I'd been working on a transcoder independently, over here - https://github.com/Sefaria/bdb_parse
Still have some dangling issues - could be that your work will help.

@jackweinbender
Copy link

FWIW; the JSON file in the transcoder should be exhaustive.

Is there a plan to encode this as a TEI document? I’ve also got a simple digital site to display the BDB by page like (http://jastrow.semitics-archive.org), if I can find it. I’ve been playing with some computer vision stuff to split up the images into entries/paragraphs that might make transcription (or perhaps corrected OCR?) easier.

@jackweinbender
Copy link

I’m going to try to keep up with these projects; I’d like to help. I was very disappointed when our NEH grant was not renewed. The BDB is such a fantastic work of scholarship, it is tragic that there isn’t a complete, open, digital edition f it yet.

@dajare
Copy link

dajare commented Feb 9, 2018

@jackweinbender said:

I’ve also got a simple digital site to display the BDB by page like (http://jastrow.semitics-archive.org), if I can find it.

I hope you can find it! That would be valuable, although something the GKC on Wikisource would be remarkable. But please ping me if you mount your digi-BDB! Thanks.

@jackweinbender
Copy link

I will. I’m out of town this week, but i’ll post a link whenever I get it deployed.

@jackweinbender
Copy link

I actually reimplemented my BDB site using the data from this repo's XML file, since the former iteration used the buggy one referenced above. Everything seems to still work, so... as promised:

http://bdb.semitics-archive.org/

It probably sucks on mobile, FWIW.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants