meta_OpenArabic_complete : the latest version of metadata for OpenArabic
Number of texts: 7,052; the number of unique: 4,278
Fields: CSV STRUCTURE:
ID
: JK000001
AUTHOR
: 0505Ghazali
DATE
: 0505
TITLE
: IhyaCulumDin
#WORDS
: 992049
URI
: 0505Ghazali.IhyaCulumDin.JK000001-ara1
STATUS
: PRI
URL
: https://raw.githubusercontent.com/OpenArabic/0600AH/master/data/0505Ghazali/0505Ghazali.IhyaCulumDin/0505Ghazali.IhyaCulumDin.JK000001-ara1
MODERN
: a modern book that may be worth including into the corpusMAJALLA
: a run of a journal; not sure how to assign a URI to journals yetEXCLUDE
: no need to include the text (modern) into the corpus, at least at the momentDIFFICULT
: a difficult caseNOTCLEAR
: not clear what from metadata (add a date, if available, for reference:0381NOTCLEAR
)PERSIAN
: the book is in Persian (mostly, modern Shi`i law)
- Use EditPadPro with OpenArabic mARkdown installed
- Open file
__Entire_Corpus_Working.txt
> text in the file should get automatically highlighted - Text files have already been grouped into the same works, but there are mistakes here and there: 1) wrong books grouped together; 2) not all text files that have the text of the same book were grouped together.
- A new record starts with
######## BEGofRECORD ###########
; the entire bibliography can be folded (right click > Fold all
) - The line that immediately follows is the line where the
URI
is either already provided or should be added- If the
URI
already exists, nothing needs to be done; in this case the line starts with#FOLDER#===#
and is followed immediately by theURI
of a book (highlighted with orange) and a selection of thematic tags (highlighted with green), for example:#FOLDER#===# 0748Dhahabi.TarikhIslam (TAGS: BIO, CENT0800, CHR, COL, DHB, PPE, _TABAQAT, _TARAJIM, _TARIKH)
- When the
URI
does not exist, the line starts with one of the following three options (NB: options #2 and #3 can be ignored for now,URIs
for books of #1 only should be provided):#BookURI##===#
(no highlighting) — theURI
for this book may be added#BookURI?##===#
(highlighted with black) — there is no need to add theURI
for this book for now#BookURI-##===#
(highlighted with red) — there is no need to add theURI
for this book for now
- If the
-
Make sure that the
URI
of an author does not exist, before creating one; the easiest way to do that is to search for the year of death in Annotation to OpenArabic https://github.com/OpenArabic/Annotation -
If it exists, use the existing one; if not, you need to create one
-
Creating Author
URI
:- Author
URI
is made of two elements: (1)YEAR of DEATH
(always (!) 4 digits, add zeros in the beginning for dates before 1000 AH); (2) ideally,SHUHRA
of the author, for example:0748Dhahabi
. - Issues with the date of death:
- If only the century is known, use the last year of that century, thus the 4th century hijri >
0400
- If pre-hijri (for example, 56 before hijra), use
0001
- If current time, use
1450
- If only the century is known, use the last year of that century, thus the 4th century hijri >
- Issues with shuhra: when it is difficult to identify the
SHUHRA
, use the following elements (as they seem to be present in the majority of records)- In this order of priority:
Laqab
,Kunya
,Nasab
(Ibn Fulan
, whereFulan
is the name of the father). - In this order of priority:
Nisba
geographical;Nisba
religious;Nisba
tribal - Examples:
0430AbuNucaymIsbahani
,0463KhatibBaghdadi
,0900AbuCabdAllahHimyari
.
- In this order of priority:
- Author
-
Creating Book URI
- Add the title to the
Author URI
— separated with aperiod
. - Drop the word kitāb from the title
- Abbreviate the title to the shortest meaningful string, for example: al-Kāmil fī-l-taʾrīḫ becomes
Kamil
- Use camelCase when connecting words, for example: Taʾrīḫ al-islām becomes
TarikhIslam
- Drop hamzas, for example: Suʾāl, becomes
Sual
- ʿAyns are transliterated with
c
and capitalized when necessary, for example: ʿAlī becomesCali
; iʿtidāl becomesIctidal
- In everything else: Library of Congress transliteration system.
- Examples:
0430AbuNucaymIsbahani.HilyaAwliya
,0463KhatibBaghdadi.TarikhBaghdad
,0900AbuCabdAllahHimyari.RawdMictar
- Add the title to the
-
Adding the URI to the record
- Replace
#BookURI##===#
with#FOLDER#===#
- Insert the
URI
between#FOLDER#===#
and(TAGS: ...)
- if everything done correctly, the line will get highligted with orange.
- Replace