## How to use the usfm-grammar python APIs


### Installation

#### From PyPI

In [None]:
# Good to set up a virtual environment
# requires python >= 3.10
!pip install usfm-grammar

#### From code base

In [None]:
! pip install -e ./../py-usfm-parser/ # from the code base

### Using from CLI

In [None]:
! usfm-grammar ../tests/basic/multiple-chapters/origin.usfm

In [None]:
! usfm-grammar ../tests/basic/footnote/origin.usfm --exclude_markers notes --exclude_markers w

In [None]:
! usfm-grammar ../tests/basic/multiple-chapters/origin.usfm --out_format usx

In [None]:
! usfm-grammar ../tests/basic/multiple-chapters/origin-usj.json --out_format usfm

In [None]:
! usfm-grammar -h # to view the command line options

In [None]:
# to bring the changes, after update on the local tree-sitter-usfm grammar
# in terminal from the project root run the following
# >>> python python-usfm-parser/src/grammar_rebuild.py ./tree-sitter-usfm3/ python-usfm-parser/src/usfm_grammar/my-languages.so

### Parsing an input USFM

In [4]:
from usfm_grammar import USFMParser, Filter # importing from the local module, not from an installed library

In [5]:
input_usfm_str = '''
\\id EXO 02EXOGNT92.SFM, Good News Translation, June 2003
\\h പുറപ്പാടു്
\\toc1 പുറപ്പാടു്
\\toc2 പുറപ്പാടു്
\\mt പുറപ്പാടു്
\\c 1
\\p
\\v 1 യാക്കോബിനോടുകൂടെ കുടുംബസഹിതം ഈജിപ്റ്റിൽ വന്ന 
\\p യിസ്രായേൽമക്കളുടെ പേരുകൾ : 
\\v 2 രൂബേൻ, ശിമെയോൻ, ലേവി,
\\v 3 
\\li1 യെഹൂദാ, 
\\li1 യിസ്സാഖാർ, 
\\li1 സെബൂലൂൻ, 
\\li1 ബെന്യാമീൻ
\\p
\\v 4 ദാൻ, നഫ്താലി, ഗാദ്, ആശേർ.
\\v 12-83 They presented their offerings in the following order:
\\tr \\th1 Day \\th2 Tribe \\th3 Leader
\\tr \\tcr1 1st \\tc2 Judah \\tc3 Nahshon son of Amminadab
\\tr \\tcr1 2nd \\tc2 Issachar \\tc3 Nethanel son of Zuar
\\tr \\tcr1 3rd \\tc2 Zebulun \\tc3 Eliab son of Helon
\\p
\\v 5 യാക്കോബിന്റെ സന്താനപരമ്പരകൾ എല്ലാം കൂടി എഴുപതു പേർ ആയിരുന്നു; യോസേഫ് മുമ്പെ തന്നെ ഈജിപ്റ്റിൽ ആയിരുന്നു. \\w gracious \\+nd Lord\\+nd*|grace\\w* and then a few words later \w gracious|lemma="grace" x-myattr="metadata"\w*
\\c 2
\\s1 A Prayer of Habakkuk
\\p
\\v 1 This is a prayer of the prophet Habakkuk:
\\b
\\q1
\\v 2 O \\nd Lord\\nd*, I have heard of what you have done,
\\q2 and I am filled with awe.
\\q1 Now do again in our times
\\q2 the great deeds you used to do.
\\q1 Be merciful, even when you are angry.
\\p
\\v 20 Adam \\f + \\fr 3.20: \\fk Adam: \\ft This name in Hebrew means “all human beings.”\\f*
named his wife Eve, \\f + \\fr 3.20: \\fk Eve: \\ft This name sounds similar to the Hebrew
word for “living,” which is rendered in this context as “human beings.”\\f* because she
was the mother of all human beings.
\\v 21 And the \\nd Lord\\nd* God made clothes out of animal skins for Adam and his wife,
and he clothed them.
\\qt-s |sid="qt_123" who="Pilate"\\*“Are you the king of the Jews?”\\qt-e |eid="qt_123"\\*
\\esb \\cat History\\cat*
\\ms Fish and Fishing
\\p In Jesus' time, fishing took place mostly on lake Galilee, because Jewish people
could not use many of the harbors along the coast of the Mediterranean Sea, since these
harbors were often controlled by unfriendly neighbors. The most common fish in the Lake
of Galilee were carp and catfish. \\wj The Law of Moses \\wj* allowed people to eat any fish with
fins and scales, but since catfish lack scales (as do eels and sharks) they were not to
be eaten. Fish were also probably brought from Tyre and Sidon,
where they were dried and salted.
...
\\p Among early Christians, the fish was a favorite image for Jesus, because the Greek
word for fish ( \\tl ichthus\\tl* ) consists of the first letters of the Greek words that
tell who Jesus is: \\fig Christian Fish Image\\fig*
\\esbe
'''

In [6]:
my_parser = USFMParser(input_usfm_str)

In [7]:
# To validate the input USFM file. 
# The rest of operations will work even if there are small errors
my_parser.errors 

### Converting USJ and extracting specific contents via filters
- The easy to work with JSON representation of USFM data
- Exclude and include markers in output as needed from the input
    - Handy custom filters provided.
    - Also the flexibility of specifying any marker

In [8]:
my_parser.to_usj()

{'type': 'USJ',
 'version': '0.1.0',
 'content': [{'type': 'book:id',
   'content': ['02EXOGNT92.SFM, Good News Translation, June 2003'],
   'code': 'EXO'},
  {'type': 'para:h', 'content': ['പുറപ്പാടു്']},
  {'type': 'para:toc1', 'content': ['പുറപ്പാടു്']},
  {'type': 'para:toc2', 'content': ['പുറപ്പാടു്']},
  {'type': 'para:mt', 'content': ['പുറപ്പാടു്']},
  {'type': 'chapter:c', 'number': '1', 'sid': 'EXO 1'},
  {'type': 'para:p',
   'content': [{'type': 'verse:v', 'number': '1', 'sid': 'EXO 1:1'},
    'യാക്കോബിനോടുകൂടെ കുടുംബസഹിതം ഈജിപ്റ്റിൽ വന്ന']},
  {'type': 'para:p',
   'content': ['യിസ്രായേൽമക്കളുടെ പേരുകൾ :',
    {'type': 'verse:v', 'number': '2', 'sid': 'EXO 1:2'},
    'രൂബേൻ, ശിമെയോൻ, ലേവി,',
    {'type': 'verse:v', 'number': '3', 'sid': 'EXO 1:3'}]},
  {'type': 'para:li1', 'content': ['യെഹൂദാ,']},
  {'type': 'para:li1', 'content': ['യിസ്സാഖാർ,']},
  {'type': 'para:li1', 'content': ['സെബൂലൂൻ,']},
  {'type': 'para:li1', 'content': ['ബെന്യാമീൻ']},
  {'type': 'para:p',
   'cont

In [None]:
my_parser.to_usj(exclude_markers=['s1','h', 'toc1','toc2','mt','b', #inner contents gone
                                  'p','q1','q2', # inner content got preserved and moved one layer up(falttened)
                                  'w','nd',# inner content got preserved...
                                  'tr','tc2','tcr1', 'tc3', 'th1','th2', 'th3','table',# inner content got preserved...
                                  'li1', # inner content got preserved...
                                  'esb','f', #inner contents gone
                                 ],
                 # combine_texts=False
                )

In [9]:
my_parser.to_usj(include_markers=Filter.BCV+Filter.TEXT,
                 combine_texts=True
                )

{'type': 'USJ',
 'version': '0.1.0',
 'content': [{'type': 'book:id',
   'content': ['02EXOGNT92.SFM, Good News Translation, June 2003'],
   'code': 'EXO'},
  {'type': 'chapter:c', 'number': '1', 'sid': 'EXO 1', 'content': []},
  {'type': 'verse:v', 'number': '1', 'sid': 'EXO 1:1', 'content': []},
  'യാക്കോബിനോടുകൂടെ കുടുംബസഹിതം ഈജിപ്റ്റിൽ വന്ന യിസ്രായേൽമക്കളുടെ പേരുകൾ :',
  {'type': 'verse:v', 'number': '2', 'sid': 'EXO 1:2', 'content': []},
  'രൂബേൻ, ശിമെയോൻ, ലേവി,',
  {'type': 'verse:v', 'number': '3', 'sid': 'EXO 1:3', 'content': []},
  'യെഹൂദാ, യിസ്സാഖാർ, സെബൂലൂൻ, ബെന്യാമീൻ',
  {'type': 'verse:v', 'number': '4', 'sid': 'EXO 1:4', 'content': []},
  'ദാൻ, നഫ്താലി, ഗാദ്, ആശേർ.',
  {'type': 'verse:v', 'number': '12-83', 'sid': 'EXO 1:12-83', 'content': []},
  'They presented their offerings in the following order: Day Tribe Leader 1st Judah Nahshon son of Amminadab 2nd Issachar Nethanel son of Zuar 3rd Zebulun Eliab son of Helon',
  {'type': 'verse:v', 'number': '5', 'sid': 'EXO 1:5',

In [None]:
my_parser.to_usj(exclude_markers=Filter.BOOK_HEADERS+Filter.TITLES+Filter.COMMENTS,
                 # combine_texts=False
                )

In [None]:
# For a Flattened JSON
my_parser.to_usj(exclude_markers=Filter.PARAGRAPHS+Filter.CHARACTERS
                 # combine_texts=False
                )

In [None]:
Filter.PARAGRAPHS

In [None]:
# For eliminating the Text as well from the excluded markers(here, pargraphs and characters)
my_parser.to_usj(exclude_markers=Filter.PARAGRAPHS+Filter.CHARACTERS+Filter.TEXT
                 # combine_texts=False
                )

### Converting to Table or List format
 - With filtering similar to to_usj()
 - Table format for better manual visual inspection of data
 - Easily port data to a CSV (and to an excel worksheet)
 - Using this from command-line lets to work with zero code

In [10]:
table_output = my_parser.to_list()
# table_output


In [None]:
print("\n".join(["\t".join(row) for row in table_output]))

In [None]:
table_output = my_parser.to_list(exclude_markers=Filter.BOOK_HEADERS+Filter.TITLES+Filter.COMMENTS)
print("\n".join(["\t".join(row) for row in table_output]))


In [12]:
table_output = my_parser.to_list(include_markers=Filter.BCV+Filter.TEXT)
print("\n".join(["\t".join(row) for row in table_output]))

Book	Chapter	Verse	Text	Type
EXO			02EXOGNT92.SFM, Good News Translation, June 2003	book:id
EXO	1	1	യാക്കോബിനോടുകൂടെ കുടുംബസഹിതം ഈജിപ്റ്റിൽ വന്ന യിസ്രായേൽമക്കളുടെ പേരുകൾ :	
EXO	1	2	രൂബേൻ, ശിമെയോൻ, ലേവി,	
EXO	1	3	യെഹൂദാ, യിസ്സാഖാർ, സെബൂലൂൻ, ബെന്യാമീൻ	
EXO	1	4	ദാൻ, നഫ്താലി, ഗാദ്, ആശേർ.	
EXO	1	12-83	They presented their offerings in the following order: Day Tribe Leader 1st Judah Nahshon son of Amminadab 2nd Issachar Nethanel son of Zuar 3rd Zebulun Eliab son of Helon	
EXO	1	5	യാക്കോബിന്റെ സന്താനപരമ്പരകൾ എല്ലാം കൂടി എഴുപതു പേർ ആയിരുന്നു; യോസേഫ് മുമ്പെ തന്നെ ഈജിപ്റ്റിൽ ആയിരുന്നു. gracious Lord and then a few words later gracious	
EXO	2	1	This is a prayer of the prophet Habakkuk:	
EXO	2	2	O Lord, I have heard of what you have done, and I am filled with awe. Now do again in our times the great deeds you used to do. Be merciful, even when you are angry.	
EXO	2	20	Adam named his wife Eve, because she
was the mother of all human beings.	
EXO	2	21	And the Lord God made clothes out of animal skin

In [None]:
table_output = my_parser.to_list(include_markers=Filter.NOTES)
print("\n".join(["\t".join(row) for row in table_output]))


### Convert to USX

In [None]:
from lxml import etree
usx_elem = my_parser.to_usx()
usx_str = etree.tostring(usx_elem, encoding="unicode", pretty_print=True) 
print(usx_str)

### To work with the syntax tree itself

In [None]:
my_st = my_parser.syntax_tree
print(my_st.children)

In [None]:
# to just view the syntax-tree
print(my_parser.to_syntax_tree())

### USJ to USFM 
 - Round tripping
 - Allows you to make edits on the USJ/SON and then create a USFM with the new data

In [None]:

usj_obj = my_parser.to_usj()

my_parser2 = USFMParser(from_usj=usj_obj)
print(my_parser2.usfm)

In [14]:
usj_obj2 = my_parser2.to_usj(include_markers=Filter.BCV+Filter.TEXT)
my_parser3 = USFMParser(from_usj=usj_obj2)
print(my_parser3.usfm)

\id EXO 02EXOGNT92.SFM, Good News Translation, June 2003
\c 1
\v 1 യാക്കോബിനോടുകൂടെ കുടുംബസഹിതം ഈജിപ്റ്റിൽ വന്ന യിസ്രായേൽമക്കളുടെ പേരുകൾ :\v 2 രൂബേൻ, ശിമെയോൻ, ലേവി,\v 3 യെഹൂദാ, യിസ്സാഖാർ, സെബൂലൂൻ, ബെന്യാമീൻ\v 4 ദാൻ, നഫ്താലി, ഗാദ്, ആശേർ.\v 12-83 They presented their offerings in the following order: Day Tribe Leader 1st Judah Nahshon son of Amminadab 2nd Issachar Nethanel son of Zuar 3rd Zebulun Eliab son of Helon\v 5 യാക്കോബിന്റെ സന്താനപരമ്പരകൾ എല്ലാം കൂടി എഴുപതു പേർ ആയിരുന്നു; യോസേഫ് മുമ്പെ തന്നെ ഈജിപ്റ്റിൽ ആയിരുന്നു. gracious Lord and then a few words later gracious\c 2
\v 1 This is a prayer of the prophet Habakkuk:\v 2 O Lord, I have heard of what you have done, and I am filled with awe. Now do again in our times the great deeds you used to do. Be merciful, even when you are angry.\v 20 Adam named his wife Eve, because she
was the mother of all human beings.\v 21 And the Lord God made clothes out of animal skins for Adam and his wife,
and he clothed them. “Are you the king of the Je