## How to use the usfm-grammar python APIs


### Installation

#### From PyPI

In [None]:
# Good to set up a virtual environment
# requires python >= 3.10
!pip install usfm-grammar

#### From code base

In [None]:
! pip install -e ./../python-usfm-parser/ # from the code base

In [None]:
! usfm-grammar -h # to view the command line options

In [None]:
# to bring the changes, after update on the local tree-sitter-usfm grammar
# in terminal from the project root run the following
# >>> python python-usfm-parser/src/grammar_rebuild.py ./tree-sitter-usfm3/ python-usfm-parser/src/usfm_grammar/my-languages.so

### Parsing an input USFM

In [None]:
from usfm_grammar import USFMParser, Filter # importing from the local module, not from an installed library

In [None]:
input_usfm_str = '''
\\id EXO 02EXOGNT92.SFM, Good News Translation, June 2003
\\h പുറപ്പാടു്
\\toc1 പുറപ്പാടു്
\\toc2 പുറപ്പാടു്
\\mt പുറപ്പാടു്
\\c 1
\\p
\\v 1 യാക്കോബിനോടുകൂടെ കുടുംബസഹിതം ഈജിപ്റ്റിൽ വന്ന 
\\p യിസ്രായേൽമക്കളുടെ പേരുകൾ : 
\\v 2 രൂബേൻ, ശിമെയോൻ, ലേവി,
\\v 3 
\\li1 യെഹൂദാ, 
\\li1 യിസ്സാഖാർ, 
\\li1 സെബൂലൂൻ, 
\\li1 ബെന്യാമീൻ
\\p
\\v 4 ദാൻ, നഫ്താലി, ഗാദ്, ആശേർ.
\\v 12-83 They presented their offerings in the following order:
\\tr \\th1 Day \\th2 Tribe \\th3 Leader
\\tr \\tcr1 1st \\tc2 Judah \\tc3 Nahshon son of Amminadab
\\tr \\tcr1 2nd \\tc2 Issachar \\tc3 Nethanel son of Zuar
\\tr \\tcr1 3rd \\tc2 Zebulun \\tc3 Eliab son of Helon
\\p
\\v 5 യാക്കോബിന്റെ സന്താനപരമ്പരകൾ എല്ലാം കൂടി എഴുപതു പേർ ആയിരുന്നു; യോസേഫ് മുമ്പെ തന്നെ ഈജിപ്റ്റിൽ ആയിരുന്നു. \w gracious|grace\w* and then a few words later \w gracious|lemma="grace" x-myattr="metadata"\w*
\\c 2
\\s1 A Prayer of Habakkuk
\\p
\\v 1 This is a prayer of the prophet Habakkuk:
\\b
\\q1
\\v 2 O \\nd Lord\\nd*, I have heard of what you have done,
\\q2 and I am filled with awe.
\\q1 Now do again in our times
\\q2 the great deeds you used to do.
\\q1 Be merciful, even when you are angry.
\\p
\\v 20 Adam \\f + \\fr 3.20: \\fk Adam: \\ft This name in Hebrew means “all human beings.”\\f*
named his wife Eve, \\f + \\fr 3.20: \\fk Eve: \\ft This name sounds similar to the Hebrew
word for “living,” which is rendered in this context as “human beings.”\\f* because she
was the mother of all human beings.
\\v 21 And the \\nd Lord\\nd* God made clothes out of animal skins for Adam and his wife,
and he clothed them.
\\qt-s |sid="qt_123" who="Pilate"\\*“Are you the king of the Jews?”\\qt-e |eid="qt_123"\\*
\\esb \\cat History\\cat*
\\ms Fish and Fishing
\\p In Jesus' time, fishing took place mostly on lake Galilee, because Jewish people
could not use many of the harbors along the coast of the Mediterranean Sea, since these
harbors were often controlled by unfriendly neighbors. The most common fish in the Lake
of Galilee were carp and catfish. \\wj The Law of Moses \\wj* allowed people to eat any fish with
fins and scales, but since catfish lack scales (as do eels and sharks) they were not to
be eaten. Fish were also probably brought from Tyre and Sidon,
where they were dried and salted.
...
\\p Among early Christians, the fish was a favorite image for Jesus, because the Greek
word for fish ( \\tl ichthus\\tl* ) consists of the first letters of the Greek words that
tell who Jesus is: \\fig Christian Fish Image\\fig*
\\esbe
'''

In [None]:
my_parser = USFMParser(input_usfm_str)

In [None]:
# To validate the input USFM file. 
# The rest of operations will work even if there are small errors
my_parser.errors 

#### Converting to other formats and extracting specific contents via filters

In [None]:
my_parser.to_usj()

In [None]:
# my_parser.to_dict([Filter.SCRIPTURE_TEXT])

In [None]:
# my_parser.to_dict([Filter.NOTES])

In [None]:
# my_parser.to_dict([Filter.SCRIPTURE_TEXT, Filter.PARAGRAPHS, Filter.TITLES])

In [None]:
table_output = my_parser.to_list()
table_output


In [None]:
print("\n".join(["\t".join(row) for row in table_output]))

In [None]:
# table_output = my_parser.to_list([Filter.MILESTONES, Filter.NOTES])
# print("\n".join(["\t".join(row) for row in table_output]))


In [None]:
# table_output = my_parser.to_list([Filter.SCRIPTURE_TEXT])
# print("\n".join(["\t".join(row) for row in table_output]))


In [None]:
from lxml import etree
usx_elem = my_parser.to_usx()
usx_str = etree.tostring(usx_elem, encoding="unicode", pretty_print=True) 
print(usx_str)

#### To work with the syntax tree itself

In [None]:
my_st = my_parser.syntax_tree
print(my_st.children)

In [None]:
# to just view the syntax-tree
print(my_parser.to_syntax_tree())