## Text Processing Services

### textwrap — Text wrapping and filling

The `textwrap` module provides some convenience functions, as well as `TextWrapper`, the class that does all the work. If you’re just wrapping or filling one or two text strings, the convenience functions should be good enough; otherwise, you should use an instance of `TextWrapper` for efficiency

* **textwrap.wrap** - Wraps the single paragraph in text (a string) so every line is at most width characters long. Returns a list of output lines, without final newlines. Optional keyword arguments correspond to the instance attributes of TextWrapper, documented below. width defaults to 70.

In [2]:
txt = "    The   `textwrap` module provides some convenience functions, as well as `TextWrapper`, the class that does all the work. If you’re just wrapping or filling one or two text strings, the convenience functions should be good enough; otherwise, you should use an instance of `TextWrapper` for efficiency"
print(txt)

    The   `textwrap` module provides some convenience functions, as well as `TextWrapper`, the class that does all the work. If you’re just wrapping or filling one or two text strings, the convenience functions should be good enough; otherwise, you should use an instance of `TextWrapper` for efficiency


In [3]:
import textwrap
print(textwrap.wrap(txt))

['    The   `textwrap` module provides some convenience functions, as', 'well as `TextWrapper`, the class that does all the work. If you’re', 'just wrapping or filling one or two text strings, the convenience', 'functions should be good enough; otherwise, you should use an instance', 'of `TextWrapper` for efficiency']


In [4]:
# Default max width is 70 characters

for t in textwrap.wrap(txt):
    print(t)

    The   `textwrap` module provides some convenience functions, as
well as `TextWrapper`, the class that does all the work. If you’re
just wrapping or filling one or two text strings, the convenience
functions should be good enough; otherwise, you should use an instance
of `TextWrapper` for efficiency


In [5]:
for t in textwrap.wrap(txt, width=100):
    print(t)

    The   `textwrap` module provides some convenience functions, as well as `TextWrapper`, the class
that does all the work. If you’re just wrapping or filling one or two text strings, the convenience
functions should be good enough; otherwise, you should use an instance of `TextWrapper` for
efficiency


In [6]:
for t in textwrap.wrap(txt.strip(), width=60, initial_indent="****** License ******   \n"):
    print(t)

****** License ******   
The   `textwrap` module provides
some convenience functions, as well as `TextWrapper`, the
class that does all the work. If you’re just wrapping or
filling one or two text strings, the convenience functions
should be good enough; otherwise, you should use an instance
of `TextWrapper` for efficiency


* **`textwrap.fill`** - Wraps the single paragraph in text, and returns a single string containing the wrapped paragraph. fill() is shorthand for

In [7]:
x = textwrap.fill(txt.strip(), width=60, initial_indent="****** License ******   \n")
print(type(x))
print(x)

<class 'str'>
****** License ******   
The   `textwrap` module provides
some convenience functions, as well as `TextWrapper`, the
class that does all the work. If you’re just wrapping or
filling one or two text strings, the convenience functions
should be good enough; otherwise, you should use an instance
of `TextWrapper` for efficiency


* **`textwrap.shorten`** - Collapse and truncate the given text to fit in the given width. First the whitespace in text is collapsed (all whitespace is replaced by single spaces). If the result fits in the width, it is returned. Otherwise, enough words are dropped from the end so that the remaining words plus the placeholder fit within width

In [10]:
print(textwrap.shorten(txt, width=60))
print(textwrap.shorten(txt, width=50))
s = textwrap.shorten(txt, width=25, placeholder=" ...")
print(s)
print(len(s))
s = textwrap.shorten(txt, width=20, placeholder=" ->")
print(s)
print(len(s))

The `textwrap` module provides some convenience [...]
The `textwrap` module provides some [...]
The `textwrap` module ...
25
The `textwrap` ->
17


* **textwrap.dedent** - Remove any common leading whitespace from every line in text, while taking in account the relative spaces between individual lines.

In [11]:
txt = """   
   This is a text 
    This is good 
  but not better.... \t 
"""
print(txt)

   
   This is a text 
    This is good 
  but not better.... 	 



In [9]:
print(textwrap.dedent(txt))


 This is a text 
  This is good 
but not better.... 	 



* **textwrap.indent** - Add prefix to the beginning of selected lines in text.

In [13]:
print(textwrap.indent(txt.strip().replace(",", '\n'), '$ '))

$ This is a text 
$     This is good 
$   but not better....


### unicodedata

Get more details from unicode characters using `unicodedata` as shown in the example below

In [14]:
import unicodedata

for i, c in enumerate(range(0x0958, 0x0968)):
    c = chr(c)
    print(i, '%04x' % ord(c), unicodedata.category(c), end=" : ")
    print(c, unicodedata.name(c))

0 0958 Lo : क़ DEVANAGARI LETTER QA
1 0959 Lo : ख़ DEVANAGARI LETTER KHHA
2 095a Lo : ग़ DEVANAGARI LETTER GHHA
3 095b Lo : ज़ DEVANAGARI LETTER ZA
4 095c Lo : ड़ DEVANAGARI LETTER DDDHA
5 095d Lo : ढ़ DEVANAGARI LETTER RHA
6 095e Lo : फ़ DEVANAGARI LETTER FA
7 095f Lo : य़ DEVANAGARI LETTER YYA
8 0960 Lo : ॠ DEVANAGARI LETTER VOCALIC RR
9 0961 Lo : ॡ DEVANAGARI LETTER VOCALIC LL
10 0962 Mn : ॢ DEVANAGARI VOWEL SIGN VOCALIC L
11 0963 Mn : ॣ DEVANAGARI VOWEL SIGN VOCALIC LL
12 0964 Po : । DEVANAGARI DANDA
13 0965 Po : ॥ DEVANAGARI DOUBLE DANDA
14 0966 Nd : ० DEVANAGARI DIGIT ZERO
15 0967 Nd : १ DEVANAGARI DIGIT ONE


### print

In [15]:
def print_format_table():
    """
    prints table of formatted text format options
    """
    for style in range(8):
        for fg in range(30,38):
            s1 = ''
            for bg in range(40,48):
                format = ';'.join([str(style), str(fg), str(bg)])
                s1 += '\x1b[%sm %s \x1b[0m' % (format, format)
            print(s1)
        print('\n')

print_format_table()

[0;30;40m 0;30;40 [0m[0;30;41m 0;30;41 [0m[0;30;42m 0;30;42 [0m[0;30;43m 0;30;43 [0m[0;30;44m 0;30;44 [0m[0;30;45m 0;30;45 [0m[0;30;46m 0;30;46 [0m[0;30;47m 0;30;47 [0m
[0;31;40m 0;31;40 [0m[0;31;41m 0;31;41 [0m[0;31;42m 0;31;42 [0m[0;31;43m 0;31;43 [0m[0;31;44m 0;31;44 [0m[0;31;45m 0;31;45 [0m[0;31;46m 0;31;46 [0m[0;31;47m 0;31;47 [0m
[0;32;40m 0;32;40 [0m[0;32;41m 0;32;41 [0m[0;32;42m 0;32;42 [0m[0;32;43m 0;32;43 [0m[0;32;44m 0;32;44 [0m[0;32;45m 0;32;45 [0m[0;32;46m 0;32;46 [0m[0;32;47m 0;32;47 [0m
[0;33;40m 0;33;40 [0m[0;33;41m 0;33;41 [0m[0;33;42m 0;33;42 [0m[0;33;43m 0;33;43 [0m[0;33;44m 0;33;44 [0m[0;33;45m 0;33;45 [0m[0;33;46m 0;33;46 [0m[0;33;47m 0;33;47 [0m
[0;34;40m 0;34;40 [0m[0;34;41m 0;34;41 [0m[0;34;42m 0;34;42 [0m[0;34;43m 0;34;43 [0m[0;34;44m 0;34;44 [0m[0;34;45m 0;34;45 [0m[0;34;46m 0;34;46 [0m[0;34;47m 0;34;47 [0m
[0;35;40m 0;35;40 [0m[0;35;41m 0;35;41 [0m[0;35;42m 0;35;42 [0m[0;35

In [18]:
mystr = """The   `textwrap` module provides some convenience functions, as well as `TextWrapper`, the class that does all the work. If you’re just wrapping or
filling one or two text strings, the convenience functions should be good enough; otherwise, you should use an instance of `TextWrapper` for efficiency"""

print(mystr.ljust(100, "."))
print(mystr.rjust(140, "~"))

The   `textwrap` module provides some convenience functions, as well as `TextWrapper`, the class that does all the work. If you’re just wrapping or
filling one or two text strings, the convenience functions should be good enough; otherwise, you should use an instance of `TextWrapper` for efficiency
The   `textwrap` module provides some convenience functions, as well as `TextWrapper`, the class that does all the work. If you’re just wrapping or
filling one or two text strings, the convenience functions should be good enough; otherwise, you should use an instance of `TextWrapper` for efficiency
