# Did You Know that Numpy Works with Strings?


### 5 Useful functions to Work with String Arrays

Write a short introduction about Numpy and list the chosen functions. 

- np.char.add()
- np.char.capitalize()
- np.char.rjust()
- np.char.zfill()
- np.char.startswith()

The recommended way to run this notebook is to click the "Run" button at the top of this page, and select "Run on Binder". This will run the notebook on mybinder.org, a free online service for running Jupyter notebooks.

In [20]:
!pip install jovian --upgrade -q

In [21]:
import jovian

In [None]:
jovian.commit(project='numpy-array-operations')

<IPython.core.display.Javascript object>

[jovian] Attempting to save notebook..[0m


Let's begin by importing Numpy and listing out the functions covered in this notebook.

In [2]:
import numpy as np

# List of functions explained 
1. np.char.add()
2. np.char.capitalize()
3. np.char.rjust()
4. np.char.zfill()
5. np.char.startswith()

## Function 1 - np.char.add(x1, x2)

Returns concatenation of string arrays x1 and x2, which must hace the same shape. 

In [7]:
# Example 1
arr1 = np.array(["Javier ", "Gender: "])

arr2 = np.array(["Piedragil", "Male "])

print(np.char.add(arr1, arr2))

['Javier Piedragil' 'Gender: Male ']


This example show how the concatenation works "element-wise".

In [18]:
# Example 2 - working
names = np.array([["Javier ", "Nancy "], 
                 ["Molly ", "Neil "]]
               ) 

last_names = np.array([["Piedragil", "Drew"], 
                       ["Hatchet", "Peart"]]
                     ) 

print(np.char.add(names, last_names))

[['Javier Piedragil' 'Nancy Drew']
 ['Molly Hatchet' 'Neil Peart']]


Concatenation between 2 2D arrays shows more clearly the "element-wise" nature of the function.

In [64]:
# Example 3 - breaking (to illustrate when it breaks)
dogs = np.array(["Spot: ", "Fido: "])
        
breeds = np.array(["Boxer", "Bulldog", "Dalmatian"])

print(np.char.add(dogs, breeds))

ValueError: shape mismatch: objects cannot be broadcast to a single shape

The function breaks because dogs and breeds arrays are not the same shape.

Use this function to concatenate strings in an element-wise fashion.

In [23]:
jovian.commit()

<IPython.core.display.Javascript object>

[jovian] Attempting to save notebook..[0m
[jovian] Updating notebook "aakashns/numpy-array-operations" on https://jovian.ml/[0m
[jovian] Uploading notebook..[0m
[jovian] Capturing environment..[0m
[jovian] Committed successfully! https://jovian.ml/aakashns/numpy-array-operations[0m


'https://jovian.ml/aakashns/numpy-array-operations'

## Function 2 - np.char.capitalize(a)

Returns a copy of "a" with only the first character of each element capitalized.

In [65]:
# Example 1

countries = np.array(["méxico", "india", "china", "argentina", "El salvador"])

print(np.char.capitalize(countries))

['México' 'India' 'China' 'Argentina' 'El salvador']


Capitalizes the first character of each array's element.

In [67]:
# Example 2
cities_countries = np.array([["venice", "mumbai", "moscow", "beijing"],
                             ["italy", "india", "russia", "china"]
                            ] 
                           )

print(np.char.capitalize(cities_countries))

[['Venice' 'Mumbai' 'Moscow' 'Beijing']
 ['Italy' 'India' 'Russia' 'China']]


This function also works with multi-dimensional arrays.

In [68]:
# Example 3 - breaking (to illustrate when it breaks)
sentence = 4

print(np.char.capitalize(sentence))

TypeError: string operation on non-string array

When passed a non-character argument, the function throws an error.

Use this function to capitalize only the first character of an array element. Does not capitalize multiple words in an array's element.

In [32]:
jovian.commit()

<IPython.core.display.Javascript object>

[jovian] Attempting to save notebook..[0m
[jovian] Updating notebook "aakashns/numpy-array-operations" on https://jovian.ml/[0m
[jovian] Uploading notebook..[0m
[jovian] Capturing environment..[0m
[jovian] Committed successfully! https://jovian.ml/aakashns/numpy-array-operations[0m


'https://jovian.ml/aakashns/numpy-array-operations'

## Function 3 - np.char.rjust(a, n, fillchar=)

Returns an array of width n with elements of a right-justified, optionally filled to the left with fillchar.

In [53]:
# Example 1

totals = np.array(["2", "34", "150", "999", "1000"])

totals_rj = np.char.rjust(totals,4)

for x in totals_rj:
    
    print(x)

   2
  34
 150
 999
1000


Prints a list of strings right-justified, with space as a fill character.

In [54]:
# Example 2 - working
totals = np.array(["2", "34", "150", "999", "1000"])

totals_rj = np.char.rjust(totals, 4, "0")

for x in totals_rj:
    
    print(x)

0002
0034
0150
0999
1000


Prints strings right-justified, with character "0" as a fill character.

In [55]:
# Example 3 - breaking (to illustrate when it breaks)
totals = np.array([2, 34, 150, 999, 1000])

totals_rj = np.char.rjust(totals, 4, "0")

for x in totals_rj:
    
    print(x)

TypeError: string operation on non-string array

This function only accepts array's elements of type string.

This function is useful to display numbers (as strings) in a conventional right-justified fashion.

In [25]:
jovian.commit()

<IPython.core.display.Javascript object>

[jovian] Attempting to save notebook..[0m
[jovian] Updating notebook "aakashns/numpy-array-operations" on https://jovian.ml/[0m
[jovian] Uploading notebook..[0m
[jovian] Capturing environment..[0m
[jovian] Committed successfully! https://jovian.ml/aakashns/numpy-array-operations[0m


'https://jovian.ml/aakashns/numpy-array-operations'

## Function 4 - np.char.zfill(a, width)

This function returns an array of elements of a left-filled with zeroes.

In [70]:
# Example 1 - working
numbers = np.array(["1", "2", "3", "4", "5",
                    "6", "7", "8", "9", "10"])

numbers_zf = np.char.zfill(numbers, 2)

for x in numbers_zf:
    
    print(x)

01
02
03
04
05
06
07
08
09
10


Similar to np.np.char.rjust(), but the fillchar is fixed to  "0".

In [5]:
# Example 2
numbers = np.array([["1", "2", "3", "4", "5","6", "7", "8", "9", "10"],
                    ["45", "77", "500", "999", "99", "12", "15", "80", "90", "100"]
                   ]
                  )

numbers_zf = np.char.zfill(numbers, 4)

for x in numbers_zf:
    
    for y in x:
        
        print(y)

0001
0002
0003
0004
0005
0006
0007
0008
0009
0010
0045
0077
0500
0999
0099
0012
0015
0080
0090
0100


This function also works with multi-dimensional arrays.

In [9]:
# Example 3 - breaking (to illustrate when it breaks)
numbers = np.array([["4", "2", "3", "4", "5","6", "7", "8", "9", "10"],
                    ["45", "77", "500", "999", "99", "12", "15", "80", "90", "100"]
                   ]
                  )

numbers_zf = np.char.zfill(numbers)

for x in numbers_zf:
    
    for y in x:
        
        print(y)

TypeError: _zfill_dispatcher() missing 1 required positional argument: 'width'

The width argument is not optional.

This function becomes handy when you need to print a nicely aligned list of numbers.

In [26]:
jovian.commit()

<IPython.core.display.Javascript object>

[jovian] Attempting to save notebook..[0m
[jovian] Updating notebook "aakashns/numpy-array-operations" on https://jovian.ml/[0m
[jovian] Uploading notebook..[0m
[jovian] Capturing environment..[0m
[jovian] Committed successfully! https://jovian.ml/aakashns/numpy-array-operations[0m


'https://jovian.ml/aakashns/numpy-array-operations'

## Function 5 - np.char.startswith(a, prefix, start, end)

This function returns boolean array with True elements where the string elements in *a* starts with *prefix*. Optionally, you can define *start* and end *position* of *prefix*.

In [18]:
# Example 1 - working
urls = np.array(["https://medium.com", "https://github.com", "https:office.com"])

https_urls = np.char.startswith(urls, "https")

print(https_urls)

[ True  True  True]


Returns an array of booleans, which each element is True if the string element in urls begins with "https".

In [30]:
# Example 2 - working
urls = np.array(["https://medium.com", "https://github.com", "https:office.com"])

https_urls = np.char.startswith(urls, ".com", -4)

print(https_urls)

[ True  True  True]


Returns an array of booleans, which each element is True if the string element in urls ends with ".com".

In [31]:
# Example 3
urls = np.array(["https://medium.com", "https://github.com", "https:office.com"])

https_urls = np.char.startswith(urls, 4)

print(https_urls)

TypeError: startswith first arg must be str or a tuple of str, not numpy.int32

Prefix must be of type string.

This function is useful to search for a specific prefix in a string array.

In [27]:
jovian.commit()

<IPython.core.display.Javascript object>

[jovian] Attempting to save notebook..[0m
[jovian] Updating notebook "aakashns/numpy-array-operations" on https://jovian.ml/[0m
[jovian] Uploading notebook..[0m
[jovian] Capturing environment..[0m
[jovian] Committed successfully! https://jovian.ml/aakashns/numpy-array-operations[0m


'https://jovian.ml/aakashns/numpy-array-operations'

## Conclusion

In this notebook we review some useful functions to process strings within a numpy array. There are many additional functions to work with string arrays, so I encourage you to look into the reference links.

## Reference Links

* Numpy official documentation: https://numpy.org/doc/stable/reference/routines.char.html
* Numpy string functions with examples: https://www.tutorialspoint.com/numpy/numpy_string_functions.htm

In [30]:
jovian.commit()

<IPython.core.display.Javascript object>

[jovian] Attempting to save notebook..[0m
[jovian] Updating notebook "aakashns/numpy-array-operations" on https://jovian.ml/[0m
[jovian] Uploading notebook..[0m
[jovian] Capturing environment..[0m
[jovian] Committed successfully! https://jovian.ml/aakashns/numpy-array-operations[0m


'https://jovian.ml/aakashns/numpy-array-operations'