## String
#### Import package 

In [1]:
import numpy as np
import pandas as pd

#### Basic string operation
`str.len(), str.capitalize(), str.swapcase(), str.lower(), str.upper(), str.startswith(), str.index()`

In [2]:
pd.Series(['david', 'jeffrey', 'zoe', 'luna']).str.swapcase()

0      DAVID
1    JEFFREY
2        ZOE
3       LUNA
dtype: object

In [3]:
pd.Series(['david', 'jeffrey', 'zoe', 'luna']).str.len()

0    5
1    7
2    3
3    4
dtype: int64

In [4]:
pd.Series(['david', 'jeffrey', 'zoe', 'luna']).str.startswith('z')

0    False
1    False
2     True
3    False
dtype: bool

#### Check string attributes
`str.islower(), str.isupper(), str.isnumeric(), str.isdecimal()`

In [5]:
pd.Series(['A', '100', 'B', '200']).str.isnumeric()

0    False
1     True
2    False
3     True
dtype: bool

In [6]:
pd.Series(['A', 'b', 'C', 'd']).str.islower()

0    False
1     True
2    False
3     True
dtype: bool

#### Split string by specific character 
`str.split(character)`

In [7]:
pd.Series(['david jefferson', 'jeffrey williams']).str.split()

0     [david, jefferson]
1    [jeffrey, williams]
dtype: object

In [8]:
pd.Series(['david-jefferson', 'jeffrey-williams']).str.split('-')

0     [david, jefferson]
1    [jeffrey, williams]
dtype: object

#### Regular expression
`str.match(regex)`: call `re.match()` for each element, return a list of Boolean objects

In [9]:
pd.Series(['david jefferson', 'jeffrey williams']).str.match('\S+\s+j\S+')

0     True
1    False
dtype: bool

`str.contains(regex)`: call `re.search()` for each element , return a list of Boolean objects

In [10]:
pd.Series(['david jefferson', 'jeffrey williams']).str.contains('\S+\s+j\S+')

0     True
1    False
dtype: bool

`str.extract(regex)`: call `re.match()` for each element, return the matched substring

In [11]:
pd.Series(['david jefferson', 'jeffrey williams']).str.extract('^(\S+)\s+\S+', expand=False)

0      david
1    jeffrey
dtype: object

`str.findall(regex)`

In [12]:
pd.Series(['david jefferson jr.', 'jeffrey williams jr.']).str.findall('(\S+)\s+')

0     [david, jefferson]
1    [jeffrey, williams]
dtype: object

#### Access string objects
`str.get(index)`

In [13]:
pd.Series(['david', 'jeffrey', 'zoe', 'luna']).str.get(0)

0    d
1    j
2    z
3    l
dtype: object

`series.str[index]`

In [14]:
pd.Series(['david', 'jeffrey', 'zoe', 'luna']).str[0]

0    d
1    j
2    z
3    l
dtype: object

`splice(start:end:split)`

In [15]:
pd.Series(['david', 'jeffrey', 'zoe', 'luna']).str.slice(0, 3, 2)

0    dv
1    jf
2    ze
3    ln
dtype: object

`series.str[index]`

In [16]:
pd.Series(['david', 'jeffrey', 'zoe', 'luna']).str[0:3:2]

0    dv
1    jf
2    ze
3    ln
dtype: object