#### Pandas Tutorial - Part 52

This notebook covers various Series string methods including:
- Extracting elements with `str.get()`
- Finding substrings with `str.index()`
- Joining strings with `str.join()`
- Removing characters with `str.strip()`
- Changing case with `str.swapcase()`

In [1]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

%matplotlib inline

##### Extracting Elements with `str.get()`

The `str.get()` method extracts an element from each component at a specified position.

In [2]:
# Create a Series with different types of elements
s = pd.Series(["String",
               (1, 2, 3),
               ["a", "b", "c"],
               123,
               -456,
               {1: "Hello", "2": "World"}])
print("Original Series:")
print(s)

Original Series:
0                        String
1                     (1, 2, 3)
2                     [a, b, c]
3                           123
4                          -456
5    {1: 'Hello', '2': 'World'}
dtype: object


In [3]:
# Extract element at position 1
result = s.str.get(1)
print("Result of get(1):")
print(result)

Result of get(1):
0        t
1        2
2        b
3      NaN
4      NaN
5    Hello
dtype: object


In [4]:
# Extract element at position -1 (last element)
result_last = s.str.get(-1)
print("Result of get(-1):")
print(result_last)

Result of get(-1):
0       g
1       3
2       c
3     NaN
4     NaN
5    None
dtype: object


In [5]:
# Create a Series with strings only
s_strings = pd.Series(['apple', 'banana', 'cherry'])
print("Series with strings:")
print(s_strings)

Series with strings:
0     apple
1    banana
2    cherry
dtype: object


In [6]:
# Extract first character
first_char = s_strings.str.get(0)
print("First character of each string:")
print(first_char)

First character of each string:
0    a
1    b
2    c
dtype: object


In [7]:
# Extract third character
third_char = s_strings.str.get(2)
print("Third character of each string:")
print(third_char)

Third character of each string:
0    p
1    n
2    e
dtype: object


In [8]:
# Create a Series with lists
s_lists = pd.Series([
    [1, 2, 3, 4],
    ['a', 'b', 'c'],
    [True, False, True]
])
print("Series with lists:")
print(s_lists)

Series with lists:
0           [1, 2, 3, 4]
1              [a, b, c]
2    [True, False, True]
dtype: object


In [9]:
# Extract second element from each list
second_elem = s_lists.str.get(1)
print("Second element from each list:")
print(second_elem)

Second element from each list:
0        2
1        b
2    False
dtype: object


##### Finding Substrings with `str.index()`

The `str.index()` method returns the lowest index where the substring is found. Raises a ValueError if not found.

In [10]:
# Create a Series with strings
s = pd.Series(['apple', 'banana', 'cherry'])
print("Original Series:")
print(s)

Original Series:
0     apple
1    banana
2    cherry
dtype: object


In [11]:
# Find index of substring 'a'
try:
    result = s.str.index('a')
    print("Result of index('a'):")
    print(result)
except ValueError as e:
    print(f"Error: {e}")

Error: substring not found


In [12]:
# Find index of substring 'an'
try:
    result_an = s.str.index('an')
    print("Result of index('an'):")
    print(result_an)
except ValueError as e:
    print(f"Error: {e}")

Error: substring not found


In [13]:
# Find index of substring 'z'
try:
    result_z = s.str.index('z')
    print("Result of index('z'):")
    print(result_z)
except ValueError as e:
    print(f"Error: {e}")

Error: substring not found


In [14]:
# Find index of substring 'a' with start index
try:
    result_start = s.str.index('a', 1)
    print("Result of index('a', 1):")
    print(result_start)
except ValueError as e:
    print(f"Error: {e}")

Error: substring not found


In [15]:
# Find index of substring 'a' with start and end indices
try:
    result_start_end = s.str.index('a', 1, 3)
    print("Result of index('a', 1, 3):")
    print(result_start_end)
except ValueError as e:
    print(f"Error: {e}")

Error: substring not found


##### Joining Strings with `str.join()`

The `str.join()` method joins lists contained as elements in the Series with the passed delimiter.

In [16]:
# Create a Series with lists of strings
s = pd.Series([
    ['lion', 'elephant', 'zebra'],
    ['cat', 'dog', 'mouse'],
    ['apple', 'banana', 'cherry']
])
print("Original Series:")
print(s)

Original Series:
0    [lion, elephant, zebra]
1          [cat, dog, mouse]
2    [apple, banana, cherry]
dtype: object


In [17]:
# Join with comma
result = s.str.join(', ')
print("Result of join(', '):")
print(result)

Result of join(', '):
0    lion, elephant, zebra
1          cat, dog, mouse
2    apple, banana, cherry
dtype: object


In [18]:
# Join with hyphen
result_hyphen = s.str.join('-')
print("Result of join('-'):")
print(result_hyphen)

Result of join('-'):
0    lion-elephant-zebra
1          cat-dog-mouse
2    apple-banana-cherry
dtype: object


In [19]:
# Join with empty string
result_empty = s.str.join('')
print("Result of join(''):")
print(result_empty)

Result of join(''):
0    lionelephantzebra
1          catdogmouse
2    applebananacherry
dtype: object


In [20]:
# Create a Series with lists that contain non-string elements
s_mixed = pd.Series([
    ['lion', 'elephant', 'zebra'],
    [1.1, 2.2, 3.3],
    ['apple', 'banana', 'cherry']
])
print("Series with mixed element types:")
print(s_mixed)

Series with mixed element types:
0    [lion, elephant, zebra]
1            [1.1, 2.2, 3.3]
2    [apple, banana, cherry]
dtype: object


In [21]:
# Join with comma
result_mixed = s_mixed.str.join(', ')
print("Result of join(', ') with mixed types:")
print(result_mixed)

Result of join(', ') with mixed types:
0    lion, elephant, zebra
1                      NaN
2    apple, banana, cherry
dtype: object


##### Removing Characters with `str.strip()`

The `str.strip()` method removes leading and trailing characters from each string in the Series.

In [22]:
# Create a Series with strings that have whitespace and special characters
s = pd.Series(['1. Ant.             ', '2. Bee!\n', '3. Cat?\t', np.nan])
print("Original Series:")
print(s)

Original Series:
0    1. Ant.             
1               2. Bee!\n
2               3. Cat?\t
3                     NaN
dtype: object


In [23]:
# Strip whitespace
result = s.str.strip()
print("Result of strip():")
print(result)

Result of strip():
0    1. Ant.
1    2. Bee!
2    3. Cat?
3        NaN
dtype: object


In [24]:
# Strip specific characters from the left
result_lstrip = s.str.lstrip('123.')
print("Result of lstrip('123.'):")
print(result_lstrip)

Result of lstrip('123.'):
0     Ant.             
1                Bee!\n
2                Cat?\t
3                   NaN
dtype: object


In [25]:
# Strip specific characters from the right
result_rstrip = s.str.rstrip('.!? \n\t')
print("Result of rstrip('.!? \\n\\t'):")
print(result_rstrip)

Result of rstrip('.!? \n\t'):
0    1. Ant
1    2. Bee
2    3. Cat
3       NaN
dtype: object


In [26]:
# Strip specific characters from both sides
result_both = s.str.strip('123.!? \n\t')
print("Result of strip('123.!? \\n\\t'):")
print(result_both)

Result of strip('123.!? \n\t'):
0    Ant
1    Bee
2    Cat
3    NaN
dtype: object


In [27]:
# Create a Series with strings that have specific characters to strip
s_special = pd.Series(['###hello###', '***world***', '===python==='])
print("Series with special characters:")
print(s_special)

Series with special characters:
0     ###hello###
1     ***world***
2    ===python===
dtype: object


In [28]:
# Strip specific characters
result_special = s_special.str.strip('#*=')
print("Result of strip('#*='):")
print(result_special)

Result of strip('#*='):
0     hello
1     world
2    python
dtype: object


##### Changing Case with `str.swapcase()`

The `str.swapcase()` method converts uppercase characters to lowercase and lowercase characters to uppercase.

In [29]:
# Create a Series with strings of different cases
s = pd.Series(['lower', 'CAPITALS', 'this is a sentence', 'SwApCaSe'])
print("Original Series:")
print(s)

Original Series:
0                 lower
1              CAPITALS
2    this is a sentence
3              SwApCaSe
dtype: object


In [30]:
# Convert to lowercase
result_lower = s.str.lower()
print("Result of lower():")
print(result_lower)

Result of lower():
0                 lower
1              capitals
2    this is a sentence
3              swapcase
dtype: object


In [31]:
# Convert to uppercase
result_upper = s.str.upper()
print("Result of upper():")
print(result_upper)

Result of upper():
0                 LOWER
1              CAPITALS
2    THIS IS A SENTENCE
3              SWAPCASE
dtype: object


In [32]:
# Convert to title case
result_title = s.str.title()
print("Result of title():")
print(result_title)

Result of title():
0                 Lower
1              Capitals
2    This Is A Sentence
3              Swapcase
dtype: object


In [33]:
# Capitalize (first character uppercase, rest lowercase)
result_capitalize = s.str.capitalize()
print("Result of capitalize():")
print(result_capitalize)

Result of capitalize():
0                 Lower
1              Capitals
2    This is a sentence
3              Swapcase
dtype: object


In [34]:
# Swap case
result_swapcase = s.str.swapcase()
print("Result of swapcase():")
print(result_swapcase)

Result of swapcase():
0                 LOWER
1              capitals
2    THIS IS A SENTENCE
3              sWaPcAsE
dtype: object


In [35]:
# Create a Series with mixed case strings
s_mixed = pd.Series(['Hello World', 'Python IS Fun', '123ABC', 'aBcDeF'])
print("Series with mixed case strings:")
print(s_mixed)

Series with mixed case strings:
0      Hello World
1    Python IS Fun
2           123ABC
3           aBcDeF
dtype: object


In [36]:
# Swap case
result_mixed = s_mixed.str.swapcase()
print("Result of swapcase():")
print(result_mixed)

Result of swapcase():
0      hELLO wORLD
1    pYTHON is fUN
2           123abc
3           AbCdEf
dtype: object


##### Conclusion

In this notebook, we've explored various Series string methods in pandas:

1. `str.get()`: Extracts an element from each component at a specified position, useful for working with strings, lists, tuples, and dictionaries.
2. `str.index()`: Returns the lowest index where the substring is found, raising a ValueError if not found, providing a more strict alternative to `str.find()`.
3. `str.join()`: Joins lists contained as elements in the Series with the passed delimiter, allowing for flexible string concatenation.
4. `str.strip()`, `str.lstrip()`, and `str.rstrip()`: Remove leading and trailing characters from each string in the Series, useful for cleaning text data.
5. `str.swapcase()` and other case conversion methods: Change the case of characters in strings, providing various options for text transformation.

These methods are essential tools for string manipulation and text processing in pandas, allowing for flexible and powerful operations on your data.