<a href="https://colab.research.google.com/github/krauseannelize/nb-py-ms-exercises/blob/main/notebooks/13_strings_lists_in_python.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# 13 | Strings & Lists in Python

## Strings

- **String** is a sequence of characters that are enclosed in single quotes 'Hello' or double-quotes "Hello".
- Strings are an _immutable_ data type and _cannot_ be changed after it has been created.
- Every character in a string has an index position starting from 0.
- Strings are used for text processing, file manipulation and more.

## String Operations

- **Concatenation (`+`)**: Joining two or more strings together
- **Repetition (`*`)**: Repeating strings a certain amount of times.

In [5]:
# creating a string
greeting = "Hello, World!"
print(greeting)

# finding data type
print(type(greeting))

# concatenation
subject = "Python"
message = greeting + " Welcome to " + subject + "!"
print(message)

# repetition
print((greeting + " ") * 3)

Hello, World!
<class 'str'>
Hello, World! Welcome to Python!
Hello, World! Hello, World! Hello, World! 


## Indexing

- String indexing is how you access individual characters within a string.
- Index position can be accessed using `[]`
- You can use:
  - **positive indexing:** start counting from the beginning of the string, and the first character is always at index 0
  - **negative indexing:** start counting from the end of the string, and the very last character is at index -1

| Character | P | Y | T | H | O | N |
| :--- | --- | --- | --- | --- | --- | --- |
| Positive Index | 0 | 1 | 2 | 3 | 4 | 5 |
| Negative Index | -6 | -5 | -4 | -3 | -2 | -1 |

In [9]:
# accessing characters in a string
text = "Python"
character_2 = text[1] # positive indexing
character_5 = text[-2] # negative indexing

print(f'In the word "{text}", the second character is "{character_2}" and the fifth character is "{character_5}".')

In the word "Python", the second character is "y" and the fifth character is "o".


## Substrings & Slicing

Slicing will extract a portion of a string using the syntax `[start:stop:step]`:

- `start` is the index where slicing should start, is `inclusive` and is `beginning of the string` by default
- `stop` is the index where slicing should end, is `exclusive` and is the `end of the string` by default
- `step` is how many characters to skip and the default value is `1`, meaning no characters are skipped

| Character | H | E | L | L | O | , |  | W | O | R | L | D | ! |
| :--- | --- | --- | --- | --- | --- | --- |--- |--- |--- |--- |--- |--- |--- |
| Positive Index | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 |
| Negative Index | -13 | -12 | -11 | -10 | -9 | -8 | -7 | -6 | -5 | -4 | -3 | -2 | -1 |

In [28]:
text = "Hello, World!"

# how many characters in the string
print(len(text))

13


In [29]:
# slicing by omission
print(text[::])

Hello, World!


In [30]:
# extract "World!" with positive indexing
# from index 7 of the string up to, but not including, index 13
print(text[7:13])

World!


In [31]:
# extract "Hello" with negative indexing
# from beginning of string up to, but not including, index -8
print(text[:-8])

Hello


In [32]:
# extracting every second letter with step 2
print(text[::2])

Hlo ol!


In [33]:
# reversing the string with step -1
print(text[::-1])

!dlroW ,olleH


## String Methods & Operators

String methods are built-in functions that allow you to **manipulate a string**. A key rule is that they do **not** change the original string. Instead, they **return a new string** with the changes.

- `upper()` and `lower()`: Change the case of a string.
- `strip()`: Removes leading and trailing whitespace.
- `replace()`: Finds and replaces a substring.
- `split()`: Breaks a string into a list of substrings based on a delimiter.
- `find()`: Locates the starting index of a substring.

The `in` operator is a simple and efficient way to check if a substring exists within a string. It returns `True` or `False`.

In [35]:
# String methods in action
text = " Hello, World! "

# .strip() returns a new string without leading/trailing spaces
stripped_text = text.strip()
print(f'"{text}" becomes "{stripped_text}"')

" Hello, World! " becomes "Hello, World!"


In [37]:
# .replace() returns a new string with replacements
# original string is unchanged, so leading/trailing spaces still included
replaced_text = text.replace("World", "All")
print(f'"{text}" becomes "{replaced_text}"')

" Hello, World! " becomes " Hello, All! "


In [38]:
# repeat .replace() on the new string stripped of leading/trailing spaces
replaced_text = stripped_text.replace("World", "All")
print(f'"{text}" becomes "{replaced_text}"')

" Hello, World! " becomes "Hello, All!"


In [39]:
# .upper() and .lower() change the case of the string.
uppercase_text = stripped_text.upper()
lowercase_text = stripped_text.lower()
print(f".upper() result on stripped string: '{uppercase_text}'")
print(f".lower() result on stripped string: '{lowercase_text}'")

.upper() result on stripped string: 'HELLO, WORLD!'
.lower() result on stripped string: 'hello, world!'


In [40]:
# .find() locates the first occurrence of a substring and returns its index
# returns -1 if the substring is not found.
find_index = stripped_text.find("World")
find_not_found = stripped_text.find("planet")
print(f".find('World') result: {find_index}")
print(f".find('planet') result: {find_not_found}")

.find('World') result: 7
.find('planet') result: -1


In [42]:
# .split() breaks a string into a list of substrings based on a delimiter
split_list = stripped_text.split(", ")
print(f".split(' ') result on stripped string: {split_list}")

.split(' ') result on stripped string: ['Hello', 'World!']


In [43]:
# 'in' operator checks for the existence of a substring
# returns a boolean value (True or False).
print(f"Is 'World' in the stripped string? {'World' in stripped_text}")
print(f"Is 'world' in the stripped string? {'world' in stripped_text}") # case sensitivity
print(f"Is 'All' in the stripped string? {'All' in stripped_text}")

Is 'World' in the stripped string? True
Is 'world' in the stripped string? False
Is 'All' in the stripped string? False


In [45]:
from re import sub
# write a function to check if a substring exist in a text
def check_substring(text, substring):
  if substring in text:
    return f"'{substring}' found in text."
  else:
    return f"'{substring}' not found in text."

text = "Hello, World!"
substring = "World"

print(check_substring(text, substring))

# reuse variables
text = "Python is fun!"
substring = "Java"

print(check_substring(text, substring))

'World' found in text.
'Java' not found in text.


## Lists

- Lists are **ordered**, **mutable** collections of items.
- They can contain various objects of different data types.
- Lists are created using square brackets `[]`.

In [48]:
# Creating lists with various data types
numbers_list = [73, 5, 19, 43]
mixed_list = ['Bob', 'The Builder', 23, 1.78, 'Carpenter']
empty_list = []

print(f'The numbers list is "{numbers_list}" and is of type "{type(numbers_list)}"')
print(f'The mixed list is "{mixed_list}" and is of type "{type(mixed_list)}"')
print(f'The empty list is "{empty_list}" and is of type "{type(empty_list)}"')

The numbers list is "[73, 5, 19, 43]" and is of type "<class 'list'>"
The mixed list is "['Bob', 'The Builder', 23, 1.78, 'Carpenter']" and is of type "<class 'list'>"
The empty list is "[]" and is of type "<class 'list'>"


In [50]:
# Lists are ordered, so you can access items by their index
print(f'The first item in the mixed list is "{mixed_list[0]}"')
print(f'The third item in the mixed list is "{mixed_list[2]}"')
print(f'The last item in the mixed list is "{mixed_list[-1]}"')

The first item in the mixed list is "Bob"
The third item in the mixed list is "23"
The last item in the mixed list is "Carpenter"


In [52]:
# Lists are MUTABLE, meaning their contents can be changed
mixed_list[1] = "The Baker"
mixed_list[-1] = "Baker"

print(f'The mixed list is now "{mixed_list}"')

The mixed list is now "['Bob', 'The Baker', 23, 1.78, 'Baker']"


In [53]:
# Strings by contracts are IMMUTABLE
name = "Bib The Builder"
name[1] = "o"

TypeError: 'str' object does not support item assignment

In [56]:
# Slicing works the same for lists as it does for strings
new_list = ['Alice', 'The Neighbour', 18, 1.60, 'Gossip']
print(new_list[0::2])

['Alice', 18, 'Gossip']


## List Methods

Common list methods include:

- `append()`: Adds a single item to the end of the list.
- `insert()`: Adds an item at a specific index.
- `remove()`: Removes the first occurrence of a specified value.
- `extend()`: Adds all items from another iterable (like a list or tuple) to the end of the list.
- `sort()`: Sorts the list items in ascending order.
- `pop()`: Removes and returns the item at a specific index.

In [70]:
# creating a list
team = ["Adam", "Blake", "Donna"]
print(f"Our team is {team}")

Our team is ['Adam', 'Blake', 'Donna']


In [71]:
# .append() adds an item to the end of the list
team.append("Fred")
print(f"After .append('Fred'): {team}")

After .append('Fred'): ['Adam', 'Blake', 'Donna', 'Fred']


In [72]:
# .insert() adds an item at a specific index
team.insert(2, "Chad")
print(f"After .insert('Chad'): {team}")

After .insert('Chad'): ['Adam', 'Blake', 'Chad', 'Donna', 'Fred']


In [73]:
# .remove() removes the first matching item by VALUE
team.remove("Blake")
print(f"After .remove('Blake'): {team}")

After .remove('Blake'): ['Adam', 'Chad', 'Donna', 'Fred']


In [74]:
# create a second list
new_team = ["Marlon", "Brandon"]
print(f"The new team is {new_team}")

# .extend() adds all items from another list to the current one
team.extend(new_team)
print(f"After .extend(new_team): {team}")

The new team is ['Marlon', 'Brandon']
After .extend(new_team): ['Adam', 'Chad', 'Donna', 'Fred', 'Marlon', 'Brandon']


In [75]:
# .sort() sorts the list in ascending order (in-place)
team.sort()
print(f"After .sort(team): {team}")

After .sort(team): ['Adam', 'Brandon', 'Chad', 'Donna', 'Fred', 'Marlon']


In [76]:
# .pop() removes an item by INDEX
leaver = team.pop(2)
print(f"After .pop(2), the list is: {team}")
print(f"The item removed was: {leaver}")

After .pop(2), the list is: ['Adam', 'Brandon', 'Donna', 'Fred', 'Marlon']
The item removed was: Chad


In [77]:
# The 'in' operator checks for existence
print(f"Is 'Jack' in the team? {'Chad' in team}")
print(f"Is 'Brandon' in the team? {'Brandon' in team}")
print(f"Is 'Chad' not in the team? {'Chad' not in team}")

Is 'Jack' in the team? False
Is 'Brandon' in the team? True
Is 'Chad' not in the team? True


## Exercise

Write a function `contain_vowel(word)` that checks if a given word contains an vowel (a,e,i,o,u). Use the in operator and return `True` or `False`.

In [80]:
def contain_vowel(word):
  vowels = 'aeiou'
  for char in word:
    if char in vowels:
      return True
  return False

print(contain_vowel('hello'))
print(contain_vowel('world'))
print(contain_vowel('grr'))
print(contain_vowel('AEIOU')) # case sensitivity

True
True
False
False
