# UNIT 1: Variables and Data Structures

## Table of Contents

1. [Variables](#1.-Variables)  
    1.1. [Working with Variables](#1.1.-Working-with-Variables)  
    1.2. [Data Types](#1.2.-Data-Types)  
    1.3. [Operators](#1.3.-Operators)  
2. [Data Structures](#2.-Data-Structures)  
    2.1. [Lists](#2.1.-Lists)  
    2.2. [Tuples and Sets](#2.2.-Tuples-and-Sets)  
    2.3. [Dictionaries](#2.3.-Dictionaries)  
3. [String Manipulation](#3.-String-Manipulation)  
    3.1. [Slicing](#3.1.-Slicing)  
    3.2. [Case Transformations and Whitespace](#3.2.-Case-Transformations-and-Whitespace)  
    3.3. [Evaluation](#3.3.-Evaluation)  
    3.4. [Concatenation and Formatting](#3.4.-Concatenation-and-Formatting)  
    3.5. [Find and Replace](#3.5.-Find-and-Replace)  
    3.6. [Splitting and Joining](#3.6.-Splitting-and-Joining)  


# 1. Variables

A variable is a storage location associated with a symbolic name that can be used to reference the value stored in it.

## 1.1. Working with Variables

We assign values to variables using the `=` operator. Below, the value `2` is assigned to the name `x`. Afterwards the value can be referenced by using that name, for example in order to output `2` to the console using the `print()` command.

In [None]:
x = 2
print(x)

The values stored in variables are changeable, for example through addition:

In [None]:
year = 2023
year = year + 1
print(year)

The value of one variable can be easily assigned to another:

In [None]:
year2 = year
print(year2)

Multiple values can be assigned to as many variables at the same time:

In [None]:
x, y, z = 1, 2, 3
print(x)
print(y)
print(z)

Similarly, the same value can be assigned to multiple variables at the same time:

In [None]:
x = y = z = 3
print(x)
print(y)
print(z)

### Variable names

Variable names can be arbitrary (like `x` and `y`), although it is *highly advisable* to use descriptive names that make the code more readable (e.g. `age`, `surname`, `total_volume`). The following restrictions apply when naming variables in Python:

- A variable name must start with a letter or the underscore character
- A variable name cannot start with a number
- A variable name can only contain alpha-numeric characters and underscores (`A-z`, `0-9`, and `_` )

Several naming approaches have emerged to make variables names easier to read, particularly when the name incorporated multiple words. Amongst the most commonly used are:

<div class="alert alert-block alert-info" style="width: 50%; float: right; margin-left: 20px;">
<b>Note:</b> <i>Snake case</i> is conventionally accepted as the "correct" approach in Python, as that is the way recommended by the official <a href="https://peps.python.org/pep-0008/#function-and-variable-names">Style Guide for Python Code (PEP 8)</a>
</div>

- **Snake Case**: `my_variable_name` (words are separated by underscores) 
- **Camel Case**: `myVariableName` (words, except the first, are capitalized) 
- **Pascal Case**: `MyVariableName` (words are capitalized)

Lastly, variable names are case-sensitive. In the following example, declaring a new variable `Year` will not overwrite the existing one named `year`:

In [None]:
Year = 1999
print(Year)
print(year)

## 1.2. Data Types

Variables can store data of many different types, each with its own characteristics and limitations. The following are the most commonly used Python built-in data types.

### Strings

Strings are sequences of character data, and have to be enclosed with quotation marks (either single or double) to distinguish them from variable names. 

In [None]:
x = "John"
y = 'John'
print(x)
print(y)

There are a few characters that can't be included *as-is* in strings, these are often called **illegal characters**. Examples include a double quote inside a string that is surrounded by double quotation marks. In order to include one of these characters in a string it is necessary to "escape" it by prepending it with a `\` (backslash, which, because of this usage, is itself also an illegal character).

In [None]:
print("the man by the door shouted "Stop!"")  # text outside quotation marks produces an error

In [None]:
print("the man by the door shouted \"Stop!\"")  # escaping the quotation marks fixes the problem

An alternative solution would be to use two different types of quotation marks:

In [None]:
print('the man by the door shouted \"Stop!\"')  # double quotes inside single quotes

Triple quotation marks can be used to deal with complex situations like the following sentence: `He said: "the man by the door shouted 'Stop!'"`

In [None]:
print('''He said: "the man by the door shouted 'Stop!'"''')

Triple quotes can be also used to assign a multiline string to a variable. The resulting text will preserve the line breaks at the same position.

In [None]:
x = '''Lorem ipsum dolor sit amet,
consectetur adipiscing elit,
sed do eiusmod tempor incididunt
ut labore et dolore magna aliqua.'''
print(x)

Lastly, special *invisible* characters, such as new line (`\n`), carriage returns (`\r`), and tabs (`\t`), can also be escaped: 

In [None]:
print('first line\n\tsecond line\nthird line')

### Numbers

As shown in previous examples, variables of numeric types are declared by assigning a value to them, quotation marks are not necessary. Python will automatically distinguish between number types, of which the two most commonly used are:

- Integers (`int`), *i.e.* whole numbers of unlimited length, positive or negative, without decimals.
- Floating point numbers (`float`), *i.e.* decimal numbers, positive or negative.

In [None]:
a = 2
b = 2.5
print(a)
print(b)

### Booleans

Variables of `Boolean` type may have one of two values, `True` or `False`. They can be declared directly, without quotation marks.

In [None]:
a = True  # note the capitalization: `True` not `true`
b = False
print(a)
print(b)


Boolean values are commonly used to assess the veracity of an expression (for example a comparison):

In [None]:
print(10 > 9)
print(10 == 9)
print(10 < 9)

### Determine a variable's data type

You can get the data type of a variable with the `type()` function:

In [None]:
x = 5
y = 5.5
z = "John"
print(type(x))
print(type(y))
print(type(z))

### Declare a variable of a specific data type

Variables do not need to be declared with any particular type, and can change type after they have been declared. There may be times, however, when it may be necessary to specify the type when declaring a variable. This can be done by *casting* the variable using a **constructor** function. The constructors for the types discussed above are: 

- `str()` constructs a string from a wide variety of data types, including strings, integer literals and float literals
- `int()` constructs an integer number from an integer literal, a float literal (by removing all decimals), or a string literal (providing the string represents a whole number)
- `float()` constructs a float number from an integer literal, a float literal or a string literal (providing the string represents a float or an integer)
- `bool()` evaluates any value or expression and returns either `True` or `False`


In [None]:
x = str(3)
y = int(3)
z = float(3)
print(type(x))
print(type(y))
print(type(z))

### Convert variables between data types

The constructors discussed above can also be used to convert variables between data types:

In [None]:
x = '2'
y = int(x)
z = str(y)
print(type(x))
print(type(y))
print(type(z))

## 1.3. Operators

Operators are constructs which can manipulate the value of operands. Given the expression `2 + 4 = 6`, `2` and `4` are **operands** and `+` is the **operator**. Python organizes operators into a number of groups. **Arithmetic operators** are used with numeric values to perform common mathematical operations:

In [None]:
x = 2
y = 4
print(x + y)  # addition
print(x - y)  # substraction
print(x / y)  # division
print(x * y)  # multiplication
print(x ** y)  # exponentiation

Some arithmetic operators can be used with strings, but their effects are different. While the `+` operator, for example, adds numbers, it instead *concatenates* strings:

In [None]:
x = 'Now is the winter '
y = 'of our discontent.'
print(x + y)

Using the same operator with mixed types, for example a string and a number, will produce an error:

In [None]:
x = 'Number '
y = 1
print(x + y)

If the goal is to concateneta the variables, then the number should be converted to a string:

In [None]:
print(x + str(y))

**Comparison operators** are used to compare two values:

In [None]:
x = 2
y = 4
print(x == y)  # equal to
print(x != y) # not equal to
print(x < y)  # less than
print(x > y) # greater than
print(x <= y) # less than or equal to
print(x >= y)  # greater than or equal to

**Logical operators** are used to combine conditional statements:

In [None]:
x = 2
y = 2
z = 4
print(x == y and z > y)  # and: both sides are true
print(x == y or y > z)  # or: either side is true
print(not x == y)  # not: `True` if operand is false, `False` if operand is true

**Membership operators** are used to test if an item or sequence of items is present in a collection:

In [None]:
x = 'Now is the winter of our discontent.'
print('Now' in x)  # in: returns True if the collection contains the item
print('Now' not in x)  # combined with not

# 2. Data Structures

Python has four built-in data types used to store collections of data: **lists**, **tuples**, **sets**, and **dictionaries**.

## 2.1. Lists

A list is used to store multiple items in a single variable. Python uses **square brackets** to denote lists, while the items they contain are **separated by commas**. A variable can be declared as a list simply by using square brackets (with or without items included). Alternatively, the `list()` constructor can be used for the same purpose:

In [None]:
my_list = []  # create an empty list
another_list = list() # create an empty list using constructor
fruits = ['apple', 'banana', 'cherry', 'orange', 'kiwi']  # create a list with items
print(my_list)
print(another_list)
print(fruits)

The function `len()` can be used to determine the number of items in a list:

In [None]:
print(len(fruits))

Almost anything can be an item in a list, such as values of any data types, or even other data structures, including lists:

In [None]:
# notice the last item in this list is itself a list
fruits = ['apple', 'banana', 'cherry', 'orange', 'kiwi', ['lemon', 'lime']]
print(fruits)

### Accessing items from a list

Lists will preserve the order in which items are stored, and the items they contain are indexed, meaning they can be retrieved by referring to the index number. The syntax to retrieve a single item from a list consists of the name of the list followed by the index of the desired item in square brackets:

<div class="alert alert-block alert-info">
<b>Note:</b> Python counts from zero, meaning that the index of the first item in a list will be <b>0</b>.
</div>

In [None]:
print(fruits[0])  # retrieves the first item in the list 'fruits'

Negative indices can be used to tell Python to count from the end of the list instead of the beginning, so that `-1` refers to the last item, `-2` refers to the second last item, and so on:

In [None]:
print(fruits[-2])  # retrieves the second item starting from the end of the list

The same approach can be extended to access items in **nested lists**:

In [None]:
print(fruits[-1][0])  # retrieves the first item (0) from the list in the last item (-1)

### List slicing

If two indices separated by a colon are supplied, Python will read that as a range and return a new list containing all items within it. This process of extracting a subset of the lists' items is called **slicing**.

<div class="alert alert-block alert-info">
<b>Note:</b> The resulting list will include the item corresponding to the <i>start</i> index of the supplied range but not the one corresponding to the <i>end</i> range.
</div>

In [None]:
print(fruits)
print(fruits[2:5]) # 'cherry', index 2, is included, while 'melon', index 5, is not

If the start value of the range is not provided, Python will start selecting from the beginning of the list:

In [None]:
print(fruits[:4])  # selects from the beginning to index 4 (not inclusive)

Conversely, if the end value of the range is omitted, Python will continue the selection to the end of the list:

In [None]:
print(fruits[2:])  # selects from index 2 (inclusive) to the end of the list

As before, negative indices can be used to reverse the direction of the selection:

In [None]:
print(fruits)
print(fruits[-3:-1])  # selects from the third index starting from the end to the last item (not included)

### Adding items to a list

To add an item to the end of the list, use the `append()` method:

In [None]:
print(fruits)
fruits.append('melon')
print(fruits)

To insert an item at a specified index, use the `insert()` method:

In [None]:
print(fruits)
fruits.insert(1, 'mango')
print(fruits)

### Changing items in a list

To change the value of a specific item, use the item's index:

In [None]:
print(fruits)
fruits[3] = 'blackcurrant'
print(fruits)

To change multiple items within a range, supply a list of the new items and the range in question.

<div class="alert alert-block alert-info">
<b>Note:</b> The length of the list will change when the number of items inserted does not match the number of items replaced. If you insert less or more items than you replace, the new items will be inserted where you specified, and the remaining items will move accordingly.
</div>

In [None]:
print(fruits)
fruits[1:3] = ['raspberry', 'watermelon']
print(fruits)

To append elements from another list to the current list, use the `extend()` method:

In [None]:
print(fruits)
tropical = ['pineapple', 'papaya']
fruits.extend(tropical)
print(fruits)

### Remove items from a list

The `remove()` method removes the specified **item**:

In [None]:
print(fruits)
fruits.remove("banana")
print(fruits)

The `del` keyword removes the specified **index**:

In [None]:
print(fruits)
del fruits[0]
print(fruits)

The `pop()` method **extracts** the specified **index** (or the last item, if an index is not provided). The extracted item can, for example, be stored in a variable:

In [None]:
print(fruits)
extracted_fruit = fruits.pop(1)
print(fruits)
print(extracted_fruit)

Lastly, the `clear()` method empties the list. The list itself still exists, but it no longer has any content:

In [None]:
print(fruits)
fruits.clear()
print(fruits)

### Sort a list

Lists have a `sort()` method that will sort the list alphanumerically:

In [None]:
fruits = ['apple', 'banana', 'cherry', 'orange', 'kiwi', 'melon', 'mango']
fruits.sort()
print(fruits)

The default order is **ascending**. To reverse it and sort in **descending** order, use the keyword argument `reverse=True`:

In [None]:
fruits.sort(reverse=True)
print(fruits)

## 2.2. Tuples and Sets

Tuples and sets, like lists, are used to store multiple items in a single variable. However, they differ in their properties:

<div style="width: 50%; margin: 20px;">

| Lists      | Tuples        | Sets          |
|:-----------|:--------------|:--------------|
| Ordered    | Ordered       | Unordered     |
| Changeable | Unchangeable  | Unchangeable+ |
| Duplicates | No duplicates | No duplicates |
| Indexed    | Indexed       | Not indexed   |

</div>

**+** set items cannot be altered, but the set iteself can, *e.g.* by adding and removing items.

### Tuples

Tuples are denoted with **round brackets**, and the constructor is `tuple()`. The items in a tuple are **separated by commas**.

<div class="alert alert-block alert-info">
<b>Note:</b> When creating a tuple with only one item, remember to include a comma after the item, otherwise it will not be identified as a tuple.
</div>

In [None]:
my_tuple = ()  # create an empty tuple
single_item = ('apple',)  # create a tuple with a single item, note the comma
another_tuple = tuple() # create an empty tuple using constructor
fruit_tuple = ('apple', 'banana', 'cherry', 'orange')  # create a tuple with items
print(my_tuple)
print(single_item)
print(another_tuple)
print(fruit_tuple)

Most of the methods and functions used to work with lists also apply to tuples:

In [None]:
print(len(fruit_tuple))  # use `len()` to get the number of items
print(fruit_tuple[1])  # tuple items are indexed and can be retrieved just like list items
print(fruit_tuple[1:4])  # tuples can aslo be sliced like lists
print(fruit_tuple[-1])  # and negative indices work in the same manner
print('apple' in fruit_tuple)  # as does checking whether an item is in the tuple

Tuples, however, are **unchangeable**, meaning that items cannot be added or removed once the tuple has been created. Therefore methods like `append()`, `insert()`, `remove()`, and `pop()` raise errors if used:

In [None]:
print(fruit_tuple.append('lime'))

It is, however, possible to extract the values of a tuple back into variables. This is called **unpacking**:

In [None]:
(green, yellow, red, orange) = fruit_tuple
print(green)
print(yellow)
print(red)
print(orange)

If the number of variables is less than the number of values in the tuple, one of the variables can be designated to receive the remaining values as a list. The variable to be used for this purpose can be indicated by adding `*` next to its name:

In [None]:
(green, yellow, *rest) = fruit_tuple
print(green)
print(yellow)
print(rest)

### Sets

Sets are denoted using **curly brackets**, and their constructor is `set()`. Items in sets are **separated by commas**.

<div class="alert alert-block alert-info">
<b>Note:</b> Because curly brackets are also used to denote dictionaries, you cannot create an empty set without using the constructor.
</div>

In [None]:
my_set = {}  # cannot create an empty set like this
empty_set = set() # but you can do it using the constructor
fruit_set = {'apple', 'banana', 'cherry', 'orange'}  # create a set with items
print(type(my_set))
print(empty_set)
print(fruit_set)

Because they are neither ordered nor indexed, individual **items in sets cannot be accessed by referencing their indices**. It is possible, however, to add and remove items from the set. The method `add()` can be used to append a new item:

In [None]:
fruit_set.add('lime')
print(fruit_set)

Two different methods can be used to remove an item from a set: `remove()` and `discard`. The main distinction is that, if the item to remove does not exist, `remove()` will raise an error, while `discard()` will not.

<div class="alert alert-block alert-info">
<b>Note:</b> <span style="font-family: monospace;">pop()</span> can also be used to remove an item but, because sets are <i>unordered</i>, the item will be randomly selected, so you cannot be sure what is being removed.
</div>

In [None]:
fruit_set.discard('guava')  # does nothing
fruit_set.remove('guava')  # raises error

Because sets do not allow duplicate items, a common use-case for them is to **eliminate duplicates from a list**:

In [None]:
duplicates = ['apple', 'banana', 'cherry', 'apple']
no_duplicates = set(duplicates)
print(duplicates)
print(no_duplicates)

A set can be converted to a list by using the `list()` constructor:

In [None]:
back_to_list = list(no_duplicates)
print(type(back_to_list))
print(back_to_list)

## 2.3. Dictionaries

Dictionaries are used to store data as keyed values. They are ordered, changeable, and do not allow duplicates. The values in dictionary items can be of any data type, including lists and other dictionaries. Python denotes dictionaries using **curly brackets**, while items in them are represented as **key:value pairs** and **separated by commas**.

In [None]:
# items are in separate lines to improve legibility
fruit_dictionary = {  
    'name': 'Apple',
    'seeds': 345, 
    'varieties': ['Golden Russet', 'Evercrisp', 'Delicious']
}
print(fruit_dictionary)

Just like with lists, the `len()` function can be used to determine how many items are contained in a dictionary:

In [None]:
print(len(fruit_dictionary))

### Accessing items from a dictionary

**Values** in a dictionary can be accessed by referencing their **key** inside **square brackets**:

In [None]:
print(fruit_dictionary['name'])  # retrieves the value associated with the 'name' key

The same result can be accomplished by invoking the `get()` method of dictionaries:

In [None]:
print(fruit_dictionary.get('name'))

The advantage of using `get()` lies in the fact that, should the supplied **key not exist in the dictionary**, `get()` will still return a value, either the default `None` or one supplied as a parameter: 

In [None]:
print(fruit_dictionary.get('wrong_key'))
print(fruit_dictionary.get('wrong_key', 'provided value'))

In the same situation, the square brackets approach will throw an error:

In [None]:
print(fruit_dictionary['wrong_key'])

It is possible to **test whether a key is in a dictionary** by using the `in` keyword:

In [None]:
print('wrong_key' in fruit_dictionary)

It is sometimes convenient to work either with just the keys, or just the values in a dictionary. Dictionaries have two methods to achieve this: `keys()` will return a **list of all the keys in the dictionary**, while `values()` will return a **list of just the values**. An additional method, `items()` will return **each item in a dictionary**, as *key,value* tuples in a list.

<div class="alert alert-block alert-info">
<b>Note:</b> The lists returned by these methods are <i>views</i> of the dictionary, meaning that if they were to be stored as variables, any future changes to the original dictionary will be propagated to them.
</div>

In [None]:
print(fruit_dictionary.keys())
print(fruit_dictionary.values())
print(fruit_dictionary.items())

### Adding and changing items

Adding an item to the dictionary is done by using a **new key** and assigning a **value** to it:

In [None]:
print(fruit_dictionary)
fruit_dictionary['color'] = 'red'
print(fruit_dictionary)

Similarly, it is possible to **change the value of a specific item** by referring to its key name:

In [None]:
print(fruit_dictionary)
fruit_dictionary['color'] = 'green'
print(fruit_dictionary)

Alternatively, the `update()` method can be used to achieve the same result. It takes another dictionary as an argument and it will update the target dictionary with its contents, or if an item does not exist, it will add it:

In [None]:
print(fruit_dictionary)
fruit_dictionary.update({'color': 'red'})
print(fruit_dictionary)

fruit_dictionary.update({'weight': '2gr'})
print(fruit_dictionary)

### Removing items

There are several ways to remove items from a dictionary. The `del` keyword **removes the item with the specified key name**:

In [None]:
print(fruit_dictionary)
del fruit_dictionary['weight']
print(fruit_dictionary)

The `pop()` method **removes the item with the specified key name, while extracting its value**, which can be assigned to a variable for later use:

In [None]:
print(fruit_dictionary)
removed_value = fruit_dictionary.pop('color')
print(fruit_dictionary)
print(removed_value)

Lastly, the `clear()` method **empties the dictionary**:

In [None]:
print(fruit_dictionary)
fruit_dictionary.clear()
print(fruit_dictionary)

# 3. String Manipulation

Python treats strings as lists where each character is an item, as such, list methods can be used to manipulate strings. The function `len()`, for example, will return the number of characters in a string, and the keyword `in` can be used to test whether a group of characters is present in a string:


In [None]:
sentence = 'Now is the winter of our discontent.'
print(len(sentence))
print('Now' in sentence)

## 3.1. Slicing

In particular, **slicing** can be used to extract substrings from a larger string. As with lists, slices are defined by a start index and an end index, separated by a colon. The exact same rules apply regarding omissions and the use of negative indices:<br />

<div style="width: 90%; margin: 40px; "><img style="float: center; width: 50%;" src="http://www.nltk.org/images/string-slicing.png" align=center /></div>

For example, if we want the word **is** from the sentence below, we could use the start index `4` and end index `6` to get it:

In [None]:
sentence = 'Now is the winter of our discontent.'
print(sentence[4:6])

If we ommit the first index, Python will start from the beginning of the string by default, so to get the first word we could use:

In [None]:
print(sentence[:3])

We can also start at index `3` and leave the end index unspecified to get the rest of the sentence. If no end index is supplied, Python continues until it reaches the end of the string:

In [None]:
print(sentence[3:])

If, for example, we wanted to find out what the last character in the sentence is, we can use a negative index:

In [None]:
print(sentence[:-1])

## 3.2. Case Transformations and Whitespace

Strings also provide their own specific methods, for example to change the case of the characters in them:

- `lower()`, converts all characters to lower case
- `upper()`, converts all characters to upper case
- `capitalize()`, converts the first character to upper case
- `title()`, converts the first letter in each word to upper case
- `swapcase()`,  converts lower case letters to upper case and viceversa


Other methods help to deal with unwanted whitespace. `lstrip()` removes any whitespace to the left of the string while `rstrip()` does the same to the right. The `strip()` method does both at the same time.

In [None]:
spaced = ' Now is the winter of our discontent. '  # note added spaces at beginning and end
print('|' + spaced.lstrip() + '|')  # concatenating bars at both ends to show spaces
print('|' + spaced.rstrip() + '|')
print('|' + spaced.strip() + '|')

In [None]:
print(sentence.lower())
print(sentence.upper())
print(sentence.capitalize())
print(sentence.title())
print(sentence.swapcase())

## 3.3. Evaluation

Python strings provide a number of convenience methods designed to evaluate the contents of a string. These methods **return a boolean indicating whether the target string meets the relevant condition**:

In [None]:
spaces = '  '
digits = '234'
print(sentence)
print(sentence.startswith('Now'))  # True if the string starts with the specified value
print(sentence.endswith('.'))  # True if the string ends with the specified value
print(sentence.islower())  # True if all characters in the string are lower case
print(sentence.isupper())  # True if all characters in the string are upper case
print(spaces.isspace())  # True if all characters in the string are whitespaces
print(digits.isdigit())  # True if all characters in the string are digits

## 3.4. Concatenation and Formatting

We already saw how the `+` operator can be used to concatenate strings. A more flexible way to achieve this is to use the `format()` method. Here, sets of **curly brackets** are used as placeholders to be dynamically replaced by the arguments passed to `format()`:

In [None]:
print('Now is the {} of our discontent. Made {} {} by this sun of {}'.format('winter', 'glorious', 'summer', 'York'))

Of course, instead of directly supplying strings to `format()`, the same could be achieved with variables:

In [None]:
s1 = 'winter'
s2 = 'summer'
qual = 'glorious'
loc = 'York'
print('Now is the {} of our discontent. Made {} {} by this sun of {}'.format(s1, qual, s2, loc))

The placeholders are matched to the arguments **one-to-one, in order from left to right**:

In [None]:
print('Now is the {} of our discontent. Made {} {} by this sun of {}'.format(s2, s1, qual, loc))

As a convenient shortcut, instead of invoking the `format()` method and passing it arguments, the arguments can be placed directly within the placeholders, provided that the entire string is prepended with `f` to indicate that it must be formatted:

In [None]:
print(f'Now is the {s2} of our discontent. Made {s1} {qual} by this sun of {loc}')

## 3.5. Find and Replace

The `find()` method can be used to search a string for occurrences of the specified value. The value returned corresponds to the start index of the range representing the found occurrence.

<div class="alert alert-block alert-info">
<b>Note:</b> If the specified value is not found, <span style="font-family: monospace;">find()</span> will return <b>-1</b> instead.
</div>

In [None]:
print(sentence.find('winter')) # returns the index where "winter" starts in the sentence

`find()` can also take two indices indicating the **start and end of a range** (in that order). If these arguments are supplied, then the search is restricted only to a subset of the string as indicated by the range:

In [None]:
print(sentence.find('winter', 12, 20))  # searches for "winter" starting at index 12 and ending at index 20

The `replace()` method can be used to **substitute a substring with another**. The arguments supplied to represent the old and new values (in that order):

In [None]:
print(sentence.replace('winter', 'summer'))  # replace "winter" with "summer"

A third, optional argument can be supplied to indicate the **number of occurrences of the old value that should be replaced**. If this argument is omitted the default is all of them:

In [None]:
repeats = 'abc abc abc abc abc abc abc abc abc'
print(repeats.replace('a', 'z', 5))  # replace "a" with "z", but only five times

## 3.6. Splitting and Joining

Strings can be split into lists using the `split()` method. The character to be used as separator can be supplied to `split()` as an argument:    

In [None]:
csv_string = 'value1,value2,value3,value4'
print(csv_string.split(','))  # uses comma as separator

If no separator is provided, `split()` defaults to using whitespace as separator:

In [None]:
tokens = sentence.split()
print(tokens)

Conversely, the `join()` method can be used to join all the items in a list into a string. The string to which the method is applied acts as the separator, while the list to be joined is supplied as an argument:

In [None]:
print(' '.join(tokens))  # use space as separator
print('_'.join(tokens))  # use underscore as separator