# CHAPTER 3
# Built-in Data Structures, Functions, and Files

## 3.1 Data Structures and Sequences

### Tuple
- **Tuple** = fixed-length, immutable sequence of Python objects.
- Simple tuples can be created as a comma-separated sequence of values.
- More complex tuples will require to enclose the values in parentheses.
- You can convert any sequence or iterator to a tuple by invoking the **tuple** function.
- Elements can be accessed with square brackets [] as with most other sequence types.
- While the objects stored in a tuple may be mutable themselves, once the tuple is created it’s not possible to modify which object is stored in each slot.
- You can concatenate tuples using the + operator to produce longer tuples.
- Multiplying a tuple by an integer, as with lists, has the effect of concatenating together that many copies of the tuple.

In [2]:
# Define a simple tuple as a comma-separated sequence of values

tup = 4, 5, 6
tup

(4, 5, 6)

In [3]:
# Check the type of the above defined python object

type(tup)

tuple

In [1]:
# Define a more complex nested tuple containing 2 tuples of different sizes

nested_tup = (4, 5, 6), (7, 8)
nested_tup

((4, 5, 6), (7, 8))

In [3]:
# Define a list

my_list = [4, 0, 2]
type(my_list)

list

In [4]:
# Convert my_list to a tuple

tuple(my_list)

(4, 0, 2)

**REMEMBER**: Strings are a sequence of Unicode characters and therefore can be treated like other sequences.

In [1]:
# Convert a string to a tuple

tup = tuple('python')
tup

('p', 'y', 't', 'h', 'o', 'n')

**REMEMBER**: Sequences are 0-indexed in Python.

In [2]:
# Check the first elemnet in tup

tup[0]

'p'

In [4]:
# Define another tuple by converting a list to a tuple

tup = tuple(['foo', [1, 2], True])
tup

('foo', [1, 2], True)

In [5]:
# Try to change the value of the tuple at tup[2] - you will get an error

tup[2] = False

TypeError: 'tuple' object does not support item assignment

In [6]:
# If an object inside a tuple is mutable, such as a list, you can modify it in-place

tup[1].append(3)
tup

('foo', [1, 2, 3], True)

In [15]:
# If you define a tuple that contains a single object you need to add , after it

test_tuple = ('bar')
type(test_tuple)

#test_tuple = ('bar',)
#type(test_tuple)

str

In [8]:
# Concatenate tuples using the + operator
(4, None, 'foo') + (6, 0) + ('bar',)

(4, None, 'foo', 6, 0, 'bar')

In [16]:
# Multiplying a tuple by an integer

('foo', 'bar') * 4

('foo', 'bar', 'foo', 'bar', 'foo', 'bar', 'foo', 'bar')

#### Upacking tuples
- If you try to assign to a tuple-like expression of variables, Python will attempt to unpack the value on the righthand side of the equals sign.
- Even sequences with nested tuples can be unpacked.
- Using this functionality you can easily swap variable names.
- A common use of variable unpacking is iterating over sequences of tuples or lists.
- Another common use is returning multiple values from a function. (covered later in the book)
- You can use the * **rest** syntax to pluck a few elements from the beginning of a tuple.
- This **rest** bit is sometimes something you want to discard & is nothing special about the **rest** name.
- Many Python programmers will use the underscore (_) for unwanted variables.
        a, b, *_ = values

In [17]:
# Assign to a tuple-like expresion of variables

tup = (4, 5, 6)
a, b, c = (4, 5, 6)
print(a, b, c)

4 5 6


In [18]:
# Unpack nested tuples

tup = 4, 5, (6, 7)
a, b, (c, d) = tup
d

7

In [19]:
# Swap variables name

a, b = 1, 2
print(a, b)

b, a = a, b
print(a, b)

1 2
2 1


In [20]:
# Iterating over a list of tuples

seq = [(1, 2, 3), (4, 5, 6), (7, 8, 9)]

for a, b, c in seq:
    print('a={0}, b={1}, c={2}'.format(a, b, c))

a=1, b=2, c=3
a=4, b=5, c=6
a=7, b=8, c=9


In [30]:
# Using the special *rest syntax to select a few elements from a tuple

values = 1, 2, 3, 4, 5, 6

a,b, *rest = values
print('a =',a, 'b =',b, 'rest = ', rest)

a = 1 b = 2 rest =  [3, 4, 5, 6]


#### Tuple methods
- Since the size and contents of a tuple cannot be modified there are not so many tuple methods.
- **count** = counts the number of occurrences of a value.

In [31]:
# Use the count method on a tuple

tup = (1, 2, 2, 2, 3, 4, 2)
tup.count(2)

4

**REMEMBER**: You can use tab completion in jupyter notebooks to check for available methods for a tuple.

In [35]:
# Tab completion tuple methods

tup.

### List
- Lists are variable-length and their contents can be modified in-place.
- You can define them using square brackets [] or using the **list** type function.
- You can convert a tuple to a list using the **list** function.
- The **list** function is used in data processing as a way to materialize an iterator or generator expression.

In [2]:
# Define a list using [] brackets

a_list = [2, 3, 7, None]
a_list

[2, 3, 7, None]

In [3]:
# Convert a tuple to a list

tup = ('foo', 'bar', 'baz')
b_list = list(tup)
b_list

['foo', 'bar', 'baz']

In [4]:
# Modify an elemnt of b_list

b_list[1] = 'peekaboo'
b_list

['foo', 'peekaboo', 'baz']

In [6]:
# Generate a sequence of integers using the range function

gen = range(10)
gen

range(0, 10)

In [9]:
# Check the type of gen

type(gen)

range

In [10]:
# Convert the type range to a list

list(gen)

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

#### Adding and removing elements
- Elements can be appended to the end of the list with the **append** method.
- Using **insert** you can insert an element at a specific location in the list. The insertion index must be between 0 and the length of the list, inclusive.
- **insert** is computationally expensive compared with **append**, because references to subsequent elements have to be shifted internally to make room for the new element.
- If you need to insert elements at both the beginning and end of a sequence, you may wish to explore **collections.deque**. (https://docs.python.org/2/library/collections.html)
- **pop** removes and returns an element at a particular index.
- Elements can be removed by value with **remove**, which locates the first such value and removes it from the list.
- Check if a list contains a value using the **in** keyword. The keyword **not** can be used to negate **in**.
- Checking whether a list contains a value is a lot slower than doing so with dicts and sets as Python makes a linear scan across the values of the list, whereas it can check the others (based on hash tables) in constant time.

In [12]:
# Append an element to a list

b_list.append('dwarf')
b_list

['foo', 'peekaboo', 'baz', 'dwarf']

In [13]:
# Insert an element at a specific location in the list

b_list.insert(1, 'red')
b_list

['foo', 'red', 'peekaboo', 'baz', 'dwarf']

In [14]:
# Use pop to remove an element at a certain location in the list

b_list.pop(2)
b_list

['foo', 'red', 'baz', 'dwarf']

In [15]:
# Append another element to the list

b_list.append('foo')
b_list

['foo', 'red', 'baz', 'dwarf', 'foo']

In [16]:
# Use remove to remove elements by value - only first element with that value is removed

b_list.remove('foo')
b_list

['red', 'baz', 'dwarf', 'foo']

In [17]:
# Check if a list contains a value

'dwarf' in b_list

True

In [18]:
# Check if a list does not contain a value

'dwarf' not in b_list

False

#### Concatenating and combining lists
- You can use **+** operator to **concatenate** lists.
- You can append multiple elements to it using the **extend** method.
- Note that list **concatenation by addition** is a comparatively expensive operation since a new list must be created and the objects copied over. 
- Using **extend** to append elements to an existing list, especially if you are building up a large list, is usually preferable.
        everything = []
        for chunk in list_of_lists:
            everything.extend(chunk)           --faster
            #everything = everything + chunk   --slower

In [19]:
# Concatenating lists using + operator

[4, None, 'foo'] + [7, 8, (2, 3)]

[4, None, 'foo', 7, 8, (2, 3)]

In [20]:
# Use extend method to append multiple elements to a list

x = [4, None, 'foo']
x.extend([7, 8, (2, 3)])
x

[4, None, 'foo', 7, 8, (2, 3)]

#### Sorting