# Containers and slicing

> *© 2024, Joris Van den Bossche and Stijn Van Hoey  (<mailto:jorisvandenbossche@gmail.com>, <mailto:stijnvanhoey@gmail.com>). Licensed under [CC BY 4.0 Creative Commons](http://creativecommons.org/licenses/by/4.0/)*

---

> This notebook is based on material of the [*Python Scientific Lecture Notes*](https://scipy-lectures.github.io/), and the [*Software Carptentry: Programming with Python course*](https://swcarpentry.github.io/python-novice-gapminder).

## Containers

If I measure air pressure multiple times, doing calculations with a hundred variables called `pressure_001`, `pressure_002`, etc., would be slow. We need _container_ or _collection_ data types to combine multiple values in a single object.

### Lists

We can use a list to store many values together:

- Contained within square brackets `[...]`.
- Values separated by commas `,`.


In [16]:
pressures_hPa = [1013, 1003, 1010, 1020, 1032, 993, 989, 1018, 889, 1001]

In [17]:
len(pressures_hPa)  # len is a built-in function in Python (does not require an import)

10

A list is an ordered collection of objects, that may have different types. 

In [18]:
a_list = [2.,'aa', 0.2]
a_list

[2.0, 'aa', 0.2]

Use an item’s index to fetch it from a list  and have access to individual object in the list:

In [19]:
a_list[1]

'aa'

In [20]:
a_list[-1]  # negative indices are used to count from the back

0.2

<div class="alert alert-warning">

__Warning__: 
    
Indexing in Python starts at 0 (as in C, C++ or Java), not at 1 (as in Fortran or Matlab)!
    
</div>

Lists are __mutable__ objects and can be modified. Lists’ values can be replaced by assigning to them. Use an index expression on the left of assignment to replace a value:

In [21]:
a_list[1] = 3000
a_list

[2.0, 3000, 0.2]

__Slicing__ obtaining sublists of regularly-spaced elements:

In [24]:
another_list = ['first', 'second', 'third', 'fourth', 'fifth']
print(another_list[2:4])
print(another_list[3:])
print(another_list[:2])
print(another_list[::2])

['third', 'fourth']
['fourth', 'fifth']
['first', 'second']
['first', 'third', 'fifth']


<div class="alert alert-info">

__Info__: 
    
* `L[start:stop]` contains the elements with indices i so `start <= i < stop`
* i ranging from start to stop-1. Therefore, L[start:stop] has (stop-start) elements.
* Slicing syntax: `L[start:stop:stride]`
* all slicing parameters are optional
</div>

Warning, with views equal to each other, they point to the same point in memory. Changing one of them is also changing the other!!

In [25]:
a = ['a',  'b']
b = a
b[0] = 1
print(a)

[1, 'b']


**List methods**:

You can list the available _methods_ in the namespace using the `TAB` key  or using the (built-in) `dir()`-function:

In [26]:
#dir(list)

In [27]:
a_third_list = ['red', 'blue', 'green', 'black', 'white']

In [28]:
# Appending
a_third_list.append('pink')
a_third_list

['red', 'blue', 'green', 'black', 'white', 'pink']

In [29]:
# Removes and returns the last element
a_third_list.pop()
a_third_list

['red', 'blue', 'green', 'black', 'white']

In [30]:
# Extends the list in-place
a_third_list.extend(['pink', 'purple'])
a_third_list

['red', 'blue', 'green', 'black', 'white', 'pink', 'purple']

In [31]:
# Reverse the list
a_third_list.reverse()
a_third_list

['purple', 'pink', 'white', 'black', 'green', 'blue', 'red']

In [32]:
# Remove the first occurence of an element
a_third_list.remove('white')
a_third_list

['purple', 'pink', 'black', 'green', 'blue', 'red']

In [34]:
# Sort list
a_third_list.sort()
a_third_list

['black', 'blue', 'green', 'pink', 'purple', 'red']

------------

In [36]:
a_third_list = ['red', 'blue', 'green', 'black', 'white']

In [37]:
# remove the last two elements
a_third_list = a_third_list[:-2]
a_third_list

['red', 'blue', 'green']

### TODO - exercises

<div class="alert alert-success">

**EXERCISE**:

Mimick the functioning of the *reverse* command using the appropriate slicing command:

<details><summary>Hints</summary>

- The slicing syntax is `L[start:stop:stride]` and the stride can be a negative value
- To access from start till end, the `start` and `stop` can be empty.
    
</details>    
    
</div>

In [38]:
a_third_list[::-1]

['green', 'blue', 'red']

------------

Concatenating lists is just the same as summing both lists:

In [29]:
a_list = ['pink', 'orange']
a_concatenated_list = a_third_list + a_list
a_concatenated_list

['red', 'blue', 'green', 'pink', 'orange']

<div class="alert alert alert-danger">
    <b>Note</b>: Why is the following not working?
</div>

In [30]:
reverted = a_third_list.reverse()
## comment out the next lines to test the error:
#a_concatenated_list = a_third_list + reverted
#a_concatenated_list

The list itself is reversed and no output is returned, so reverted is None, which can not be added to a list

------------

In [31]:
# Repeating lists
a_repeated_list = a_concatenated_list*10
print(a_repeated_list)

['red', 'blue', 'green', 'pink', 'orange', 'red', 'blue', 'green', 'pink', 'orange', 'red', 'blue', 'green', 'pink', 'orange', 'red', 'blue', 'green', 'pink', 'orange', 'red', 'blue', 'green', 'pink', 'orange', 'red', 'blue', 'green', 'pink', 'orange', 'red', 'blue', 'green', 'pink', 'orange', 'red', 'blue', 'green', 'pink', 'orange', 'red', 'blue', 'green', 'pink', 'orange', 'red', 'blue', 'green', 'pink', 'orange']


**List comprehensions**

List comprehensions are a very powerful functionality. It creates an in-list for-loop option, looping through all the elements of a list and doing an action on it, in a single, readable line.

In [32]:
number_list = [1, 2, 3, 4]
[i**2 for i in number_list]

[1, 4, 9, 16]

and with conditional options:

In [33]:
[i**2 for i in number_list if i>1]

[4, 9, 16]

In [34]:
[i**2 for i in number_list if i>1]

[4, 9, 16]

In [35]:
# Let's try multiplying with two on a list of strings:
print([i*2 for i in a_repeated_list])

['redred', 'blueblue', 'greengreen', 'pinkpink', 'orangeorange', 'redred', 'blueblue', 'greengreen', 'pinkpink', 'orangeorange', 'redred', 'blueblue', 'greengreen', 'pinkpink', 'orangeorange', 'redred', 'blueblue', 'greengreen', 'pinkpink', 'orangeorange', 'redred', 'blueblue', 'greengreen', 'pinkpink', 'orangeorange', 'redred', 'blueblue', 'greengreen', 'pinkpink', 'orangeorange', 'redred', 'blueblue', 'greengreen', 'pinkpink', 'orangeorange', 'redred', 'blueblue', 'greengreen', 'pinkpink', 'orangeorange', 'redred', 'blueblue', 'greengreen', 'pinkpink', 'orangeorange', 'redred', 'blueblue', 'greengreen', 'pinkpink', 'orangeorange']


Cool, this works! let's check more about strings:

#### Strings

NOTE 2024 -> just short reference to list (slicing,...) and that it is immutable
https://swcarpentry.github.io/python-novice-gapminder/11-lists.html#character-strings-can-be-indexed-like-lists.

Different string syntaxes (simple, double or triple quotes)

In [36]:
s = 'Never gonna give you up'
print(s)
s = "never gonna let you down"
print(s)
s = '''Never gonna run around 
    and desert you'''         
print(s)
s = """Never gonna make you cry, 
    never gonna say goodbye"""
print(s)

Never gonna give you up
never gonna let you down
Never gonna run around 
    and desert you
Never gonna make you cry, 
    never gonna say goodbye


In [37]:
## pay attention when using apostrophes! - test out the next two lines one at a time
#print('Hi, what's up?')
#print("Hi, what's up?")

The newline character is **\n**, and the tab character is **\t**.

In [38]:
print('''Never gonna tell a lie and hurt you.
Never gonna give you up,\tnever gonna let you down
Never \ngonna\n run around and\t desert\t you''')

Never gonna tell a lie and hurt you.
Never gonna give you up,	never gonna let you down
Never 
gonna
 run around and	 desert	 you


Strings are collections like lists. Hence they can be indexed and sliced, using the same syntax and rules.

In [39]:
a_string = "hello"
print(a_string[0])
print(a_string[1:5])
print(a_string[-4:-1:2])

h
ello
el


Accents and special characters can also be handled in Unicode strings (see http://docs.python.org/tutorial/introduction.html#unicode-strings).

In [40]:
print(u'Hello\u0020World !')

Hello World !


A string is an immutable object and it is not possible to modify its contents. One may however create new strings from the original one.

In [41]:
#a_string[3] = 'q'   # uncomment this cell

We won't introduce all methods on strings, but let's check the namespace and apply a few of them:

In [42]:
#dir(str) # uncomment this cell

In [43]:
another_string = "Strawberry-raspBerry pAstry package party"
another_string.lower().replace('r', 'l', 7)

'stlawbelly-laspbelly pastly package party'

String formatting to make the output as wanted can be done as follows:

In [44]:
print('An integer: %i; a float: %f; another string: %s' % (1, 0.1, 'string'))

An integer: 1; a float: 0.100000; another string: string


The [`format` string print](https://pyformat.info/) options in python 3 are able to interpret the conversions itself:

In [45]:
print('An integer: {}; a float: {}; another string: {}'.format(1, 0.1, 'string'))

An integer: 1; a float: 0.1; another string: string


In [46]:
n_dataset_number = 20
sFilename = 'processing_of_dataset_%d.txt' % n_dataset_number
print(sFilename)

processing_of_dataset_20.txt


<div class="alert alert alert-success">
    <b>Exercise</b>: With the `dir(list)` command, all the methods of the list type are printed. However, we're not interested in the hidden methods. Use a list comprehension to only print the non-hidden methods (methods with no starting or trailing '_'):
</div>

In [47]:
[el for el in dir(list) if not el[0]=='_']

['append',
 'clear',
 'copy',
 'count',
 'extend',
 'index',
 'insert',
 'pop',
 'remove',
 'reverse',
 'sort']

<div class="alert alert alert-success">
    <b>Exercise</b>: Given the previous sentence `the quick brown fox jumps over the lazy dog`, split the sentence and put all the word-lengths in a list. 
</div>

In [48]:
sentence = "the quick brown fox jumps over the lazy dog"

In [49]:
#split in words and get word lengths
[len(word) for word in sentence.split()]

[3, 5, 5, 3, 5, 4, 3, 4, 3]

------------

#### Dictionaries

A dictionary is basically an efficient table that **maps keys to values**. It is an **unordered** container

It can be used to conveniently store and retrieve values associated with a name

In [50]:
# Always key : value combinations, datatypes can be mixed
hourly_wage = {'Jos':10, 'Frida': 9, 'Gaspard': '13', 23 : 3}
hourly_wage

{23: 3, 'Frida': 9, 'Gaspard': '13', 'Jos': 10}

In [51]:
hourly_wage['Jos']

10

Adding an extra element:

In [52]:
hourly_wage['Antoinette'] = 15
hourly_wage

{23: 3, 'Antoinette': 15, 'Frida': 9, 'Gaspard': '13', 'Jos': 10}

You can get the keys and values separately:

In [53]:
hourly_wage.keys()

dict_keys(['Jos', 'Frida', 'Gaspard', 23, 'Antoinette'])

In [54]:
hourly_wage.values()

dict_values([10, 9, '13', 3, 15])

In [55]:
hourly_wage.items() # all combinations in a list

dict_items([('Jos', 10), ('Frida', 9), ('Gaspard', '13'), (23, 3), ('Antoinette', 15)])

In [56]:
# ignore this loop for now, this will be explained later
for key, value in hourly_wage.items():
    print(key,' earns ', value, '€/hour')

Jos  earns  10 €/hour
Frida  earns  9 €/hour
Gaspard  earns  13 €/hour
23  earns  3 €/hour
Antoinette  earns  15 €/hour


<div class="alert alert alert-success">
    <b>Exercise</b> Put all keys of the `hourly_wage` dictionary in a list as strings.  If they are not yet a string, convert them:
</div>

In [57]:
hourly_wage = {'Jos':10, 'Frida': 9, 'Gaspard': '13', 23 : 3}

In [58]:
str_key = []
for key in hourly_wage.keys():
    str_key.append(str(key))
str_key

['Jos', 'Frida', 'Gaspard', '23']

----------------------------

#### Tuples

Tuples are basically immutable lists. The elements of a tuple are written between parentheses, or just separated by commas

In [59]:
a_tuple = (2, 3, 'aa', [1, 2])
a_tuple

(2, 3, 'aa', [1, 2])

In [60]:
a_second_tuple = 2, 3, 'aa', [1,2]
a_second_tuple

(2, 3, 'aa', [1, 2])

the key concept here is mutable vs. immutable
* mutable objects can be changed in place
* immutable objects cannot be modified once created