# HEP Software Training: Learn Programming with Python
## Chapter 4: Storing Multiple Values in Lists

Created by: [Hisyam Athaya](https://athayahisyam.github.io/)  
Learning portfolio based on [SWCarpentry Programming with Python: Python Fundamentals](https://swcarpentry.github.io/python-novice-inflammation/)  
Visit [HEP Software Training](https://hepsoftwarefoundation.org/training/curriculum.html) for more information.

### Python Lists

Python lists are built into language so do not need any library to use them.

In [1]:
odds=[1, 3, 5, 7]
print('odds are:', odds)

odds are: [1, 3, 5, 7]


In [2]:
# accessing element in a list using indices

print('first element:', odds[0])
print('last element:', odds[1])
print('-1 element:', odds[-1])

#negative list gives last element first. So -1 meant first from the last, -2 meant second from the last, etc.

first element: 1
last element: 3
-1 element: 7


We can change the values in a list, but we cannot change individual characters in a string.

In [3]:
# Change value in list

names = ["Curie", "Darwing", "Al-Khwarizmi"] # Typo in Darwin's name
print('names is originally:', names)
names[1] = 'Darwin' # attempt to correct the name
print('final value of names: ', names)

names is originally: ['Curie', 'Darwing', 'Al-Khwarizmi']
final value of names:  ['Curie', 'Darwin', 'Al-Khwarizmi']


In [15]:
# Change value in string

# name = 'Darwin'
# name[0] = 'd'

TypeError: 'str' object does not support item assignment

Data which can be modified in place is called `mutable`, while data which cannot be modified is called `immutable`. **String and numbers are immutable**. This does not mean that variable with string or numbers are constants, but when one change the value of a string, we can only replace the old value in them with a completely new value.  
  
List and Array are `mutable`. It can be modified after being created. We can modify individual elements, append new ones, reorder the list. Some other operations, like sorting, can be modified whether it will change the data in place (data inside the list) or, by using a function, can return a modified copy and leave the original intact.

In [5]:
# careful when modifying data in-place. If two variables refer to the same list,
# and you modify the list value, it will change for both variable!

salsa = ['peppers', 'onions', 'cilantro', 'tomatoes']
my_salsa = salsa  # my_salsa and salsa point the *same* list data in memory
salsa[0] = 'hot peppers'
print('Ingredients in my salsa: ', my_salsa)

Ingredients in my salsa:  ['hot peppers', 'onions', 'cilantro', 'tomatoes']


In [6]:
# to make a variable with mutable values independent,
# we need to make a copy of the value when we asssign it

student_2021 = ['rudy', 'moses', 'ricky', 'thumper']
student_2022 = list(student_2021) # makes a *copy* of the list

student_2022[0] = 'aiden'
print('student class 2021:', student_2021)
print('student class 2022:', student_2022)

student class 2021: ['rudy', 'moses', 'ricky', 'thumper']
student class 2022: ['aiden', 'moses', 'ricky', 'thumper']


Modifying data in-place is one of the pitfalls in programing, it can be very difficult to understand. However, it is often more efficient rather than making a copy of the data and modify the copy when you only modify a small change.  
  
You should consider both of these aspects.

### Nested Lists

In [7]:
x = [
    ['pepper', 'zucchini', 'onion'],
    ['cabbage', 'lettuce', 'garlic'],
    ['apple', 'pear', 'banana']
    ]

print(x)

[['pepper', 'zucchini', 'onion'], ['cabbage', 'lettuce', 'garlic'], ['apple', 'pear', 'banana']]


To help understand how indexing works in this kind of list, see this image of pepper in the bottle.

![Pepper Analogy](img/indexing_lists_python.png)

In [8]:
print([x[0]])

# print the the outer list, the inner list and the content

[['pepper', 'zucchini', 'onion']]


In [9]:
print(x[0])

# print the inner list and the content only

['pepper', 'zucchini', 'onion']


In [10]:
print(x[0][0])

# print the first content in the first array

pepper


### Heterogeneous List

In [11]:
sample_age = [10, 12.5, 'unknown']

print(sample_age)

[10, 12.5, 'unknown']


### Modifying List

In [12]:
print(odds)

[1, 3, 5, 7]


In [13]:
# append

odds.append(11)
print('odds after adding a value:', odds)

odds after adding a value: [1, 3, 5, 7, 11]


In [14]:
# remove element
# insert the removed element to new variable

removed_elem = odds.pop(0)
print('odds after removing first element', odds)
print('removed_element:', removed_elem)

odds after removing first element [3, 5, 7, 11]
removed_element: 1


In [16]:
# reverse the order

odds.reverse()
print('odds after reversing', odds)

odds after reversing [11, 7, 5, 3]


In modifying in-pkace, remember that Python treats lists in counter-intuitive way.  

In this sample, we make list, (attempt to) copy it and modify the list, we can cause all sorts of trouble. Look at these samples:

In [17]:
odds = [1, 3, 5, 7]
primes = odds
primes.append(2)
print('primes:', primes)
print('odds:', odds)

primes: [1, 3, 5, 7, 2]
odds: [1, 3, 5, 7, 2]


Note that when we `append(2)` the primes, it also did the same thing to the odds. This is because Python stores a list in a memory, then **then can use multiple names to refer the same list**. If we want to copy a (simple) list, we can again use `list()` function, so we do not modify the list we did not mean to.

In [20]:
odds = [1, 3, 5, 7]
primes = list(odds) # copying the list
primes.append(2)
print('primes:', primes)
print('odds:', odds)

primes: [1, 3, 5, 7, 2]
odds: [1, 3, 5, 7]


In [26]:
# subsets of lists and strings can be accessed by specifying range of value in brackets,
# similar to how we accessed ranges of positions in a NumPy Array. 
# this is commonly referred as 'slicing' the list/string

# slicing strings

binomial_name = 'Drosophila melanogaster'

group = binomial_name[0:10] #slice subset 0 to 10
print('group:', group)

species = binomial_name[11:23] #slice subset 11:23
print('species:', species)

group: Drosophila
species: melanogaster


In [25]:
# slicing lists

chromosomes = ['X', 'Y', '2', '3', '4']
autosomes = chromosomes[2:5]
print('autosomes:', autosomes)

last=chromosomes[-1]
print('last:', last)

autosomes: ['2', '3', '4']
last: 4


Assignment: Use slicing to access only the last four characters of a string or entries or of a list

In [45]:
string_for_slicing = 'Observation date: 02-Feb-2013'
list_for_slicing= [['fluorine', 'F'],
                   ['chlorine', 'Cl'],
                   ['bromine', 'Br'],
                   ['iodine', 'I'],
                   ['astatine', 'At']]

In [43]:
# slice 2013 from string_for_slicing

slc1 = string_for_slicing[-4:]
print(slc1)

2013


In [54]:
# slice 4 last entry from list_for_slicing

slc2 = list_for_slicing[-4:]
print(slc2)

[['chlorine', 'Cl'], ['bromine', 'Br'], ['iodine', 'I'], ['astatine', 'At']]


In [60]:
# Non Continuous SLices
# Adding step slices [begin:end:step]

primes = [2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37]
subset = primes[0:12:3]

print('subset:', subset)

subset: [2, 7, 17, 29]


In [61]:
# Slicing from the 3rd entry.

subset2 = primes[2:12:3]
print(subset2)

[5, 13, 23, 37]


Assignment: use the non continuous step on this variable of string to create new string.

In [63]:
beatles = "In an octopus's garden in the shade"
slc3 = beatles[0:35:2]
print(slc3)

I notpssgre ntesae


In [64]:
# If the slice begin from the beginning of string/list, we can omit the first index

date = "Monday 4 January 2016"
day = date[0:6]
print("Using 0 to begin range:", day)
day = date[:6]
print("Omit the beginning:", day)

Using 0 to begin range: Monday
Omit the beginning: Monday


In [65]:
# We can omit the ending index, if the slice is taken to the very end of sequence

months = ['jan', 'feb', 'mar', 'apr', 'may', 'jun', 'jul', 'aug', 'sep', 'oct', 'nov', 'dec']
sond = months[8:12]
print("With known last position:", sond)
sond = months[8:len(months)]
print("With len() to get last entry", sond)
sond = months[8:]
print("Omitting the last index:", sond)

With known last position: ['sep', 'oct', 'nov', 'dec']
With len() to get last entry ['sep', 'oct', 'nov', 'dec']
Omitting the last index: ['sep', 'oct', 'nov', 'dec']


In [67]:
# Overloading
# + in lists or strings callet concatenate, given that multiplication will concatenate list with itself.

counts = [2,4,6,8,10]
repeats = counts*2
print(repeats)

# equal with counts + counts

[2, 4, 6, 8, 10, 2, 4, 6, 8, 10]
