# Using Lists

## Creating a list
Here are a few ways of creating a list:

In [4]:
a_list = list() # empty list!
print(a_list)
a_list = [] # Similar. Note the use of square brackets
print(a_list)
a_list = ['apple', 'bananas', 'carrots']
print(a_list)
a_list = [1, "apples", True] # heterogeneous types are OK!
print(a_list)

[]
[]
['apple', 'bananas', 'carrots']
[1, 'apples', True]


# Adding and removing items

In [7]:
a = ['a', 'b', 'c']
a.append('d')
a

['a', 'b', 'c', 'd']

In [9]:
a.insert(2, 'E')
a

['a', 'b', 'E', 'c', 'd']

In [18]:
a.pop()

'd'

In [19]:
print(a)

['a', 'b', 'c']


## for loop

we have previously seen how to write for loops, iterating over items:

In [13]:
number_list = [1, 2, 3]
for number in number_list:
    print(number*2)

2
4
6


### List comprehension

With list comprehensions, we can directly create a list out of the elements returned by a for loop, with the following syntax:

In [14]:
doubled_list = [number*2 for number in number_list]
doubled_list

[2, 4, 6]

### Accessing items

Note that in python, the first element is number 0! This is a common convention in programming (C, Fortran, Rust all use zero indexing).

In [7]:
a = ['a', 'b', 'c', 'd']
a[1]

'b'

We can also access the elements starting from the last, using a minus sign: -1 will access the last, -2 the second last, etc...

In [8]:
a[-1]

'd'

How to access the third-to-last element?

In [13]:
a[2] # show 'c'
a[-3] # shows 'b' - the third to last element

'b'

## List slicing
Here are several ways of slicing a single list. Try to change the indexes to see what impact it has! 

In [16]:
original = [1, 2, 3, 4, 5, 6, 7, 8]
start = original[:4]
end = original[3:]
middle = original[2:5]
alternate = original[::2]
reverse = original[::-1]
print('Start:', start)
print('End:', end)
print('Middle:', middle)
print('Alternate:', alternate)
print('Reverse:', reverse)

Start: [1, 2, 3, 4]
End: [4, 5, 6, 7, 8]
Middle: [3, 4, 5]
Alternate: [1, 3, 5, 7]
Reverse: [8, 7, 6, 5, 4, 3, 2, 1]


### Operators

When you have a sequence (a list, tuple, set, dictionary, or other sequential type), you may be interesting in testing whether a specific value is in that sequence.
Rather than code it yourself, python provides the `in` opeartor.

In [20]:
2 in original

True

using `not`, you can also check if a value is not in the list:

In [21]:
"a" not in original

True

In [22]:
original + reverse

[1, 2, 3, 4, 5, 6, 7, 8, 8, 7, 6, 5, 4, 3, 2, 1]

## List operation examples
The aim of this section of the Notebook it to give examples of how many of the operators, functions and methods can be used in practice. 

First **create a list** from a file containing numbers: 

In [37]:
numbers = []
with open('../data/numbers.txt', 'r') as f:
    numbers = f.readlines()
numbers = [float(n) for n in numbers]
print(numbers)

[1.63, -32.78, -4.1, 25.307, 8.0, 31.33333, 780.4592, -422.343, 87.612, 928.7, -187.0, 1153.04, 4.2, 0.932, 5.65, -8205.9, -2749.655, 5.912, 2347.105, 39.2, 61.5]


Next apply **numeric functions** that work on sequences (including lists):

In [25]:
print('Minimum:', min(numbers))
print('Maximum:', max(numbers))
print('Sum:', sum(numbers))

Minimum: -8205.9
Maximum: 2347.105
Sum: -6121.19747


Next use **enumerate()** to iterate over the list and modify its contents: 

In [53]:
# replace any negative numbers with zeros
for i, n in enumerate(numbers, start=6):
    if n < 0:
        numbers[i] = 0
print(numbers)

[1.63, 0, 0, 25.307, 8.0, 31.33333, 780.4592, 0, 87.612, 928.7, 0, 1153.04, 4.2, 0.932, 5.65, 0, 0, 5.912, 2347.105, 39.2, 61.5]


In [43]:
help(enumerate)

Help on class enumerate in module builtins:

class enumerate(object)
 |  enumerate(iterable, start=0)
 |  
 |  Return an enumerate object.
 |  
 |    iterable
 |      an object supporting iteration
 |  
 |  The enumerate object yields pairs containing a count (from start, which
 |  defaults to zero) and a value yielded by the iterable argument.
 |  
 |  enumerate is useful for obtaining an indexed list:
 |      (0, seq[0]), (1, seq[1]), (2, seq[2]), ...
 |  
 |  Methods defined here:
 |  
 |  __getattribute__(self, name, /)
 |      Return getattr(self, name).
 |  
 |  __iter__(self, /)
 |      Implement iter(self).
 |  
 |  __next__(self, /)
 |      Implement next(self).
 |  
 |  __reduce__(...)
 |      Return state information for pickling.
 |  
 |  ----------------------------------------------------------------------
 |  Static methods defined here:
 |  
 |  __new__(*args, **kwargs) from builtins.type
 |      Create and return a new object.  See help(type) for accurate signature.



Next: Use the **insert()** method. 

**Question:** Why am I using floor division here? (*Hint:* try it with regular division to see why!)

In [56]:
# insert a star as middle element of list
middle_index = len(numbers) // 2
numbers.insert(middle_index, '*')
print(numbers)

[1.63, 0, 0, 25.307, 8.0, 31.33333, 780.4592, 0, 87.612, 928.7, '*', '*', 0, 1153.04, 4.2, 0.932, 5.65, 0, 0, 5.912, 2347.105, 39.2, 61.5]


Finally, make a **reversed copy** of the list and **concatenate** the two lists together:

In [27]:
# create a reverse copy of the list and add it to the end
reversed_numbers = list(reversed(numbers))
numbers = numbers + reversed_numbers
print(numbers)

[1.63, 0, 0, 25.307, 8.0, 31.33333, 780.4592, 0, 87.612, 928.7, '*', 0, 1153.04, 4.2, 0.932, 5.65, 0, 0, 5.912, 2347.105, 39.2, 61.5, 61.5, 39.2, 2347.105, 5.912, 0, 0, 5.65, 0.932, 4.2, 1153.04, 0, '*', 928.7, 87.612, 0, 780.4592, 31.33333, 8.0, 25.307, 0, 0, 1.63]


In [57]:
help(reversed)

Help on class reversed in module builtins:

class reversed(object)
 |  reversed(sequence, /)
 |  
 |  Return a reverse iterator over the values of the given sequence.
 |  
 |  Methods defined here:
 |  
 |  __getattribute__(self, name, /)
 |      Return getattr(self, name).
 |  
 |  __iter__(self, /)
 |      Implement iter(self).
 |  
 |  __length_hint__(...)
 |      Private method returning an estimate of len(list(it)).
 |  
 |  __next__(self, /)
 |      Implement next(self).
 |  
 |  __reduce__(...)
 |      Return state information for pickling.
 |  
 |  __setstate__(...)
 |      Set state information for unpickling.
 |  
 |  ----------------------------------------------------------------------
 |  Static methods defined here:
 |  
 |  __new__(*args, **kwargs) from builtins.type
 |      Create and return a new object.  See help(type) for accurate signature.



## Comparing lists
The following examples use two files, `words1.txt` and `word2.txt`; neither file contains any duplicates (you can rely on that property in what follows). Let's examine their contents before we start. This can be done at the Linux command-line as follows:

In [62]:
%%bash
echo "words1.txt:"
cat ../data/words1.txt
echo
echo "words2.txt:"
cat ../data/words2.txt

words1.txt:
flycatcher
gnatcatcher
bee-eater
kingfisher

words2.txt:
flycatcher
warbler
kingfisher
thrush
sparrow
hawk


The following script uses iteration and the `list.append()` method to answer the question: How many words are in both `words1.txt` and  `words2.txt`? 

In [67]:
words1 = []
with open('../data/words1.txt', 'r') as f:
    words1 = f.read().splitlines()

words2 = []
with open('../data/words2.txt', 'r') as f:
    words2 = f.read().splitlines()

# create a list containing words in both lists
# and print out its length
words_in_both = []
for w in words1:
    if w in words2:
        words_in_both.append(w)
print(len(words_in_both))

2


In [66]:
words1 = []
with open('../data/words1.txt', 'r') as f:
    words1 = f.readlines()
print(words1)

['flycatcher\n', 'gnatcatcher\n', 'bee-eater\n', 'kingfisher\n']


In the cell below, write a script that:
- prints out the words exclusive to list `words2.txt`
- prints out the total number of unique words in lists `words1.txt` and `words2.txt` (remembering that neither file on its own contains any duplicates). 

In [108]:
# NOTE: lists words1 and words2 are still available from the
# previous script (i.e. you don't need to reopen the files)
words2_exclusive = []

for w in words2:
    if w not in words1:
        words2_exclusive.append(w)
print(words2_exclusive)

words_all = words1 + words2
print(words_all)
unique_words = 0
for n in words_all:
    if words_all.count(n) == 1:
        unique_words += 1
        
print("There are", unique_words, "unique words")

['warbler', 'thrush', 'sparrow', 'hawk']
['flycatcher', 'gnatcatcher', 'bee-eater', 'kingfisher', 'flycatcher', 'warbler', 'kingfisher', 'thrush', 'sparrow', 'hawk']
There are 6 unique words


Extend the following script so that it removes any item from list `words2` that is in list `word1` and prints out words2. *Hint:* use the `list.remove()` method.

In [110]:
# NOTE: Assuming you didn't modify either list words1 or words2,
# you can use them again here (i.e. no need to reopen the files)

for w in words2:
    if w not in words2_exclusive:
        words2.remove(w)
print(words2)

['warbler', 'thrush', 'sparrow', 'hawk']


## Sorting lists
Look at the output from the following code to confirm that it is *not* sorting the lists numerically (hence `82` comes before `9`). 

Modify the following code so that it does a **numerical** sort of the list in **descending order**:

In [116]:
numbers = []
with open('../data/integers.txt', 'r') as f:
    numbers = f.read().splitlines()

# There are two sorting alternatives:
# a) Here list itself is not sorted
numbers = [int(n) for n in numbers]
print('Using sorted():\n', sorted(numbers, reverse = True))
# b) this time list is sorted
numbers.sort(reverse = True)
print('Using list.sort():\n', numbers)

Using sorted():
 [92, 82, 41, 34, 25, 17, 13, 11, 9, 8, 8, 7, 5, 5, 1, 0, 0, 0, 0, 0, 0, 0, -3, -4, -32]
Using list.sort():
 [92, 82, 41, 34, 25, 17, 13, 11, 9, 8, 8, 7, 5, 5, 1, 0, 0, 0, 0, 0, 0, 0, -3, -4, -32]


In [94]:
help(list.sort)

Help on method_descriptor:

sort(self, /, *, key=None, reverse=False)
    Sort the list in ascending order and return None.
    
    The sort is in-place (i.e. the list itself is modified) and stable (i.e. the
    order of two equal elements is maintained).
    
    If a key function is given, apply it once to each list item and sort them,
    ascending or descending, according to their function values.
    
    The reverse flag can be set to sort in descending order.



In [112]:
help(sorted)

Help on built-in function sorted in module builtins:

sorted(iterable, /, *, key=None, reverse=False)
    Return a new list containing all items from the iterable in ascending order.
    
    A custom key function can be supplied to customize the sort order, and the
    reverse flag can be set to request the result in descending order.

