<a name="top"></a>
# Introduction to Python Programming for Bioinformatics. Lesson 3

<details>
<summary>
About this notebook
</summary>

This notebook was originally written by [Marc Cohen](https://github.com/mco-gh), an engineer at Google. The original source can be found on [Marc's short link service](https://mco.fyi/), and starts with [Python lesson 0](https://mco.fyi/py0), and I encourage you to work through that notebook if you find some details missing here.

Rob Edwards edited the notebook, adapted it for bioinformatics, using some simple geneticy examples, condensed it into a single notebook, and rearranged some of the lessons, so if some of it does not make sense, it is Rob's fault!

It is intended as a hands-on companion to an in-person course, and if you would like Rob to teach this course (or one of the other courses) don't hesitate to get in touch with him.

</details>
<details>
<summary>
Using this notebook
</summary>

You can download the original version of this notebook from [GitHub](https://linsalrob.github.io/ComputationalGenomicsManual/Python/Python_Lesson_03.ipynb) and from [Rob's Google Drive]()

**You should make your own copy of this notebook by selecting File->Save a copy in Drive from the menu bar above, and then you can edit the code and run it as your own**

There are several lessons, and you can do them in any order. I've tried to organise them in the order I think most appropriate, but you may disagree!
</details>


<a name="lessons"></a>

# Lesson Links

* [Lesson 3 Lists](#Lesson-3-Lists)
  * [Lists](#Lists)
  * [Creating Lists](#Creating-Lists)
  * [Sets](#Sets)
  * [List Operations](#List-Operations)

Previous Lesson: [Local](Python_Lesson_02.ipynb) | [GitHub](https://linsalrob.github.io/ComputationalGenomicsManual/Python/Python_Lesson_02.ipynb) | [Google Colab](https://colab.research.google.com/drive/1Sm7N8Agf0aFj6qbd6GwenGBaqchZYTug)

Next Lesson: [Local](Python_Lesson_04.ipynb) | [GitHub](https://linsalrob.github.io/ComputationalGenomicsManual/Python/Python_Lesson_04.ipynb) | [Google Colab](https://colab.research.google.com/drive/1IyjNTpdtwaulP_QXbrCYK8Yd2J3MgCQo)


# Lesson 3 Lists

**Lists and Dictionaries**


# Lists

* A list is a list of things, like a shopping list.
* Lists are ordered sequences.
* All the sequence operations you learned about with strings, like `len`, indexing, slicing, looping, `in`, etc. apply to lists as well.

Lists are defined inside square brackets, with list elements separated by commas, for example...
```
['a', 'b', 'c', 1, 2, 3]
```

Note that you can have both strings and numbers inside lists.

[List documentation](https://docs.python.org/3/library/stdtypes.html#list)

## Creating Lists

In [None]:
# Create an empty list (lists use square brackets)
li = []
print(f"empty list: {li}")

In [None]:
# Create and initialize a list with some data
li = ['Bacteria', 4500000, 3.14, True]
print(f"non-empty list: {li}")

In [None]:
# the same value can occur multiple times in a list
li = ['a', 'a', 'a']
print(li)

# Sets

Sets are like lists except for two key things:
* Sets are not ordered! The order that you get things back is not necessarily the same as the order that you put them in.
* Everything in a set is unique.

In [None]:
example_set = set()
print(f"empty set: {example_set}")
example_set.add('a')
example_set.add('a')
example_set.add('a')
print(f"non-empty set: {example_set}")

## List Operations

You can change and edit lists, and add things to them. (We call this mutable, but don't worry about that).

In [None]:
# The len() function gives us the size of a list.
li = ["Chr1", "Chr2", "Chr3"]
# get the size of a list
list_size = len(li)
print(list_size)

In [None]:
for i in range(7, 10):
  print(i)

In [None]:
li = ["Chr1", "Chr2", "Chr3", "Chr4", "Chr5", "Chr6", "Chr7", "Chr8", "Chr9", "Chr10", "Chr11", "Chr12", "Chr13", "Chr14", "Chr15", "Chr16", "Chr17", "Chr18", "Chr19", "Chr20", "Chr21", "Chr22", "Chr23"]
# iterate (loop) over the elements in a list
for i in range(len(li)):
  print(li[i])

In [None]:
# A better way to iterate over the elements in a list
li = ["Chr1", "Chr2", "Chr3", "Chr4", "Chr5", "Chr6", "Chr7", "Chr8", "Chr9", "Chr10", "Chr11", "Chr12", "Chr13", "Chr14", "Chr15", "Chr16", "Chr17", "Chr18", "Chr19", "Chr20", "Chr21", "Chr22", "Chr23"]
for i in li:
  print(i)

In [None]:
li = ["Chr1", "Chr2", "Chr3", "Chr4", "Chr5", "Chr6", "Chr7", "Chr8", "Chr9", "Chr10", "Chr11", "Chr12", "Chr13", "Chr14", "Chr15", "Chr16", "Chr17", "Chr18", "Chr19", "Chr20", "Chr21", "Chr22", "Chr23"]
for i, j in enumerate(li):
  print(f"The element at position {i} is {j}")

In [None]:
li = ["Chr1", "Chr2", "Chr3", "Chr4", "Chr5", "Chr6", "Chr7", "Chr8", "Chr9", "Chr10", "Chr11", "Chr12", "Chr13", "Chr14", "Chr15", "Chr16", "Chr17", "Chr18", "Chr19", "Chr20", "Chr21", "Chr22", "Chr23"]
# test membership in a list

x = "ChrX"
if not x in li:
 print(f"{x} is not in the list")
else:
 print(f"{x} is in the list")

In [None]:
# indexing (list indexes start with zero!)
li = ["Chr1", "Chr2", "Chr3", "Chr4", "Chr5", "Chr6", "Chr7", "Chr8", "Chr9", "Chr10", "Chr11", "Chr12", "Chr13", "Chr14", "Chr15", "Chr16", "Chr17", "Chr18", "Chr19", "Chr20", "Chr21", "Chr22", "Chr23"]
print(li[2])

In [None]:
# indexing out of bounds raises a runtime error
li = ["Chr1", "Chr2", "Chr3", "Chr4", "Chr5", "Chr6", "Chr7", "Chr8", "Chr9", "Chr10", "Chr11", "Chr12", "Chr13", "Chr14", "Chr15", "Chr16", "Chr17", "Chr18", "Chr19", "Chr20", "Chr21", "Chr22", "Chr23"]
print(li[99])


In [None]:
li = ["Chr1", "Chr2", "Chr3", "Chr4", "Chr5", "Chr6", "Chr7", "Chr8", "Chr9", "Chr10", "Chr11", "Chr12", "Chr13", "Chr14", "Chr15", "Chr16", "Chr17", "Chr18", "Chr19", "Chr20", "Chr21", "Chr22", "Chr23"]
# slicing
print(li[1:3])

In [None]:
# concatenating lists
li1 = ["Chr1", "Chr2", "Chr3"]
li2 = ["Chr4", "Chr5", "Chr6"]
li3 = ["Chr7", "Chr8", "Chr9"]
li4 = li1 + li2 + li3
print(li4)

In [None]:
# add an element
li = ["Chr1", "Chr2", "Chr3", "Chr4"]
print(li)
li.append("ChrX")
print(li)

In [None]:
print(li)
# replace an element by index
li[4] = 'ChrY' # overwrites value at index 4
print(li)

In [None]:
li = ["Chr1", "Chr2", "Chr3", "Chr4", "Chr1", "Chr2", "Chr3", "Chr1", "Chr2", "Chr1"]
# get the number of occurrences of a particular value
count = li.count("Chr1")
print(count)

In [None]:
li = ["Chr1", "Chr2", "Chr3", "Chr4", "Chr1", "Chr2", "Chr3", "Chr1", "Chr2", "Chr1"]
# get the (first) index of a particular value
index = li.index("Chr1")
print(index)

In [None]:
li = ["Chr1", "Chr2", "Chr3", "Chr4"]
# reverse a list
li.reverse()
print(li)
li.reverse()
print(li)

## Nested Lists

We can have lists of lists, and lists of lists of lists.

- list of lists: ```[[1, 2], [3, 4]]```


Later we will look at spreadsheets that are two-dimensional, and you can imagine them being held as lists of lists. The first list is each row of the spreadsheet, and the second list is each column, so that a cell has a unique value per list.

If you want to explore nested lists in more detail, have a look at [Marc's Python lesson 5](mco.fyi/py5), which covers these concepts in more detail.


[Return to the lesson listing](#lessons)

[Return to the top of the notebook](#top)