## Comprehensions, Generators
### BIOINF 575 - Fall 2023

### For loop RECAP

### for: the repetitive control structure with a known number of steps

To loop through a sequence of elements is to iterate

```python
for var in sequence:
    statements
```

___ 

### Python Comprehension Statements
Courtesy of Marcurs Sherman - partly adapted

First, the **purpose** of comprehensions:
> "\[...\] comprehensions provide a more concise way to create \[iterables\] in situations where `map()` and `filter()` and/or nested loops would currently be used" - Barry Warsaw, [PEP 202](https://www.python.org/dev/peps/pep-0202/)

Comprehensions are what we call "_syntactic sugar_". 
This means that they do not do anything you could not have done already.     
But, with them, you can do some operations easier.

<img src="venn_diagram2.png" width=400 />

---
### Comprehension Syntax

#### Legend

<img src="legendary.png" width=250 />

#### Examples
<img src="comprehensions.png" width=500 />

#### Alternate syntax of a comprehensions

<center><img src="http://python-3-patterns-idioms-test.readthedocs.io/en/latest/_images/listComprehensions.gif" width = "500"/></center>

---
#### The Comprehension Categories
1. `list` comprehensions - create a list
2. `dict`ionary comprehensions - create dictionaries
3. `set` comprehensions - create sets
4. `tuple`? comprehensions

In [1]:
sequences = ["ACTTGCCC", "AAAGTC", "CCTAC", "AAACCTA"]

In [2]:
sequences

['ACTTGCCC', 'AAAGTC', 'CCTAC', 'AAACCTA']

#### Basic list comprehension
* Compute simple expression for each element

In [3]:
len_lst = [len(seq) for seq in sequences]
len_lst

[8, 6, 5, 7]

In [6]:
len_lst = []
for seq in sequences:
    len_lst.append(len(seq))
    
len_lst

[8, 6, 5, 7]

#### List comprehension - use [ ]
* Compute complex expression for each element



In [10]:
# compute GC count

gc_lst = [(seq.count("G") + seq.count("C"))/len(seq) for seq in sequences]
gc_lst

[0.625, 0.3333333333333333, 0.6, 0.2857142857142857]

#### List comprehension with predicate
* Compute complex expression for specific elements
    * add a predicate - an if expression 
    * if expression - similar to the to the if statement but with no statements after the header line
        * e.g.: if "#" not in item


In [11]:
# compute GC content only for sequences that contain "AC"

sequences

['ACTTGCCC', 'AAAGTC', 'CCTAC', 'AAACCTA']

In [14]:
[(seq.count("G") + seq.count("C"))/len(seq) for seq in sequences if "AC" in seq]

[0.625, 0.6, 0.2857142857142857]

#### If the comprehension becomes to complex - use a regular for loop

In [None]:
# compute GC content only for sequences that contain "AC"

sequences

In [16]:
gc_lst = []
for seq in sequences:
    if "AC" in seq:
        gc_count = (seq.count("G") + seq.count("C"))/len(seq)
        gc_lst.append(gc_count)
        
gc_lst

[0.625, 0.6, 0.2857142857142857]

#### Set comprehensions - use { } 
* Use when you want unique elements and the order does not matter

In [18]:
# get the first codon in each sequence  

{seq[:3] for seq in sequences}

{'AAA', 'ACT', 'CCT'}

In [19]:
sequences

['ACTTGCCC', 'AAAGTC', 'CCTAC', 'AAACCTA']

#### Dictionary comprehensions - use { }
* must start with something like: key_expression:value_expression
* Use when you want key:value pairs and the order does not matter

In [21]:
# sequence as key GC count as value
# {key:value for s in sequence}

{seq:(seq.count("G") + seq.count("C"))/len(seq) for seq in sequences}

{'ACTTGCCC': 0.625,
 'AAAGTC': 0.3333333333333333,
 'CCTAC': 0.6,
 'AAACCTA': 0.2857142857142857}

In [4]:
def compute_GC_count(seq):
    return (seq.count("G") + seq.count("C"))/len(seq)

In [5]:
{seq:compute_GC_count(seq) for seq in sequences}

{'ACTTGCCC': 0.625,
 'AAAGTC': 0.3333333333333333,
 'CCTAC': 0.6,
 'AAACCTA': 0.2857142857142857}

In [6]:
# ?open

#### <font color = "red">Exercise:</font>   

* Create a list comprehension where we store if the corresponding sequence can code for the amino acid Tyrosine (TAT and TAC codons code for this amino acid).
* Change this into a dictionary comprehension where the key is the "Seq pos", where pos is the position of the sequence on the `sequences` list and val is the result of the check from point 1
.


In [7]:
sequences

['ACTTGCCC', 'AAAGTC', 'CCTAC', 'AAACCTA']

In [8]:
["TAC" in seq for seq in sequences]

[False, False, True, False]

In [9]:
["TAT" in seq for seq in sequences]

[False, False, False, False]

In [10]:
["TAC" in seq or "TAT" in seq for seq in sequences]

[False, False, True, False]

In [12]:
# dir(str)

In [13]:
# make sure it is a proper codon - it starts at index 0,3,6,9...

'AAAGTC'.index("TAC")

ValueError: substring not found

In [14]:
'AAAGTC'.find("TAC")

-1

In [20]:
(-2)%3

1

In [24]:
"AAATACGGG".find("TAC")

3

In [25]:
"AAATACGGG".find("TAC") % 3

0

In [23]:
"AAATACGGG".find("TAC") % 3 == 0

True

In [26]:
"CAAATACGGG".find("TAC") % 3 == 0

False

In [None]:
["TAC" in seq or "TAT" in seq for seq in sequences]

In [27]:
[type(seq) for seq in sequences]

[str, str, str, str]

In [28]:
[seq.index("TAC") for seq in sequences]

ValueError: substring not found

In [29]:
?str.index

[0;31mDocstring:[0m
S.index(sub[, start[, end]]) -> int

Return the lowest index in S where substring sub is found,
such that sub is contained within S[start:end].  Optional
arguments start and end are interpreted as in slice notation.

Raises ValueError when the substring is not found.
[0;31mType:[0m      method_descriptor


In [30]:
[seq.find("TAC") for seq in sequences]

[-1, -1, 2, -1]

In [31]:
?str.find

[0;31mDocstring:[0m
S.find(sub[, start[, end]]) -> int

Return the lowest index in S where substring sub is found,
such that sub is contained within S[start:end].  Optional
arguments start and end are interpreted as in slice notation.

Return -1 on failure.
[0;31mType:[0m      method_descriptor


In [32]:
[seq.find("TAC")%3 for seq in sequences]

[2, 2, 2, 2]

In [33]:
[seq.find("TAC")%3 == 0 for seq in sequences]

[False, False, False, False]

In [34]:
[seq.find("TAC")%3 == 0 or seq.find("TAT")%3 == 0 for seq in sequences]

[False, False, False, False]

In [35]:
sequences

['ACTTGCCC', 'AAAGTC', 'CCTAC', 'AAACCTA']

In [36]:
s2 = ['ACTTGCCC', 'AAAGTC', 'CCTAC', 'AAACCTA', "TATAAACCC"]

In [37]:
[seq.find("TAC")%3 == 0 or seq.find("TAT")%3 == 0 for seq in s2]

[False, False, False, False, True]

In [38]:
s2 = ['ACTTGCCC', 'AAAGTC', 'CCTAC', 'AAACCTA', "GGGTATAAACCC"]
[seq.find("TAC")%3 == 0 or seq.find("TAT")%3 == 0 for seq in s2]

[False, False, False, False, True]

In [40]:
{seq:seq for seq in sequences}

{'ACTTGCCC': 'ACTTGCCC',
 'AAAGTC': 'AAAGTC',
 'CCTAC': 'CCTAC',
 'AAACCTA': 'AAACCTA'}

In [41]:
{i for i in range(10)}

{0, 1, 2, 3, 4, 5, 6, 7, 8, 9}

In [42]:
{"seq 1": 'ACTTGCCC'}

{'seq 1': 'ACTTGCCC'}

In [46]:
{"seq " + str(i) for i in range(len(sequences))}

{'seq 0', 'seq 1', 'seq 2', 'seq 3'}

In [47]:
{"seq " + str(i): sequences[i] for i in range(len(sequences))}

{'seq 0': 'ACTTGCCC', 'seq 1': 'AAAGTC', 'seq 2': 'CCTAC', 'seq 3': 'AAACCTA'}

In [48]:
{"seq " + str(i): sequences[i].find("TAC")%3 == 0 or sequences[i].find("TAT")%3 == 0 for i in range(len(sequences))}

{'seq 0': False, 'seq 1': False, 'seq 2': False, 'seq 3': False}

In [49]:
sequences

['ACTTGCCC', 'AAAGTC', 'CCTAC', 'AAACCTA']

In [50]:
{"seq " + str(i): s2[i].find("TAC")%3 == 0 or s2[i].find("TAT")%3 == 0 for i in range(len(s2))}

{'seq 0': False, 'seq 1': False, 'seq 2': False, 'seq 3': False, 'seq 4': True}

In [51]:
s2

['ACTTGCCC', 'AAAGTC', 'CCTAC', 'AAACCTA', 'GGGTATAAACCC']

In [56]:
def codon_check(seq):
    return seq.find("TAC")%3 == 0 or seq.find("TAT")%3 == 0

In [57]:
{"seq " + str(i): codon_check(sequences[i]) for i in range(len(sequences))}

{'seq 0': False, 'seq 1': False, 'seq 2': False, 'seq 3': False}

In [58]:
list(enumerate(sequences))

[(0, 'ACTTGCCC'), (1, 'AAAGTC'), (2, 'CCTAC'), (3, 'AAACCTA')]

In [60]:
idx, val = (0, 'ACTTGCCC')
print(idx)
print(val)

0
ACTTGCCC


In [62]:
{idx:val for idx, val in enumerate(sequences)}

{0: 'ACTTGCCC', 1: 'AAAGTC', 2: 'CCTAC', 3: 'AAACCTA'}

In [63]:
{idx:codon_check(val) for idx, val in enumerate(sequences)}

{0: False, 1: False, 2: False, 3: False}

In [64]:
{"seq " + str(idx + 1):codon_check(val) for idx, val in enumerate(sequences)}

{'seq 1': False, 'seq 2': False, 'seq 3': False, 'seq 4': False}

### Some pros of comprehensions
1. Concise - their use can easily distill multiple lines of code into a single, concise statement
1. Efficient (time and other resources) - _slightly_ more performant than regular loops
1. Flexible output - list, set, dictionary ...

### Some cons of comprehensions
1. The "imperative" syntax - the order in which you type things to make one is different from the rest of Python
1. Readability - comprehension statements get more unreadable as complexity is added

### RESOURCES

https://www.tutorialspoint.com/python-list-comprehension  
https://python-3-patterns-idioms-test.readthedocs.io/en/latest/Comprehensions.html  
https://realpython.com/list-comprehension-python/  
http://scipy-lectures.org/advanced/advanced_python/index.html   

#### Did we miss the tuple comprehensions?

In [65]:
# Try to make a `tuple` comprehension
# this will not return a tuple

(number * 2 for number in range(10))

<generator object <genexpr> at 0x7fda0265e3c0>

### Python Generators
Courtesy of Marcurs Sherman - partly adapted

#### What was mentioned above as "comprehension statements" are actually called "generator expressions".

<img src="http://nvie.com/img/relationships.png" width=600 align='middle'/>


* Iterable is an object, which one can iterate over.
    * It generates an Iterator when passed to `iter()` method.       
* Iterator is an object, which is used to iterate over an iterable object using `__next__()` method. 
    * Iterators have `__next__()` method, which returns the next item of the object.       

* Note that **every iterator** is also an **iterable**, but **_not every iterable is an iterator_**.    
    * For example, a list is iterable but a list is not an iterator.        
* An iterator can be created from an iterable by using the function `iter()`. 
    * To make this possible, the class of an object needs either a method `__iter__`, which returns an iterator, or a `__getitem__` method with sequential indexes starting with 0.           

https://www.geeksforgeeks.org/python-difference-iterable-iterator/



In [66]:
range(3)

range(0, 3)

In [67]:
dir(range)

['__bool__',
 '__class__',
 '__contains__',
 '__delattr__',
 '__dir__',
 '__doc__',
 '__eq__',
 '__format__',
 '__ge__',
 '__getattribute__',
 '__getitem__',
 '__gt__',
 '__hash__',
 '__init__',
 '__init_subclass__',
 '__iter__',
 '__le__',
 '__len__',
 '__lt__',
 '__ne__',
 '__new__',
 '__reduce__',
 '__reduce_ex__',
 '__repr__',
 '__reversed__',
 '__setattr__',
 '__sizeof__',
 '__str__',
 '__subclasshook__',
 'count',
 'index',
 'start',
 'step',
 'stop']

In [68]:
# is range an iterator?
next(range(3))

TypeError: 'range' object is not an iterator

In [70]:
dir(iter(range(3)))

['__class__',
 '__delattr__',
 '__dir__',
 '__doc__',
 '__eq__',
 '__format__',
 '__ge__',
 '__getattribute__',
 '__gt__',
 '__hash__',
 '__init__',
 '__init_subclass__',
 '__iter__',
 '__le__',
 '__length_hint__',
 '__lt__',
 '__ne__',
 '__new__',
 '__next__',
 '__reduce__',
 '__reduce_ex__',
 '__repr__',
 '__setattr__',
 '__setstate__',
 '__sizeof__',
 '__str__',
 '__subclasshook__']

In [72]:
next(iter(range(3)))

0

In [73]:
test_iter = iter(range(3))

In [74]:
next(test_iter)

0

#### and we can do next again and again ...

In [75]:
# and ...that's it ... 
# when we reach the end of the sequence 
# the generator gives an error on next
# we have to create it again to start from the beginning

next(test_iter)

1

In [76]:
next(test_iter)

2

In [77]:
next(test_iter)

StopIteration: 

In [78]:
test_iter = iter(range(3))
for i in test_iter:
    print(i)

0
1
2


In [79]:
test_gen = (number * 2 for number in range(10))

In [81]:
next(test_gen)

2

In [82]:
# retrieve all values
tuple(test_gen)

(4, 6, 8, 10, 12, 14, 16, 18)

In [83]:
test_gen = (number * 2 for number in range(10))
tuple(test_gen)

(0, 2, 4, 6, 8, 10, 12, 14, 16, 18)

In [85]:
test_gen = (number * 2 for number in range(10))
set(test_gen)

{0, 2, 4, 6, 8, 10, 12, 14, 16, 18}

In [86]:
next(test_gen)

StopIteration: 

___
#### Functions RECAP

```python

# DEFINITION - creating a function

def function_name(arg1, arg2, darg=None):
    # instructions to compute result
    return result

# CALL - running a function

function_result = function_name(val1, val2, dval)
```

___


A generator is just a special case of a function. The main difference is how it gives its output. 

How do you make a function give a result?

In [87]:
def number_one():
    number = 1
    return number

In [88]:
number_one()

1

In [89]:
# create a generator for an infinite sequence of numbers
# Note for generators we have yield instead of return

def infinite_sequence():
    number = 0
    while True:
        yield number
        number += 1 # number = number + 1

In [90]:
numbers_seq_gen = infinite_sequence()

In [91]:
numbers_seq_gen

<generator object infinite_sequence at 0x7fda107c5ba0>

In [92]:
next(numbers_seq_gen)

0

#### and we can do next again and again ...

In [98]:
next(numbers_seq_gen)

6

In [99]:
# a generator for a finite sequence of numbers
# this starts to look like range

def finite_sequence(limit):
    number = 0
    while number < limit:
        yield number
        number += 1

In [100]:
numbers_seq_gen = finite_sequence(3)

In [101]:
numbers_seq_gen

<generator object finite_sequence at 0x7fd9f05ad040>

In [102]:
next(numbers_seq_gen)

0

In [103]:
# and we can do next again and again ... and ...that's it

next(numbers_seq_gen)


1

In [105]:
# we can put all the results in a list
next(numbers_seq_gen)


StopIteration: 

In [106]:
numbers_seq_gen = finite_sequence(3)

In [107]:
tuple(numbers_seq_gen)

(0, 1, 2)

In [108]:
# go through the elements of the generator

x = finite_sequence(10)
y = next(x)
while y < 5:
    print(y)
    y = next(x)

0
1
2
3
4


In [109]:
for i in x:
    print(i)

6
7
8
9


In [112]:
# generator to put a key and a values list together in a dictionary

def zip_2sequences(seq1, seq2):
    i = 0 
    n = min(len(seq1), len(seq2))
    while i < n:
        yield (seq1[i], seq2[i])
        i = i + 1

In [111]:
v1 = [1,2,3]
v2 = ["A","G","G"]

s1 = "ACG"
s2 = "123"

In [116]:
list(zip_2sequences(v1,v2))

[(1, 'A'), (2, 'G'), (3, 'G')]

In [117]:
dict(zip_2sequences(v1,v2))

{1: 'A', 2: 'G', 3: 'G'}

In [118]:
dict(zip_2sequences(s1,s2))

{'A': '1', 'C': '2', 'G': '3'}

In [110]:
{1:"A"}.items()

dict_items([(1, 'A')])

#### <font color = "red">Exercise:</font>   

* Create a generator of n nucleotides that keeps giving us a nucleotide in the order A,C,G,T and then starts again from A until it reaches n nucleotides. 


In [119]:
def gen_nucleotides(n):
    yield "A"

In [120]:
tuple(gen_nucleotides(10))

('A',)

In [137]:
def gen_nucleotides(n):
    nucleotides = "ACGT"
    i = 0
    while i < n:
        yield nucleotides[i%4]
        i = i + 1

In [139]:
tuple(gen_nucleotides(16))

('A',
 'C',
 'G',
 'T',
 'A',
 'C',
 'G',
 'T',
 'A',
 'C',
 'G',
 'T',
 'A',
 'C',
 'G',
 'T')

---
# Conclusion
Generators and generator expressions should be a standard tool in every bioinformaticist's tool belt. 

1. Generator expressions can compress simple for loops down to a single line
1. List comprehensions tend to be more efficient than standard for loops when the data is sufficiently large
1. The same syntax to make a list comprehension can be used to make dictionaries, sets, and generators
1. Generators are iterators that lazily evaluate the next value and `yield` it back
1. Once a generator (or any iterator) is consumed it needs to be reset to provide the values it generated

### Some pros of generators
1. Lazy evaluation: does not produce all the data at one time
1. Maintains state between steps: does not forget where it left off
1. Easily handles data of any size

### Some cons of generators
1. Hard to explain to someone that does not use Python
1. The data you are using is sufficiently small that the trade-off is not worth it

#### RESOURCES 
https://www.tutorialspoint.com/generators-in-python   
https://www.geeksforgeeks.org/generators-in-python/   
https://book.pythontips.com/en/latest/generators.html   


---
### Function Examples

___
##### <b>`*args`</b> - unkown no. of arguments - unpack collection of argument values
##### <b>`**kargs`</b> - unkown no. of arguments - unpack mapping of names and values 

In [140]:
x ,y ,z = [20,30,40]
print(x)
print(y)
print(z)

20
30
40


In [141]:
# what if the number of elements do not match?
x ,y ,z = [20,30,40,50]
print(x)
print(y)
print(z)


ValueError: too many values to unpack (expected 3)

In [142]:
x ,*y ,z = [20,30,50, "A", 40]
print(x)
print(y)
print(z)

20
[30, 50, 'A']
40


In [143]:
x ,y ,*z = [20,30,50, "A", 40]
print(x)
print(y)
print(z)

20
30
[50, 'A', 40]


In [144]:
# if we use * we can provide an unknown number value of arguments

def test_arg(*args_list):
    for value in args_list:
        print("value = ", value)

In [145]:
test_arg(1,2,3, {"a":4}, [4,5])

value =  1
value =  2
value =  3
value =  {'a': 4}
value =  [4, 5]


In [146]:
# no key=value arguments allowed
test_arg(args_list = 2)

TypeError: test_arg() got an unexpected keyword argument 'args_list'

In [148]:
# if we use * we can provide an unknown number value of arguments
# if we use ** we can provide an unknown number key = value of arguments

def test_karg(**keys_args_dict):
    for name,value in keys_args_dict.items():
        print("name = ", name)
        print("value = ", value)

In [149]:
test_karg(**{"gene":"EGFR", "expression": 20,"transcript_no": 4})

name =  gene
value =  EGFR
name =  expression
value =  20
name =  transcript_no
value =  4


In [150]:
test_karg(gene = "EGFR", expression = 20, transcript_no = 4, snp_no = 5, genes_regualted = {"TP53", "EGR"})

name =  gene
value =  EGFR
name =  expression
value =  20
name =  transcript_no
value =  4
name =  snp_no
value =  5
name =  genes_regualted
value =  {'TP53', 'EGR'}


In [151]:
# we can check for the key and perform computations with the value for that key
# or retrieve the value for a specific key

def test_karg(**keys_args_dict):
    for name,value in keys_args_dict.items():
        print("name = ", name)
        print("value = ", value)
        if (name == "expression"):
            print("new value", 2*keys_args_dict[name])
        

In [152]:
test_karg(gene = "EGFR", expression = 20, transcript_no = 4, snp_no = 5, genes_regualted = {"TP53", "EGR"})

name =  gene
value =  EGFR
name =  expression
value =  20
new value 40
name =  transcript_no
value =  4
name =  snp_no
value =  5
name =  genes_regualted
value =  {'TP53', 'EGR'}


In [153]:
test_karg(gene = "EGFR", Expression = 20, transcript_no = 4, snp_no = 5, genes_regualted = {"TP53", "EGR"})

name =  gene
value =  EGFR
name =  Expression
value =  20
name =  transcript_no
value =  4
name =  snp_no
value =  5
name =  genes_regualted
value =  {'TP53', 'EGR'}


In [155]:
# if we provide a dictionary then all our key value pairs have to be in the dictionary we create
def test_karg(keys_args_dict):
    for name,value in keys_args_dict.items():
        print("name = ", name)
        print("value = ", value)

In [156]:
test_karg({"gene":"EGFR", "expression": 20,"transcript_no": 4})

name =  gene
value =  EGFR
name =  expression
value =  20
name =  transcript_no
value =  4


In [158]:
test_karg(keys_args_dict = {"gene":"EGFR", "exp": 20,"transcript_no": 4})

name =  gene
value =  EGFR
name =  exp
value =  20
name =  transcript_no
value =  4


In [159]:
# we cannot provide the dictionary items as independent arguments
test_karg(gene = "EGFR", Expression = 20, transcript_no = 4, snp_no = 5, genes_regualted = {"TP53", "EGR"})

TypeError: test_karg() got an unexpected keyword argument 'gene'

In [160]:
# if we provide a dictionary then all our key value pairs have to be in the dictionary we create
def test_karg(**keys_args_dict):
    for name,value in keys_args_dict.items():
        print("name = ", name)
        print("value = ", value)

In [161]:
test_karg(gene = "EGFR", Expression = 20, transcript_no = 4, snp_no = 5, genes_regualted = {"TP53", "EGR"})

name =  gene
value =  EGFR
name =  Expression
value =  20
name =  transcript_no
value =  4
name =  snp_no
value =  5
name =  genes_regualted
value =  {'TP53', 'EGR'}


____
##### <b>`lambda` function</b> - anonymous function - it has no name
Should be used only with simple expressions

https://docs.python.org/3/reference/expressions.html#lambda<br>
https://www.geeksforgeeks.org/python-lambda-anonymous-functions-filter-map-reduce/<br>
https://realpython.com/python-lambda/<br>

`lambda arguments : expression`

A lambda function can take <b>any number of arguments<b>, but must always have <b>only one expression</b>.

In [162]:
help(compute_expression)

NameError: name 'compute_expression' is not defined

In [163]:
compute_expression = lambda x, y: x + y + x*y

In [164]:
help(compute_expression)

Help on function <lambda> in module __main__:

<lambda> lambda x, y



In [165]:
compute_expression(2, 3)

11

____
### Useful functions

#### Built-in functions
https://docs.python.org/3/library/functions.html

In [None]:
# [1,2,3] + [4,5,6]

##### <b>`zip(*iterables)`</b> - make an iterator that aggregates respective elements from each of the iterables.   
https://docs.python.org/3/library/functions.html#zip

##### <b>`map(function, iterable, ...)`</b> - apply function to every element of an iterable - return iterable with results
https://docs.python.org/3/library/functions.html#map

##### <b>`filter(function, iterable)`</b> - apply function (bool result) to every element of an iterable - return the elements from the input iterable for which the function returns True
https://docs.python.org/3/library/functions.html#filter

##### <b>`functools.reduce(function, iterable[, initializer])`</b> - apply function to every element of an iterable to reduce the iterable to a single value
https://docs.python.org/3/library/functools.html#functools.reduce

____



<b>`zip(*iterables)`</b> - make an iterator that aggregates respective elements from each of the iterables.  


In [166]:
combined_res = zip([10,20,30],["ACT","GGT","AACT"],[True,False,True])
combined_res

<zip at 0x7fda02a4bb80>

In [167]:
for element in combined_res:
    print(element)

(10, 'ACT', True)
(20, 'GGT', False)
(30, 'AACT', True)


In [None]:
list(combined_res)

In [168]:
combined_res = zip([10,20,30],["ACT","GGT","AACT"],[True,False,True])
list(combined_res)

[(10, 'ACT', True), (20, 'GGT', False), (30, 'AACT', True)]

In [169]:
combined_res = zip([10,20,30,500],["ACT","GGT","AACT"],[True,False,True])
list(combined_res)

[(10, 'ACT', True), (20, 'GGT', False), (30, 'AACT', True)]

In [170]:
# unzip list
x, y, z = zip(*[(3,4,7), (12,15,19), (30,60,90)])
print(x, y, z)

(3, 12, 30) (4, 15, 60) (7, 19, 90)


In [171]:
x, y, z = zip(*[(3,4,7,8), (12,15,19), (30,60,90)])
print(x, y, z)

(3, 12, 30) (4, 15, 60) (7, 19, 90)


In [172]:
combined_res = zip(["ACT","GGT","AACT"], [10,20,30])
dict(combined_res)

{'ACT': 10, 'GGT': 20, 'AACT': 30}

In [173]:
dict(zip(["ACT","GGT","AACT"], [10,20,30]))

{'ACT': 10, 'GGT': 20, 'AACT': 30}

_____

<b>`map(function, iterable, ...)`</b> - apply function to every element of an iterable - return iterable with results

In [None]:
map(abs,[-2,0,-5,6,-7])

In [174]:
list(map(abs,[-2,0,-5,6,-7]))

[2, 0, 5, 6, 7]

In [176]:
def compute_addition(x,y):
    return x + y


In [177]:
list(map(compute_addition, [1,2,3,4], [50,60,70]))

[51, 62, 73]

In [178]:
def compute_addition(x,y = 10):
    return x + y

In [179]:
list(map(compute_addition, [1,2,3,4]))

[11, 12, 13, 14]

In [180]:
list(map(compute_addition, [1,2,3,4], [50,60,70]))

[51, 62, 73]

https://www.geeksforgeeks.org/python-map-function/

In [182]:
numbers1 = [1, 2, 3] 
numbers2 = [4, 5, 6] 
  
result = map(lambda x, y: x + y, numbers1, numbers2) 
list(result)

[5, 7, 9]

In [183]:
list(map(lambda x, y: x + y, [1,2,3,4], [50,60,70]) )

[51, 62, 73]

____
Use a lambda function and the map function to compute a result from the followimg 3 lists.<br>
If the element in the third list is divisible by 3 return 3*x, otherwise return 2*y.

In [184]:
numbers1 = [1, 2, 3, 4, 5, 9] 
numbers2 = [7, 8, 9, 10, 11, 12] 
numbers3 = [13, 14, 15, 16, 17, 18] 

result = map(lambda x, y, z: 3*x if z%3 ==0 else 2*y, \
             numbers1, numbers2, numbers3) 
list(result)



[14, 16, 9, 20, 22, 27]

In [185]:
def compute_res(x,y,z):
    res = None
    if z%3 == 0:
        res = 3*x
    else:
        res = 2*y
    return res


result = map(compute_res, numbers1, numbers2, numbers3) 
list(result)

[14, 16, 9, 20, 22, 27]

____
<b>`filter(function, iterable)`</b> - apply function (bool result) to every element of an iterable - return the elements from the input iterable for which the function returns True

In [186]:
test_list = [3,4,5,6,7]
result = filter(lambda x: x > 4, test_list)
result

<filter at 0x7fda0246e820>

In [187]:
list(result)

[5, 6, 7]

In [188]:
# Filter to remove empty structures or 0
test_list = [3, 0, 5, None, 7, "", "AACG", [], {}, {1:"one"}]
result = filter(bool, test_list)
list(result)

[3, 5, 7, 'AACG', {1: 'one'}]

____
<b>`functools.reduce(function, iterable[, initializer])`</b> - apply function to every element of an iterable to reduce the iterable to a single value



In [189]:
help(reduce)

NameError: name 'reduce' is not defined

In [190]:
from functools import reduce

In [191]:
help(reduce)

Help on built-in function reduce in module _functools:

reduce(...)
    reduce(function, sequence[, initial]) -> value
    
    Apply a function of two arguments cumulatively to the items of a sequence,
    from left to right, so as to reduce the sequence to a single value.
    For example, reduce(lambda x, y: x+y, [1, 2, 3, 4, 5]) calculates
    ((((1+2)+3)+4)+5).  If initial is present, it is placed before the items
    of the sequence in the calculation, and serves as a default when the
    sequence is empty.



In [192]:
reduce(lambda x,y: x+y, [47,11,42,13])

113

<img src = https://www.python-course.eu/images/reduce_diagram.png width=300/>

https://www.python-course.eu/lambda.php

https://www.geeksforgeeks.org/reduce-in-python/
https://www.tutorialsteacher.com/python/python-reduce-function

In [193]:
test_list = [1,2,3,4,5,6]
reduce(lambda x,y: x+y, test_list)

21

In [194]:
# compute factorial of n
n=5
reduce(lambda x, y: x*y, range(1, n+1))

120

In [195]:
list(range(n))

[0, 1, 2, 3, 4]

In [196]:
list(range(1, n+1))

[1, 2, 3, 4, 5]

In [197]:
reduce(lambda x,y: x+y, ["AACT", "AA", "C", "TTG"])

'AACTAACTTG'

In [199]:
reduce(set.intersection, [{"A","C"}, {"A","G","T"}, {"A","T"}])

{'A'}