# Loops
Loops are coding structures that repeat the same portion of code multiple times.

# For loops
For Loops define how many times to repeat the code in the beginning. They iterate through a sequence, for example a list or a string.

In python the basic loop structure is seen below: 

In [2]:
values = [1,1,3,5,7,10,22,1,2,3,4,5]

for i in values:
    print(i)


1
1
3
5
7
10
22
1
2
3
4
5


#### Note that in this example, ‘i' is a new variable we declare, this will keep track of our iterations. Every for statement ends with a ‘:’. Underneath the for statement is code, indented over by one tab, that will repeat for every iteration. In this example we want print our variable ‘i’ every iteration.

The iterating variable cycles through each element of the list. 

We can iterate through values by using range, which is a generator. Generators are types of iterators, functions that iterate through something and return certain values:

In [3]:
print(range(5))

range(0, 5)


Note that range(5) does not return a list as you may expect, such as [0,1,2,3,4]. This is because by calling range(5), you would return 0, 1, 2, 3, 4, and 5, but print here is printing out the generator range(0,5), we can access the values returned using a for loop. Just be aware of this, in reality you should easily be able to use generators/iterators such as range with for loops:

In [4]:
for i in range(5):
    print(i)

0
1
2
3
4


### Exercise: use a for loop to print out every letter in a string

In [5]:
my_str = 'Print out each of these letters'

In [6]:
#answer
for char in my_str:
    print(char)

P
r
i
n
t
 
o
u
t
 
e
a
c
h
 
o
f
 
t
h
e
s
e
 
l
e
t
t
e
r
s


## Quick tangent: string formatting

String formatting is a way to substitute variables into a string, this is done by adding {} within a string where you want the variable placed and calling the format function on your string with your variables as the inputs.

In [7]:
apples = 12
oranges = 2
string = "We have {} apples and {} oranges for {} fruit in total."
string = string.format(apples, oranges, apples+oranges)
print(string)

We have 12 apples and 2 oranges for 14 fruit in total.


## Now back to for loops:

Let's use what we have learned to print out a string saying The value at index X is Y for the list `values` we created earlier:

In [8]:
for i in range(len(values)):
    print("The value at index {} is {}".format(i, values[i]))

The value at index 0 is 1
The value at index 1 is 1
The value at index 2 is 3
The value at index 3 is 5
The value at index 4 is 7
The value at index 5 is 10
The value at index 6 is 22
The value at index 7 is 1
The value at index 8 is 2
The value at index 9 is 3
The value at index 10 is 4
The value at index 11 is 5


### Exercise: Print out a statement of X* 5 = Y for each X in our list `values`
- For example the first three print lines should be:

`1*5 = 5`

`1*5 = 5`

`3*5 = 15`

In [9]:
#answer
for i in range(len(values)):
    x = values[i]
    y = x*5
    print("{}*5 = {}".format(x, y))

1*5 = 5
1*5 = 5
3*5 = 15
5*5 = 25
7*5 = 35
10*5 = 50
22*5 = 110
1*5 = 5
2*5 = 10
3*5 = 15
4*5 = 20
5*5 = 25


## Break and continue:
We can use `break` to stop a for loop:

In [10]:
fruits = ['strawberry', 'orange', 'pineapple', 'apple']
for fruit in fruits:
    if fruit == 'pineapple':
        break
    print(fruit)

strawberry
orange


If we want to stop an iteration, but not the entire loop, we can use `continue`:

In [11]:
for fruit in fruits:
    if fruit == 'pineapple':
        continue
    print(fruit)

strawberry
orange
apple


### Exercise: Do the same as the previous example but only print out a statement if you have not previously printed it
- You can use `continue`, but it is not necessary

- For example the first three print lines should now be:

`1*5 = 5`

`3*5 = 15`

`5*5 = 25`

In [30]:
#answer
prev_x = []
for i in range(len(values)):
    
    x = values[i]
    y = x*5
    
    if x in prev_x:
        continue
        
    print("{}*5 = {}".format(x, y))
    
    prev_x.append(x)

## another answer
# prev_x = []
# for i in range(len(values)):
    
#     x = values[i]
#     y = x*5
    
#     if not x in prev_x:
#         print("{}*5 = {}".format(x, y))
#         prev_x.append(x)
    
    

1*5 = 5
3*5 = 15
5*5 = 25
7*5 = 35
10*5 = 50
22*5 = 110
2*5 = 10
4*5 = 20


# While loops
While loops repeat code until a conditional statement has been met. 

In [32]:
i = 1
while i < 6:
    print(i)
    i = i + 1

1
2
3
4
5


In [34]:
i = 1
while i < 6:
    print(i)
    if i == 3:
        break
    i += 1

1
2
3


However, you should be careful using while loops, what will happen with the following loop:

`i = 1
while i < 6:
    print(i)`
 

## Arrays/matrices and nested loops:
A lot of data you will work with is in the shapes of arrays/matrices, where you have several rows and columns in a data set, these can be iterated through using nested loops:

In [13]:
row1 = [1,2,3]
row2 = [4,5,6]
row3 = [7,8,9]
mat = [row1,row2,row3]
print(mat)

[[1, 2, 3], [4, 5, 6], [7, 8, 9]]


In [14]:
#print out each row of our matrix
for row in mat:
    print(row)

[1, 2, 3]
[4, 5, 6]
[7, 8, 9]


#### Print out each value in our matrix:

In [15]:
for row in mat:
    for i in row:
        print(i)

1
2
3
4
5
6
7
8
9


### Exercise: Print out the 3rd column as a single statement:
- Your output should be:

`The third column is [3, 6, 9]`

- Note we store the column as a list

In [16]:
# how to access the 3rd column in general
for row in mat:
    print(row[2])
    
#answer
col3_string = "The third column is {}."
col3_list = []
for row in mat:
    col3_list.append(row[2])
print(col3_string.format(col3_list))
    


3
6
9
The third column is [3, 6, 9].


### Exercise: Print out the mean of each row as a single statement:
- Your output should be something like:

`The sum of each row is [2.0, 5.0, 8.0] respectively.`

- Note we store each of the means in a single list

In [17]:
row_means_string = "The sum of each row is {} respectively."
row_means_list = []

#answer - note there are other ways to do this
for row in mat:
    sum_row = 0
    for i in row:
        sum_row += i
    row_means_list.append(sum_row/len(row))
print(row_means_string.format(row_means_list))  

The sum of each row is [2.0, 5.0, 8.0] respectively.


## This is an example of what the contents of a .csv look like when read into Python
- This data set is from PDACGeneExpression.csv and the first column should be a list of genes with each other column being the gene expression levels for 10 patients for the genes in the first column
- Let's run the following line to load the contents and then look at what the first row looks like:

In [18]:
file_contents = ['S100A5,0.822784869582219,0.845863228049007,0.406539674722031,0.824984966648277,0.821594633667975,0.789688419503896,0.833973465199686,0.836439694197088,0.864337032765865,0.815665965658554', 'NHLH1,0.394702796775243,0.410730511883324,0.219664947634127,0.41464018581812,0.369068600742691,0.490930117101514,0.456201754094835,0.504766034166945,0.293838620878279,0.492372665985876', 'RRS1,0.251006686303772,0.412312671874319,0.268463901467237,0.390190596382938,0.2523612159335,0.353650749511271,0.394560026736902,0.688632437553388,0.206826682645551,0.586634030132701', 'C6orf168,0.699903413245089,0.326859544355949,0.185823992156753,0.37031061674346,0.409969252552166,0.561154180528344,0.466529317131377,0.250738290392964,0.330373591664031,0.420009547520468', 'C3orf39,0.611252664782601,0.402430789760535,0.41520726144889,0.59948407780805,0.684473244379174,0.646440076061265,0.617034149349823,0.663219107945528,0.809268788902361,0.63163201500637', 'MMACHC,0.596787574192,0.554378067326241,0.458286737064493,0.756125031282779,0.793326533278925,0.624621204862126,0.625752308861302,0.91063404463238,0.935569300037654,0.673479371218657', 'MRPS18C,0.0529750433840126,0.0491274508239888,0.0657395179518438,0.061495757853716,0.0537543243920936,0.0690531184146998,0.05125541681082,0.0555655310871747,0.0518817618549706,0.0482599873862293', 'TMEM186,0.9550460317666,0.925184733352794,0.958076834649601,0.952111322754442,0.945755608447406,0.950115703231161,0.96048407139522,0.946067154320366,0.955159856780996,0.939113264127353', 'MED13,0.0550268866226761,0.0361152380932741,0.0380814379215068,0.0398735187845585,0.0394443634141986,0.0727817265205962,0.0336187647647697,0.0443308300444682,0.0472160699760975,0.0374115580944846', 'STC1,0.447141990764451,0.142263235260851,0.0963066192040709,0.100803999790915,0.243063572249026,0.0871463984537661,0.14795537538702,0.511140850983747,0.449871461944188,0.15046300525056', 'RUNDC1,0.229862200697369,0.179790742671869,0.175443545811324,0.449457045927617,0.346311917172176,0.313013385993425,0.367310200240279,0.231744135244366,0.288456905819856,0.173402477739019', 'KIDINS220,0.927782473612778,0.902787113212325,0.911093996295196,0.926604211202866,0.915827811148267,0.9191219009756,0.916159766372345,0.929926432715925,0.904038363934294,0.817993959610952', 'TTC13,0.202456861574162,0.168303661137373,0.156719011972649,0.188993486242258,0.252335636166787,0.199947405966139,0.154123374697918,0.132717525474448,0.202536855596487,0.255944521167351', 'C12orf34,0.829553298063444,0.619522721495095,0.73156218322232,0.824081692705947,0.717177219573419,0.798462775198388,0.700916140927931,0.695414211285328,0.862445797620262,0.567483602475862', 'RIN3,0.0400783505667456,0.025652017190695,0.388712273731395,0.0201475259449711,0.0299168368837042,0.02630220836944,0.0332240959164747,0.0238926276378682,0.0312369355425198,0.0352633076088863', 'RBPJL,0.688806656203917,0.676219343207869,0.625913863385265,0.761040780042732,0.6701254138104,0.748399388292281,0.690434100738422,0.553260903403817,0.79618757921868,0.662310361481015', 'TMEM88,0.865582637762921,0.860480561849697,0.811826355446214,0.853883461825556,0.782501899052217,0.898276687733795,0.848577979030571,0.852326444593798,0.903374052031686,0.827197613321023', 'GCSH,0.0260119918818649,0.0307494651836545,0.0317296801737609,0.0219335968733127,0.0330908642092647,0.0305402436034755,0.0337260726674031,0.0281999968418255,0.0255677631390283,0.0333477967480378', 'PRB1,0.730189828539415,0.921580751113737,0.8969088630096,0.912196445560258,0.814575714631769,0.838499575131666,0.869546193733878,0.842307047552913,0.899955298083046,0.856200224174302', 'TSPYL5,0.602477144711941,0.20506699346459,0.536374410190165,0.328910783824717,0.609248776987099,0.426599544661007,0.459691451121452,0.127722111503288,0.792344094650751,0.331791316788236', 'NDUFB10,0.109888271918239,0.0475045195082116,0.106874234868888,0.202180462006002,0.0780466933922104,0.0796084525331745,0.108671645429693,0.0668068258568089,0.0635635392410561,0.0810001610937127', 'TXNDC12,0.689635877355775,0.704567036718183,0.652967591029437,0.865633587509622,0.674199351264183,0.911640503507107,0.88602453422879,0.918795072143546,0.923148327163197,0.879659797814791', 'CUL5,0.124466215034395,0.0792239218086809,0.138829823028167,0.082846272252537,0.1207286384997,0.10611717001832,0.0729421871913964,0.0920108178011344,0.146658527921657,0.0919148276917869', 'MAPKAPK5,0.0315752243098706,0.0270914131036616,0.0359202291686714,0.0503929866891351,0.0417706859094457,0.0335344010598194,0.0437964807941048,0.037241838792496,0.0424803806212533,0.0361944179031991', 'SERPINB8,0.0358305038769923,0.0221063560854364,0.0295616571841004,0.0285808612929641,0.0345731406311977,0.0349263092387253,0.0313778357421065,0.0224284183384681,0.0347457362511377,0.0437589615391024', 'FOXP1,0.656823504057294,0.772386609617164,0.863387923200724,0.853967664239512,0.723282550689797,0.754639317760414,0.803501172197541,0.662211165084489,0.896867166725881,0.856183580514763', 'ATP1B2,0.420547873750113,0.0591051330262687,0.332089974902671,0.282273458861798,0.287149665320887,0.375999861101193,0.266983195425483,0.397066762350009,0.0771989991520224,0.214106246996239', 'PRR13,0.0222089567699739,0.0303395851155825,0.0251240165478326,0.0245206356456478,0.0285042381067129,0.0241462290033447,0.0298868073909926,0.0264535727979577,0.0256877554206662,0.0249712527356183', 'RLN1,0.505860167202016,0.233732831865145,0.244305520829885,0.341668628892728,0.345470100312765,0.474667275779945,0.329737003108627,0.272371331248946,0.260591584357735,0.356806047566814', 'SLC30A7,0.0387218605057556,0.0841393216905098,0.103825295756435,0.0585108390112747,0.0650439263208631,0.118280693490394,0.0780530055260824,0.0872673009305303,0.0704585256923067,0.0779252473504344', 'PLS1,0.271558835103972,0.231776645016396,0.165954264451267,0.35003357726844,0.332766017573353,0.303956417444957,0.290728196300926,0.176092935935736,0.272889339181636,0.206063600706175', 'MKI67IP,0.0436839537879851,0.0537846626665239,0.0447333344091698,0.0472296999885088,0.0525630078315515,0.0466384597521588,0.0529081661661211,0.0482618328017328,0.040078606746921,0.0581776813872443', 'DVL3,0.78348422208497,0.802899406726715,0.349828098599701,0.744824912179236,0.765775610217673,0.653986725651838,0.853751186455316,0.863345934380749,0.80377630523838,0.885370305237215', 'CSNK1G3,0.0845137417781484,0.0349807080135393,0.0971820137878518,0.0420013973998826,0.0893573730446338,0.0601207268008986,0.0573997635921294,0.0449188190073141,0.0656162199107375,0.0817752510020167', 'SLC35B2,0.508593684253031,0.383771577206503,0.34157618070133,0.472721542335466,0.453624680516223,0.493502218958843,0.377613141218262,0.350083234418927,0.40247645272981,0.392577715502068', 'FASLG,0.769990074036185,0.718078632909763,0.815902661687022,0.780174123021311,0.716569526910872,0.825004703309107,0.755787753951287,0.848651326320404,0.862674932010742,0.806832559007061', 'EIF4EBP3,0.0799782409143575,0.164855770450311,0.129677652293253,0.0788144215982149,0.0851415439714859,0.0855863504522344,0.189518442817105,0.12667372537326,0.0562125401933348,0.243030737081083', 'ZNF740,0.875408595842671,0.828849904679111,0.898948797367536,0.864879903243498,0.830372368322216,0.8677966503546,0.82690950454911,0.883647838964889,0.883163395810801,0.829053213284776', 'C17orf82,0.409870029526322,0.516562287230567,0.312921457250959,0.349339562388979,0.366416656262796,0.414720311225632,0.499355908550477,0.591267085367012,0.329381914296392,0.625292472919534', 'RGNEF,0.543597749783105,0.675424982295158,0.446315736144528,0.770749599913478,0.632532354389703,0.662919153954135,0.769449947178671,0.656414367890751,0.383035596906268,0.690150413895634', 'TFF2,0.796776819515759,0.673395010826953,0.646724466394657,0.875198831404756,0.801466468737784,0.849709460726385,0.607280884947497,0.787117755014381,0.694795190091935,0.72514499176784', 'EN2,0.437538592628546,0.0886401132208386,0.0651519278998921,0.0789137995777754,0.102356507054342,0.120334497182443,0.0949274220144457,0.314057330894281,0.654220112433343,0.34255575588114', 'TNFRSF25,0.610273882344768,0.485554762796353,0.672822291790582,0.457304607682965,0.556140546303944,0.598582538849424,0.537526996697809,0.59654066501819,0.660083879719682,0.665129087447639', 'FAM19A2,0.448516608870054,0.0920204972478316,0.442662392947526,0.287930595965232,0.0456362392156737,0.454808943699137,0.345874113372436,0.292324669387098,0.715105270870366,0.271043676633921', 'C1orf55,0.0363052565042572,0.0288653441979218,0.0297994909703115,0.0325147352954109,0.0388998380297855,0.0330203717983193,0.0301100380864881,0.0303593562756209,0.0384924706322884,0.0359822199864667', 'BIN1,0.034874851278973,0.0568658723060041,0.0334398600576489,0.0238021007592397,0.0361548038847578,0.0307311762656438,0.0380512628050859,0.0276240785251695,0.02571057522034,0.0370983281559857', 'PCNT,0.861979111618397,0.775679555127748,0.848168054871485,0.808143843909422,0.80000490621274,0.768138917436458,0.816079690970016,0.837225823188408,0.65299581009104,0.801809693676859', 'NDOR1,0.53628958287787,0.626804992355312,0.560850429445737,0.706318104247813,0.654689411837097,0.621285461499769,0.762329177578142,0.533367447582077,0.560844294318585,0.647479930680209', 'ZNF195,0.119038658408971,0.0857279049108819,0.128586308006941,0.0921376356107959,0.0856403764965316,0.119739411102093,0.0878950787675807,0.0902992925100078,0.0884526231784711,0.0754503066931744', 'PNLIPRP2,0.679914210285335,0.949331327457818,0.727904945713752,0.865048975277703,0.934024987189282,0.937618223098502,0.937072963042651,0.935457636683573,0.940853980262415,0.853095566034016', 'FAM96A,0.0696452484200918,0.0490787405322701,0.0795993030077767,0.0666297260648747,0.0668076582381803,0.0658055626977855,0.065167193381837,0.0610550262801427,0.0788758318661469,0.0563869277105686', 'RNF11,0.166940240251,0.0900535232540831,0.111675625312308,0.117053927866485,0.0824504754060312,0.0913988150615428,0.114448300470157,0.115017187810291,0.219536962547731,0.108458224230523', 'SARS,0.920053428122312,0.933923736547202,0.944675556461623,0.941520726309559,0.92249328612005,0.929697686954803,0.917375815588686,0.917101095543411,0.917776259035612,0.927455227722441', 'GKAP1,0.0563186108045476,0.0312866897480106,0.0343114312186371,0.0301457508465228,0.0389695387288897,0.0312683675243227,0.03729799633604,0.0624930462425773,0.0369580800180538,0.0395793999363404', 'STK38,0.0449091969397431,0.0417735706058291,0.046904300296125,0.0433728725211192,0.0436255395438173,0.0448378809424052,0.0363057859061741,0.0358678127447701,0.0703602393838619,0.0414780628036977', 'NT5DC2,0.694310156426553,0.634446287713557,0.516533404465239,0.61993235398532,0.697351424893338,0.641073705310058,0.657517568256635,0.673788434714004,0.753819666304673,0.702169168562755', 'AK2,0.0552889895875401,0.0335691887482398,0.0361035670393684,0.0485010819645114,0.0524245203264144,0.0436963286918763,0.0418793500199979,0.0485523592611477,0.0582355285998718,0.0504739423639894', 'TOPORS,0.0776166267659901,0.0795560453256281,0.086468276488682,0.0828430224024449,0.0731338630215473,0.0773969451544003,0.0752255291500178,0.0758788092368437,0.113739558762621,0.0733975792394302', 'ARSG,0.596134317222689,0.745963087234385,0.684919945654795,0.702851651986966,0.732813727355016,0.638837133219303,0.530015512983207,0.733951015100855,0.682788475962212,0.66332025614692', 'NFKB1,0.0407127163913499,0.0345509008654961,0.0408418226654406,0.0299637898429717,0.0332871641373105,0.0382408783216359,0.0442254414578513,0.0364693988252769,0.0398085552818166,0.0377400560982233', 'C20orf29,0.883720578859916,0.861773137011747,0.897060934834392,0.886078086408836,0.872177076792024,0.921257510299259,0.866064605717885,0.882527517766225,0.92101550062141,0.897694062123232', 'CLSTN1,0.18679704584227,0.139264511451422,0.166635482232543,0.216027306180562,0.299045696516782,0.171939509890993,0.245202370227016,0.162117663425661,0.202683222561338,0.344901634560205', 'TMEM33,0.117092685267855,0.0714997993008811,0.0959515456623929,0.0593044905919364,0.0872876126978351,0.0910038721351183,0.0766514381380437,0.0793162329968912,0.108240971605288,0.068166310914374', 'DSE,0.68060623940661,0.753358833878758,0.760442952833984,0.450387148884677,0.405312403381764,0.406133101550236,0.509119367021542,0.696337571267752,0.665169243346428,0.714479141326585', 'ZNF337,0.0397923311899736,0.0325811255051931,0.0446675864262798,0.0312821705949778,0.0417436452798853,0.0371378180076706,0.0299708676839589,0.0392737936537195,0.0464415399941457,0.0444774996295202', 'CTNNA3,0.685205149861329,0.533716621489397,0.698976683404289,0.81148185089458,0.735944857565703,0.731371919998389,0.762634802181784,0.870839054183075,0.497340902825667,0.811358921127757', 'CLSTN2,0.481289872374823,0.0709004752297662,0.243376627541793,0.178615185951128,0.361685458817014,0.152691036798315,0.288484396466763,0.388196842685589,0.629005711085158,0.255416626117495', 'RC3H2,0.0238222652149452,0.0195718557973369,0.0214018790016246,0.016455229859661,0.0185261934423468,0.0321013468426204,0.0172304663600242,0.0318125161149448,0.0263090368502645,0.0238467349634123', 'AGPS,0.0251708994102106,0.0288311867813493,0.0329715932061922,0.0231265339414979,0.0281683276664108,0.0266327869656798,0.026933338150958,0.0291864813023228,0.0291930846843687,0.0270156599033129', 'TUFM,0.0186905550625506,0.0137406775927072,0.0271117886605636,0.0133067876437399,0.0362315776132978,0.0187385721184044,0.0269765857192027,0.0139416356965423,0.0232017249324632,0.0325435969168344', 'MESDC1,0.178812264359111,0.213241799151567,0.183644765698311,0.182518128167725,0.28066695373157,0.288404670707237,0.210852701781262,0.248713259874427,0.0903096687677512,0.287456019765428', 'GNPDA1,0.250469492713019,0.25349513346954,0.0898126718762776,0.207570905608437,0.291112870129764,0.217420989416752,0.311335530758959,0.186951440164543,0.473731128159476,0.333863537301164', 'HIST3H2BB,0.19396336498193,0.150283074761364,0.388212084360739,0.140953984360806,0.275267363548491,0.124673861706957,0.108648610109818,0.0907617921988172,0.0773695991798101,0.351260600940542', 'MAP3K14,0.436873891590526,0.293721988011011,0.518661209940832,0.487345719206413,0.438847888279689,0.48608753743471,0.354408974272391,0.537757813670467,0.654455834734356,0.423917972488547', 'MRPL3,0.0380932978753147,0.0171611744551794,0.0377874169843036,0.021171377792222,0.0562153056158577,0.0418297000293666,0.0460671355935135,0.035077559235404,0.0452234454400339,0.0494266071037021', 'SLCO4A1,0.076961259234717,0.0883986659829489,0.0761674297524491,0.083913980426788,0.130619393294692,0.109597068262227,0.154167076877708,0.2603967696759,0.134739185846562,0.158851416310179', 'ZNF692,0.0162552791505354,0.0161823833266986,0.015737080329799,0.0140650005628246,0.0170461478189446,0.0163115312119176,0.0163086987579494,0.0150675798052281,0.0174176381002957,0.018614571386984', 'XCL2,0.847784194708556,0.829969155794203,0.673588463814159,0.781291402593153,0.657582905474667,0.836493796419918,0.761940015045846,0.810156990254257,0.608510865016372,0.79086447533311', 'CDK13,0.0158870725819988,0.014050719405132,0.0191837169877316,0.0173619654608638,0.014534090171433,0.0152501191171696,0.0187610128322387,0.0171706532346209,0.0184355853792855,0.0177871227755527', 'HSD17B8,0.288574712264346,0.290552007118894,0.231921387175291,0.347451960192385,0.448032092038935,0.343669093083262,0.308600625607432,0.220206699886012,0.257134683618561,0.35959581637728', 'RTF1,0.135939774131505,0.104088149970797,0.15696182442022,0.0984456409571939,0.154192776182538,0.178179578442862,0.14093516327853,0.156594898397641,0.180282396174782,0.122094082438454', 'HRASLS2,0.874692531119797,0.731878086810835,0.957968623923456,0.918687357882255,0.841123929770805,0.928481148903042,0.797485625601035,0.819862688990755,0.96087632699815,0.948510660241433', 'THY1,0.615744729989562,0.548179863906917,0.413014109219387,0.421667205306975,0.500046608119534,0.506803950433968,0.521656164867168,0.649610214088867,0.129448630009757,0.566582345873052', 'DHODH,0.0153364575103417,0.0132843340794327,0.0143483726418113,0.0142869987755522,0.0129446080842546,0.0165919908460082,0.0139220266747052,0.0153137381268074,0.0146569945247342,0.0132075327919956', 'PUF60,0.84465797367604,0.779967010734216,0.829995018491934,0.881142457216267,0.847606474067769,0.848735989457948,0.854167566612465,0.805069405651114,0.882408641966577,0.878440840695694', 'COL16A1,0.714459304782007,0.646020164912211,0.381086807543571,0.642791079805032,0.619770574028887,0.616041734426035,0.757909434324442,0.731407747622728,0.428667656270295,0.795685556220889', 'MTRF1L,0.0516224569542567,0.0433759699759809,0.0382938116226717,0.0406715390714581,0.0418247090240891,0.0443972414698934,0.0379921703190874,0.04167200667895,0.0490992554816367,0.0353438884121533', 'OSGIN2,0.0197705078174223,0.0154817511690732,0.0204726294265001,0.0168986574088382,0.0200191087590649,0.018316441959889,0.0193576261822319,0.0190347680197972,0.0182747723290532,0.0208787115765055', 'SPNS2,0.827230067006923,0.747288291835321,0.655554655796872,0.862507609204601,0.828075166611991,0.826537208151353,0.848083100649545,0.825566460830599,0.847375003873455,0.757109548087844', 'PTPN14,0.374886851796765,0.401893647492845,0.339152799327559,0.883021513111252,0.62264925309401,0.737649992051453,0.880457218073846,0.774946272614756,0.609328978532815,0.812109125675851', 'CCDC76,0.0459351928052728,0.0874275918539136,0.0851412927385499,0.0550288158870341,0.0603993361126137,0.0867236242947327,0.0689428715040367,0.0623263848639271,0.0996173301320284,0.110538895752022', 'CRLF3,0.554668261530038,0.460586112474897,0.435615327925527,0.46285258461222,0.466877311819269,0.58593833220202,0.407619465198577,0.719022076960497,0.668970304304527,0.574745659732291', 'KHDRBS1,0.0149599931027639,0.0142146977865417,0.0122983548149328,0.0137768373835712,0.015022369765906,0.014145234197751,0.0146289992816608,0.0149441636990473,0.0168922519981911,0.0148058892835458', 'SUMO4,0.952935965823807,0.934945090165215,0.937975335741676,0.95294314934073,0.931655085703313,0.942124583870446,0.953519352060992,0.959566549379233,0.94410495013926,0.962710772346073', 'WSCD1,0.533160850449178,0.0815253764608537,0.119644567082058,0.0701387027725544,0.1347995623236,0.0743532924091846,0.0831160693265598,0.0667790381758574,0.165109007113057,0.108380364368554', 'GNA11,0.374903603961706,0.40561620069453,0.560241278167625,0.589584461613602,0.583708478320601,0.756351275707296,0.67003627900997,0.704521678653024,0.915052605956133,0.593351291944331', 'PCGF3,0.591179445811051,0.778946502141032,0.787292625571513,0.738053910734601,0.734687193267296,0.809315216362622,0.675561218393078,0.818584476869637,0.510415648471889,0.795488131856417', 'NR2E1,0.408198993521937,0.0932106919645039,0.0674177592836206,0.0774462872257402,0.0626137593911525,0.403657089544592,0.107181436009937,0.174270284138826,0.653225453806848,0.236873153014166', 'C1orf101,0.0295455076498613,0.0223087190172124,0.025349918475757,0.0285244312214111,0.0289197512877822,0.0231329022966106,0.021901367322381,0.0241717765369096,0.028095467449468,0.0266651737363404', 'BCAS3,0.0674703790635772,0.0638558706354165,0.0610556119853078,0.0551114301835721,0.0554125453749025,0.0778862601255988,0.0607615612374764,0.0479476724642858,0.080366545994832,0.0648900517919926', 'ATP5F1,0.0181105947307712,0.0124849406897891,0.0153831778249976,0.0142029348387289,0.0148769400487329,0.0160646371838254,0.0144061860086339,0.014274108083451,0.0142520670557951,0.0179278664214466', 'SUSD4,0.0465258571853469,0.0198199053828117,0.0212700553224964,0.0220099135070982,0.0355043785984458,0.02625280739127,0.0221930954086974,0.0241496745932442,0.0198315550792076,0.0243792636184352', 'TCTA,0.218025335303113,0.199538359287013,0.185173020813844,0.315586307631771,0.235853146448338,0.270834854333344,0.313985856131106,0.246519752087086,0.255973633244882,0.264017396715253', 'ZGPAT,0.0299405289635558,0.036076286465109,0.0584535009691114,0.0384205825137999,0.0326675080055424,0.0432548729978844,0.0322200631676735,0.0318375175542866,0.0316383347485577,0.0322726473659182', 'GLG1,0.766608583087068,0.516101378054748,0.393154497694102,0.591050506352133,0.616170482848411,0.597700805176906,0.594983746711937,0.695729809206461,0.786486257515297,0.749598925754576', 'RPL22L1,0.766867911439995,0.814346846741978,0.898380339768933,0.882681200510043,0.611576912203975,0.82940914382775,0.86814795313635,0.931848804241592,0.935476979862685,0.915218535444051', 'FAM171B,0.664062261127131,0.227204225006408,0.409906352846855,0.523202028651291,0.318759775970957,0.641480632388343,0.336192192497025,0.167818239834002,0.801047758979414,0.393422578724647', 'SPRY2,0.448877616427761,0.515651066851268,0.854909378183338,0.71225677526366,0.470433904394715,0.534441671710483,0.601670793945384,0.361698019888073,0.874278794586598,0.507352433778843', 'WDFY2,0.804850073323499,0.464801170782641,0.240441077592747,0.800533052129332,0.856698373511268,0.843952410717978,0.687431381170438,0.737977849106956,0.323400400119759,0.443519636024393', 'VPS35,0.03171154459088,0.0307079038736174,0.0375838072435212,0.0350224247115355,0.0405554951955893,0.0360921904270381,0.0253311400095743,0.0273177865359405,0.0288110590370979,0.0418807864251972', 'WDYHV1,0.0492509953341836,0.0365818216746947,0.0517946274995793,0.0314621101901989,0.041500524303817,0.0404105226825499,0.0365755398779616,0.0204051374817011,0.0366181708718115,0.0433008189535932', 'ALKBH2,0.110993574364643,0.0454008306259736,0.0434049511856071,0.0400365113529224,0.0415295150237187,0.0545643906427271,0.0413221123153952,0.0499792178234674,0.0629992270935069,0.0845205978703691', 'NIPA1,0.162776923779837,0.107880824046906,0.133524837890171,0.167270148263107,0.164997371425685,0.174652157887055,0.241175347553986,0.101974774178391,0.102917392147371,0.164427519971284', 'KCNA2,0.0788355181145985,0.0299123542708737,0.0340880432731349,0.0313501393393151,0.0221193708893304,0.0234646748371815,0.0304247009273642,0.354007364899179,0.0265377833578589,0.0252625775562496', 'EHMT2,0.065149934100918,0.0539290815202676,0.0600932790669057,0.052664022055883,0.0456727610786344,0.062962569746241,0.0359789086628142,0.0455029415209081,0.100170804478278,0.0802042313421471', 'LRPAP1,0.933974918973655,0.916653689736404,0.938705995138262,0.937854450313656,0.945442767653398,0.938877185897112,0.933271022976637,0.936793610274411,0.896964033003297,0.948662123878563', 'CCDC90A,0.0950114725803258,0.0561429506560616,0.0737960761526807,0.0698240826434944,0.0757805340504551,0.0637377446550861,0.0550632836635434,0.0800162126153556,0.135253029207209,0.0554756956738796', 'C1QB,0.773713588641391,0.728877032581295,0.567523397731708,0.677268832317797,0.755627200375,0.707686701332914,0.703053016925355,0.744354627452459,0.851590617903951,0.782258015380633', 'FAM36A,0.153844718862568,0.108537184429309,0.133922856367696,0.111986650296324,0.0919037530742933,0.108411217622693,0.0791657042672345,0.068690634176994,0.132728578375565,0.103149773528198', 'CD8A,0.885690440557787,0.890867069799849,0.936898637921851,0.924558679201083,0.890031934806341,0.92183918416937,0.884090407570069,0.912985162359493,0.919053786630715,0.875707868688435', 'FSCN2,0.336504874307906,0.337744288476516,0.329468741638431,0.552533297647235,0.397427343118025,0.427018842950795,0.402788468077317,0.266155033673701,0.430579325076911,0.408081712088005', 'NME4,0.0347508665613717,0.0248984542726336,0.0224566811960989,0.0248692047561209,0.024371452158728,0.0340285400049789,0.0312480382405061,0.0302030311701744,0.0291290993209722,0.0688488793935344', 'EI24,0.178994965904334,0.191986187480473,0.17845209654143,0.16703332848239,0.20088818674451,0.141543207854055,0.218684630913047,0.186457579622686,0.196878562126328,0.208087603315449', 'SIGMAR1,0.67068378544957,0.823152353141369,0.94073199364957,0.936426060765722,0.958768720990721,0.934686840710689,0.930960171354973,0.932009504126345,0.954962311296339,0.945733038990634', 'FGD2,0.848132879593231,0.821721772624207,0.848910925999149,0.833694325558432,0.79555283949458,0.820879769928544,0.83556675244453,0.817564837101251,0.809907080817823,0.818775522795591', 'RAI14,0.0317355520222266,0.0700265953525159,0.0341606949900879,0.0696112295181282,0.0446117525309868,0.273198149753284,0.0528271702938019,0.213572923170959,0.0280862520833342,0.106810925922575', 'PGAP1,0.0110722177218823,0.00902840668688458,0.0102511220234056,0.00904984833341079,0.01652285605692,0.0110809081510155,0.0177002124315662,0.0131097547880751,0.0130660932929695,0.0223821057741358', 'TUBB2A,0.168113069769831,0.137409350995192,0.117138455946281,0.0635912760047811,0.0915189494411884,0.10159757327151,0.100633712520062,0.125828669284689,0.202281502082102,0.181602254540832', 'ANKRD36,0.0343632731633384,0.0214095858169102,0.0510194378689937,0.0235546945082927,0.0333567503372167,0.0305818588482838,0.0244768200960004,0.0302644818313394,0.0358648397951072,0.0271567459963417', 'SGMS2,0.0410601033533121,0.0322891605937258,0.0319963560081599,0.0366119729426254,0.0373087146926527,0.0340662031063056,0.038394082112374,0.0332990399646538,0.0330791634314041,0.0399260961189426', 'MALL,0.479323950563631,0.533586759773086,0.459483888041866,0.674279327301063,0.600859727220461,0.499486671642187,0.606208643273248,0.60216855493714,0.913537436612605,0.696235394792583', 'KIAA0748,0.868157398855691,0.840266973490368,0.865465807805581,0.722662177353327,0.850276455999252,0.848651949153264,0.824108991033308,0.844401334040396,0.847009051905628,0.890984330685021', 'A4GALT,0.159619950083502,0.0540659467800054,0.0870046457378278,0.0287486018304255,0.118966627250282,0.0483983591498256,0.101685404659732,0.169711157050465,0.684468681870806,0.460769946089957', 'YIPF4,0.738092546853324,0.644859235784657,0.641682619098793,0.858258575090205,0.824253027897032,0.915759044806983,0.885784427308654,0.696645372615931,0.797874669930469,0.860127867138283', 'COQ9,0.285861943338493,0.143128339937441,0.242964318093965,0.309289053432827,0.400033768788974,0.271345074221165,0.341921206430533,0.39770095329713,0.141053338272378,0.347237601452814', 'CR2,0.307262069439202,0.0243287573304525,0.023407208681154,0.0286914345221789,0.0249683842458965,0.231314335017769,0.0252824806029371,0.0205175486927744,0.0240394974265466,0.0217875519750506', 'TMEM79,0.519301589251608,0.620429188062285,0.351181489315511,0.605875423923647,0.62990997925593,0.577871594406166,0.813531972241782,0.78643295118144,0.483926798778722,0.587245438242814', 'SERPINH1,0.834846343074226,0.810269842416977,0.750080524763762,0.814173657578319,0.770346541214622,0.766064148281077,0.863431969817451,0.866786475061837,0.869849477809444,0.839987692598205', 'ANKRD9,0.694681189854409,0.619631848192056,0.685937622123278,0.711982786289924,0.846050249842538,0.659645640497446,0.687428745179074,0.77350548775368,0.925509559499578,0.753044215678742', 'WNT2B,0.0857750785550902,0.0489142644530931,0.0846242636306354,0.0524753969669448,0.0651144348046424,0.0922011940391231,0.0630188403846466,0.0646380560493543,0.0912368989192229,0.0894973055063514', 'GALNS,0.92816745227475,0.962283072517701,0.959293777998287,0.862591244335525,0.736123732410954,0.873645305477781,0.964034792946261,0.966227638816454,0.95150785949655,0.951907629181853', 'ESD,0.0307310915673318,0.021817106550168,0.021096762606106,0.0212423061287451,0.0278962373162012,0.0280060375492159,0.0245070655649783,0.0175326912644675,0.0192185547173635,0.0225625230661327', 'TMEM50B,0.0439081332791347,0.038085465832851,0.0446455323120487,0.0327290084625381,0.0432257225449148,0.0452096129938912,0.0438074781872602,0.0306953900602581,0.0339997854432835,0.040573361253199', 'HBEGF,0.463661885284818,0.51695602533605,0.43964136977645,0.705483487721977,0.575179463347097,0.561643952210197,0.649666236641008,0.603955341698109,0.376489472959901,0.698285975233816', 'TNFRSF10B,0.319083369174088,0.393598542651332,0.508697527054287,0.521718303620836,0.448470786491466,0.422645864426469,0.453244923552114,0.332887998809156,0.184887608323456,0.544042610241096']

In [19]:
first_row = file_contents[0]
first_row

'S100A5,0.822784869582219,0.845863228049007,0.406539674722031,0.824984966648277,0.821594633667975,0.789688419503896,0.833973465199686,0.836439694197088,0.864337032765865,0.815665965658554'

#### Here, we can see that the first row is a single string, with the gene S100A5 followed by gene expression counts for 10 patients, with each column separated by a comma (hence the .csv, or comma separate values)
- We have a useful string function we can use here, split:

In [20]:
first_row_split = first_row.split(',')

print(first_row_split)

['S100A5', '0.822784869582219', '0.845863228049007', '0.406539674722031', '0.824984966648277', '0.821594633667975', '0.789688419503896', '0.833973465199686', '0.836439694197088', '0.864337032765865', '0.815665965658554']


### Side note: mutable vs immutable - sorry for the confusion
- lists are mutable, and a built in function/method such as .append() will change the original list
- strings are immutable, and a built in function/method such as .split() will NOT change the original string


#### Note that all of the gene expression values are still stored as strings, lets cast these as floats:

In [21]:
first_row_split_cast = [first_row_split[0]]

for column in range(1,len(first_row_split)):
    first_row_split_cast.append(float(first_row_split[column]))

print(first_row_split_cast)

['S100A5', 0.822784869582219, 0.845863228049007, 0.406539674722031, 0.824984966648277, 0.821594633667975, 0.789688419503896, 0.833973465199686, 0.836439694197088, 0.864337032765865, 0.815665965658554]


### Exercise: Determine the average gene expression for S100A5
- Can use a `for` loop, but not necessary 


In [22]:
#answers

total = sum(first_row_split_cast[1:])
N = len(first_row_split_cast[1:])
print(total/N)

# can also do a for loop if you want

0.7861871949994599


### Exercise: Create an array containing all the genes and expressions where the first element is the name of the gene as a string and the following elements are the gene expression counts.

In [27]:
gene_expression_array = []

#answer
for row in file_contents:
    row_split = row.split(',')
    row_split_cast = [row_split[0]]

    for column in range(1,len(row_split)):
        row_split_cast.append(float(row_split[column]))
        
    gene_expression_array.append(row_split_cast)

gene_expression_array

[['S100A5',
  0.822784869582219,
  0.845863228049007,
  0.406539674722031,
  0.824984966648277,
  0.821594633667975,
  0.789688419503896,
  0.833973465199686,
  0.836439694197088,
  0.864337032765865,
  0.815665965658554],
 ['NHLH1',
  0.394702796775243,
  0.410730511883324,
  0.219664947634127,
  0.41464018581812,
  0.369068600742691,
  0.490930117101514,
  0.456201754094835,
  0.504766034166945,
  0.293838620878279,
  0.492372665985876],
 ['RRS1',
  0.251006686303772,
  0.412312671874319,
  0.268463901467237,
  0.390190596382938,
  0.2523612159335,
  0.353650749511271,
  0.394560026736902,
  0.688632437553388,
  0.206826682645551,
  0.586634030132701],
 ['C6orf168',
  0.699903413245089,
  0.326859544355949,
  0.185823992156753,
  0.37031061674346,
  0.409969252552166,
  0.561154180528344,
  0.466529317131377,
  0.250738290392964,
  0.330373591664031,
  0.420009547520468],
 ['C3orf39',
  0.611252664782601,
  0.402430789760535,
  0.41520726144889,
  0.59948407780805,
  0.68447324437917

### Exercise: Get the average gene expression for the gene SIGMAR1.

In [28]:
#answer

for row in gene_expression_array:
    if row[0] == 'SIGMAR1':
        total = sum(row[1:])
        N = len(row[1:])
        print(total/N)

        

0.902811478047593


### Challenge exercise: Find which gene was the highest expressed on average. Print the name of the gene and its expression.

In [29]:
highest_gene = ''
highest_expression = 0

#answer
for row in gene_expression_array:
    total = sum(row[1:])
    N = len(row[1:])
    avg_expression = total/N
    if avg_expression > highest_expression:
        highest_expression = avg_expression
        highest_gene = row[0]
        
print("The gene {} has the highest expression at {}".format(highest_gene, highest_expression))

The gene TMEM186 has the highest expression at 0.948711458082594
