# Section 2.2 | Loops and List Comprehension

Let's go a bit further with Python lists and see how we can use them to create for loops. You may remember "for loops" from [Module 1: Section 4](https://github.com/bueno646/CIERA-HS-Program-2021/blob/master/IDEASpy-Mike-Updates/Module_1/Section_4_conditional_statements_and_loops.ipynb). We will also introduce "list comprehension", how they can simplify your life, and how we can read data from files into lists.

__why should you care about list comprehensions__: They allow for you to both quickly access (or "parse") data stored in lists and to write more condensed code! 

## For Loops Refresher

In Module 1, you learned to construct for loops to repeat a series of steps over a predefined number of times, e.g., for each item x in example_list where example_list contains 4 items [Module 1: Section 4](https://github.com/bueno646/CIERA-HS-Program-2021/blob/master/IDEASpy-Mike-Updates/Module_1/Section_4_conditional_statements_and_loops.ipynb). Another way to repeat a sequence of steps is through a for loop, in which you repeat (or "iterate") over a predefined sequence, such as a Python list. As a reminder, here is the general structure for a for loop:

> example_list = [2,4,6,8]
> for temporary_variable in example_list:<br>
> &nbsp; &nbsp; &nbsp;    print(temporary_variable)

Try running this code in the cell below!

In [2]:
example_list = [2,4,6,8] 

for temporary_variable in example_list:
    
    print(temporary_variable)

2
4
6
8


__Explanation__:<br>
"temporary_variable" in the cell above is a variable the for loop uses as it goes through the items in the sequence "example_list". Starting with the 0th element in example_list, the int 2, the for loop saves this element as "temporary_variable". The for loop then does what we tell it to - print "temporary_variable" in line 5. After this temporary_variable is printed, the for loop starts over again with the next element in example_list - the int 4 - being saved as the "temporary_variable" because there are no other commands after the print statement. Everything under line 3 in the cell above is repeated for each item in "example_list".


### Quick note on variable names
It is important to be careful with the different names you use in your code! It can be helpful to name your variables in a way that helps prevent you from accidently reusing variables later on! Unknowningly reusing a  variable that you used earlier can lead to bugs in your code that can be difficult to find! In our example from above, the temporary variable "temporary_variable" will remain assigned to the last element in the python list it was iterating over. Run the cell below to see for yourself!

In [5]:
print(temporary_variable)

8


##  Grabbing indices for an iterable object

We can also iterate over the indices of the list (e.g., for a list of 4 elements, or length 4, the indices are 0, 1, 2, and 3). Within the loop, we reference the items by the list name and the index variable. You may find it helpful to revist our discussion on indexing in 1.3 of [Module 2: Section 1](https://github.com/bueno646/CIERA-HS-Program-2021/blob/master/IDEASpy-Mike-Updates/Module_2/Section_1.ipynb).  

### Using the enumerate( ) function to access (or 'parse', 'iterate over' etc) your data
__why should I care?__

Using enumerate allows you to access your data stored in an iterable object (like a list) while also keeping track of the indices for that iterable object. Lets use an example to illustrate the utility of enumerate( )

## Walkthrough: Enumerate( )

__Context__:

You, an astronomer, are give a data set of 1000 stellar temperatures. Within those stellar temperatures, you want to make a subset of sun like stars - stars with temperatures between 5300 K and 6000 K. Before you try your code on the entire data set, you want to try it on a smaller portion of this data (a 'subset') - 10 data points. 

__Situation__: 

You are going to use the enumerate( ) function to iterate over the data in your subset. You will also use conditional operators (from [Module 1: Section 4](https://github.com/bueno646/CIERA-HS-Program-2021/blob/master/IDEASpy-Mike-Updates/Module_1/Section_4_conditional_statements_and_loops.ipynb)) to check each data point if between 5300 K and 6000 K. If the data point is within this temperature range, we will add this index to a list (this list will be empty to start). We will then use that list of indices _to make a copy of the subset that is __Only__ sun like stars_.

### Stellar Temperature Data Subset
Below is the data subset

In [10]:
stellar_data_kelvin_subset = [5600,5000,6500,6600,3000,5708,7000,6300,5200,5900]


### Enumerate( ) Format (or "syntax")
Lets take a look at the format for the enumerate( ) function so we know how to use it correctly!

There are functionally two syntaxes that are most important to know for the enumerate function - the syntax for the enumerate function itself and how it is used in loops. 

Lets look at the syntax of the enumerate function itself first:

> enumerate(iterable_object, start_index)   # the default starting index is 0 (i.e it will start with the 0th element unless you tell it otherwise)

__Note:__ "iterable_object" and "start_index" are not the actual names of the arguments that enumerate takes. They are slightly more explicit versions of the arguments that enumerate takes - "iterable" and "start". 


In the code below, lets see how that would look with an actual iterable object - like our stellar data subset. 


In [4]:
enumerate(stellar_data_kelvin_subset)

<enumerate at 0x7fa866b53a40>

Executing the code cell above will produce some text on your screen that is not helpful. Lets use a loop to make enumerate more useful. When we iterate over an enumerated object (like in the code cell above) we can get __two__ object instead of one (like in the code example below section 1.1 of this notebook)

Lets now look at this syntax below:

> for index,element in enumerate(stellar_data_kelvin_subset): <br>
> &nbsp; &nbsp; &nbsp;    print("the index is",index) <br>
> &nbsp; &nbsp; &nbsp;    print("the element at that index is",element) <br>
> &nbsp; &nbsp; &nbsp;    print( ) # this empty print statement will make the output of the print statements above easier to read

__Note:__ our temporary variables are now "index" and "element". We could choose any name for these variables, but are using these to help remember that the two objects you get from looping over an enumerated object are the index for an element and the element itself.

Lets take a look at this syntax in action. We will use the syntax above for just the first three elements, so we can see the outputs of the print statements more easily. 

In [9]:
# placing subset here for your reference
stellar_data_kelvin_subset = [5600,5000,6500,6600,3000,5708,7000,6300,5200,5900]

# note the slice below - we are just iterating over the first 3 objects in our subset
for index,element in enumerate(stellar_data_kelvin_subset[:3]):
    print("the index is",index)
    print("the element at that index is",element)
    print() # this empty print statement will make the output of the print statements above easier to read

the index is 0
the element at that index is 5600

the index is 1
the element at that index is 5000

the index is 2
the element at that index is 6500



### Using Range( ) and len( ) to grab indices
We can quickly grab indices for a list using two familiar functions: len() and range(). 

As a refresher, here is a brief explanation of the range function and an example of it being used on the integer 10.

__Brief Explanation__:<br>
range(max_number) : returns an __iterable__ object consisting of numbers from 0 up to max_number - 1<br>


#### Range Example

In [19]:
range(10)  # this is the iterable object

## Lets use a for loop on this object to see what is in it

for ii in range(10):
    print (ii)

0
1
2
3
4
5
6
7
8
9


__Range Example Explanation__: <br>
As noted in 1.2.1, the range function return an iterable object consisting of the number 0 up to the max_number (10 in this) minus 1. The outputs of 1.2.2 confirm this, as we see 0-9 printed.



### How to quickly grab indices for a list (Cont.)
In the example above we saw that the range function can return an interable object of numbers based on the number passed into it. We can use this to our advantage by combining it with the len function. Lets look at the code below to illustrate this

In [20]:
example_list = [2,4,6,8]

for ii in range(len(example_list)):   # ii is a temp variable
    
    print("The", ii,"th index corresponds to the element", example_list[ii])
    
          # Reference list values via the list name and indices

The 0 th index corresponds to the element 2
The 1 th index corresponds to the element 4
The 2 th index corresponds to the element 6
The 3 th index corresponds to the element 8


In [None]:
example_list = [2,4,6,8]

for ii in range(len(example_list)):   # ii is a temp variable
    # here is what happening in the line above...
    print("The", ii,"th index corresponds to the element", example_list[ii])
    # same thing...
          # Reference list values via the list name and indices

## List comprehension

List comprehension is a powerful tool for doing concise operations on lists or even to create new lists. As we'll talk about a bit later, besides making your code more concise, using list comprehension can also make your code run more efficiently.

Let's see how we can write a for loop via list comprehension:

In [1]:
my_list = [2, 4, 6, 8]
print([item for item in my_list])       # Prints the items in the list without need for a for loop
print([item*10 for item in my_list])    # This will print each item multiplied by 10

[2, 4, 6, 8]
[20, 40, 60, 80]


And, another way to do the exact same thing, by iterating over the indices of the list:

In [None]:
my_list = [2, 4, 6, 8]
print([my_list[i] for i in range(len(my_list))])
print([my_list[i]*10 for i in range(len(my_list))])

List comprehension can be a bit trickier to understand because it's compressing the code down to the bare minimum; that's also the power of list comprehension, and why it's worth understanding it well. 

## Practice

In the cell below, you'll work with a list of the distances of our nearest stars (excluding the Sun!). Using both a for loop and list comprehension, you'll be converting the star distances from light years (Ly) to parsecs (pc), another common measurement of distance in astronomy. Hint: 1 Ly = 0.306601 pc

In [None]:
# List nearest stars distances in light years
distances = [4.2, 4.2, 4.4, 5.9, 7.8]

# By the way, the names of these stars nearest stars are:
# (1) Proxima Centauri, (2) Alpha Centauri A, (3) Alpha Centauri B, 
# (4) Barnard's Star and (5) Wolf 359
# Alpha Cen A and B are technically part of a single star system

# Conversion factor - from ly to pc
ly_to_pc = 0.306601

# Write a FOR LOOP to print each star distance in units of pc using the conversion factor above
for FILL IN CODE
    FILL IN CODE 
    
# Now, print off the distances in units of pc again, but this time using LIST COMPREHENSION
print('distances in pc:', FILL IN CODE)

# In the above code, we printed the converted values, but we didn't SAVE the converted values

# Using a FOR LOOP, create a NEW list to store the distances in pc, called distances_pc
distances_pc = FILL IN CODE    # must first create the new empty list
for i in FILL IN CODE:   # here we set up temp variable i for iterating
    FILL IN CODE    # hint: use append

# Alternatively, we can change the values of our original list so that all values are in pc
for i in FILL IN CODE:
    distances[i] = FILL IN CODE

# Print out the new list distances_pc and the modified distances array
# They SHOULD be identical!
print('distances_pc =', FILL IN CODE)
print('distances in pc =', FILL IN CODE)

# Now, perform the same two tasks using LIST COMPREHENSION
# First, let's reset back to our original list in light years
distances = [4.2, 4.2, 4.4, 5.9, 7.8]

# Using LIST COMPREHENSION, create a NEW list to store the distances in pc, called distances_pc
distances_pc = FILL IN CODE

# Using LIST COMPREHENSION, change the values of our original list so that all values are in pc
distances = FILL IN CODE


# Print out the new list distances_pc and the modified distances array
# Again, they SHOULD be identical!
print('distances_pc =', FILL IN CODE)
print('distances in pc =', FILL IN CODE)

## Takeaways

> - As an alternative to while loops, for loops are even more common and more flexible when it comes to looping in Python<br>
> - For loops allow you to iterate over a range of pre-determined numbers, or a list of non-numeric items (e.g., every student name in a list)<br>
> - List comprehension is a bit tricky to learn but so powerful that it's probably worth your time to learn to take advantage of it; among other things, it can keep your code clear and concise<br>