## List comprehensions

Let's say we have the following task. We are given a list of words:

In [None]:
words = ["sky", "water", "air", "nature", "forest", "ice"]

We want to create a list where we will collect the length of the words from `words`. Using what we have learned about Python so far,a solution could look like the one below.

In [None]:
lengths = []



How does this code work? We

- Instantiate an empty list.
- Loop over an iterable or range of elements.
- Compute the length for each element and append it to the end of the list.


### Basics of list comprehensions

**List comprehensions** are a second, elegant way of dynamically constructic lists using a single line of code.
Style wise, this approach is considered more "Pythonic",a llowing you to focus of the logic behind the new list and worry less about the way it is constructed.

In the example above, we used the following code:

    new_list = []
    for item in some_iterable:
      new_list.append(function(item))
      
Now, rather than creating an empty list and adding each element to the end, we can do:

    
    new_list = [expression for item in some_iterable]
    
This allows us to simply define the list and its contents at the same time.

In [None]:
lengths = [len(word) for word in words]

Every list comprehension in Python includes three elements:

- the *expression* is the item itself, a call to a method, or any other valid expression that returns a value. In the example above, the expression `len(w)` gets you the length of each item.
- the *item* is the object or value in the list or iterable. In the example above, the member value is `w`.
- *some_iterable* is a list, set, sequence, generator, or any other object that can return its elements one at a time. In the example above, the iterable is the list `words`.

Note the the requirements on how the expression is defined are very flexible.
For instance, we can create a new list where every single word from the old list would be reversed.

Or maybe you want to create a list where all letters in `words` are masked:

And, of course, it is possible to map all values of the variable `w` to something that does not depend on `w` at all:

**Practice 1:** create a list containing the last letter of every word in `words`.

**Practice 2:** You are given two lists: `words` and `new_indices`.

In [1]:
words = ["sky", "water", "air", "nature", "forest", "ice"]
new_indices = [3, 0, 5, 1, 2, 4]

Write a list comprehension that would yield the following list:

    ['nature', 'sky', 'ice', 'water', 'air', 'forest']

In [2]:
[words[i] for i in new_indices]

['nature', 'sky', 'ice', 'water', 'air', 'forest']

As mentioned, list comprehensions are also **more declarative** than loops, which means they’re easier to read and understand.
Loops need you to focus on how the list is created. You have to manually create an empty list, loop over the elements, and add each of them to the end of the list. With a list comprehension in Python, you can instead focus on what you want to go in the list and trust that Python will take care of how the list construction takes place.

To read more about the advantages of this technique, you can check out [this blog posts and references therein](https://towardsdatascience.com/python-basics-list-comprehensions-631278f22c40).

### Adding conditions to list comprehensions

So far the list comprehensions we wrote were using the following logic:
  
  1. consider an item from some iterable;
  2. based on the value of item, create some new item;
  3. add this new item to a new list.
  
However, comprehensions would be less useful if we couldn't add *conditionals* to the way the build lists.
Consider now the following two lists: `swadesh` and `words`.

In [None]:
swadesh = ["fish", "bird", "dog", "house", "tree", "seed"]
words = ["bird", "laptop", "puppy", "house", "seed", "Python"]

The task is to make a copy of `words` that will contain only the words that are also included in the Swadesh list.

In [None]:
# Standard implementation Using Loops



It is also possible to do it using a list comprehension. The syntax will be the following:

    new_list = [expression for item in some_iterable if condition]

As with loops, conditionals are important as they allow us to filter out unwanted values.

**Practice.** Make a copy of the list `swadesh` that will only contain words that end with a vowel. If it helps, write the standard for loop first, and then write a version with a list comprehension.

In [3]:
swadesh = ["fish", "bird", "dog", "house", "tree", "seed"]

# your code
[w for w in swadesh if w[-1] in "aeiou"]

['house', 'tree']

### More complicated list comprehensions 

It is also possible to add `else` and a second `for` loop in list comprehensions. However, more "overloaded" the list comprehension is, less readable it is. 

My general advice would be not to use a list comprehension if you are see that it looks very scary and unreadable. :)

However, consider the following examples of list comprehensions and their "unfolded" versions.

**Example 1.** Make a copy of the list `words`. Leave the words unmasked if those words are included in the `swadesh` list, otherwise mask them as "UNK".

In [7]:
swadesh = ["fish", "bird", "dog", "house", "tree", "seed"]
words = ["bird", "laptop", "puppy", "house", "seed", "Python"]

**Solution 1**: No list comprehension:

In [None]:
words_1 = []
for w in words:
    if w in swadesh:
        words_1.append(w)
    else:
        words_1.append("UNK")
print(words_1)

**Solution 2**: With a list comprehension:

In [10]:
[w if w in swadesh else "UNK" for w in words]

['bird', 'UNK', 'UNK', 'house', 'seed', 'UNK']

In this case, the nested list comprehensions solution is arguably still more readable than the one using nested loop.s

**Example 2.** Imagine we want to create a list of letters all letters from `words` while preserving their order.

It is also possible to do using a list comprehension:

In [12]:
[l for w in words for l in w]

['b',
 'i',
 'r',
 'd',
 'l',
 'a',
 'p',
 't',
 'o',
 'p',
 'p',
 'u',
 'p',
 'p',
 'y',
 'h',
 'o',
 'u',
 's',
 'e',
 's',
 'e',
 'e',
 'd',
 'P',
 'y',
 't',
 'h',
 'o',
 'n']

Bonus way to get the same output:

In [14]:
list("".join(words))

['b',
 'i',
 'r',
 'd',
 'l',
 'a',
 'p',
 't',
 'o',
 'p',
 'p',
 'u',
 'p',
 'p',
 'y',
 'h',
 'o',
 'u',
 's',
 'e',
 's',
 'e',
 'e',
 'd',
 'P',
 'y',
 't',
 'h',
 'o',
 'n']

In [15]:
numbers = [2,7,14,20,56,30,60,45,99]

[n ** 2 for n in numbers if (n % 3 == 0 or n % 5 == 0) and n ** 2 > 50]

[400, 900, 3600, 2025, 9801]

## Dictionaries

Ok, let's not take a (short) break from loops, and talk about an useful new data structure.

Studying objects like strings and lists, we have learned to uniquely identify an item within an object _by its index_. However, in order to be able to access something by index, objects need to be ordered.

When working with complex data though, it is sometimes useful to create objects that are not intrinsically ordered.
However, we would still need to be able to uniquely refer to elements within such objects.
A **Dictionary** is an unordered data type (`dict`) that associates every _value_ with its _key_.

    dictionary = {key_1:value_1, key_2:value_2, ..., key_n:value_n}
    
Thus, we can uniquely identify elements within a dictionary (the values)  by their name (by key_).
    
### Requirements for keys and values

-  The **keys** must be unique (of course, because they are used _instead_ of indices). The data type of the key must be _immutable_. **Immutable** objects cannot be modified directly after they are created. Remember how difficult it is to modify a string and how easy it is to modify a list? Strings are _immutable objects_, and lists are _mutable_.

**Question** Why do yu think it is important to use *Immutable** objects as keys?

- The **values** can be anything, and they can be repeated as well.

In [None]:
int_keys = {37: "hello", 9: "world"}
float_keys = {48.2: "hello", 3.0: "world"}
string_keys = {"hello": "world", "goodbye": "earth"}
bool_keys = {True: "hello", False: "world"}

Lists are mutable, and therefore they cannot be used as keys. Note the error Python gives you: ` unhashable ` types are mutable (the can be changed once instantiated).

In [2]:
list_keys = {[1, 2]: "Hello"}

TypeError: unhashable type: 'list'

But they can be easily used as values.

In [1]:
list_val = {"list1": [1, 2]}

As we learned earlier, keys must be _unique_. If several different dictionary items are defined with the same key, only the item that was mentioned the last will be included in the dictionary.

**Warning** This means that if you instatiate a dictionary with duplicate keys, Python will not raise an error. But your code will not work as expected.

In [3]:
int_keys = {42: "hello", 0: "world", 42: "again"}
print(int_keys)

{42: 'again', 0: 'world'}


**Question:** what is the maximal size of a dictionary where all the keys are of the type `bool`?

**A Case Study: A Dictionary with ISO 639 language codes**

Consider the following dictionary with some of the [ISO 639](https://en.wikipedia.org/wiki/List_of_ISO_639-1_codes) codes of languages.

In [None]:
iso_639 = {'ny': ['Chichewa', 'Chewa', 'Nyanja'], 
           'zh': ['Chinese'], 
           'cs': ['Czech'], 
           'da': ['Danish'], 
           'dv': ['Divehi', 'Maldivian']}

Values can be accessed by keys in the same way as we were accessing them by indices before:

In [None]:
print("The value of \"ny\" is", iso_639["ny"])
print("The value of \"da\" is", iso_639["da"])

**Question.** How to access the string "Chewa"?

Adding a new item to the dictionary is extremely easy.
    
    dictionary[new_key] = new_value

In [4]:
iso_639["ru"] = ["Russian"]
print(iso_639)

NameError: name 'iso_639' is not defined

### Iterating over dictionaries

Dictionaries can be iterated over. But the thing to be beware of is that iterating over the dictionary actually **iterates over the keys**.

In [None]:
for language in iso_639:
    print(language, end=" ")

**Practice.** Modify the for loop above so that it produces the following output:

    ny -> ['Chichewa', 'Chewa', 'Nyanja']
    zh -> ['Chinese']
    cs -> ['Czech']
    da -> ['Danish']
    dv -> ['Divehi', 'Maldivian']
    ru -> ['Russian']

In [None]:
#write your code here

One way to iterate over the `key:value` pairs is to apply `.items()` method to the dictionary.
That method extracts each `key:value` pair as a single element, by turning them into a _tuple_. (Remember `zip` and `enumerate`? Tuples are basically _immutable_ versions of lists.)

### Methods defined for dictionaries

As for lists and strings, there are many useful methods associated to dictionaries.
This section of the notebooks exemplifies a few important ones.

  * method `.keys()` returns a collection of keys;
  * method `.values()` returns a collection of values;
  * method `.clear()` removes all items from a dictionary;
  * operator `del` deletes an item by its key.
  
Click on [this link](https://www.programiz.com/python-programming/methods/dictionary) if you want to learn more.
Consider now the following dictionary:

In [None]:
zip_codes = {"Stony Brook": [11733, 11790, 11794],
             "Port Jefferson": [11777],
             "Lake Grove": [11755, 11790]
            }

First of all, it is possible to separate a dictionary into independent collections of keys and list of values.
The method `.keys()` returns a collection of keys from a dictionary, and it can easily be typecasted into a list.

In [None]:
print("Keys:            ",) #your code)
print("Keys (as a list):",) #your code)

The method `.values()` returns a collection of values, and it also can easily be typecasted into a list.

In [None]:
print("Values:            ", #your code)
print("Values (as a list):", #your code)

**Practice 1.** Create a list of keys in `zip_codes` using a list comprehension.

**Practice 2.** Now create a list of values in `zip_codes` using a list comprehension.

Similarly to lists, the operator `del` removes an item from a dictionary.

When we were looking at the list methods, we saw the `del` deletes an item by its index. Dictionary items do not have indices, but they have keys, so `del` deletes an object from a dictionary _by key_.

In [None]:
del zip_codes["Stony Brook"]
print(zip_codes)

Finally, `.clear()` completely wipes the dictionary:

In [None]:
zip_codes.clear()
print(zip_codes)

**Example.** Let's look at a practicl example of how to use dictionaries. We are given lists `fruits` and `prices`.

In [1]:
fruits = ["banana", "apple", "apple", "peach", "kiwi", "kiwi", "kiwi"]
prices = ["$1.20", "$0.87", "$0.48", "$2.9", "$0.93", "$1.48", "$1.05"]

It means that the only observed price of a banana is $\$1.20$. In different stores, apples cost $\$0.87$ and $\$0.48$, and so on. We want to create a dictionary that will store all the prices in the following way:

    {'banana': ['$1.20'], 'apple': ['$0.87', '$0.48'], 'peach': ['$2.9'], 'kiwi': ['$0.93', '$1.48', '$1.05']}

In [4]:
fruit_prices = {}
for i in range(len(fruits)):
    if fruits[i] not in fruit_prices:
        fruit_prices[fruits[i]] = [prices[i]]
    else:
        fruit_prices[fruits[i]].append(prices[i])

print(fruit_prices)

{'banana': ['$1.20'], 'apple': ['$0.87', '$0.48'], 'peach': ['$2.9'], 'kiwi': ['$0.93', '$1.48', '$1.05']}


**Practice.** An easy way to calculate a sum of all numbers before a certain integer (excluding that integer) is the following one:

In [None]:
a = 8
print("1 + 2 + 3 + 4 + 5 + 6 + 7 =", sum(range(a)))

b = 16
print("1 + 2 + 3 + ... + 14 + 15 =", sum(range(b)))

Create a dictionary where keys will be natural numbers from $1$ to $10$. For every key $n$, its value is the sum of all numbers from $0$ up to $n-1$.

    sums = {1: 0, 2: 1, 3: 3, 4: 6, 5: 10, 6: 15, 7: 21, 8: 28, 9: 36, 10: 45}

In [8]:
sums = {k: sum(range(k)) for k in range(1, 11)}
sums = {k: k*(k-1)//2 for k in range(1, 11)}
print(sums)

{1: 0, 2: 1, 3: 3, 4: 6, 5: 10, 6: 15, 7: 21, 8: 28, 9: 36, 10: 45}


## (Optional!) Advanced section: `map`

`map` is a third way to create lists, similalry to loops and list comprehensions.
The idea is to "map" every single item of a given iterable to a new value. 
    
    new_iterable = map(expression, old_iterable)
    
The parameter `expression` represents any function such as `len`, `sum`, `range`, or others. `map` creates a _map_ object that can be easily converted to a list. (You can also use your own customized functions, as we will learn in a few weeks!)

In [None]:
lengths = map(len, words)
print("Map object:", lengths)
print("Map object:", list(lengths))

In [15]:
numbers = range(1, 6)
new_numbers = list(map(lambda x: x*2 + 1, numbers))
print(new_numbers)

[3, 5, 7, 9, 11]


**A very advanced question:** You are given the `initial_range`.

In [9]:
initial_range = list(range(12))
print(initial_range)

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11]


Find a way to create the following list (the shape of the representation doesn't matter):

        [[],
         [0],
         [0, 1],
         [0, 1, 2],
         [0, 1, 2, 3],
         [0, 1, 2, 3, 4],
         [0, 1, 2, 3, 4, 5],
         [0, 1, 2, 3, 4, 5, 6],
         [0, 1, 2, 3, 4, 5, 6, 7],
         [0, 1, 2, 3, 4, 5, 6, 7, 8],
         [0, 1, 2, 3, 4, 5, 6, 7, 8, 9],
         [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10]]
         
You might need to use `map` twice!

In [13]:
stairs = map(list, map(range, initial_range))
print(list(stairs))

[[], [0], [0, 1], [0, 1, 2], [0, 1, 2, 3], [0, 1, 2, 3, 4], [0, 1, 2, 3, 4, 5], [0, 1, 2, 3, 4, 5, 6], [0, 1, 2, 3, 4, 5, 6, 7], [0, 1, 2, 3, 4, 5, 6, 7, 8], [0, 1, 2, 3, 4, 5, 6, 7, 8, 9], [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10]]
