# Comprehension Syntax

Comprehension syntax is a syntax specific to some programming languages that comes from the world of mathematics and which, instead of describing elements one by one, describes the properties that these different elements have in common.

In Python, this can be thought of as a formula that allows us to generate elements instead of writing them all one by one.

This syntax can be applied to different objects (lists, sets and dictionaries) and is widely used, for example to easily filter iterables.

Even if this way of writing is short and elegant, it is not recommended to use it each time we need to generate a list, a dictionary or a set. Indeed, if the operation to be performed is complex, it's probably best to use a `for` loop which will take up a little more space in the code but will also be easier to create, test, debug and understand by another programmer.

# List Comprehension

## Definition

List Comprehension are a kind of "magic formula" that will create lists. In general, we start with an iterable object from which we want to create a list (it can be another list, but also a dictionary or any other iterable object).

To create a comprehensive list we open the brackets as if we wanted to create a list, but instead of enumerating the different elements that make it up, we use the following syntax:

```python
[expression for element in iterable]
```
Here:

- The variable **expression** is the result we want to get.
- The **element** variable is the name we will give to each element we iterate on.
- The **iterable** variable is the name of the object on which we can iterate.

So you could write it like this:

```python
[what_we_want_to_get for variable_containing_successively_each_element in object_iterable]
```

To begin in a very simple way, let's just create a comprehension list that will generate a copy of another list:

In [None]:
seq = [2, 6, 8, 5, 7, 9, 1, 2, 3, 6]

[n for n in seq]

Now let's imagine that we want to generate a new list containing the squares of each of these numbers, we will then modify the expression we are trying to obtain:

In [None]:
seq = [2, 6, 8, 5, 7, 9, 1, 2, 3, 6]

[n ** 2 for n in seq]

Note that we could use a `for` loop as well, but it'd take more time to write:

In [None]:
seq = [2, 6, 8, 5, 7, 9, 1, 2, 3, 6]

new_list = []

for n in seq:
    new_list.append(n ** 2)

print(new_list)

### Exercise (easy)

From the following list, generate a new list using the comprehensive syntax that triples each character. For example The letter "c" will become "ccc" in the new list.

**Tip**:

- Remember, in python it is possible to multiply strings by integers.

In [None]:
seq = ["a", "c", "d", "f", "g", "z", "e", "i"]

# Code here!

## Filtering with list comprehension and `if`

Lists allow you to easily filter, i.e. to choose the elements that you will process (and those that you will not process).

To do this, you simply add an `if` at the end of the syntax.

```python
[el for el in iterable if cond]
```
In other words, we want "el" if the condition is met, where "el" is the element that will be added to our list. Otherwise the element will not be processed.

We could also write it this way:

```python
[what_we_want_to_get for variable_containing_successively_each_element in object_iterable if our_condition_is_true]
```

Let's go back to the last example and imagine that we only want to triple the letters in the case that they are vowels:

In [None]:
seq = ["a", "c", "d", "f", "g", "z", "e", "i"]

[el * 3 for el in seq if el in ["a", "e", "i", "o", "u", "y"]]

## Exercise (easy)

Write a comprehensive list that returns the square of the following numbers only if they are strictly greater than 5.

In [None]:
seq = [2, 6, 8, 5, 7, 9, 1, 2, 3, 6]

# code here!


## `if` and `else` in a list comprehension

Sometimes we want to apply different transformations to elements depending on their nature. In this case you can add `if` and `else` in the first part of the expression.

```python
[expression_1 if condition else expression_2 for element in iterable]
```
For example, let's say we want to get the squares of the numbers 2, 4 and 6 and the cube of the other numbers.

In [None]:
seq = [2, 6, 8, 5, 7, 9, 1, 2, 3, 6]

[n ** 2 if n in [2, 4, 6] else n ** 3 for n in seq]

Note that the use of conditional expressions (`if` and `else`) can be combined with the use of an `if` at the end of the expression which will filter out the upstream elements:

In [None]:
seq = [2, 6, 8, 5, 7, 9, 1, 2, 3, 6]

[n ** 2 if n in [2, 4, 6] else n ** 3 for n in seq if n > 5]

## Exercise (easy / medium)

Generate a new list from the "fruits" list using the comprehensive syntax. The new list will have the following characteristics:

- The first letter of each word must be a capital letter.
- The items "banana" will be replaced by "Gorilla".
- The items "mango" should be removed from the list.

In [None]:
fruits = ["mango", "banana", "kiwi", "orange",
         "banana", "banana", "banana", "banane",
         "mango", "mango", "banana", "mango", "banana",
         "pineapple", "mango", "banana", "pineapple"]

# Code here!

# Dictionary comprehension

## Definition

Dictionary comprehension behave in the same way as list comprehension, except that they must be given two expressions to return: the key and the value. The syntax is therefore slightly different:

```python
{k: v for k, v in iterable}
```
Here:

- The variable **k** (*key*) is the variable that will be used as the key.
- The variable **v** (*value*) is the value associated with this key.
- The **iterable** variable is the name of the object that we will be able to iterate on, so this iterable must contain pairs of values.

Let's start simply by creating a copy of a dictionary, in this case we must iterate on it using the `.items()` method so that at each iteration a key-value pair is returned.

In [None]:
my_dict = {"kiwis": 3,
           "melons": 12,
           "oranges": 7}

{k: v for k, v in my_dict.items()}

Now let's multiply the quantities by 5.

In [None]:
my_dict = {"kiwis": 3,
           "melons": 12,
           "oranges": 7}

{k: v * 5 for k, v in my_dict.items()}

And let's capitalise the keys.

In [None]:
my_dict = {"kiwis": 3,
           "melons": 12,
           "oranges": 7}

{k.upper(): v * 5 for k, v in my_dict.items()}

## Generating a dictionary from lists

Let's pretend we have two lists, we want the first one to become a key and the second to become the value. We can then use the `zip()` function which will map each element of one list to the other so that we can iterate over both lists simultaneously!

Let's create an object of type `zip` and iterate over it to see how it behaves:

In [None]:
l1 = ["kiwis", "melons", "oranges"]
l2 = [3, 12, 7]

for el in zip(l1, l2): print(el)

Zipping the lists together has created a special "zip" object that allows us to iterate over both lists at once. This returns a "tuple" at each iteration. Now let's generate our dictionary using the comprehensive syntax :

In [None]:
l1 = ["kiwis", "melons", "oranges"]
l2 = [3, 12, 7]

{k: v for k, v in zip(l1, l2)}

Conditional structures (`if` and `else`) apply in the same way as in list comprehension.

# Exercise (medium)

Create a dictionary from the following two lists. The one containing the names of the differents types of food will be the key, the other will be the values. Make sure that the following constraints are met:

- If a dictionary key has more than 6 characters, the first letter of the key will be capitalized. Otherwise, if the key is 6 characters or fewer, the entire key should be upper case.
- If the value is 0, replace it with "out of stock".
- Keys beginning with the letter "c" and "b" (regardless of their case) should be ignored.

**TIPS**

- Use `.upper()` to deal with the letter case.
- Do you remember the use of *slices*?
- If you apply ``len()`` on a string, you'll get the number of characters of the string.

In [None]:
food = ["Courgettes", "quiches", "aubergines", "croissants", "Baguettes", "kiwis", "melons", "oranges", "brioches"]
stock = [1, 0, 2, 16, 46, 0, 1, 4, 5]

# Code here!


# Exercise (difficult)

Use comprehensive syntax to generate a list of numbers and solve the following problem.

Find a 6-digit number that meets the following conditions:

- The first and last digits are the same.
- The first digit multiplied by 2 produces a 2-digit number.
- This number is the second and third digits.
- The last digit multiplied by 3 produces a 2-digit number.
- This number is the fourth and fifth digits.
- The total of all 6 digits equals 22.

**Tip**:

- Start by creating a comprehensive list that generates all the real numbers with 6 digits. Then use `if` and `and` to filter the list.

In [None]:
# Code here!



# Exercise (difficult)

Some words in the English language have a unique property: their letters are arranged in alphabetical order. For example, the words "abbey", "cells" and "loops" share this property.

Run the cell below to create a list containing a large number of words in the English language (around 370 000), then create a comprehensive list to keep only the words that share this property and contain at least 5 letters.

**Libraries**: To complete our task we will the `requests` library which allows us to retrieve information from the internet.

**Tips**:

- The `all` function can be very useful. It takes a list of iterables as input and returns `True` if all the elements in the list are `True`. If at least one of the elements is `False`, then it returns `False`.

For example:

```python
all([True, False, True])
>>> False
```

But you can also apply `all` to expressions or comprehensive lists, for example :

```python
l = [n**2 > 200 for n in range(13,17)]
all(l)
>>> False
```

Or directly (note the absence of square brackets "[ ]", it's the expression we're evaluating, not the comprehensive list): 

```python
all(n**2 > 200 for n in range(13,17))
>>> False
```

- The `enumerate()` function is used to retrieve two variables when we go through an iterable: the index, and the element. By convention, these are called "i" and "el". It is very useful when you want to compare characters in a string with other characters, since you can access any character with s[i] (where s is a string and i is an index).

```python
for i, el in enumerate('Hey') : print(i, el)
>>> 0 H
>>> 1 e
>>> 2 y
```

- The `ord()` function returns the number representing the unicode code of a specified character.

```python
ord("a")
>>> 97
```

- You need to find 341 words.

In [None]:
import requests

r = requests.get('https://raw.githubusercontent.com/dwyl/english-words/master/words_alpha.txt')
words = [w for w in r.text.split()]
print(f"Number of words in the list 'words' : {len(words)}")

# Code here!
