List Comprehensions
-------------------

The practical data scientist often faces situations where one list is to be transformed into another list, transforming the values in the input array, filtering out certain undesired values, etc. List comprehensions are a natural, flexible way to perform these transformations on the elements in a list. 

The syntax of list comprehensions is based on the way mathematicians define sets and lists, a syntax that leaves it clear what the contents should be:

+ `S = {x² : x in {0 ... 9}}`

Python's list comprehensions give a very natural way to write statements just like these. It may look strange early on, but it becomes a very natural and concise way of creating lists, without having to write for-loops.

In [None]:
S = [] # initialize the list
for x in range(10):
    S.append(x*x)
print(S)

In [None]:
# This code below will create a list with the squares
# of the numbers from 0 to 9 
S = [] # we create an empty list
for i in range(10): # We iterate over all numbers from 0 to 9
    S.append(i*i) # We add in the list the square of the number i
print(S )# we print(the list)

In [None]:
S = [i*i for i in range(10)]
print(S)

Now let's do one more example:

+ `V = (1, 2, 4, 8, ..., 2¹²)`


In [None]:
V = [2**i for i in range(13)]
print(V)

In [None]:
V= []
for i in range(13):
    V.append(2**i)
print(V)

### The *if* statement within a list comprehension

Now let's consider the following case:

+ `M = {x | x in S and x even}`

**Note the list comprehension for deriving M uses a "if statement" to filter out those values that aren't of interest**, restricting to only the even squares.

In [None]:
S = [i*i for i in range(10)]
print(S)

In [None]:
M = []
for i in S: # iterate through all elements in S
    if i%2 == 0: # if i is an event number
        M.append(i) # ..add it to the list
print(M)

In [None]:
M = [x for x in S if x%2 == 0]
print(M)

These are simple examples, using numerical compuation. Let's see a more "practical" use: In the following operation we transform a string into an list of values, a more complex operation: 

In [None]:
words = 'The quick brown fox jumps over the lazy dog'
[(w.upper(), w.lower(), len(w)) for w in words.split()]

#### Exercise

* List each word and its length from the string 'The quick brown fox jumps over the lazy dog', conditioned on the length of the word being four characters and above
* List only words with the letter o in them

In [None]:
# List each word and its length from the string 
# 'The quick brown fox jumps over the lazy dog', 
# conditioned on the length of the word being four characters and above


In [None]:
# List only words with the letter o in them


* You are given the `wsj` article below. Write a list comprehension for getting the words that appear more than once. 
    * Use the `.split()` command for splitting, without passing a parameter.
    * When counting words, case does not matter (i.e., YAHOO is the same as Yahoo).

* Find all the *characters* in the article that are not letters or numbers. You can use the isdigit() and isalpha() functions, which work on strings. (e.g, `"Panos".isalpha()` and `"1234".isdigit()` return True) 

In [None]:
wsj = """
Yahoo Inc. disclosed a massive security breach by a “state-sponsored actor” affecting at least 500 million users, potentially the largest such data breach on record and the latest hurdle for the beaten-down internet company as it works through the sale of its core business.
Yahoo said certain user account information—including names, email addresses, telephone numbers, dates of birth, hashed passwords and, in some cases, encrypted or unencrypted security questions and answers—was stolen from the company’s network in late 2014 by what it believes is a state-sponsored actor.
Yahoo said it is notifying potentially affected users and has taken steps to secure their accounts by invalidating unencrypted security questions and answers so they can’t be used to access an account and asking potentially affected users to change their passwords.
Yahoo recommended users who haven’t changed their passwords since 2014 do so. It also encouraged users change their passwords as well as security questions and answers for any other accounts on which they use the same or similar information used for their Yahoo account.
The company, which is working with law enforcement, said the continuing investigation indicates that stolen information didn't include unprotected passwords, payment-card data or bank account information.
With 500 million user accounts affected, this is the largest-ever publicly disclosed data breach, according to Paul Stephens, director of policy and advocacy with Privacy Rights Clearing House, a not-for-profit group that compiles information on data breaches.
No evidence has been found to suggest the state-sponsored actor is currently in Yahoo’s network, and Yahoo didn’t name the country it suspected was involved. In August, a hacker called “Peace” appeared in online forums, offering to sell 200 million of the company’s usernames and passwords for about $1,900 in total. Peace had previously sold data taken from breaches at Myspace and LinkedIn Corp.
"""