#### Knowledge Sharing Content
# <center> Some Helpful Python Tricks
#### [Bhanu Pratap Singh](https://www.linkedin.com/in/bpst/)

A `trick` is a way of accomplishing a task in a surprisingly fast or easy manner

## List Comprehension

Python offers a powerful way of creating new lists: list comprehension.

**[ expression + context ]**

The enclosing brackets indicate that the result is a new list. The context defines which list elements to select. The expression defines how to modify each list element before adding the result to the list.

In [2]:
# Example
[x * 2 for x in range(3)]

[0, 2, 4]

`for x in range(3)` is the context and the remaining part `x * 2`, is the expression. 

The expression doubles the values 0, 1, 2 generated by the context. Thus, the list comprehension results in the following list: [0, 2, 4]

Both the expression and the context can be arbitrarily complicated. The expression may be a function of any variable defined in the context and may perform any computation — it can even call outside functions. The goal of the expression is to modify each list element before adding it to the new list.

The context can consist of one or many variables defined using one or many nested for loops. We can also restrict the context by using if statements. In this case, a new value will be added to the list only if the user-defined condition holds.

Let's look into some examples to get good sense of list comprehension

In [8]:
# Example
print([x for x in range(5)])

[0, 1, 2, 3, 4]


In the above example,

print([➊x ➋for x in range(5)])

**Expression** ➊: Identity function (does not change the context variable x).

**Context** ➋: Context variable x takes all values returned by the range function: 0, 1, 2, 3, 4.

In [9]:
# Example
print([(x, y) for x in range(3) for y in range(3)])

[(0, 0), (0, 1), (0, 2), (1, 0), (1, 1), (1, 2), (2, 0), (2, 1), (2, 2)]


In the above example,

print([➊(x, y) ➋for x in range(3) for y in range(3)])

**Expression** ➊: Create a new tuple from the context variables x and y.

**Context** ➋: The context variable x iterates over all values returned by the range function (0, 1, 2), while context variable y iterates over all values returned by the range function (0, 1, 2). The two for loops are nested, so the context variable y repeats its iteration procedure for every single value of the context variable x. Thus, there are 3 × 3 = 9 combinations of context variables.

In [10]:
# Example
print([x ** 2 for x in range(10) if x % 2 > 0])

[1, 9, 25, 49, 81]


In the above example,

print([➊x ** 2 ➋for x in range(10) if x % 2 > 0])

**Expression** ➊: Square function on the context variable x.

**Context** ➋: Context variable x iterates over all values returned by the range function — 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 — but only if they are odd values; that is, x % 2 > 0.

In [11]:
# Example
print([x.lower() for x in ['I', 'AM', 'NOT', 'SHOUTING']])

['i', 'am', 'not', 'shouting']


In the above example,

print([➊x.lower() ➋for x in ['I', 'AM', 'NOT', 'SHOUTING']])

**Expression** ➊: String lowercase function on context variable x.

**Context** ➋: Context variable x iterates over all string values in the list: 'I', 'AM', 'NOT', 'SHOUTING'.

### Problem Statement - I

Say you work in the human resources department of a large company and need to find all staff members who earn at least $100,000 per year. 

Your desired output is a list of tuples, each consisting of two values: the employee name and the employee’s yearly salary. 

In [1]:
# Data
employees = {'Alice' : 100000,
             'Bob' : 99817,
             'Carol' : 122908,
             'Frank' : 88123,
             'Eve' : 93121}

# Logic
top_earners = []
for key, val in employees.items():
    if val >= 100000:
        top_earners.append((key,val))

# Result
print(top_earners)

[('Alice', 100000), ('Carol', 122908)]


Now let's see how we can do it using list comprehension

In [13]:
# Data
employees = {'Alice' : 100000,
             'Bob' : 99817,
             'Carol' : 122908,
             'Frank' : 88123,
             'Eve' : 93121}

# One-Liner
top_earners = [(k, v) for k, v in employees.items() if v >= 100000]

# Result
print(top_earners)

[('Alice', 100000), ('Carol', 122908)]


Let's see how it works,

**top_earners = [ ➊(k, v) ➋for k, v in employees.items() if v >= 100000]**

**Expression** ➊: Creates a simple (key, value) tuple for context variables k and v.

**Context** ➋: The dictionary method `dict.items()` ensures that context variable `k` iterates over all dictionary keys and that context variable `v` iterates over the associated values for context variable k — but only if the value of context variable v is larger than or equal to 100,000 as ensured by the `if` condition.

### Problem Statement - II

Search engines rank textual information according to its relevance to a user query. To accomplish this, search engines analyze the content of the text to be searched. All text consists of words. Some words provide a lot of information about the content of the text — and others don’t. 

Examples for the former are words like white, whale, Captain, Ahab (Do you know the text?). Examples for the latter are words like is, to, as, the, a, or how, because most texts contain those words. 

Filtering out words that don’t contribute a lot of meaning is common practice when implementing search engines. 

A simple heuristic is to filter out all words with three characters or less.

#### Using list comprehension to find words with high information value

In [15]:
# Data
text = '''
Call me Ishmael. Some years ago - never mind how long precisely - having
little or no money in my purse, and nothing particular to interest me
on shore, I thought I would sail about a little and see the watery part
of the world. It is a way I have of driving off the spleen, and regulating
the circulation. - Moby Dick'''


# One-Liner
words = [[x for x in line.split() if len(x)>3] for line in text.split('\n')]


# Result
print(words)

[[], ['Call', 'Ishmael.', 'Some', 'years', 'never', 'mind', 'long', 'precisely', 'having'], ['little', 'money', 'purse,', 'nothing', 'particular', 'interest'], ['shore,', 'thought', 'would', 'sail', 'about', 'little', 'watery', 'part'], ['world.', 'have', 'driving', 'spleen,', 'regulating'], ['circulation.', 'Moby', 'Dick']]


The one-liner creates a list of lists by using two nested list comprehension expressions

* The inner list comprehension expression `[x for x in line.split() if len(x)>3]` uses the string `split()` function to divide a given line into a sequence of words. We iterate over all words `x` and add them to the list if they have more than three characters.
* The outer list comprehension expression creates the string line used in the previous statement. Again, it uses the split() function to divide the text on the newline characters '\n'.

We need to get used to thinking in terms of list comprehensions, so the meaning may not come naturally to us.

## Reading a File

Read a file and store the result as a list of strings (one string per line). We will also remove any leading and trailing whitespaces from the lines.

In [6]:
filename = "sample.txt" # this code

f = open(filename)
lines = []
for line in f:
    lines.append(line.strip())

print(lines[0:2])

['Quod equidem non reprehendo;', 'Lorem ipsum dolor sit amet, consectetur adipiscing elit. Quibus natura iure responderit non esse verum aliunde finem beate vivendi, a se principia rei gerendae peti; Quae enim adhuc protulisti, popularia sunt, ego autem a te elegantiora desidero. Duo Reges: constructio interrete. Tum Lucius: Mihi vero ista valde probata sunt, quod item fratri puto. Bestiarum vero nullum iudicium puto. Nihil enim iam habes, quod ad corpus referas; Deinde prima illa, quae in congressu solemus: Quid tu, inquit, huc? Et homini, qui ceteris animantibus plurimum praestat, praecipue a natura nihil datum esse dicemus?']


The code opens `sample.txt` file, creates an empty list, lines, and fills the list with strings by using the `append()` operation in the for loop body to iterate over all the lines in the file. We also use the string method `strip()` to remove any leading or trailing whitespace (otherwise, the newline character '\n' would appear in the strings).

Now let's see how we can do it in one liner

In [8]:
print([line.strip() for line in open("sample.txt")])

['Quod equidem non reprehendo;', 'Lorem ipsum dolor sit amet, consectetur adipiscing elit. Quibus natura iure responderit non esse verum aliunde finem beate vivendi, a se principia rei gerendae peti; Quae enim adhuc protulisti, popularia sunt, ego autem a te elegantiora desidero. Duo Reges: constructio interrete. Tum Lucius: Mihi vero ista valde probata sunt, quod item fratri puto. Bestiarum vero nullum iudicium puto. Nihil enim iam habes, quod ad corpus referas; Deinde prima illa, quae in congressu solemus: Quid tu, inquit, huc? Et homini, qui ceteris animantibus plurimum praestat, praecipue a natura nihil datum esse dicemus?', '', 'Iam id ipsum absurdum, maximum malum neglegi. Quod ea non occurrentia fingunt, vincunt Aristonem; Atqui perspicuum est hominem e corpore animoque constare, cum primae sint animi partes, secundae corporis. Fieri, inquam, Triari, nullo pacto potest, ut non dicas, quid non probes eius, a quo dissentias. Equidem e Cn. An dubium est, quin virtus ita maximam par

### Problem Statement
When given a list of strings, creates a new list of tuples, each consisting of a Boolean value and the original string. The Boolean value indicates whether the string 'anonymous' appears in the original string! We call the resulting list mark because the Boolean values mark the string elements in the list that contain the string 'anonymous'.

In [9]:
# Data
txt = ['lambda functions are anonymous functions.',
       'anonymous functions dont have a name.',
       'functions are objects in Python.']

# One-Liner
mark = map(lambda s: (True, s) if 'anonymous' in s else (False, s), txt)

# Result
print(list(mark))

[(True, 'lambda functions are anonymous functions.'), (True, 'anonymous functions dont have a name.'), (False, 'functions are objects in Python.')]


The `map()` function adds a Boolean value to each string element in the original txt list. This Boolean value is True if the string element contains the word anonymous. The first argument is the anonymous lambda function, and the second is a list of strings we want to check for the desired string.

## Using Slicing to Extract Matching Substring Environments

*Slicing* — the process of carving out a subsequence from an original full sequence — to process simple text queries. 

Slicing carves out subsequences of a sequence, such as a part of a string. The syntax is straightforward. Say we have a variable `x` that refers to a string, list, or tuple. We can carve out a subsequence by using the following notation

### **x[start:stop:step]**
The resulting subsequence starts at index `start` (included) and ends at index `stop` (excluded). We can include an optional third `step` argument that determines which elements are carved out, so we could choose to include just every step-th element. For example, the slicing operation below 

In [10]:
x = 'hello world'
x[1:4:1]

'ell'

In [11]:
x[1:4:2]

'el'

If we don’t include the step argument, Python assumes the default step size of one. 

Study the following examples to improve our intuitive understanding even further.

In [12]:
# define string
s = 'Eat more fruits!'

In [13]:
print(s[0:3])

Eat


If start >= stop with a positive step size, the slice is empty

In [15]:
print(s[3:0])




In [16]:
print(s[:5])

Eat m


In [17]:
print(s[5:])

ore fruits!


If the stop argument is larger than the sequence length, Python will slice all the way to and including the rightmost element

In [18]:
print(s[:100])

Eat more fruits!


In [19]:
print(s[4:8:2])

mr


If the step size is positive, the default start is the leftmost element, and the default stop is the rightmost element (included) 

In [20]:
print(s[::3])

E rfi!


If the step size is negative (step < 0), the slice traverses the sequence in reverse order. With empty start and stop arguments, you slice from the rightmost element (included) to the leftmost element (included)

In [23]:
print(s[::-1])

!stiurf erom taE


Note that if the stop argument is given, the respective position is excluded from the slice.

In [22]:
print(s[6:1:-1])

rom t


### Problem Statement
Our goal is to find a particular text query within a multiline string. We want to find the query in the text and return its immediate environment, up to 18 positions around the found query. Extracting the environment as well as the query is useful for seeing the textual context of the found string—just as Google presents text snippets around a searched keyword. We are looking for the string 'SQL' in an Amazon letter to shareholders — with the immediate environment of up to 18 positions around the string 'SQL'.

In [24]:
# Data
letters_amazon = '''
We spent several years building our own database engine,
Amazon Aurora, a fully-managed MySQL and PostgreSQL-compatible
service with the same or better durability and availability as
the commercial engines, but at one-tenth of the cost. We were
not surprised when this worked.
'''

# One-Liner
find = lambda x, q: x[x.find(q)-18:x.find(q) + 18] if q in x else -1

# Result
print(find(letters_amazon, 'SQL'))

a fully-managed MySQL and PostgreSQL


As a result, we got the string and a few words around it to provide context for the find.