# Trick 1/10
Do you like "before-and-after" case studies like in those shows where the nerd gets a new set of clothing and becomes the cool kid in town?

Today, you're looking at a special before-and-after case study: how to beautify a Python code snippet.

Let's say you have this code to find all top earners from a dictionary of employee salary data:


In [None]:
# Create employees dictionary of salaries
employees = {'Alice' : 100000,
             'Bob' : 99817,
             'Carol' : 122908,
             'Frank' : 88123,
             'Eve' : 93121}
print(employees)

What are the top earners earning over
$100k per month?

In [None]:
top_earners = []
for key, val in employees.items():
    if val >= 100000:
        top_earners.append((key,val))

# Print everything

In [None]:
print(top_earners)

 [('Alice', 100000), ('Carol', 122908)]

Four lines of code just to extract the top earners! Can we do better?
Let's have a look at the Python One-Liner that makes this more readable and more beautiful!

What are the top earners earning over $100k per month?

In [None]:
res = [(k,v) for k,v in employees.items() if v>=100000]
print(res)

You use list comprehension to extract all employees who earn at least $100,000.

Check out my video where I explain the one-liner to you in a step-by-step manner:
https://click.mlflow.com/link/c/YT0xODE1MTYwMzIwOTM5NzkyNTYzJmM9bDh4OSZlPTAmYj02NDcxNzc2MDgmZD1uNHY0cjFj.V2yPuM_2BR7IbyXGDX0Da2CNNlbFbUS6Qh6UNBJ77Xs    

# Trick 2/10¶
In today's tutorial, you'll learn how to read a file and strip the leading and trailing whitespaces and store everything in a list of strings---all of this in a single line of code.

But first, let's dive into the less concise and less Pythonic "ordinary" way of reading files...

MULTIPLE LINES: How to read a file and strip whitespaces?

Here's the normal version (assume that this code is stored in a file named 'readFileDefault.py'):

## How to read a file in Python

In [None]:
filename = "readFileDefault.py" # this code
f = open(filename)
lines = []
for line in f:
    lines.append(line.strip())
print(lines)

# This is the output:

In [None]:
['filename = "readFileDefault.py" # this code',
 '', '',
 'f = open(filename)',
 'lines = []',
 'for line in f:',
 'lines.append(line.strip())',
 '', '',
 'print(lines)']
 

A lot of code---just to do some basic file I/O! The variable lines contains all lines with leading and trailing whitespaces removed.

# ONE-LINER

Let's apply your one-liner superpower to this code snippet:

In [None]:
print([l.strip() for l in open("readFileDefault.py")])

That's it! You use list comprehension to iterate over the contents of the dynamically created file object. 
# This is the output:

['print([l.strip() for l in open("file.py")])']

Check out my video where I explain the one-liner to you in a step-by-step manner:
https://click.mlflow.com/link/c/YT0xODE1ODg1MTU5MjI0ODQyNTM5JmM9eDFlNSZlPTAmYj02NDc5Nzg3MzEmZD13OGkzcjN6.HgWMAB7ft0Z21fiBVzNagM3QhXcXxcs3VA40ii21TZ4

# Trick 3/10
When given a list of strings, our next one-liner creates a new list of tuples, each consisting of a Boolean and the original string.

The Boolean value indicates whether the string 'anonymous' appears in the original string!

Here's the sample code:

## Data

In [None]:
t = ['lambda functions are anonymous functions.',
     'anonymous functions dont have a name.',
     'functions are objects in Python.']
q = 'anonymous' # query
print(t)

## One-Liner

In [None]:
mark = map(lambda s: (1, s) if q in s else (0, s), t)

## Result

In [None]:
print(list(mark))

We call the resulting list mark because the Boolean values mark the string elements in the list that contain the string 'anonymous'.

The map() function adds a Boolean value to each string element in the original txt list. This Boolean isTrue if the string element contains the word 'anonymous'. The first argument is the anonymous lambda function, and the second is a list of strings you want to check for the desired string.

You use the lambda return expression (1, s) if 'anonymous' in s else (0, s) to search for the 'anonymous' string. The value s is the input argument of the lambda function, which, in this example, is a string.

If the string query 'anonymous' exists in the string, the expression returns the tuple (1, s). Otherwise, it returns the tuple (0, s).

The result of the one-liner is the following:

## Result

In [None]:
print(list(mark))

[(1, 'lambda functions are anonymous functions.'),
(1, 'anonymous functions dont have a name.'),
(0, 'functions are objects in Python.')]

Video course:
https://click.mlflow.com/link/c/YT0xODE2NjA5OTUyMjg2OTA3MjAwJmM9bjd0OSZlPTAmYj02NDk1NDAzMzAmZD1pNHI0aTdi.RBajVa8Tlp2hZEWdTaliZQiv5OBfQz6d7jtzgxg9jNA

# Trick 4/10
This one-liner teaches you some critical Python skills: 

slicing, 
the ternary operator, and 
the string find() method.
Problem Description

Your goal is to find a particular text query within a multiline string. You want to find the query in the text and return its immediate environment, up to 18 positions around the matching query string.

You may ask - why? - and rightly so! 

Extracting the environment of the query is useful for search context—just as the Google search engine presents text snippets around a found keyword. 

Solution

In the following code example, you’re looking for the string 'SQL' in an Amazon letter to shareholders—with the immediate environment of up to 18 positions around the string 'SQL'.

Here's the one-liner solution:

## Data

In [None]:
letters_amazon = '''
We spent several years building our own database engine,
Amazon Aurora, a fully-managed MySQL and PostgreSQL-compatible
service with the same or better durability and availability as
the commercial engines, but at one-tenth of the cost. We were
not surprised when this worked.
'''

## One-Liner

In [None]:
find = lambda txt, q: txt[txt.find(q)-18:txt.find(q)+18]

## Result

In [None]:
print(find(letters_amazon, 'SQL'))

## a fully-managed MySQL and PostgreSQL

You define a lambda function with two arguments: string value x, and query q to search for in the text. 

You assign the lambda function to the name find. The function find(x, q) finds the string query q in the string text x.

If query q does not appear in string x, you directly return the result -1. 

Otherwise, you use slicing on the text string to carve out the first occurrence of the query, plus 18 characters to the left of the query and 18 characters to the right, to capture the query’s environment.

The index of the first occurrence of q in x uses the string function x.find(q). You call the function twice: to help determine the start index and the stop index of the slice, but both function calls return the same value because query q and string x do not change.

## Discussion

Although this code works perfectly fine, the redundant function call causes unnecessary computations—a disadvantage that could easily be fixed by adding a helper variable to temporarily store the result of the first function call. You could then reuse the result from the first function call by accessing the value in the helper variable.

This highlights an important trade-off: by restricting yourself to one line of code, you cannot define and reuse a helper variable to store the index of the first occurrence of the query. Instead, you must execute the same function find to compute the start index (and decrement the result by 18 index positions) and to compute the end index (and increment the result by 18 index positions).

## Video

If you struggled with understanding this one-liner code, check out my explainer video:
https://click.mlflow.com/link/c/YT0xODE3MzM0NzM2NzUwNjQ3OTM0JmM9cTZ2OSZlPTAmYj02NTA4NTExNDcmZD1sNGwxZjd4.ZLLMY2G0IU_Tv79gwWKLKaqU3PtFipk2FpijDa6ye0E


# Trick 5/10

In this email, you'll combine list comprehension and slicing to sample a two-dimensional data set. You aim to create a smaller but representative sample of data from a prohibitively large sample.

Motivation and Basics

Say you work as a financial analyst for a large bank and are training a new machine learning model for stock-price forecasting.

You have a training data set of real-world stock prices. However, the data set is huge, and the model training seems to take forever on your computer.

To speed things up, you reduce the data set by half by excluding every other stock-price data point. You don’t expect this modification to decrease the model’s accuracy significantly.

In this email, you’ll use two Python features you learned about previously in this email series: 

List comprehension allows you to iterate over each list element and modify it subsequently.
Slicing allows you to select every other element from a given list quickly—and it lends itself naturally to simple filtering operations.
Let’s have a detailed look at how these two features can be used in combination.

Source Code

Your goal is to create a new training data sample from our data—a list of lists, each consisting of six floats—by including only every other float value from the original data set. Take a look at the following code snippet:

## Data (daily stock prices ($))

In [1]:
price = [[9.9, 9.8, 9.8, 9.4, 9.5, 9.7],
         [9.5, 9.4, 9.4, 9.3, 9.2, 9.1],
         [8.4, 7.9, 7.9, 8.1, 8.0, 8.0],
         [7.1, 5.9, 4.8, 4.8, 4.7, 3.9]]

## One-Liner

In [2]:
sample = [line[::2] for line in price]

## Result

In [None]:
print(sample)

### [[9.9, 9.8, 9.5],
###  [9.5, 9.4, 9.2], 
###  [8.4, 7.9, 8.0],
###  [7.1, 4.8, 4.7]]

Your solution is a two-step approach.
First, you use list comprehension to iterate over all lines of the original list, price.

Second, you create a new list of floats by slicing each line; you use line[start:stop:step] with default start and stop parameters and step size 2.

The new list of floats consists of only three (instead of six) floats.

Explainer Video

If you struggle with understanding this one-liner, check out our explainer video:
https://click.mlflow.com/link/c/YT0xODE4MDU5NTc4Nzg1NDA1NzU0JmM9ZDNzMSZlPTAmYj02NTUxMDM4MjcmZD13NnoybzV5.2q-MZS3oKGCviX6Redgb-po-Rwwn7u_eqs73KvqYMX0


# Trick 6/10

In this email, you'll strengthen your understanding of two important Python concepts: slicing, and slice assignments.

## Problem Description

Imagine you work at a small internet startup that keeps track of its users' web browsers (Chrome, Firefox, Safari). You store the data in a database.

To analyze the data, you load the gathered browser data into a large list of strings, but because of a bug in your tracking algorithm, every second string is corrupted and needs to be replaced by the correct string.

Assume that your web server always redirects the first web request of a user to another URL -- this is a common practice in web development known under the HTML code 301: moved permanently. 

You conclude that the first browser value will be equal to the second one in most cases because the browser of a user stays the same while waiting for the redirection to occur. This means that you can easily reproduce the original data.

Essentially, you want to duplicate every other string: the list 

['Firefox', 'corrupted', 'Chrome', 'corrupted'] 

becomes 

['Firefox', 'Firefox', 'Chrome', 'Chrome'].

How can you achieve this in a fast, readable, and efficient way (preferably in a single line of code)?

## Solution Overview

Your first idea is to create a new list, iterate over the corrupted list, and add every non-corrupted browser twice to the new list. But you reject the idea because you’d then have to maintain two lists in your code—and each may have millions of entries. Also, this solution would require a few lines of code, which would reduce conciseness and readability of your source code.

Luckily, you’ve read about a beautiful Python feature: slice assignments. 

You’ll use slice assignments to select and replace a sequence of elements between indices i and j by using the slicing notation lst[i:j] = [0 0 ...0]. 

Because you are using slicing lst[i:j] on the left-hand side of the assignment operation (rather than on the right-hand side), the feature is denoted as slice assignments.

The idea of slice assignments is simple: replace all selected elements in the original sequence on the left with the elements on the right.

## Source Code

Your goal is to replace every other string with the string immediately in front of it.

In [9]:
## Data
visitors = ['Firefox', 'corrupted',
            'Chrome', 'corrupted',
            'Safari', 'corrupted',
            'Safari', 'corrupted',
            'Chrome', 'corrupted',
            'Firefox', 'corrupted']
print(visitors)

['Firefox', 'corrupted', 'Chrome', 'corrupted', 'Safari', 'corrupted', 'Safari', 'corrupted', 'Chrome', 'corrupted', 'Firefox', 'corrupted']


In [10]:
## One-Liner
visitors[1::2] = visitors[::2]

In [11]:
## Result
print(visitors)

['Firefox', 'Firefox', 'Chrome', 'Chrome', 'Safari', 'Safari', 'Safari', 'Safari', 'Chrome', 'Chrome', 'Firefox', 'Firefox']


['Firefox', 'Firefox',
 'Chrome', 'Chrome',
 'Safari', 'Safari',
 'Safari', 'Safari',
 'Chrome', 'Chrome',
 'Firefox', 'Firefox']

The one-liner solution replaces the 'corrupted' strings with the browser strings that precede them in the list. You use the slice assignment notation to access every corrupted element in the visitors list.

## Explainer Video

As usual, if you struggle with understanding this one-liner, check out my explainer video:
https://click.mlflow.com/link/c/YT0xODE4Nzg0NDM3NDM3OTk1MDQ0JmM9cTdiNSZlPTAmYj02NTUxMTQ0MDgmZD1qNmQyZTlv.YFiswCGXp0d-04RCaTP9fjFi_MyhDY_s_zImT8hvMFI


# Trick 7/10

This time, you’re working on a small code project for a hospital. Your goal is to monitor and visualize the health statistics of patients by tracking their cardiac cycles.

By plotting expected cardiac cycle data, you’ll enable patients and doctors to monitor any deviation from that cycle. For example, given a series of measurements stored in the list [62, 60, 62, 64, 68, 77, 80, 76, 71, 66, 61, 60, 62] for a single cardiac cycle, you want to achieve the visualization:

https://bucket.mlcdn.com/a/3230/3230158/images/3bfa0e35ea5603bcc72c80605c5bdf827d2dc8ef.jpeg

(If you cannot see the image, please allow pictures from Finxter in your email program)

The problem is that the first and the last two data values in the list are redundant:

[62, 60, 62, 64, 68, 77, 80, 76, 71, 66, 61, 60, 62].

You need to clean the original list by removing the redundant first and the last two data values:

[62, 60, 62, 64, 68, 77, 80, 76, 71, 66, 61, 60, 62] 

becomes

[60, 62, 64, 68, 77, 80, 76, 71, 66, 61].

You’ll combine slicing with the Python feature list concatenation, which creates a new list by concatenating (that is, joining) existing lists.

For example, the operation [1, 2, 3] + [4, 5] generates the new list [1, 2, 3, 4, 5], but doesn’t replace the original lists.

You can use this with the * operator to concatenate the same list again and again to create large lists: for example, the operation [1, 2, 3] * 3 generates the new list [1, 2, 3, 1, 2, 3, 1, 2, 3].

In addition, you’ll use the matplotlib.pyplot module to plot the cardiac data you generate. The matplotlib function plot(data) expects an iterable argument data—an iterable is simply an object over which you can iterate, such as a list—and uses it as y values for subsequent data points in a two-dimensional plot.

The Code

Given a list of integers that reflect the measured cardiac cycle, you first want to clean the data by removing the first and last two values from the list. Second, you create a new list with expected future heart rates by copying the cardiac cycle to future time instances.

## Dependencies
import matplotlib.pyplot as plt

## Data
cardiac_cycle = [62, 60, 62, 64, 68, 77,
                 80, 76, 71, 66, 61, 60, 62]

## One-Liner
expected_cycles = cardiac_cycle[1:-2] * 10

## Result
plt.plot(expected_cycles)
plt.show()

This one-liner consists of two steps.

First, you use slicing to clean the data by using the negative stop argument -2 to slice all the way to the right but skip the last two redundant values.
Second, you concatenate the resulting data values ten times by using the replication operator *.
The result is a list of 10 × 10 = 100 integers made up of the concatenated cardiac cycle data. When you plot the result, you get the desired output shown previously

Video

If you struggle with understanding this one-liner, check out my explainer video:
https://click.mlflow.com/link/c/YT0xODE5NTA5Mjc3ODcwNTI4Nzk5JmM9ZzdxNiZlPTAmYj02NTc3NjI5MTYmZD1xM3o2dTV3.TfZTmJnPWSGVfbEFGNEr7JG7clm7AjpplNoOqzCOmS0

# Trick 8/10

This email combines some of the Python basics you’ve already learned and introduces the useful function any().

The Basics

You work in law enforcement for the US Department of Labor, finding companies that pay below minimum wage so you can initiate further investigations. Like hungry dogs on the back of a meat truck, your Fair Labor Standards Act (FLSA) officers are already waiting for the list of companies that violated the minimum wage law. Can you give it to them?

Here’s your weapon: Python’s any() function, which takes an iterable, such as a list, and returns True if at least one element of the iterable evaluates to True.

For example, the expression any([True, False, False, False]) evaluates to True, while the expression any([2<1, 3+2>5+5, 3-2<0, 0]) evaluates to False.

Note: Python’s creator, Guido van Rossum, was a huge fan of the built-in function any() and even proposed to include it as a built-in function in Python 3.

An interesting Python extension is a generalization of list comprehension: generator expressions.

Generator expressions work exactly like list comprehensions—but without creating an actual list in memory. The numbers are created on the fly, without storing them explicitly in a list.

For example, instead of using list comprehension to calculate the squares of the first 20 numbers, sum([x*x for x in range(20)]), you can use a generator expression: sum(x*x for x in range(20)).

The Code

Our data is a dictionary of dictionaries storing the hourly wages of company employees. You want to extract a list of the companies paying below your state’s minimum wage (< $9) for at least one employee.

## Companies
c = {'CoolCompany' : {'Alice' : 33,
                      'Bob' : 28,
                      'Frank' : 29},
     'CheapCompany' : {'Ann' : 4,
                       'Lee' : 9,
                       'Chrisi' : 7},
     'SosoCompany' : {'Esther' : 38,
                      'Cole' :
8,
                      'Paris' : 18}}

## One-Liner to find illegal companies
i = [x for x in c if any(y<9 for y in
                         c[x].values())]

## Result
print(i)
# ['CheapCompany', 'SosoCompany']


You use two generator expressions in this one-liner.

(1) The first generator expression, y<9 for y in c[x].values() generates the input to the function any(). It checks each of the companies’ employees to see whether they are being paid below minimum wage, y<9. The result is an iterable of Booleans. 

You use the dictionary function values() to return the collection of values stored in the dictionary. For example, the expression c['CoolCompany'].values() returns the collection of hourly wages dict_values([33, 28, 29]). 

If at least one of them is below minimum wage, the function any() would return True, and the company name x would be stored as a string in the resulting list i, as described next.

(2) The second generator expression is the list comprehension [x for x in c if any(...)] and it creates a list of company names for which the previous call of the function any() returns True. Those are the companies that pay below minimum wage. 

Note that the expression for x in c visits all dictionary keys—the company names 'CoolCompany', 'CheapCompany', and 'SosoCompany'.

Two out of three companies must be investigated further because they pay too little money to at least one employee. Your officers can start to talk to Ann, Chrisi, and Cole!

Video

If you struggle with understanding this one-liner, check out my explainer video:
https://click.mlflow.com/link/c/YT0xODIwMjM0MTAyNTk5NTg4NTA0JmM9dzBlNiZlPTAmYj03NDI3MDg0OTAmZD12M2ExbjNv.1O7VUpRtbByPhCPDB4168SrrXK-ENHjCewDmfyUROPU