Module 00 Part A 
---
Programming in Python: Getting Started
---
# What is Python
- Developed in the late 1980s by Guido van Rossum
- Used widely within Google, now widely used in industry
- Name of the language was chosen simply because van Rossum is a fan of Monty Python

# Why use Python?
The choice of programming language is a subjective decision that is best
decided by the type of problem.  There is strong application library support for Python with
support for web programming, data analysis, and machine learning.   The
type structure in Python isn't as clean as Perl's design. That can cause
problems when maintaining large programs.

# Python 3.0 vs. Python 2.x
- Python 3.0 was not completely backward compatible with prior versions
    -Many of the 3.0 features were backported to Python 2.7
- Means there are distinct version of the language
    - We will focus on Python 3.0


# Getting started
- Finding the tools
    - Python included with the OS install in macOS
    - Available with Visual Studio in Windows
    - We will be using the Anaconda Python data science distribution: [https://www.anaconda.com/products/individual-d](https://www.anaconda.com/products/individual-d)
    
# Compiler vs. interpreter
Python is an interpreted language.   Rather than having a compiler convert your code from statements to machine language in a batch mode, Python is an interpreted language: each statement is interpreted as you go.
## REPL 
The Python interpreter is an example of a "Read-Evaluate-Print Loop", often abbreviated as a REPL.  The interpreter reads a Python statement,  evaluates it, and prints the result.    You will note that we're working in a Juypter Notebook for these lectures,  which is a "cell-based IDE" which allows you mix text, graphics, and executable code in a single document.   The executable code is implemented by having code cells serve as a front-end to the Python REPL interpreter.

# What's the first program you always write in a new language?
That's right: it's "Hello World"!   It's rather simple in this case as Python's an interpreted language.
    
    

In [1]:
print("Hello CS530!")
print("hello world")

Hello CS530!
hello world


In [3]:
# Note that as an interpreted language, there are things you can do interactively
a = 6
a
a+2

8

Let's talk about what happened here.   You entered three statements, those are combined together in the cell and the return value from the last expression is the result.

In [4]:
print(a)

6


# The Basics of the Python Programming Language
Let's consider the following Python program.   This is what a complete Python program would like in a separate file.  Python files are assumed to have the `.py` extension.

In [2]:
#!/usr/bin/env python3

# Import modules you use in your code
# Sys is the base system module
import sys
# Provide a main function
def main():
    print('Hello world from ', sys.argv[1])

# Add some boiler plate code to call the
# main() function
if __name__ == '__main__':
    main()

Hello world from  --f=/Users/alewis/Library/Jupyter/runtime/kernel-v2-44117OQAgqdufF6Hd.json


A few important points:
- Note the sha-bang.  The extra level of indirection using the
  `/usr/bin/env` command deals with the fact that the python
  interpreter can be in different places depending upon the Linux
  distribution.
- Python is a *fixed-format* language. This means that it expects
  to see things a particular format rather than using delimiters (such
  as braces) to mark blocks.
  - We see this with the definition of the `main()` function.
  - Note how it affects the `if` statement
  - Interestingly enough, the only other language that does this is the `COBOL` programming language.
- Note the use of the `name-main` idiom
  - Python does not have a `main()` function like the C++ entry point
  - Instead the Python interpreter assumes that all things in your Python file are items to be executed on the "main" thread of execution
  - So, you put a check in your code to determine if the name of the thread is `__main__` to call the function you want to execute
  - Otherwise, the interpreter executes the top-level expressions it finds in the code

## Identifers: What's In A Name

- Identifiers in Python start with an uppercase letter, lowercase letter, or underscore
    - Followed by letters, underscores, and/or digits
    - No punctuation characters other than underscore
   
The language is case sensitive: `SequenceValues` and `sequencevalues` are different identifiers.

### Naming conventions
- Class names should be stared with an uppercase letter
    - All other identifiers should start with lowercase letters
- Starting an identifier with a leading underscore indicates this is a private identifier
- Two underscores indicate a *strongly private* identifier (don't worry about this at the moment)
    - If a strongly private identifier ends with two trailing underscores, then this is a language-defined special name

### Quotation In Python
- Python accepts either single or double quotes to denote string literals
- Triple quotes are used to denote strings that span multiple lines


In [None]:
word = 'word'
sentence = "This is a sentence"
paragraph = """This is a paragraph 
made up of multiple lines of text that are delimited
using triple quotes"""

print(word, sentence, paragraph)

word This is a sentence This is a paragraph 
made up of multiple lines of text that are delimited
using triple quotes



Be very careful about using the triple quote operator.   If you span multiple lines in a string literal, any of the white space is included within the string (it is a string literal, after all).

### Wait A Second: Where's The Semi-colons?
In most cases, end-of-line is treated as terminating a statement in Python
- One can use a semi-colon to separate multiple statements on a single line
- Use the backslash character at the end of a line to indicate a multi-line statement continuing to the next line
    - Not required when you're entering things in an initalizer

### Blocks, Suites, and Indentation

Remember what I said about Python being a fixed-format langauge?   You don't do blocks of code surrponded by braces or other delimiters as you see in C++.  Blocks are indented with nested blocks just indented further.

**Suite**: A single code block in Python
- Begins with a colon-terminated header line
    - Followed by one or more indented lines that make up the suite
- All of the control-of-flow statements use suites

## Data Types in Python

### It's all objects!
Everything in Python is an object.  
- Means no distinction like we see between fundamental data types and objects in C++
- Python is a *pass by object reference* language
    - This introduces some oddities when dealing with varaibles and values
    
### It's all values!


In [7]:
count = 100 
miles = 1000.0
name = "john"

a = b = c = 1
a,b,c = 1, 2, "john"

print(count, miles, name, a, b, c)


100 1000.0 john 1 2 john


Python permits multiple assignment.   In the example code, an integer object is created with value of 1 and then the variables `a`, `b`, and `c` are all assigned to the memory location where that variable is stored.

Note that you can also assign multiple objects to multiple values.   The number of variables and the number of objects must match.

### Let's get dynamic: duck typing!

> Duck typing: if waddles like a duck, quacks like a duck, and smells like a duck,
> Then, it's a duck.

How does this apply to Python?  Python is a duck-typed language.   The data type of a variable is determined at first use by how the variable is intialized.  And the type of a variable is mutable:  reassignment to a value of a different type results in a change to the data type.

In fancy $5 words, we say that Python is a dynamically typed language.

## Standard Data Types

In [8]:
# Standard Data Types: Numbers
var1, var2, var3, var4  = 10, 51924361488, 0.0, 3.14+1.712j
print(var1, var2, var3, var4)
# Note that in Python 3 all integers are long integers.  
# And that we can define complex numbers as well

10 51924361488 0.0 (3.14+1.712j)


In [9]:
# Standard Data Types: Strings
aStr = 'Dr. Lewis is a really great human being'
print(aStr)
print(aStr[0])
print(aStr[2:6])
print(aStr[2:])
print(aStr * 2)
print(aStr + " even if he's sometimes quite evil")

Dr. Lewis is a really great human being
D
. Le
. Lewis is a really great human being
Dr. Lewis is a really great human beingDr. Lewis is a really great human being
Dr. Lewis is a really great human being even if he's sometimes quite evil


A few notes:
- Substrings can be accessed using the slice operators
- Note the overloading of the addition and multiplication operators
    - "+" is string concatenation
    - while "*" is string repetition
- There is no character data type in Python.
    - A single character is represented by a string of length 1

In [10]:
# Standard Data Types: Lists
list = [ 'abcd', 786 , 2.23, 'john', 70.2 ]
tinylist = [123, 'john']

print(list)           # Prints complete list
print(list[0])        # Prints first element of the list
print(list[1:3])      # Prints elements starting from 2nd till 4th 
print(list[2:])       # Prints elements starting from 3rd element
print(tinylist * 2)   # Prints list two times
print(list + tinylist)# Prints concatenated lists

['abcd', 786, 2.23, 'john', 70.2]
abcd
[786, 2.23]
[2.23, 'john', 70.2]
[123, 'john', 123, 'john']
['abcd', 786, 2.23, 'john', 70.2, 123, 'john']


Lists in Python fill multiple roles in the language: they provide the functionality of arrays, vectors, lists, stacks, and queues!  Here's where duck typing comes into play as the elements of a list don't need to be of the same data types.  It's all objects!

In [11]:
# Standard Data Types: Tuples
tuple = ( 'abcd', 786 , 2.23, 'john', 70.2  )
tinytuple = (123, 'john')
print(tuple)           # Prints complete tuple
print(tuple[0])        # Prints first element of the tuple
print(tuple[1:3])      # Prints elements starting from 2nd till 3rd 
print(tuple[2:])       # Prints elements starting from 3rd element
print(tinytuple * 2)   # Prints tuple two times
print(tuple + tinytuple) # Prints concatenated tuples


('abcd', 786, 2.23, 'john', 70.2)
abcd
(786, 2.23)
(2.23, 'john', 70.2)
(123, 'john', 123, 'john')
('abcd', 786, 2.23, 'john', 70.2, 123, 'john')


The Python `tuple` data type is a read only list.  You cannot update or change the contents of a tuple. Note how you distingush a tuple from a list by using parentheses as delimiters rather than braces.   Note that you get a tuple when you slice a tuple.

In [12]:
tuple = ( 'abcd', 786 , 2.23, 'john', 70.2  )
list = [ 'abcd', 786 , 2.23, 'john', 70.2  ]
tuple[2] = 1000    # Invalid syntax with tuple
list[2] = 1000     # Valid syntax with list


TypeError: 'tuple' object does not support item assignment

In [None]:
# Standard Data Types: Dictionaries
d = {}
d['one'] = "This is one"
d[2]     = "This is two"

td = {'name': 'john','code':6734, 'dept': 'sales'}

print(d['one'])    # Prints value for 'one' key
print(d[2])        # Prints value for 2 key
print(td)          # Prints complete dictionary
print(td.keys())   # Prints all the keys
print(td.values()) # Prints all the values

This is one
This is two
{'name': 'john', 'code': 6734, 'dept': 'sales'}
dict_keys(['name', 'code', 'dept'])
dict_values(['john', 6734, 'sales'])


Associative data is stored in Python dictionaries.  Like C++ maps, dictionaries store key-value pairs.   Note the `keys()` and `values()`.   This gets you back Python iterator objects that you can use to get lists of the keys and values in a dictionary.

### Data Type Conversion Functions

| Function | Purpose |
|----------|---------|
| str(x)   | Convert object x to a string |
| repr(x)  | Convert object x to an expression string |
| eval(str)| Evaluate a string and return an object |
| chr(x)   | Convert an integer to a character |
| unichr(x)| Convert an integer to a Unicode characters |
| ord(x)   | Convert single char to its integer value |
| hex(x)   | Convert integer to a hexadecimal string |
| oct(x)   | Convert integer to an octal string |

## Control Structures


In [13]:
# IF - ELSEIF - ELSE
var = 100
if var == 200:
    print("1 - Got a true expression value")
    print(var)
elif var == 150:
    print("2 - Got a true expression value")
    print(var)
elif var == 100:
    print("3 - Got a true expression value")
    print(var)
else:
    print("4 - Got a false expression value")
    print(var)

3 - Got a true expression value
100


In [14]:
# Nested if
if var == 100:
    print(var1)
    if var2 == 200:
        print(var2)
    else:
        print(var1, var2)
else:
    print(var1)
print("we're done")    

10
10 51924361488
we're done


`if` statements work as expected, with the note that we are building them using suites.  Note in the nesting example how suites are nested by indenting.

In [None]:
# WHILE loops
count = 0
while (count < 9):
    print("The count is: ", count)
    count = count + 1
print("We're done")

The count is:  0
The count is:  1
The count is:  2
The count is:  3
The count is:  4
The count is:  5
The count is:  6
The count is:  7
The count is:  8
We're done


In [None]:
# And now for something completely different (get the Python
# reference?)
count = 0
while count < 5:
    print(count, " is less than 5")
    count = count + 1
else:
    print(count, " is not less than 5")
    

0  is less than 5
1  is less than 5
2  is less than 5
3  is less than 5
4  is less than 5
5  is not less than 5


Loop structures in Python can have `else` clauses.   The `else` condition is executed when the loop control condition becomes false.   Can you explain why we see the `else` executes in this case?

In [None]:
# FOR lops
# FOR loops in Python are range based.
for i in range(10):  # Counted loop
    print(i)
    
for letter in 'Python':
    print("Current letter: ", letter)
    
fruits = ['banana', 'apple', 'mango']
for fruit in fruits:
    print("Current fruit:", fruit)


0
1
2
3
4
5
6
7
8
9
Current letter:  P
Current letter:  y
Current letter:  t
Current letter:  h
Current letter:  o
Current letter:  n
Current fruit: banana
Current fruit: apple
Current fruit: mango


### Adjusing the Loop Control FLow

| Statement | Effect|
|-----------|-------|
|`break` | Terminate loop and transfer execution to statement |
|        | immediately following loop |
| `continue` | Skip remainder of loop and continue at loop test |
|`pass` | Ignore this condition and do nothing|

What's up with `pass`?   You can view it as being a 'No-Op', a placeholder for some future case that you aren't including in a loop or function at this time.

# Classes and Objects in Python

Class definitions in Python are indicated by `class` keyword.   Any code contained in the suite associated with the class is part of the class's body.

In [None]:
# A simple class with an initialization method
class Dog:
    species = "Canis familaris"
    def __init__(self, name, age):
        self.name = name
        self.age = age

Important things to note:
- The class `__init__` method is the Python equivialent of a C++ constructor
    - The Python term for this method is that it is the class *initializer*
- Class attributes (static member variables in C++) are defined by assigning a value to a variable outside of the initializer
- Note that we have to pass `self` to all methods as the first parameter
    - You don't have to do this in C++ as it occurs automagically in that language
    - And, as you expect, `self` does refer to the object itself
- Assigning values to class instance attributes in the initializer is what defines the attributes

In [None]:
# This example creates two Dog objects and shows they are unique
a = Dog("Biscuit", "1")
b = Dog("Biscuit", "1")
a == b

False

In [None]:
print(a.name)
print(b.name)

Biscuit
Biscuit


In [None]:
# A simple class with an initialization method
class Dog:
    species = "Canis familaris"
    def __init__(self, name, age):
        self.name = name
        self.age = age
        
    def description(self):
        return f"{self.name} is {self.age} years old"
    
    def speak(self,sound):
        return f"{self.name} said {sound}"

biscuit = Dog("biscuit", 1)    
pasta = Dog("spaghetti", 2)
print(biscuit.description())
pasta.speak("Woof")

biscuit is 1 years old


'spaghetti said Woof'

See how we don't include the `self` parameter in the arguments.  Python provides that for you automagically.

In Pythonic speak, we have what are called *dunder methods*, which are methods in a class that begin and end with double underscore characters (see... dunder).    These are overridable methods from the `Object` class that are useful things.  For example, we have "__str__" dunder method, which takes `self` and returns a string.

In [18]:
class Dog:
    species = "Canis familaris"
    def __init__(self, name, age):
        self.name = name
        self.age = age
        
    def description(self):
        return f"{self.name} is {self.age} years old"
    
    def speak(self,sound):
        return f"{self.name} said {sound}"
    
    def __str__(self):
        return self.description()
    
biscuit = Dog("biscuit", 3)
print(biscuit)

biscuit is 3 years old


## And now we need to consider inheritance

Python is a single inheritance language, with all objects eventually inheriting from the `Object` base class (walking back up the inheritance tree from your class, it's parent class,  it's parent's parent class, and so on).

In [19]:
# Consider the following
class JackRussellTerrier(Dog):
    pass

class Daschhound(Dog):
    pass

class Bulldog(Dog):
    pass

We now have three derived classes of the `Dog` class, with each of the subclasses doing nothing new (remember the meaning of `pass`).   We have ways of finding out what class an object is instantiated from:

In [20]:
schultz = Daschhound("schultz", 3)

print(type(schultz))
if (isinstance(schultz, Dog)):
    print("Yep, schultz is a Dog")

<class '__main__.Daschhound'>
Yep, schultz is a Dog


In [22]:
# Overriding works like you might expect
class Daschound(Dog):
    def speak(self, sound="WOOF"):
        return f"{self.name} says {sound}"

schultz.speak()

TypeError: Dog.speak() missing 1 required positional argument: 'sound'

What? Note that we changed the definition of `Daschound` but didn't update `schultz` after the change.   Let's fix that:

In [None]:
schultz = Daschound("Hund",4);
schultz.speak()

'Hund says WOOF'

The other issue is making certain that we can call a superclass's version of a method from a subclass's override.  This is done using the `super()` function.

In [None]:
class JackRussellTerrier(Dog):
    def speak(self,sound="loud yippy bark"):
        print(super().speak(sound))
        return f"{self.name} barks out a {sound}"

spot = JackRussellTerrier("TargetDog", 15)
print(spot.speak())

TargetDog said loud yippy bark
TargetDog barks out a loud yippy bark


Be careful with calling `super()` as it does more than just search the superclass for a method or an attribute.  It will walk up the inheritance tree trying to find a matching method or attribute (this is different behavior than C++).   The sometimes causes unexpected results.

# More Complex Operations on Lists (and other data structures)

## Adding and deleting items from a list

You have three ways to remove an item from a list:
1. Use the `del` keyword
2. Use the `list.remove()` method
3. Use the `list.pop()' method



In [None]:
# Using the del keyword to remove items from a list
# list of products
products = ['table', 'chair', 'lamp', 'closet', 'bed', 'shelf', 'computer']

# remove the first item of the list using the del keyword
del products[0]
# check the modification
print(products)
# ['chair', 'lamp', 'closet', 'bed', 'shelf', 'computer']

# remove the last item of the list using the del keyword
del products[-1]
# check the modification
print(products)
# ['chair', 'lamp', 'closet', 'bed', 'shelf']

# remove the first and the second items of the list using the del keyword
del products[:2]
# check the modification
print(products)
# ['closet', 'bed', 'shelf']

# remove the entire list
del products
# check the modification
print(products)
# NameError

['chair', 'lamp', 'closet', 'bed', 'shelf', 'computer']
['chair', 'lamp', 'closet', 'bed', 'shelf']
['closet', 'bed', 'shelf']


NameError: name 'products' is not defined

Note that in Python, the `del` keyword is an explicit delete, just as `delete` is in C++.   Unlike C++ it is very rare that you have to do an explicit delete of a piece of memory.

### The `pop()` method

The `pop()` method removes the item at a given index and returns it to the caller.   If an index isn't speciifed, then the method removes and returns the last item in the list.   You will get an `IndexError` if you provide an index that is outside the range of the list.

In [None]:
# Illustrating the pop() method
# list of cities 
cities = ['Valencia', 'Munich', 'Ingolstadt', 'Stuttgart']

# remove the item at index 2
city_removed = cities.pop(2)

# the pop function returns the removed item
print(city_removed)
# Ingolstadt

# list of cities
print(cities)
# ['Valencia', 'Munich', 'Stuttgart']

Ingolstadt
['Valencia', 'Munich', 'Stuttgart']


### The `remove()` method

The `remove()` method deletes the first item in the list that matches the argument and returns `None`.  If the argument cannot be found in the list, then you will get a `ValueError` error.

In [None]:
# list of cities 
cities = ['Valencia', 'Munich', 'Madrid', 'Barcelona', 'Valencia', 'Madrid']

# remove the first matching element with the list.remove() method
cities.remove('Valencia')

# check the modification
print(cities)
# ['Munich', 'Madrid', 'Barcelona', 'Valencia', 'Madrid']

# if the element we try to remove is not present an exception is raised
cities.remove('Paris')
# ValueError

['Munich', 'Madrid', 'Barcelona', 'Valencia', 'Madrid']


ValueError: list.remove(x): x not in list

### Inserting into a list

We hae two ways to insert items into a list:
1. the `list.insert(i,x)` method
2. the `list.append(x)` method

The`insert()` method inserts and element `x` at a given index `i` and returns `None`:

In [None]:
# list of products
products = ['table', 'chair', 'lamp']

# insert an element at index 0
products.insert(0, 'closet')

# check the modification
print(products)
# ['closet', 'table', 'chair', 'lamp']

# insert an element at the end of the list
products.insert(len(products), 'computer')

# check the modification
print(products)
# ['closet', 'table', 'chair', 'lamp', 'computer']

# insert an element at the second-to-last index
products.insert(-1, 'bed')

# check the modification
print(products)
# ['closet', 'table', 'chair', 'lamp', 'bed', 'computer']

['closet', 'table', 'chair', 'lamp']
['closet', 'table', 'chair', 'lamp', 'computer']
['closet', 'table', 'chair', 'lamp', 'bed', 'computer']


As shown above, the first argument `i` is the index of the element before which we insert an item. That is why, if we provide an index of -1 as input, the element is inserted in the second-to-last position rather that at the end of the list. To insert an item at the end of the list, we have to provide the length of the list as input.

Watch what happens when I try to insert one of the Python container types into a list:

In [None]:
# list of numbers
numbers = [5, 10, 16]

# insert an integer (4) at index 0
numbers.insert(0, 4)

# insert a list [2,3] at index 1
numbers.insert(1, [2, 3])

# check the modification
print(numbers)
# [4, [2, 3], 5, 10, 16]

[4, [2, 3], 5, 10, 16]


The `append()` adds an item to the end of a list, in the same fashion as of you said `aList.insert(len(list),x)`:


In [None]:
# list of numbers
numbers = [1, 2, 3, 4]

# add an integer to the end of the list
numbers.append(5)

print(numbers)
# [1, 2, 3, 4, 5]

# add a list to the end of the list
numbers.append([6, 7])

print(numbers)
# [1, 2, 3, 4, 5, [6, 7]]

[1, 2, 3, 4, 5]
[1, 2, 3, 4, 5, [6, 7]]


## Sorting listed

There are two ways of sorting a list: 
1. Use the `sorted()` function
2. The list `sort()` method

Why two different functions: it's a question of whether or not you want to keep the original list in a unsorted fashion.

#### The `sorted()` function

The `sorted()` function returns a sorted list but leaves the original list unchanged.   The function has two optional parameters: `key` and `reverse`.   More on that when we look at the `sort()` method.

#### The list `sort()` method

The `list` class `sort()` method sorts the list in-place, returning `None`.

In [None]:
# sorted function - returns a sorted list
numbers = [3, 7, 1, 5, 4]
numbers_sorted = sorted(numbers)

# numbers is not modified
print(numbers)
# [3, 7, 1, 5, 4]

# sorted function returns a sorted list
print(numbers_sorted)
# [1, 3, 4, 5, 7]


# sort method - modifies the list in-place and returns None
letters = ['c', 'b', 'd', 'a']
letters_sorted = letters.sort()

# letters is modified
print(letters)
# ['a', 'b', 'c', 'd']

# sort method returns None
print(letters_sorted)
# None

[3, 7, 1, 5, 4]
[1, 3, 4, 5, 7]
['a', 'b', 'c', 'd']
None


So what's up with `key` and `reverse`?  Both ways of sorting a list have those optional parameters.   First, `reverse`... The default value for this parameter is `False`, which means the list will be sorted in ascending order (as indicated by the content of the list).   For descending order, set this named parameter to `True` as an argument.

In [None]:
numbers = [2, 8, 3, 1, 9]

numbers_ascending = sorted(numbers)
print(numbers_ascending)
# [1, 2, 3, 8, 9]

numbers_descending = sorted(numbers, reverse=True)
print(numbers_descending)
# [9, 8, 3, 2, 1]

[1, 2, 3, 8, 9]
[9, 8, 3, 2, 1]


The `key` parameter is used to select what we want to use as a sort key for the list.   Guess what... it's default value is a Python anonymous function! That's right.... Python support's lambda expressions!   By default it just traverses the list but you can provide a function that is applied to each element of the list, and then sort the elements based on the value returned by this function.

Let's look at some examples:

In [None]:
# Sort a list of strings in order of increasing length
cities = ['Munich', 'Rome', 'Barcelona', 'Paris']

# Sort strings by length ascending order
cities_sorted = sorted(cities, key=len)
print(cities_sorted)
# ['Rome', 'Paris', 'Munich', 'Barcelona']

# Sort strings by length descending order
cities_sorted_r = sorted(cities, key=len, reverse=True)
print(cities_sorted_r)
# ['Barcelona', 'Munich', 'Paris', 'Rome']

['Rome', 'Paris', 'Munich', 'Barcelona']
['Barcelona', 'Munich', 'Paris', 'Rome']


In [None]:
# Sort a list of strings by the number of vowels

# function defined with def 
def num_vowel(x):
  cnt = 0
  for char in x.lower():
    if char in ['a', 'e', 'i', 'o', 'u']:
      cnt += 1
  return cnt

names = ['Paula', 'Amanda', 'Ana', 'Amaranta', 'Li']

names_sorted = sorted(names, key=num_vowel)

print(names_sorted)
# ['Li', 'Ana', 'Paula', 'Amanda', 'Amaranta']

['Li', 'Ana', 'Paula', 'Amanda', 'Amaranta']


In [None]:
# Sort a nested list by the third value in the inner lists
# list of lists [item,price,quantity]

shop_list = [['table', 150, 2], ['chair', 50, 4], ['carpet', 100, 1], ['painting', 200, 7]]

shop_list_sorted = sorted(shop_list, key=lambda x: x[2])
print(shop_list_sorted)
# [['carpet', 100, 1], ['table', 150, 2], ['chair', 50, 4], ['painting', 200, 7]]


[['carpet', 100, 1], ['table', 150, 2], ['chair', 50, 4], ['painting', 200, 7]]


Note what we did in the last example: the `key` argument is set to a reference to a Python lambda expression. i.e., an anonymous function.  For this example, we are defining an anonymous function that just returns the third value of a list.

Note the syntax: the keyword `lambda` followed by the parameteer list, a colon, and then the body of the function.

## List Comprehensions

Here's where things start to get funky for noob Pythonistias: list comprehensions. A **list comprehension** is syntax in the language that allows one to create a new list based on an existing list.  Languagues that provide this construct are trying to get the list processing in the language to look and smell like notation used for defining sequences and sets in discrete math.   There's some funky math going here that is more of a point for your Programming Languages and Theory of Computation courses.   But it's a language construct that is very, very, very useful.

Consider the following snippet of Python code:


In [None]:
# list of cities
cities = ['valencia', 'barcelona', 'madrid']

# new empty list
cities_capitalized = []

# we loop through the list (cities) and we append the capitalized word to the new list (cities_capitalized)
for city in cities:
    cities_capitalized.append(city.capitalize())
    
print(cities_capitalized)
# ['Valencia', 'Barcelona', 'Madrid']

['Valencia', 'Barcelona', 'Madrid']


Creating a new list required three steps: (1) create an empty list,  loop through the source list, and append each entry from the first list into new list.   Look at how we can do this is a single line of code using a list comprehension:

In [None]:
# list of cities
cities = ['valencia', 'barcelona', 'madrid']

# new list with capitalized names
cities_capitalized = [city.capitalize() for city in cities]

print(cities_capitalized)
# ['Valencia', 'Barcelona', 'Madrid']

['Valencia', 'Barcelona', 'Madrid']


And we can also apply filters to the comprehension by adding conditional statements (theory note: this is similar to the "what and why" behind SQL's `WHERE` clauses on `SELECT` statements).


In [None]:
# list of numbers
numbers = [1, 2, 3, 4, 5, 6]

# filter out odd numbers from the list
numbers_even = [number for number in numbers if number % 2 == 0]

print(numbers_even)
# [2, 4, 6]

[2, 4, 6]


List comprehensions can be nested. 

Consider how we can implement a 3-4 matrix as a list of 3 lists of length 4:


In [None]:
matrix = [
    [1,2,3,4],
    [5,6,7,8],
    [9,10,11,12]
]

Now I want the transpose of that matrix (4x3 matrix).   I can do this using a nested list comprehension:


In [None]:
[[row[i] for row in matrix] for i in range(4)]


[[1, 5, 9], [2, 6, 10], [3, 7, 11], [4, 8, 12]]

Note: you wouldn't deal with linear algebra stuff like this directly: Remember one of the Pythonic mantras: **Somebody else has already done it**.  You would use the `numpy` and `pandas` packages... more on that in a later lecture.

Another useful list comprehension-related thing is the `map()` function.  The `map()` is one of those *higher order functions* that we spoke about when we introduced lambda expressions in C++:  it takes a function as an argument.   In this case,  it "maps" the function onto a list.   Consider the following:

In [None]:
squares = []
for x in range(10):
    squares.append(x**2)
print(squares)

squares = list(map(lambda x: x**2, range(10)))
print(squares)

squares = [x**2 for x in range(10)]
print(squares)

[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]
[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]
[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]
