# Lecture 2: Python Language Basics

In [1]:
import numpy as np
np.random.seed(12345)
np.set_printoptions(precision=4, suppress=True)

## Python Language Basics

### Language Semantics

#### Indentation, not braces

Example: Calculation of the Pythagorean Numbers

Generally, it is assumed that the Pythagorean theorem was discovered by Pythagoras that is why it has its name. But there is a debate whether the Pythagorean theorem might have been discovered earlier or by others independently. For the Pythagoreans, - a mystical movement, based on mathematics, religion and philosophy, - the integer numbers satisfying the theorem were special numbers, which had been sacred to them. 

These days Pythagorean numbers are not mystical anymore. Though to some pupils at school or other people, who are not on good terms with mathematics, they may still appear so. 

So the definition is very simple: 
Three integers satisfying $a^2+b^2=c^2$ are called Pythagorean numbers. 

The following program calculates all pythagorean numbers less than a maximal number. 
Remark: We have to import the math module to be able to calculate the square root of a number

```python
print('hello Data Curaion course!')
    print('hello again, Data Curaion course!')
```

In [2]:
from math import sqrt
n = input("Maximum Number? ")
n = int(n)+1
for a in range(1,n):
    for b in range(a,n):
        c_square = a**2 + b**2
        c = int(sqrt(c_square))
        if ((c_square - c**2) == 0):
            print(a, b, c)

Maximum Number? 10
3 4 5
6 8 10


#### Everything is an object

In [3]:
n = "pedro"
type(n)

str

In [4]:
a = 123
type(a)

int

In [5]:
a = "hello world"
b = a

print(id(a))
print(id(b))



4438209520
4438209520


In [6]:
ab = 123
cd = ab

cd = ab +100
print(ab)
print(cd)

123
223


What does the id() function do? id() returns the actual memory location where the variable is stored. Since id(a) = id(b), we know that a and b both point to a single variable, that resides in a single memory location. This is what we mean by “multiple names bound to single object”.

In [7]:
a = [1, 2, 3]
b = [1, 2, 3]

print(id(a))
print(id(b))

4438037824
4438037376


In [8]:
variable = 3
variable = "hello"



print(type(variable))


<class 'str'>


In [9]:
variable = "hello"
print(type(variable))


<class 'str'>


In [10]:
print(a is b)

False


In this case, you can see that the objects that a and b point to occupy different places in memory. Why did Python behave differently in this example? The difference is that a string is *immutable*, but a list is *mutable*. The above lines of code created two separate lists. To have the two names point to the same object, you could write the following:

In [11]:
a = [1, 2, 3]
b = a
print(b is a)

True


An immutable variable cannot be changed after it is created. If you wish to change an immutable variable, such as a string, you must create a new instance and bind the variable to the new instance. A mutable variable can be changed in place.

In [12]:
a.append(4)
print(a)

[1, 2, 3, 4]


In [13]:
print(b is a)

True


This is because the list is immutable but the the variable is still binded to the same object, so a is b can be considered as id(a) == id(b):

In [14]:
print(id(a))
print(id(b))

4433522112
4433522112


However, as string is immutable object, two string objects will be created and binded to different names:

In [15]:
a = "hello world"
b = "hello world"

print(id(a))
print(id(b))

4438289968
4438289840


In [16]:
print(a == b)

True


In [17]:
print(a is b)

False


**Exercises**

What is the output of the following code?
```python
a = 256
b = 256
print(a == b)
print(a is b)
```

Then waht is the output of the following code?
```python
a = 257
b = 257
print(a == b)
print(a is b)
```

Check the reason why this is the case at [wtfPython](https://github.com/satwikkansal/wtfPython?utm_source=mybridge&utm_medium=blog&utm_campaign=read_more#-is-is-not-what-it-is), does this contradict what you thought about?

#### Comments

Any text preceded by the hash mark (pound sign) # is ignored by the Python interpreter.
This is often used to add comments to code. At times you may also want to
exclude certain blocks of code without deleting them.

Comments can also occur after a line of executed code. While some programmers
prefer comments to be placed in the line preceding a particular line of code, this can
be useful at times.

In [18]:
results = []
for number in range(10):
    # find the odd numbers
    if number % 2 == 0:
        results.append(number)

print(results) #list all odd numbers

[0, 2, 4, 6, 8]


Comments that span multiple lines – used to explain things in more detail – are created by adding a delimiter (```“””```) on each end of the comment.

In [19]:
"""
This would be a multiline comment
in Python that spans several lines and
describes your code, your day, or anything you want it to
"""

results = []
for number in range(10):
    # find the odd numbers
    if number % 2 == 0:
        results.append(number)

print(results)


[0, 2, 4, 6, 8]


#### Function and object method calls

You call functions using parentheses and passing zero or more arguments, optionally
assigning the returned value to a variable:

In [20]:
def add(a, b):
    result = a + b
    
    return result

result = add(10, 20)

print(result)

30


**Exercises**

Write a function *square* with one argument of number to return the result of square. For example, square(5) = 25 

In [2]:
def square(n):
    return n**2

square(5)

25

#### Variables and argument passing

When assigning a variable (or name) in Python, you are creating a reference to the
object on the righthand side of the equals sign

In practical terms, consider a list of
integers:

In [21]:
a = [1, 2, 3]

Suppose we assign a to a new variable b:

In some languages, this assignment would cause the data [1, 2, 3] to be copied. In
Python, a and b actually now refer to the same object, the original list [1, 2, 3]

In [22]:
b = a

In [23]:
a.append(4)
b

[1, 2, 3, 4]

When you pass objects as arguments to a function, new local variables are created referencing
the original objects without any copying. If you bind a new object to a variable
inside a function, that change will not be reflected in the parent scope. It is
therefore possible to alter the internals of a mutable argument. Suppose we had the
following function:

In [24]:
def append_element(some_list, element):
    some_list.append(element)

In [25]:
data = [1, 2, 3]
append_element(data, 4)
print(data)

[1, 2, 3, 4]


#### Dynamic references, strong types

In contrast with many compiled languages, such as Java and C++, object references in
Python have no type associated with them. There is no problem with the following:

In [26]:
a = 5
type(a)

int

In [27]:
a = 'foo'
type(a)

str

Variables are names for objects within a particular namespace; the type information is
stored in the object itself. Some observers might hastily conclude that Python is not a
“typed language.”

In [28]:
'5' + 5

TypeError: can only concatenate str (not "int") to str

In some languages, such as Visual Basic, the string '5' might get implicitly converted
(or casted) to an integer, thus yielding 10. Yet in other languages, such as JavaScript,
the integer 5 might be casted to a string, yielding the concatenated string '55'. In this
regard Python is considered a strongly typed language, which means that every object
has a specific type (or class), and implicit conversions will occur only in certain obvious
circumstances, such as the following:

In [29]:
a = 4.5
b = 2
# String formatting, to be visited later
print('a is {0}, b is {1}'.format(type(a), type(b)))
a / b

a is <class 'float'>, b is <class 'int'>


2.25

Knowing the type of an object is important, and it’s useful to be able to write functions
that can handle many different kinds of input. You can check that an object is an
instance of a particular type using the isinstance function:

In [30]:
a = 5
isinstance(a, int)

True

isinstance can accept a tuple of types if you want to check that an object’s type is
among those present in the tuple:

In [31]:
a = 5; b = 4.5
isinstance(a, (int, float))
isinstance(b, (int, float))

True

**Exercises**

What is the output of the following code?
```python
a = True
print(a + 1)
print(type(a + 1))
```

What is the output of the following code?
```python
a = None
print(a + 1)
print(type(a + 1))
```

In [4]:
a = None
print(a + 1)
print(type(a + 1))

TypeError: unsupported operand type(s) for +: 'NoneType' and 'int'

In [55]:
a = True
print(a + 1)
print(type(a + 1))

2
<class 'int'>


#### Attributes and methods

Objects in Python typically have both attributes (other Python objects stored “inside”
the object) and methods (functions associated with an object that can have access to
the object’s internal data). Both of them are accessed via the syntax
obj.attribute_name:

```python
In [1]: a = 'hello world'

In [2]: a.<Press Tab>
a.capitalize  a.format      a.isupper     a.rindex      a.strip
a.center      a.index       a.join        a.rjust       a.swapcase
a.count       a.isalnum     a.ljust       a.rpartition  a.title
a.decode      a.isalpha     a.lower       a.rsplit      a.translate
a.encode      a.isdigit     a.lstrip      a.rstrip      a.upper
a.endswith    a.islower     a.partition   a.split       a.zfill
a.expandtabs  a.isspace     a.replace     a.splitlines
a.find        a.istitle     a.rfind       a.startswith
```

In [32]:
a = 'hello world'

Attributes and methods can also be accessed by name via the getattr function:

In [33]:
getattr(a, 'split')

<function str.split(sep=None, maxsplit=-1)>

In [34]:
a.split()

['hello', 'world']

**Exercises**

Assign a to be string of 'hello data curation course', and then convert the first character in each word to Uppercase and remaining characters to Lowercase using method *title*, then try to make all characters in Uppercase

In [57]:
a = "hello data curation course"
a.title()

'Hello Data Curation Course'

In [58]:
a.upper()

'HELLO DATA CURATION COURSE'

#### Imports

In Python a module is simply a file with the .py extension containing Python code.

In [35]:
import numpy

numpy.arange(10)

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

If we wanted to access the variables and functions defined in some_module.py, from
another file in the same directory we could do:

In [36]:
from numpy import arange

arange(10)

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

By using the as keyword you can give imports different variable names:

In [37]:
import numpy as np

np.arange(10)

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [38]:
#All of three ways can import arange from numpy module
import numpy
print(numpy.arange(10))

from numpy import arange
print(arange(10))

import numpy as np
print(np.arange(10))

[0 1 2 3 4 5 6 7 8 9]
[0 1 2 3 4 5 6 7 8 9]
[0 1 2 3 4 5 6 7 8 9]


#### Import conventions
The Python community has adopted a number of naming conventions for commonly
used modules:

In [39]:
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
import seaborn as sns
import statsmodels as sm

This means that when you see np.arange, this is a reference to the arange function in
NumPy. This is done because it’s considered bad practice in Python software development
to import everything (from numpy import *) from a large package like NumPy

**Exercises**

import *math* module that has a lot of useful mathematical functions. Try to use the function *log* and *pow* to check which value is larger, $\log(1000000000)$ or $2^5$?

(Hint: try to use the *?function* to check the function declarations if you do not know it yet, e.g.: 

```python
?math.log
```

In [61]:
import math
a = math.log(1000000000)
b = math.pow(2, 5)
if a > b:
    print("log(1000000000) = ", a)
else:
    print("2^5 = ", b)

2^5 =  32.0


In [62]:
?math.log

#### Binary operators and comparisons

Most of the binary math operations and comparisons are as you might expect:

In [40]:
5 - 7

-2

In [41]:
12 + 21.5

33.5

In [42]:
5 <= 2

False

To check if two references refer to the same object, use the **is** keyword. **is not** is also
perfectly valid if you want to check that two objects are not the same:

In [43]:
a = [1, 2, 3]
b = a

In [44]:
c = list(a)
a is b

True

Since list always creates a new Python list (i.e., a copy), we can be sure that c is distinct
from a.

In [45]:
a is not c

True

Comparing with is is not the same as the == operator, because in this
case we have:

In [46]:
a == c

True

A very common use of is and is not is to check if a variable is None, since there is
only one instance of None:

In [47]:
a = None
a is None

True

**Exercises**

What is the output of the following code:

```python
print(10 > 0 and 5 < 10)
print(10 != 100 or 5 < 10)
print(not 5<10)
print(10!=100 and not 5<10)
```

In [63]:
print(10 > 0 and 5 < 10)
print(10 != 100 or 5 < 10)
print(not 5<10)
print(10!=100 and not 5<10)

True
True
False
False


#### Assignment Operators

When programming, it is common to use compound assignment operators that perform an operation on a variable’s value and then assign the resulting new value to that variable. These compound operators combine an arithmetic operator with the *=* operator, so for addition we’ll combine *+* with *=* to get the compound operator *+=*:

In [48]:
w = 10
w += 10
print(w)

20


In [49]:
w = 10
w *= 5
print(w)

50


In [50]:
w = 2
w **= 10
print(w)

1024


#### Membership operators

**in** and **not in** are the membership operators in Python. They are used to test whether a value or variable is found in a sequence (string, list, tuple, set and dictionary)

In [51]:
x = 'Hello data curation course'

print('H' in x)

True


In [65]:
print('hello' not in x)

True


#### Mutable and immutable objects

Most objects in Python, such as lists, dicts, NumPy arrays, and most user-defined
types (classes), are mutable. This means that the object or values that they contain can
be modified:

In [53]:
a_list = ['foo', 2, [4, 5]]
a_list[2] = (3, 4)
a_list

['foo', 2, (3, 4)]

Others, like strings and tuples, are immutable:

In [54]:
a_tuple = (3, 5, (4, 5))
a_tuple[1] = 'four'

TypeError: 'tuple' object does not support item assignment

Remember that just because you can mutate an object does not mean that you always
should. Such actions are known as side effects. For example, when writing a function,
any side effects should be explicitly communicated to the user in the function’s documentation
or comments. If possible, try to avoid side effects and
favor immutability, even though there may be mutable objects involved.

#### Implications of passing mutable vs. immutable variables to functions …


In [7]:
value = "old value"
def assign(param):
    #global value
    value = "new value"
    

assign(value)
print(value)

old value


It passes a string (which is an immutable type of object) into the function *assign*. Within the scope of the function *assign*, *param* has been bound to the same object that *value* has been bound to outside the scope of the function. Within the scope of the function *assign*, we modify "old value" to "new value" . But, as you’ll remember, strings are imutable, so *param* ends up pointing to a completely different object. Once we leave the scope of function *assign* , *param* is no longer in the name space, and the value that *value* refers to was never changed.



In [67]:
def assign(param):
    param[2] = "nothing"
    
value = ['You', 'know', 'something', 'Jon', 'Snow']
assign(value)
print(value)

['You', 'know', 'nothing', 'Jon', 'Snow']


### Scalar Types

Python along with its standard library has a small set of built-in types for handling
numerical data, strings, boolean (True or False) values, and dates and time. These
“single value” types are sometimes called scalar types

#### Numeric types

The primary Python types for numbers are int and float. An int can store arbitrarily
large numbers:

In [68]:
ival = 17239871
ival ** 6

26254519291092456596965462913230729701102721

Floating-point numbers are represented with the Python float type. Under the hood
each one is a double-precision (64-bit) value. They can also be expressed with scientific
notation:

In [69]:
fval = 7.243
fval2 = 6.78e-5

Integer division not resulting in a whole number will always yield a floating-point
number:

In [70]:
3 / 2

1.5

To get C-style integer division (which drops the fractional part if the result is not a
whole number), use the floor division operator //:

In [71]:
3 // 2

1

In [8]:
3%2

1

**Exercises**

Write a function *remainder* to obtain the remainder of divisions between a and b, e.g.

```python
results = remainder(45, 8)
print(results) #results = 3
```

In [89]:
def remainder(a, b):
    return b - a//b

results = remainder(45, 8)
print(results)

3


#### Strings

Many people use Python for its powerful and flexible built-in string processing capabilities.
You can write string literals using either single quotes ' or double quotes ":

In [72]:
a = 'one way of writing a string'
b = "another way"

For multiline strings with line breaks, you can use triple quotes, either ''' or """:

In [73]:
c = """
This is a longer string that
spans multiple lines
"""

It may surprise you that this string c actually contains four lines of text; the line
breaks after """ and after lines are included in the string. We can count the new line
characters with the count method on c:

In [74]:
c.count('\n')

3

Python strings are immutable; you cannot modify a string:

In [75]:
a = 'this is a string'
a[10] = 'f'

TypeError: 'str' object does not support item assignment

Afer this operation, the variable a is unmodified:

In [90]:
a

'this is a string'

In [91]:
b = a.replace('string', 'longer string')
b

'this is a longer string'

Many Python objects can be converted to a string using the str function:

In [92]:
a = 5.6
s = str(a)
print(s)

5.6


Strings are a sequence of Unicode characters and therefore can be treated like other
sequences, such as lists and tuples (which we will explore in more detail in the next
chapter):

In [93]:
s = 'python'
list(s)

['p', 'y', 't', 'h', 'o', 'n']

The syntax s[:3] is called slicing and is implemented for many kinds of Python
sequences.

Index starts from 0. Trying to access a character out of index range will raise an IndexError. The index must be an integer. We can't use float or other types, this will result into TypeError.

In [94]:
s[:3]

'pyt'

In [95]:
s[10]

IndexError: string index out of range

The index of -1 refers to the last item, -2 to the second last item and so on. We can access a range of items in a string by using the slicing operator (colon).

In [102]:
s[-2:-1]

'o'

The backslash character \ is an escape character, meaning that it is used to specify
special characters like newline \n or Unicode characters. To write a string literal with
backslashes, you need to escape them:

In [103]:
s = '12\\34'
print(s)

12\34


If you have a string with a lot of backslashes and no special characters, you might find
this a bit annoying. Fortunately you can preface the leading quote of the string with r,
which means that the characters should be interpreted as is (The r stands for raw):

In [104]:
s = r'this\has\no\special\characters'
s

'this\\has\\no\\special\\characters'

Adding two strings together concatenates them and produces a new string:

In [105]:
a = 'this is the first half '
b = 'and this is the second half'
a + b

'this is the first half and this is the second half'

String objects have a format method that
can be used to substitute formatted arguments into the string, producing a new
string:

In [106]:
template = '{0:.2f} {1:s} are worth USD${2:d}'

In this string,
* {0:.2f} means to format the first argument as a floating-point number with two
decimal places.
* {1:s} means to format the second argument as a string.
* {2:d} means to format the third argument as an exact integer.

In [107]:
template.format(0.86, 'USD', 1)

'0.86 USD are worth USD$1'

**Exercises**

Replace the word "Hello" from string "Hello, Data Curation Course!" to "Ola"

(Hint: you can either first split the string to the list of words and replace the word or use the ```replace``` method, check ?str.replace for more details)

In [161]:
string = "Hello, Data Curation Course!"
string.replace(string[:5], "Ola")

'Ola, Data Curation Course!'

#### Bytes and Unicode

In modern Python (i.e., Python 3.0 and up), Unicode has become the first-class string
type to enable more consistent handling of ASCII and non-ASCII text. In older versions
of Python, strings were all bytes without any explicit Unicode encoding. You
could convert to Unicode assuming you knew the character encoding. Let’s look at an
example:

In [108]:
val = "español"
val

'español'

We can convert this Unicode string to its UTF-8 bytes representation using the
encode method:

In [109]:
val_utf8 = val.encode('utf-8')
val_utf8

b'espa\xc3\xb1ol'

In [110]:
type(val_utf8)

bytes

Assuming you know the Unicode encoding of a bytes object, you can go back using
the decode method:

In [111]:
val_utf8.decode('utf-8')

'español'

While it’s become preferred to use UTF-8 for any encoding, for historical reasons you
may encounter data in any number of different encodings:

In [112]:
val.encode('latin1')

b'espa\xf1ol'

In [113]:
val.encode('utf-16')

b'\xff\xfee\x00s\x00p\x00a\x00\xf1\x00o\x00l\x00'

In [114]:
val.encode('utf-16le')

b'e\x00s\x00p\x00a\x00\xf1\x00o\x00l\x00'

#### Booleans

The two boolean values in Python are written as True and False. Comparisons and
other conditional expressions evaluate to either True or False. Boolean values are
combined with the and and or keywords:

In [115]:
True and True
False or True

True

#### Type casting

The str, bool, int, and float types are also functions that can be used to cast values
to those types:

In [116]:
s = '3.14159'
type(s)

str

In [117]:
fval = float(s)
type(fval)

float

In [118]:
int(fval)

3

In [119]:
bool(fval)

True

In [120]:
bool(0)

False

**Exercises**

Case the course number 2489 into string and concatenate it with the string "Hello, Data Curation Course"

In [162]:
Course_number = str(2489)
Course_name = "Hello, Data Curation Course"
Course_number + " - " + Course_name


'2489 - Hello, Data Curation Course'

#### None

None is the Python null value type. If a function does not explicitly return a value, it
implicitly returns None:

In [121]:
a = None
a is None

True

In [122]:
b = 5
b is not None

True

None is also a common default value for function arguments:

In [123]:
def add_and_maybe_multiply(a, b, c=None):
    result = a + b

    if c is not None:
        result = result * c

    return result

In [124]:
add_and_maybe_multiply(2, 5)

7

In [125]:
add_and_maybe_multiply(2, 5, 8)

56

None is not only a reserved keyword but also a unique instance of NoneType:

In [126]:
type(None)

NoneType

#### Dates and times

The built-in Python datetime module provides datetime, date, and time types. The
datetime type, as you may imagine, combines the information stored in date and
time and is the most commonly used:

In [127]:
from datetime import datetime, date, time
dt = datetime(2011, 10, 29, 20, 30, 21)

In [128]:
dt.day

29

In [129]:
dt.minute

30

In [130]:
dt.date()

datetime.date(2011, 10, 29)

In [131]:
dt.time()

datetime.time(20, 30, 21)

The strftime method formats a datetime as a string:

In [132]:
dt.strftime('%m/%d/%Y %H:%M')

'10/29/2011 20:30'

Strings can be converted (parsed) into datetime objects with the strptime function:

In [133]:
datetime.strptime('20091031', '%Y%m%d')

datetime.datetime(2009, 10, 31, 0, 0)

When you are aggregating or otherwise grouping time series data, it will occasionally
be useful to replace time fields of a series of datetimes—for example, replacing the
minute and second fields with zero:

In [134]:
dt.replace(minute=0, second=0)

datetime.datetime(2011, 10, 29, 20, 0)

Since datetime.datetime is an immutable type, methods like these always produce
new objects.


The difference of two datetime objects produces a datetime.timedelta type:

In [135]:
dt2 = datetime(2011, 11, 15, 22, 30)
delta = dt2 - dt
delta

datetime.timedelta(days=17, seconds=7179)

In [136]:
type(delta)

datetime.timedelta

The output timedelta(17, 7179) indicates that the timedelta encodes an offset of 17
days and 7,179 seconds.

In [137]:
dt

datetime.datetime(2011, 10, 29, 20, 30, 21)

Adding a timedelta to a datetime produces a new shifted datetime:

In [138]:
dt + delta

datetime.datetime(2011, 11, 15, 22, 30)

### Control Flow

Python has several built-in keywords for conditional logic, loops, and other standard
control flow concepts found in other programming languages.

#### if, elif, and else

The if statement is one of the most well-known control flow statement types. It
checks a condition that, if True, evaluates the code in the block that follows:

```python
if x < 0:
    print('It's negative')
```

An if statement can be optionally followed by one or more elif blocks and a catchall
else block if all of the conditions are False:

```python
if x < 0:
    print('It's negative')
elif x == 0:
    print('Equal to zero')
elif 0 < x < 5:
    print('Positive but smaller than 5')
else:
    print('Positive and larger than or equal to 5')
```

If any of the conditions is **True**, no further **elif** or **else** blocks will be reached. With
a compound condition using **and** or **or**, conditions are evaluated left to right and will
short-circuit:

In [139]:
a = 5; b = 7
c = 8; d = 4
if a < b or c > d:
    print('Made it')

Made it


In this example, the comparison c > d never gets evaluated because the first comparison
was True.

It is also possible to chain comparisons:

In [10]:
4 > 3 > 2 > 1

True

In [141]:
a = True

if a == True:
    print('a is True')
else:
    print('a is False')

a is True


In [142]:
a = 0


if a == True:
    print('a is True')
else:
    print('a is False')

a is False


**Exercises**

Input an integer number and check with the number is an odd/even number:

```python
number = input()

...

print("Number 35 is an odd number")
print("Number 42 is an even number")
```

In [164]:
number = int(input())
if number % 2 == 0:
    print("Number", number, "is an even number")
else:
    print("Number", number, "is an odd number")

42
Number 42 is an even number


#### for loops

**for** loops are for iterating over a collection (like a list or tuple) or an iterater. The
standard syntax for a for loop is:

```python
for value in collection:
    # do something with value
```

You can advance a **for** loop to the next iteration, skipping the remainder of the block,
using the **continue** keyword. Consider this code, which sums up integers in a list and
skips None values:

In [143]:
sequence = [1, 2, None, 4, None, 5]
total = 0
for value in sequence:
    if value is None:
        continue
    total += value

print(total)

12


A **for** loop can be exited altogether with the **break** keyword. This code sums elements
of the list until a 5 is reached:

In [144]:
sequence = [1, 2, 0, 4, 6, 5, 2, 1]
total_until_5 = 0
for value in sequence:
    if value == 5:
        break
    total_until_5 += value
    
print(total_until_5)

13


The break keyword only terminates the innermost for loop; any outer for loops will
continue to run:

In [145]:
for i in range(4):
    for j in range(4):
        if j > i:
            break
        print((i, j))

(0, 0)
(1, 0)
(1, 1)
(2, 0)
(2, 1)
(2, 2)
(3, 0)
(3, 1)
(3, 2)
(3, 3)


**Exercises**

Print all odd numbers from 0 to 100

```python
print("Number 1 is an odd number")
print("Number 3 is an even number")


print("Number 99 is an even number")
```

In [165]:
def is_even(a):
    if a % 2 == 0:
        return True
    return False

for i in range(100):
    if not is_even(i):
        print("Number", i, "is an odd number")

Number 1 is an odd number
Number 3 is an odd number
Number 5 is an odd number
Number 7 is an odd number
Number 9 is an odd number
Number 11 is an odd number
Number 13 is an odd number
Number 15 is an odd number
Number 17 is an odd number
Number 19 is an odd number
Number 21 is an odd number
Number 23 is an odd number
Number 25 is an odd number
Number 27 is an odd number
Number 29 is an odd number
Number 31 is an odd number
Number 33 is an odd number
Number 35 is an odd number
Number 37 is an odd number
Number 39 is an odd number
Number 41 is an odd number
Number 43 is an odd number
Number 45 is an odd number
Number 47 is an odd number
Number 49 is an odd number
Number 51 is an odd number
Number 53 is an odd number
Number 55 is an odd number
Number 57 is an odd number
Number 59 is an odd number
Number 61 is an odd number
Number 63 is an odd number
Number 65 is an odd number
Number 67 is an odd number
Number 69 is an odd number
Number 71 is an odd number
Number 73 is an odd number
Number

#### while loops

A **while** loop specifies a condition and a block of code that is to be executed until the
condition evaluates to **False** or the loop is explicitly ended with **break**:

In [146]:
n = 100

# initialize sum and counter
sum = 0
i = 1

while i <= n:
    sum = sum + i
    i = i+1    # update counter

# print the sum
print("The sum is", sum)

The sum is 5050


The **break** statement terminates the loop containing it. Control of the program flows to the statement immediately after the body of the loop.

If break statement is inside a nested loop (loop inside another loop), break will terminate the innermost loop.


In [147]:
x = 256
total = 0
while x > 0:
    if total > 500:
        break
    total += x
    x = x // 2

print(total)

504


The **continue** statement is used to skip the rest of the code inside a loop for the current iteration only. Loop does not terminate but continues on with the next iteration.

In [148]:
# print the sum of all odd numbers small than 100
x = 0
total = 0
while x < 100:
    if x % 2 == 0:
        x += 1
        continue
        
    total += x
    x += 1

print(total)

2500


#### pass

**pass** is the “no-op” statement in Python. It can be used in blocks where no action is to
be taken (or as a placeholder for code not yet implemented); it is only required
because Python uses whitespace to delimit blocks:

```python

if x < 0:
    print('negative!')
elif x == 0:
    # TODO: put something smart here
    pass
else:
    print('positive!')
```

** Exercises **

Loop through and print out all even numbers from the numbers list in the same order they are received. Don't print any numbers that come after 237 in the sequence.

```python
numbers = [
    951, 402, 984, 651, 360, 69, 408, 319, 601, 485, 980, 507, 725, 547, 544,
    615, 83, 165, 141, 501, 263, 617, 865, 575, 219, 390, 984, 592, 236, 105, 942, 941,
    386, 462, 47, 418, 907, 344, 236, 375, 823, 566, 597, 978, 328, 615, 953, 345,
    399, 162, 758, 219, 918, 237, 412, 566, 826, 248, 866, 950, 626, 949, 687, 217,
    815, 67, 104, 58, 512, 24, 892, 894, 767, 553, 81, 379, 843, 831, 445, 742, 717,
    958, 609, 842, 451, 688, 753, 854, 685, 93, 857, 440, 380, 126, 721, 328, 753, 470,
    743, 527
]
```



In [166]:
numbers = [
    951, 402, 984, 651, 360, 69, 408, 319, 601, 485, 980, 507, 725, 547, 544,
    615, 83, 165, 141, 501, 263, 617, 865, 575, 219, 390, 984, 592, 236, 105, 942, 941,
    386, 462, 47, 418, 907, 344, 236, 375, 823, 566, 597, 978, 328, 615, 953, 345,
    399, 162, 758, 219, 918, 237, 412, 566, 826, 248, 866, 950, 626, 949, 687, 217,
    815, 67, 104, 58, 512, 24, 892, 894, 767, 553, 81, 379, 843, 831, 445, 742, 717,
    958, 609, 842, 451, 688, 753, 854, 685, 93, 857, 440, 380, 126, 721, 328, 753, 470,
    743, 527
]

for number in numbers: 
    if is_even(number):
        print(number)
    if number == 237:
        break

402
984
360
408
980
544
390
984
592
236
942
386
462
418
344
236
566
978
328
162
758
918


#### range

The range function returns an iterator that yields a sequence of evenly spaced
integers:

In [149]:
range(10)

range(0, 10)

In [150]:
list(range(10))

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

Both a start, end, and step (which may be negative) can be given:

In [151]:
list(range(0, 20, 2))

[0, 2, 4, 6, 8, 10, 12, 14, 16, 18]

In [152]:
list(range(5, 0, -1))

[5, 4, 3, 2, 1]

As you can see, range produces integers up to but not including the endpoint. A
common use of range is for iterating through sequences by index:

In [153]:
seq = [1, 2, 3, 4]
for i in range(len(seq)):
    val = seq[i]
    print(val)

1
2
3
4


While you can use functions like list to store all the integers generated by range in
some other data structure, often the default iterator form will be what you want. This
snippet sums all numbers from 0 to 99,999 that are multiples of 3 or 5:

In [154]:
sum = 0
for i in range(100000):
    # % is the modulo operator
    if i % 3 == 0 or i % 5 == 0:
        sum += i

print(sum)

2333316668


#### Ternary expressions

A ternary expression in Python allows you to combine an if-else block that produces
a value into a single line or expression. The syntax for this in Python is:

```python
value = true-expr if condition else false-expr
```

Here, true-expr and false-expr can be any Python expressions. It has the identical
effect as the more verbose:
    
```python
if condition:
value = true-expr
else:
value = false-expr
```

In [155]:
x = 5
'Non-negative' if x >= 0 else 'Negative'

'Non-negative'

As with if-else blocks, only one of the expressions will be executed. Thus, the “if ”
and “else” sides of the ternary expression could contain costly computations, but only
the true branch is ever evaluated.

**Exercises**

- A prime number is an integer number greater than 1 whose only factors are 1 and itself. Print all prime numbers up to 1000, and its sum
- Write a program **factorial** which can compute the factorial of a given numbers. The results should be printed in a comma-separated sequence on a single line. For example:
```python
results = factorial(5)
print(results) #results = 120
```
- Given the following definition of Data Curation on Wikipedia
```python
"data curation is a broad term used to indicate processes and activities related to the organization and integration of data collected from various sources, annotation of the data, and publication and presentation of the data such that the value of the data is maintained over time, and the data remains available for reuse and preservation. data curation includes 'all the processes needed for principled and controlled data creation, maintenance, and management, together with the capacity to add value to data'"
```
    1. count how many words in this sentence (for simplicity, the punctuations can be considered as words)
    2. count how many 'data' in this sentence
    3. convert this sentence to uppercase
    4. replace the word 'data' into 'information'

In [169]:
def is_prime(a):
    for i in range(2, a):
        if a % i == 0:
            return False
    return True

sum = 0
for i in range(2, 1000):
    if is_prime(i):
        print(i)
    sum += i

print("sum: ", sum)

2
3
5
7
11
13
17
19
23
29
31
37
41
43
47
53
59
61
67
71
73
79
83
89
97
101
103
107
109
113
127
131
137
139
149
151
157
163
167
173
179
181
191
193
197
199
211
223
227
229
233
239
241
251
257
263
269
271
277
281
283
293
307
311
313
317
331
337
347
349
353
359
367
373
379
383
389
397
401
409
419
421
431
433
439
443
449
457
461
463
467
479
487
491
499
503
509
521
523
541
547
557
563
569
571
577
587
593
599
601
607
613
617
619
631
641
643
647
653
659
661
673
677
683
691
701
709
719
727
733
739
743
751
757
761
769
773
787
797
809
811
821
823
827
829
839
853
857
859
863
877
881
883
887
907
911
919
929
937
941
947
953
967
971
977
983
991
997
sum:  499499


In [172]:
def factorial(a):
    fact = 1
    if a == 0:
        return 1
    elif a < 0: 
        return "not defined"
    else:
        for i in range(1, a+1):
            fact *= i
    return fact

results = factorial(5)
print(results)
        

120
