# Python Language Basics, IPython, and Jupyter Notebooks

In [1]:
import numpy as np
np.random.seed(12345)
np.set_printoptions(precision=4, suppress=True)

## The Python Interpreter

```python
$ python
Python 3.6.0 | packaged by conda-forge | (default, Jan 13 2017, 23:17:12)
[GCC 4.8.2 20140120 (Red Hat 4.8.2-15)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> a = 5
>>> print(a)
5
```

```python
print('Hello world')
```

```python
$ python hello_world.py
Hello world
```

```shell
$ ipython
Python 3.6.0 | packaged by conda-forge | (default, Jan 13 2017, 23:17:12)
Type "copyright", "credits" or "license" for more information.

IPython 5.1.0 -- An enhanced Interactive Python.
?         -> Introduction and overview of IPython's features.
%quickref -> Quick reference.
help      -> Python's own help system.
object?   -> Details about 'object', use 'object??' for extra details.

In [1]: %run hello_world.py
Hello world

In [2]:
```

## IPython Basics

### Running the IPython Shell

$ 

In [3]:
import numpy as np
data = {i : np.random.randn() for i in range(7)}
data

{0: -0.20470765948471295,
 1: 0.47894333805754824,
 2: -0.5194387150567381,
 3: -0.55573030434749,
 4: 1.9657805725027142,
 5: 1.3934058329729904,
 6: 0.09290787674371767}

>>> from numpy.random import randn
>>> data = {i : randn() for i in range(7)}
>>> print(data)
{0: -1.5948255432744511, 1: 0.10569006472787983, 2: 1.972367135977295,
3: 0.15455217573074576, 4: -0.24058577449429575, 5: -1.2904897053651216,
6: 0.3308507317325902}

### Running the Jupyter Notebook

```shell
$ jupyter notebook
[I 15:20:52.739 NotebookApp] Serving notebooks from local directory:
/home/wesm/code/pydata-book
[I 15:20:52.739 NotebookApp] 0 active kernels
[I 15:20:52.739 NotebookApp] The Jupyter Notebook is running at:
http://localhost:8888/
[I 15:20:52.740 NotebookApp] Use Control-C to stop this server and shut down
all kernels (twice to skip confirmation).
Created new window in existing browser session.
```

### Tab Completion

In [4]:
an_apple = 27

an_example = 42

#an_

In [None]:
b = [1, 2, 3]

#b.

In [None]:
import datetime
#datetime.

In [None]:
datasets/movielens/

### Introspection

In [5]:
b = [1, 2, 3]
b?

In [6]:
print?

In [7]:
def add_numbers(a, b):
    """
    Add two numbers together

    Returns
    -------
    the_sum : type of arguments
    """
    return a + b

In [8]:
add_numbers?

In [9]:
add_numbers??

In [10]:
np.*load*?

### The %run Command

In [11]:
def f(x, y, z):
    return (x + y) / z

a = 5
b = 6
c = 7.5

result = f(a, b, c)

In [12]:
%ls scripts/

ipython_script_test.py


In [13]:
%run scripts/ipython_script_test.py

In [14]:
c

7.5

In [15]:
result

1.4666666666666666

In [None]:
# %load scripts/ipython_script_test.py
def f(x, y, z):
	return (x + y) / z

a = 5
b = 6
c = 7.5

result = f(a, b, c) 


#### Interrupting running code

### Executing Code from the Clipboard

Relevant when running iPython, not so much for Jupyter notebook (here, copy/paste works well)

```python
x = 5
y = 7
if x > 5:
    x += 1

    y = 8
```

```python
In [17]: %paste
x = 5
y = 7
if x > 5:
    x += 1

    y = 8
## -- End pasted text --
```

```python
In [18]: %cpaste
Pasting code; enter '--' alone on the line to stop or use Ctrl-D.
:x = 5
:y = 7
:if x > 5:
:    x += 1
:
:    y = 8
:--
```

### Terminal Keyboard Shortcuts

### About Magic Commands

In [18]:
a = np.random.randn(100, 100)

%timeit np.dot(a, a)

43.1 µs ± 1.92 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)


In [19]:
%debug?                      

UsageError: Line magic function `%debug?` not found.


In [20]:
%pwd

'/Users/verenakutschera/Dropbox/Projects/Miscellanea/Workshops/meetup_Python_DataAnalysis/pydata-book'

In [21]:
foo = %pwd

foo

'/Users/verenakutschera/Dropbox/Projects/Miscellanea/Workshops/meetup_Python_DataAnalysis/pydata-book'

Potentially useful magic command: config
* e.g. for formatting number of lines

In [27]:
%config

Available objects for config:
     AliasManager
     DisplayFormatter
     HistoryManager
     IPCompleter
     IPKernelApp
     InlineBackend
     LoggingMagics
     MagicsManager
     OSMagics
     PrefilterManager
     ScriptMagics
     StoreMagics
     ZMQInteractiveShell


In [26]:
%config DisplayFormatter

DisplayFormatter options
----------------------
DisplayFormatter.active_types=<List>
    Current: ['text/plain', 'text/html', 'text/markdown', 'image/svg+xml', 'image/png', 'application/pdf', 'image/jpeg', 'text/latex', 'application/json', 'application/javascript']
    List of currently active mime-types to display. You can use this to set a
    white-list for formats to display.
    Most users will not need to change this value.


### Matplotlib Integration

In [22]:
%matplotlib

Using matplotlib backend: MacOSX


In [23]:
%matplotlib inline

## Python Language Basics

### Language Semantics

#### Indentation, not braces

The following command won't execute because array was not defined

In [28]:
for x in array:
    if x < pivot:
        less.append(x)
    else:
        greater.append(x)

NameError: name 'array' is not defined

Semicolon is possible to use, not very readable though

In [29]:
a = 5; b = 6; c = 7

#### Everything is an object

#### Comments

In [32]:
file_handle = open("scripts/ipython_script_test.py", 'r')
results = []
for line in file_handle:
    # keep the empty lines for now
    # if len(line) == 0:
    #   continue
    results.append(line.replace('foo', 'bar'))
print(results)

['def f(x, y, z):\n', '\treturn (x + y) / z\n', '\n', 'a = 5\n', 'b = 6\n', 'c = 7.5\n', '\n', 'result = f(a, b, c) \n']


In [33]:
print("Reached this line")  # Simple status report

Reached this line


#### Function and object method calls

```python
result = f(x, y, z)
g()
```

```python
obj.some_method(x, y, z)
```

```python
result = f(a, b, c, d=5, e='foo')
```

#### Variables and argument passing

Setting a new variable equal to an existing one won't make a copy, but both will point to the same object!

In [35]:
a = [1, 2, 3]

In [36]:
b = a

In [37]:
a.append(4)
b

[1, 2, 3, 4]

In [40]:
def append_element(some_list, element):
    some_list.append(element)

In [38]:
data = [1, 2, 3]

In [41]:
append_element(data, 4)
data

[1, 2, 3, 4]

#### Dynamic references, strong types

In [42]:
a = 5
type(a)

int

In [43]:
a = 'foo'
type(a)

str

In [44]:
'5' + 5

TypeError: must be str, not int

In [45]:
a = 4.5
b = 2
# String formatting, to be visited later
print('a is {0}, b is {1}'.format(type(a), type(b)))
a / b

a is <class 'float'>, b is <class 'int'>


2.25

In [46]:
a = 5
isinstance(a, int)

True

In [47]:
a = 5; b = 4.5
isinstance(a, (int, float))
isinstance(b, (int, float))

True

#### Attributes and methods

```python
In [1]: a = 'foo'

In [2]: a.<Press Tab>
a.capitalize  a.format      a.isupper     a.rindex      a.strip
a.center      a.index       a.join        a.rjust       a.swapcase
a.count       a.isalnum     a.ljust       a.rpartition  a.title
a.decode      a.isalpha     a.lower       a.rsplit      a.translate
a.encode      a.isdigit     a.lstrip      a.rstrip      a.upper
a.endswith    a.islower     a.partition   a.split       a.zfill
a.expandtabs  a.isspace     a.replace     a.splitlines
a.find        a.istitle     a.rfind       a.startswith
```

In [49]:
a = 'foo'

In [50]:
getattr(a, 'split')

<function str.split>

Methods are actually calling functions associated with an object whereas attributes are object from within an object

Another example from the meetup:

In [51]:
#attribute/method example (ignore the part about forming the class if it's confusing):
class Person:
    def __init__(self, name):
        self.name = name    
    def say_my_name(self):
        print("Hi my name is", self.name)

In [52]:
p1 = Person('jenny')
p1.name # <- attribute, not called as a function
# 'jenny'

'jenny'

In [53]:
p1.say_my_name() # <- method, called as a function
# Hi my name is jenny

Hi my name is jenny


#### Duck typing

In [54]:
def isiterable(obj):
    try:
        iter(obj)
        return True
    except TypeError: # not iterable
        return False

In [62]:
isiterable('a string')

True

In [63]:
isiterable([1, 2, 3])

True

In [64]:
isiterable(5)

False

In [60]:
x = 5
isiterable(x)

False

In [61]:
if not isinstance(x, list) and isiterable(x):
    x = list(x)
    
isiterable(x)

False

#### Imports

Created a python script `some_module.py` that can be loaded as a module, stored in directory `scripts/`

In [71]:
# %load scripts/some_module.py
PI = 3.14159

def f(x):
    return x + 2

def g(a, b):
    return a + b

Python will only find the file if it is in the systems path. Using python's sys module, we can add a directory to the path just while Python is running, and once Python stops running, it will remove it from the path:

In [76]:
import sys
sys.path.insert(0, 'scripts/')

In [77]:
from scripts import some_module
result = some_module.f(5)
pi = some_module.PI

In [79]:
# this would not work without importing sys and adding the directory with the module to the path
from some_module import f, g, PI
result = g(5, PI)

In [80]:
import some_module as sm
from some_module import PI as pi, g as gf

r1 = sm.f(pi)
r2 = gf(6, pi)

#### Binary operators and comparisons

In [83]:
5 - 7

-2

In [84]:
12 + 21.5

33.5

In [85]:
5 <= 2

False

In [94]:
a = [1, 2, 3]
b = a

In [95]:
# changing a and pointing to a new variable will actually create a new object!
c = list(a)

In [97]:
a is b

True

In [98]:
a is not c

True

In [99]:
a == c

True

In [100]:
a = None

In [101]:
a is None

True

#### Mutable and immutable objects

In [102]:
a_list = ['foo', 2, [4, 5]]

In [103]:
a_list[2] = (3, 4)
a_list

['foo', 2, (3, 4)]

Strings and tuples are immutable

In [104]:
a_tuple = (3, 5, (4, 5))
a_tuple[1] = 'four'

TypeError: 'tuple' object does not support item assignment

### Scalar Types

#### Numeric types

In [105]:
ival = 17239871
ival ** 6

26254519291092456596965462913230729701102721

In [109]:
fval = 7.243
fval2 = 6.78e-5
fval ** fval2

1.0001342554173493

In [107]:
3 / 2

1.5

In [108]:
3 // 2

1

#### Strings

In [110]:
a = 'one way of writing a string'
b = "another way"

In [None]:
c = """
This is a longer string that
spans multiple lines
"""

In [113]:
c

'\nThis is a longer string that\nspans multiple lines\n'

In [112]:
print(c)


This is a longer string that
spans multiple lines



In [114]:
c.count('\n')

3

In [115]:
a = 'this is a string' # again, strings are not mutable
a[10] = 'f'

TypeError: 'str' object does not support item assignment

In [117]:
b = a.replace('string', 'longer string') # replace is a built-in method to modify strings
b

'this is a longer string'

In [118]:
a

'this is a string'

In [119]:
a = 5.6
s = str(a)
print(s)

5.6


In [121]:
s = 'python'
list(s) # converting to a list makes the string mutable
s[:3]

'pyt'

In [122]:
s = '12\\34'
print(s)

12\34


In [124]:
s = r'this\has\no\special\characters'
print(s)

this\has\no\special\characters


In [125]:
s = 'this\has\no\special\characters'
print(s)

this\has
o\special\characters


In [126]:
a = 'this is the first half '
b = 'and this is the second half'
a + b

'this is the first half and this is the second half'

In [127]:
template = '{0:.2f} {1:s} are worth US${2:d}'

In [128]:
template.format(4.5560, 'Argentine Pesos', 1)

'4.56 Argentine Pesos are worth US$1'

#### Bytes and Unicode

In [129]:
val = "español"
val

'español'

In [130]:
val_utf8 = val.encode('utf-8')
val_utf8
type(val_utf8)

bytes

In [131]:
val_utf8.decode('utf-8')

'español'

In [133]:
val.encode('latin1')

b'espa\xf1ol'

In [134]:
val.encode('utf-16')

b'\xff\xfee\x00s\x00p\x00a\x00\xf1\x00o\x00l\x00'

In [135]:
val.encode('utf-16le')

b'e\x00s\x00p\x00a\x00\xf1\x00o\x00l\x00'

In [136]:
bytes_val = b'this is bytes'
bytes_val

b'this is bytes'

In [137]:
decoded = bytes_val.decode('utf8')
decoded  # this is str (Unicode) now

'this is bytes'

#### Booleans

In [138]:
True and True
False or True

True

#### Type casting

In [140]:
s = '3.14159'

In [141]:
fval = float(s)

In [142]:
type(fval)

float

In [143]:
int(fval)

3

In [144]:
bool(fval)

True

In [145]:
bool(0)

False

#### None

In [146]:
a = None

In [147]:
a is None

True

In [148]:
b = 5

In [149]:
b is not None

True

In [150]:
def add_and_maybe_multiply(a, b, c=None):
    result = a + b

    if c is not None:
        result = result * c

    return result

In [151]:
type(None)

NoneType

#### Dates and times

In [154]:
from datetime import datetime, date, time
dt = datetime(2011, 10, 29, 20, 30, 21)
dt.day

29

In [155]:
dt.minute

30

In [156]:
dt.date()

datetime.date(2011, 10, 29)

In [157]:
dt.time()

datetime.time(20, 30, 21)

In [158]:
dt.strftime('%m/%d/%Y %H:%M')

'10/29/2011 20:30'

In [159]:
datetime.strptime('20091031', '%Y%m%d')

datetime.datetime(2009, 10, 31, 0, 0)

In [160]:
dt.replace(minute=0, second=0)

datetime.datetime(2011, 10, 29, 20, 0)

In [161]:
dt2 = datetime(2011, 11, 15, 22, 30)
delta = dt2 - dt
delta

datetime.timedelta(17, 7179)

In [162]:
type(delta)

datetime.timedelta

In [163]:
dt
dt + delta

datetime.datetime(2011, 11, 15, 22, 30)

### Control Flow

#### if, elif, and else

In [167]:
x = -2
if x < 0:
    print("It's negative")

It's negative


In [168]:
if x < 0:
    print("It's negative")
elif x == 0:
    print('Equal to zero')
elif 0 < x < 5:
    print('Positive but smaller than 5')
else:
    print('Positive and larger than or equal to 5')

It's negative


In [169]:
a = 5; b = 7
c = 8; d = 4
if a < b or c > d:
    print('Made it')

Made it


In [170]:
4 > 3 > 2 > 1

True

#### for loops

```python
for value in collection:
    # do something with value
```

In [173]:
sequence = [1, 2, None, 4, None, 5]
total = 0
for value in sequence:
    if value is None:
        continue
    total += value
    
print(total)

12


In [174]:
sequence = [1, 2, 0, 4, 6, 5, 2, 1]
total_until_5 = 0
for value in sequence:
    if value == 5:
        break
    total_until_5 += value
    
total_until_5

13

```python
for a, b, c in iterator:
    # do something
```

In [175]:
for i in range(4):
    for j in range(4):
        if j > i:
            break
        print((i, j))

(0, 0)
(1, 0)
(1, 1)
(2, 0)
(2, 1)
(2, 2)
(3, 0)
(3, 1)
(3, 2)
(3, 3)


#### while loops

In [176]:
x = 256
total = 0
while x > 0:
    if total > 500:
        break
    total += x
    x = x // 2

#### pass

In [177]:
if x < 0:
    print('negative!')
elif x == 0:
    # TODO: put something smart here
    pass
else:
    print('positive!')

positive!


#### range

In [180]:
range(10)

range(0, 10)

In [181]:
list(range(10))

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

In [182]:
list(range(0, 20, 2))

[0, 2, 4, 6, 8, 10, 12, 14, 16, 18]

In [183]:
list(range(5, 0, -1))

[5, 4, 3, 2, 1]

In [185]:
seq = [1, 2, 3, 4]
for i in range(len(seq)):
    val = seq[i]

print(val)

4


In [187]:
sum = 0
for i in range(100000):
    # % is the modulo operator
    if i % 3 == 0 or i % 5 == 0:
        sum += i
        
print(sum)

2333316668


#### Ternary expressions

Tempting to condense code like that, but not very readable!

In [188]:
x = 5
'Non-negative' if x >= 0 else 'Negative'

'Non-negative'