# Python Language Basics, IPython, and Jupyter Notebooks

In [7]:
import numpy as np
np.random.seed(12345)
np.set_printoptions(precision=4, suppress=True)

## IPython Basics

### Running the IPython Shell

<mark>Comprehension: 字典推导式快速dictionary i:random for i in ...</mark>

In [1]:
import numpy as np
data = {i : np.random.randn() for i in range(7)}
data

{0: -0.9162493812635564,
 1: -0.4354767170376472,
 2: -1.0334032105090856,
 3: 0.4398732556110766,
 4: 0.7853700400212276,
 5: 1.2837787638092049,
 6: 0.8719533865964082}

In [None]:
print(data)

### Tab Completion

> 1. 字符串后tab补全一串字符
> 2. object.后补全后面method
> 3. 函数补全argument

```python
In [1]: an_apple = 27

In [2]: an_example = 42

In [3]: an
```

```python
In [3]: b = [1, 2, 3]

In [4]: b.
```

```python
In [1]: import datetime

In [2]: datetime.
```

```python
In [7]: datasets/movielens/
```

<mark>注意，默认情况下，IPython会隐藏下划线开头的方法和属性，比如魔术方法和内部的“私有”方法和属性，以避免混乱的显示（和让新手迷惑！）这些也可以tab补全，但是你必须首先键入一个下划线才能看到它们。</mark>

### Introspection

```python
In [8]: b = [1, 2, 3]

In [9]: b?
Type:       list
String Form:[1, 2, 3]
Length:     3
Docstring:
list() -> new empty list
list(iterable) -> new list initialized from iterable's items

In [10]: print?
Docstring:
print(value, ..., sep=' ', end='\n', file=sys.stdout, flush=False)

Prints the values to a stream, or to sys.stdout by default.
Optional keyword arguments:
file:  a file-like object (stream); defaults to the current sys.stdout.
sep:   string inserted between values, default a space.
end:   string appended after the last value, default a newline.
flush: whether to forcibly flush the stream.
Type:      builtin_function_or_method
```

```python
def add_numbers(a, b):
    """
    Add two numbers together

    Returns
    -------
    the_sum : type of arguments
    """
    return a + b
```

```python
In [11]: add_numbers?
Signature: add_numbers(a, b)
Docstring:
Add two numbers together

Returns
-------
the_sum : type of arguments
File:      <ipython-input-9-6a548a216e27>
Type:      function
```

<mark>??显示函数源码</mark>

```python
In [12]: add_numbers??
Signature: add_numbers(a, b)
Source:
def add_numbers(a, b):
    """
    Add two numbers together

    Returns
    -------
    the_sum : type of arguments
    """
    return a + b
File:      <ipython-input-9-6a548a216e27>
Type:      function
```

<mark>*通配符</mark>

In [3]:
?np.load*

np.load
np.loads
np.loadtxt

In [5]:
np.*load*?

np.__loader__
np.load
np.loads
np.loadtxt

### The %run Command

```python
def f(x, y, z):
    return (x + y) / z

a = 5
b = 6
c = 7.5

result = f(a, b, c)
```

```python
In [14]: %run ipython_script_test.py
```

```python
In [15]: c
Out [15]: 7.5

In [16]: result
Out[16]: 1.4666666666666666
```

```python
>>> %load ipython_script_test.py

    def f(x, y, z):
        return (x + y) / z

    a = 5
    b = 6
    c = 7.5

    result = f(a, b, c)
```

<mark>Interrupting running code: 运行时按 Ctrl-C</mark>

### Executing Code from the Clipboard

```python
x = 5
y = 7
if x > 5:
    x += 1

    y = 8
```

```python
In [17]: %paste
x = 5
y = 7
if x > 5:
    x += 1

    y = 8
## -- End pasted text --
```

```python
In [18]: %cpaste
Pasting code; enter '--' alone on the line to stop or use Ctrl-D.
:x = 5
:y = 7
:if x > 5:
:    x += 1
:
:    y = 8
:--
```

### Terminal Keyboard Shortcuts

![Standard IPython keyboard shortcuts.png](attachment:8c8a0171-a9d7-4926-a5f5-153f0d2f826d.png)

### About Magic Commands

```python
In [20]: a = np.random.randn(100, 100)

In [20]: %timeit np.dot(a, a)
10000 loops, best of 3: 20.9 µs per loop
```

<mark>魔术命令测时间 %timeit ...</mark>

In [6]:
a = np.random.randn(100, 100)

%timeit np.dot(a, a)

224 µs ± 69.3 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)


In [8]:
a

array([[-1.42215132, -0.15226025, -0.41619018, ...,  0.25212144,
        -0.3390018 , -1.50348787],
       [-1.00767352, -0.12289105, -0.03063463, ..., -1.46499677,
        -0.1875935 , -0.78000066],
       [ 1.46702925,  0.39419305,  0.2841049 , ..., -0.35517461,
        -0.8324227 ,  1.3014608 ],
       ...,
       [ 1.69093506,  1.24690235, -0.73674753, ...,  0.38176916,
         0.4561317 ,  0.05312409],
       [-0.44663617, -0.7785484 ,  1.71368396, ..., -0.57286395,
         1.02842187, -0.02136854],
       [ 0.69698105,  2.48811466, -0.48396955, ...,  0.15460071,
         1.34379542,  0.37619965]])

> 一些魔术函数与Python函数很像，它的结果可以赋值给一个变量

```python
In [22]: %pwd
Out[22]: '/home/wesm/code/pydata-book

In [23]: foo = %pwd

In [24]: foo
Out[24]: '/home/wesm/code/pydata-book'
```

![Table 2-2. Some frequently used IPython magic commands.png](attachment:86468e79-e74b-432b-8a7e-115264fbe972.png)

### Matplotlib Integration

```python
In [26]: %matplotlib
Using matplotlib backend: Qt4Agg
```

> 行内作图

```python
In [26]: %matplotlib inline
```

## Python Language Basics

### Language Semantics

#### Indentation, not braces

```python
for x in array:
    if x < pivot:
        less.append(x)
    else:
        greater.append(x)
```

```python
a = 5; b = 6; c = 7
```

#### Everything is an object

#### Comments

```python
results = []
for line in file_handle:
    # keep the empty lines for now
    # if len(line) == 0:
    #   continue
    results.append(line.replace('foo', 'bar'))
```

```python
print("Reached this line")  # Simple status report
```

#### Function and object method calls

```
result = f(x, y, z)
g()
```

```
obj.some_method(x, y, z)
```

```python
result = f(a, b, c, d=5, e='foo')
```

#### Variables and argument passing

In [5]:
a = [1, 2, 3]

In [6]:
b = a

In [7]:
a.append(4)
b

[1, 2, 3, 4]

<mark>Assigning is only reference. 变量名只是绑定了一个object</mark>

> 理解Python的引用的含义，数据是何时、如何、为何复制的，是非常重要的。尤其是当你用Python处理大的数据集时。
笔记：赋值也被称作绑定，我们是把一个名字绑定给一个对象。变量名有时可能被称为绑定变量。
当你将对象作为参数传递给函数时，新的局域变量创建了对原始对象的引用，而不是复制。

```python
def append_element(some_list, element):
    some_list.append(element)
```

```python
In [27]: data = [1, 2, 3]

In [28]: append_element(data, 4)

In [29]: data
Out[29]: [1, 2, 3, 4]
```

#### Dynamic references, strong types

In [8]:
a = 5
type(a)
a = 'foo'
type(a)

str

In [9]:
# '5' + 5

<mark>print(''.format())</mark>

In [10]:
a = 4.5
b = 2
# String formatting, to be visited later
print('a is {0}, b is {1}'.format(type(a), type(b)))
a / b

a is <class 'float'>, b is <class 'int'>


2.25

<mark>isinstance()查看是否是某一类型</mark> 

In [13]:
a = 5
isinstance(a, int)

True

In [13]:
a = 5; b = 4.5
isinstance(a, (int, float))
isinstance(b, (int, float))

True

#### Attributes and methods

```python
In [1]: a = 'foo'

In [2]: a.<Press Tab>
a.capitalize  a.format      a.isupper     a.rindex      a.strip
a.center      a.index       a.join        a.rjust       a.swapcase
a.count       a.isalnum     a.ljust       a.rpartition  a.title
a.decode      a.isalpha     a.lower       a.rsplit      a.translate
a.encode      a.isdigit     a.lstrip      a.rstrip      a.upper
a.endswith    a.islower     a.partition   a.split       a.zfill
a.expandtabs  a.isspace     a.replace     a.splitlines
a.find        a.istitle     a.rfind       a.startswith
```

In [14]:
a = 'foo'

In [15]:
getattr(a, 'split')

<function str.split(sep=None, maxsplit=-1)>

#### Duck typing

> “If it walks like a duck and quacks like a duck, then it’s a duck.”

<mark>iter(): Get an iterator from an object.</mark>

In [17]:
iter?

[0;31mDocstring:[0m
iter(iterable) -> iterator
iter(callable, sentinel) -> iterator

Get an iterator from an object.  In the first form, the argument must
supply its own iterator, or be a sequence.
In the second form, the callable is called until it returns the sentinel.
[0;31mType:[0m      builtin_function_or_method


In [16]:
def isiterable(obj):
    try:
        iter(obj)
        return True
    except TypeError: # not iterable
        return False

In [16]:
iter('dsjovdf')

<str_iterator at 0x7fad98e76390>

In [18]:
isiterable(3)

False

In [19]:
isiterable('a string')

True

In [20]:
isiterable([1, 2, 3])

True

In [21]:
isiterable(5)

False

> 常见的例子是编写一个函数可以接受任意类型的序列（list、tuple、ndarray）或是迭代器。你可先检验对象是否是列表（或是NumPy数组），如果不是的话，将其转变成列表：

```python
if not isinstance(x, list) and isiterable(x):
    x = list(x)
```

#### Imports

```python
# some_module.py
PI = 3.14159

def f(x):
    return x + 2

def g(a, b):
    return a + b
```

```python
import some_module
result = some_module.f(5)
pi = some_module.PI
```

```python
from some_module import f, g, PI
result = g(5, PI)
```

```python
import some_module as sm
from some_module import PI as pi, g as gf

r1 = sm.f(pi)
r2 = gf(6, pi)
```

#### Binary operators and comparisons

In [22]:
5 - 7

-2

In [23]:
12 + 21.5

33.5

In [24]:
5 <= 2

False

<mark>判断是否引用同一对象 is, different with "=="</mark>

> True if a and b reference the same Python object

In [18]:
a = [1, 2, 3]
b = a
c = list(a)
a is b

True

In [19]:
a is not c

True

In [20]:
a == c

True

In [28]:
a = None
a is None

True

> 此处注意，None只有一个实例, there is only one instance of None

In [22]:
a == None

False

#### Mutable and immutable objects

In [30]:
a_list = ['foo', 2, [4, 5]]
a_list[2] = (3, 4)
a_list

['foo', 2, (3, 4)]

<mark>例：string和tuple不可变</mark>

In [31]:
a_tuple = (3, 5, (4, 5))
# a_tuple[1] = 'four'

> Remember that just because you can mutate an object does not mean that you always should. Such actions are known as side effects.

### Scalar Types

![标量](http://upload-images.jianshu.io/upload_images/7178691-27a30ac3e7d262a1.png?imageMogr2/auto-orient/strip|imageView2/2/w/1240)

#### Numeric types

In [32]:
ival = 17239871
ival ** 6

26254519291092456596965462913230729701102721

In [33]:
fval = 7.243
fval2 = 6.78e-5

In [34]:
3 / 2

1.5

In [35]:
3 // 2

1

#### Strings

a = 'one way of writing a string'
b = "another way"

## <mark>三引号可以是包含换行符的string</mark>

In [36]:
c = """
This is a longer string that
spans multiple lines
"""

In [37]:
c.count('\n')

3

In [38]:
print(c)


This is a longer string that
spans multiple lines



In [2]:
a = 'this is a string'
# a[10] = 'f'

In [3]:
a.replace?

In [40]:
b = a.replace('string', 'longer string')
b

'this is a longer string'

In [41]:
a

'this is a string'

In [42]:
a = [1,2]
a

[1, 2]

In [43]:
b = a.append(3)
print(b) #因为append不返回值

None


In [44]:
#a.append?

In [45]:
a = 5.6
s = str(a)
print(s)

5.6


In [46]:
s = 'python'
list(s)
s[:3]

'pyt'

> 反斜杠是转义字符，意思是它备用来表示特殊字符，比如换行符\n或Unicode字符。要写一个包含反斜杠的字符串，需要进行转义：

In [47]:
s = '12\\34'
print(s)

12\34


In [48]:
s = r'this\has\no\special\characters'
s

'this\\has\\no\\special\\characters'

In [49]:
s = 'this\\has\\no\\special\\characters'
s

'this\\has\\no\\special\\characters'

In [50]:
a = 'this is the first half '
b = 'and this is the second half'
a + b

'this is the first half and this is the second half'

## <mark>格式化</mark>

> 在这个字符串中，
- {0:.2f}表示格式化第一个参数为带有两位小数的浮点数。
- {1:s}表示格式化第二个参数为字符串。
- {2:d}表示格式化第三个参数为一个整数。

详见第八章

In [51]:
template = '{0:.2f} {1:s} are worth US${2:d}'

In [52]:
template.format(4.5560, 'Argentine Pesos', 1)

'4.56 Argentine Pesos are worth US$1'

#### Bytes and Unicode

## <mark>编码与解码</mark>

In [53]:
val = "español"
val

'español'

In [54]:
val_utf8 = val.encode('utf-8')
val_utf8

b'espa\xc3\xb1ol'

In [55]:
type(val_utf8)

bytes

In [56]:
val_utf8.decode('utf-8')

'español'

In [57]:
val.encode('latin1')

b'espa\xf1ol'

In [58]:
val.encode('utf-16')

b'\xff\xfee\x00s\x00p\x00a\x00\xf1\x00o\x00l\x00'

In [59]:
val.encode('utf-16le')

b'e\x00s\x00p\x00a\x00\xf1\x00o\x00l\x00'

In [60]:
bytes_val = b'this is bytes'
bytes_val

b'this is bytes'

In [61]:
decoded = bytes_val.decode('utf8')
decoded  # this is str (Unicode) now

'this is bytes'

#### Booleans

In [62]:
True and True

True

In [63]:
True and False

False

In [64]:
False or True

True

#### Type casting

In [65]:
s = '3.14159'

In [66]:
fval = float(s)

In [67]:
type(fval)

float

In [68]:
int(fval)

3

In [69]:
bool(fval)

True

In [70]:
bool(0)

False

#### None

In [71]:
a = None
a is None

True

In [72]:
b = 5
b is not None

True

```python
def add_and_maybe_multiply(a, b, c=None):
    result = a + b

    if c is not None:
        result = result * c

    return result
```

## <mark>唯一的NoneType实例</mark>

In [73]:
type(None)

NoneType

## <mark> Dates and times</mark>

In [74]:
from datetime import datetime, date, time
dt = datetime(2011, 10, 29, 20, 30, 21)
dt.day

29

In [75]:
dt.minute

30

In [76]:
dt.date()

datetime.date(2011, 10, 29)

In [77]:
dt.time()

datetime.time(20, 30, 21)

> strftime方法可以将datetime格式化为字符串：

In [78]:
dt.strftime('%m/%d/%Y %H:%M')

'10/29/2011 20:30'

In [79]:
datetime.strptime('20091031', '%Y%m%d')

datetime.datetime(2009, 10, 31, 0, 0)

In [80]:
dt.replace(minute=0, second=0)

datetime.datetime(2011, 10, 29, 20, 0)

In [81]:
dt2 = datetime(2011, 11, 15, 22, 30)
delta = dt2 - dt
delta

datetime.timedelta(days=17, seconds=7179)

In [82]:
type(delta)

datetime.timedelta

In [83]:
dt

datetime.datetime(2011, 10, 29, 20, 30, 21)

In [84]:
dt + delta

datetime.datetime(2011, 11, 15, 22, 30)

![Datetime格式化指令（与ISO C89兼容）](http://upload-images.jianshu.io/upload_images/7178691-100f9a20c1536553.png?imageMogr2/auto-orient/strip|imageView2/2/w/1240)

### Control Flow

#### if, elif, and else

```python
if x < 0:
    print('It's negative')
```

```python
if x < 0:
    print('It's negative')
elif x == 0:
    print('Equal to zero')
elif 0 < x < 5:
    print('Positive but smaller than 5')
else:
    print('Positive and larger than or equal to 5')
```

In [85]:
a = 5; b = 7
c = 8; d = 4
if a < b or c > d:
    print('Made it')

Made it


## <mark>c>d不会执行，从左至右</mark>

In [86]:
4 > 3 > 2 > 1

True

#### for loops

```
for value in collection:
    # do something with value
```

## <mark>continue跳出循环 运行下一个value</mark>

```python
sequence = [1, 2, None, 4, None, 5]
total = 0
for value in sequence:
    if value is None:
        continue
    total += value
```

## <mark>break跳出循环 不再运行</mark>

```python
sequence = [1, 2, 0, 4, 6, 5, 2, 1]
total_until_5 = 0
for value in sequence:
    if value == 5:
        break
    total_until_5 += value
```

In [87]:
for i in range(4):
    for j in range(4):
        if j > i:
            break
        print((i, j))

(0, 0)
(1, 0)
(1, 1)
(2, 0)
(2, 1)
(2, 2)
(3, 0)
(3, 1)
(3, 2)
(3, 3)


for a, b, c in iterator:
    # do something

#### while loops

```python
x = 256
total = 0
while x > 0:
    if total > 500:
        break
    total += x
    x = x // 2
```

In [88]:
x = 256
total = 0
while x > 0:
    if total > 500:
        break
    total += x
    x = x // 2
print(total, x)

504 4


In [89]:
x = 256
total = 0
while (x > 0 and total <= 500):
    total += x
    x = x // 2
print(total, x)

504 4


#### pass

> pass是Python中的非操作语句。代码块不需要任何动作时可以使用（作为未执行代码的占位符）；因为Python需要使用空白字符划定代码块，所以需要pass

```python
if x < 0:
    print('negative!')
elif x == 0:
    # TODO: put something smart here
    pass
else:
    print('positive!')
```

#### range

### <mark>range函数返回一个迭代器，不是list</mark>

In [90]:
range(10)

range(0, 10)

In [91]:
type(range(10))

range

In [92]:
list(range(10))

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

In [93]:
list(range(0, 20, 2))

[0, 2, 4, 6, 8, 10, 12, 14, 16, 18]

In [94]:
list(range(5, 0, -1))

[5, 4, 3, 2, 1]

## <mark>常见range，耗用内存很小</mark>

```python
seq = [1, 2, 3, 4]
for i in range(len(seq)):
    val = seq[i]
```

```python
sum = 0
for i in range(100000):
    # % is the modulo operator
    if i % 3 == 0 or i % 5 == 0:
        sum += i
```

#### Ternary expressions

```python
value = true-expr if condition else false-expr
```

if 

In [95]:
x = 5
'Non-negative' if x >= 0 else 'Negative'

'Non-negative'

> 和if-else一样，只有一个表达式会被执行。因此，三元表达式中的if和else可以包含大量的计算，但只有True的分支会被执行。因此，三元表达式中的if和else可以包含大量的计算，但只有True的分支会被执行。