# Mastering Python
## 2. Pythonic Syntax, Common Pitfalls, and Style Guide

## Index

<H3>
- A. Code style – or what is Pythonic code? 
  - Formatting strings – printf-style or str.format? 
  - PEP20, the Zen of Python
  - Explaining PEP8 
<br>
<br>
<br>
- B. Common pitfalls
  - Scope matters!
  - Overwriting and/or creating extra built-ins
  - Modifying while iterating
  - Catching exceptions – differences between Python 2 and 3
  - Late binding – be careful with closures
  - Circular imports
  - Import collisions

------------------------------------------

## A. Code style – or what is Pythonic code?: Python style efficient code

- Clean 
- Simple 
- Beautiful 
- Explicit 
- Readable
<br>
<br>
<br>
<br>


---------------------------

## PEP: Python Enhancement Proposal

- PEP 8: Style Guide for Python Code
- PEP 20: The Zen of Python

-------

## PEP 20 , the Zen of Python

In [1]:
import this

The Zen of Python, by Tim Peters

Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.
Flat is better than nested.
Sparse is better than dense.
Readability counts.
Special cases aren't special enough to break the rules.
Although practicality beats purity.
Errors should never pass silently.
Unless explicitly silenced.
In the face of ambiguity, refuse the temptation to guess.
There should be one-- and preferably only one --obvious way to do it.
Although that way may not be obvious at first unless you're Dutch.
Now is better than never.
Although never is often better than *right* now.
If the implementation is hard to explain, it's a bad idea.
If the implementation is easy to explain, it may be a good idea.
Namespaces are one honking great idea -- let's do more of those!


---------------------------------

### PEP20 1. Beautiful is better than ugly

In short, instead of a somewhat complex function such as this:

In [3]:
 def filter_modulo(items, modulo):
    output_items = []
    for i in range(len(items)):
        if items[i] % modulo:
            output_items.append(items[i])
    return output_items

Or this:

In [5]:
filter_modulo = lambda i, m: [i[j] for i in range(len(i)) if i[j] % m]

Just do the following:

In [8]:
def filter_modulo(items, modulo):
    for item in items:
        if item % modulo:
             yield item

----------------------------------

### PEP20 2. Explicit is better than implicit

아래와 같이 특정 모듈에서 * 로 참조하면 실행하는 함수가 어떤 모듈의 메소드인지 알수 없음..

In [None]:
from spam import *
from eggs import *
some_function()

모듈명을 명시하여 어떤 함수를 통해 실행 되는 함수인지 명시하는 것이 필요 <br>
(위 예시 처럼 모듈 내부함수 이름이 동일한 경우에는 예상한 모듈의 함수를 사용 할 수 X

In [None]:
# better use
import spam
import eggs
spam.some_function()
eggs.some_function()

<br>
<br>
<br>
<br>

--------------

### PEP20 3. Simple is better than complex

_"Simple is better than complex. Complex is better than complicated."_

아래 예시는 SQLite를 통해서 데이터를 로드, 실행
이를 pickle로 대체 함으로써 3줄로 줄임.. 

In [None]:
import sqlite3
connection = sqlite3.connect('database.sqlite')
cursor = connection.cursor()
cursor.execute('CREATE TABLE data (key text, value text)')
cursor.execute('''INSERT INTO data VALUES ('key', 'value')''')
connection.commit()
connection.close()

<h3> vs.</h3>

In [None]:
# more simple
import pickle # Or json/yaml
With open('data.pickle', 'wb') as fh:
    pickle.dump(data, fh, pickle.HIGHEST_PROTOCOL)

--------------

### PEP20 4. Flat is better than nested

중첩 된 코드는 쉽게 가독성이 떨어지는 코드가 되므로 지양
꼭 그렇게 해야한다는 룰은 없지만.. 나눌 수 있는 부분에서는 나누는 것이 권장된다

In [9]:
# nested
def print_matrices():
    for matrix in matrices:
        print('Matrix:')
        for row in matrix:
            for col in row:
                print(col, end='')
            print()
        print()

In [None]:
# flatter version
def print_row(row):
    for col in row:
    print(col, end='')
    
def print_matrix(matrix):
    for row in matrix:
    print_row(row)
    print()
    
def print_matrices(matrices):
    for matrix in matrices:
    print('Matrix:')
    print_matrix(matrix)
    print()

--------------

### PEP 20 5. Sparse is better than dense

_"Whitespace is generally a good thing"_ <br>
여러 줄에 나눠 쓰는 것이 공간은 많이 차지하지만, 가독성을 위해 나눠 쓰는 것이 좋다

In [23]:
# dense code example
>>> def make_eggs(a,b):'while',['technically'];print('correct');\
... {'this':'is','highly':'unreadable'};print(1-a+b**4/2**2)
...
>>> make_eggs(1,2)

correct
4.0


In [22]:
# sparse code example
>>> def make_eggs(a, b):
...     'while', ['technically']
...     print('correct')
...     {'this': 'is', 'highly': 'unreadable'}
...     print(1 - a + ((b ** 4) / (2 ** 2)))
...
>>> make_eggs(1, 2)

correct
4.0


---------

### PEP 20 6. Readability Counts

Shorter does not always mean easier to read:

In [24]:
fib=lambda n:reduce(lambda x,y:(x[0]+x[1],x[0]),[(1,1)]*(n-2))[0]

vs.

In [28]:
# easy to read
def fib(n):
    a, b = 0, 1
    while True:
        yield a
        a, b = b, a + b

------

### PEP 20 7. Practicality beats purity

_"Special cases aren't special enough to break the rules. Although practicality
beats purity."_

To prevent long lines, imports can be made shorter by using a few methods, adding
a backslash, adding parentheses, or just shortening the imports:

In [None]:
from spam.eggs.foo.bar import spam, eggs, extra_spam, extra_eggs,
extra_stuff from spam.eggs.foo.bar import spam, eggs, extra_spam,
extra_eggs

This case can easily be avoided by just following PEP8 (one import per line)

In [None]:
from spam.eggs.foo.bar import spam from spam.eggs.foo.bar import eggs
from spam.eggs.foo.bar import extra_spam from spam.eggs.foo.bar 
from spam.eggs.foo.bar import eggs
from spam.eggs.foo.bar import extra_spam
from spam.eggs.foo.bar import extra_eggs

In [None]:
# worse
from spam_eggs_and_some_extra_spam_stuff import
my_spam_and_eggs_stuff_which_is_too_long_for_a_line

# better
from spam_eggs_and_some_extra_spam_stuff \
import my_spam_and_eggs_stuff_which_is_too_long_for_a_line

------

### PEP 20 8. Errors should never pass silently

_"Errors should never pass silently. Unless explicitly silenced."_

In [None]:
#If you really need to catch all errors, be very explicit about it:
try:
    value = int(user_input)
except:
    pass

# If you really need to catch all errors, be very explicit about it:
try:
    value = int(user_input)
except Exception as e:
    logging.warn('Uncaught exception %r', e)

# Or even better, catch it specifically and add a sane default:
try:
    value = int(user_input)
except ValueError:
    value = 0

The problem is actually even more complicated. What about blocks of code that
depend on whatever is happening within the exception? For example, consider the
following code block:

In [None]:
# Or even better, catch it specifically and add a sane default:
try:
    value = int(user_input)
    value = do_some_processing(value)
    value = do_some_other_processing(value)
except ValueError:
    value = 0

If ValueError is raised, which line is causing it? <br> Is it int(user_input), do_some_
processing(value), or do_some_other_processing(value)? <br> With silent catching
of the error, there is no way to know when regularly executing the code, and this can
be quite dangerous. <br>If for some reason the processing of the other functions changes,
<br>it becomes a problem to handle exceptions in this way. <br>So, unless it was actually
intended to behave like that, use this instead:

In [None]:
# instead
try:
    value = int(user_input)
except ValueError:
    value = 0
else:
    value = do_some_processing(value)
    value = do_some_other_processing(value)

-------------------------

### PEP 20 9. In the face of ambiguity, refuse the temptation to guess

Ambiguity should generally be avoided, so guessing can be avoided. Clear and
unambiguous code generates fewer bugs. A useful case where ambiguity is likely is
function calling. <br> Take, for example, the following two function calls:

In [None]:
spam(1, 2, 3, 4, 5)
spam(spam=1, eggs=2, a=3, b=4, c=5)

They could be the same, but they might also not be. It's impossible to say without
seeing the function. <br> If the function were implemented in the following way, the
results would be vastly different between the two:

In [63]:
def spam(a=0, b=0, c=0, d=0, e=0, spam=1, eggs=2):
    print('a:{} b:{} c:{} d:{} e:{} spam:{} eggs:{}'.format(a, b, c, d, e, spam, eggs))
    pass

In [64]:
spam(1, 2, 3, 4, 5)
spam(spam=1, eggs=2, a=3, b=4, c=5)

a:1 b:2 c:3 d:4 e:5 spam:1 eggs:2
a:3 b:4 c:5 d:0 e:0 spam:1 eggs:2


-------

### PEP 20 10. One obvious way to do it

_"There should be one—and preferably only one—obvious way to do it. Although
that way may not be obvious at first unless you're Dutch."_

------

### PEP 20 11. Now is better than never

_"Now is better than never. Although never is often better than *right* now."_

In [None]:
import warnings
warnings.warn('Something deprecated', DeprecationWarning)

### PEP 20 12. Hard to explain, easy to explain

_"If the implementation is hard to explain, it's a bad idea. If the implementation is
easy to explain, it may be a good idea."_

-------

### PEP 20 13. Namespaces are one honking great idea

_"Namespaces are one honking great idea—let's do more of those!"_

In [None]:
# bad: not easy to figure out
load(fh)

# good (using namespace)
pickle.load(fh)

## Conclusion

• Beautiful <br>
• Readable <br>
• Unambiguous <br>
• Explicit enough <br>
• Not completely void of whitespace <br>

---------------

------------------

## PEP 8, Style Guide for Python Code

### PEP 8 1. Duck typing

_"Duck typing is a method of handling variables by behavior"_
<br>..프로그래밍 언어에서 어떤 대상의 타입 검사를 최대한 늦추는 체계 <br>
(프로그래머에게는 최대한 유연성을 주는 장점)

In [69]:
def sumprod(a, b, m):
    return (a + b) * m
 
x = sumprod(1, 2, 3)
l = sumprod([1], [2, 3], 2)
s = sumprod('one ', 'and two, ', 3)
 
print(x)
print(l)
print(s)

9
[1, 2, 3, 1, 2, 3]
one and two, one and two, one and two, 


In [None]:
# timestamp to generate a filename:
# for this
filename = '%s.csv' % timestamp

In [None]:
# o check whether it is actually a date or datetime object
import datetime
if isinstance(timestamp, (datetime.date, datetime.datetime)):
    filename = '%s.csv' % timestamp
else:
    raise TypeError(
        'Timestamp %r should be date(time) object, got %s'
    % (timestamp, type(timestamp)))

In Python, as there is oftentimes no need for it <br>
just use the following,

In [None]:
import datetime
timestamp = datetime.date(2000, 10, 5)
filename = '%s.csv' % timestamp
print('Filename from date: %s' % filename)

timestamp = '2000-10-05'
filename = '%s.csv' % timestamp
print('Filename from str: %s' % filename)

In [None]:
# again, don't use
if isinstance(value, int):

# just use
value = int(value)

# and instead
import io
if isinstance(fh, io.IOBase):

# Simply use the following
if hasattr(fh, 'read'):

--------

### PEP 8 2. Differences between value and identity comparisons

value와 identity의 차이를 이해 필요

In [74]:
a = 200 + 56
b = 256
c = 200 + 57
d = 257

print('%r == %r: %r' % (a, b, a == b))
print('%r is %r: %r' % (a, b, a is b))
print('%r == %r: %r' % (c, d, c == d))
print('%r is %r: %r' % (c, d, c is d))

256 == 256: True
256 is 256: True
257 == 257: True
257 is 257: False


--------

### PEP 8 3. Loops

In [None]:
# Can be complex than needed
i = 0
while i < len(my_list):
    item = my_list[i]
    i += 1
    do_something(i, item)

In [None]:
# Instead you can do the following
for i, item in enumerate(my_list):
    do_something(i, item)

In [None]:
# more shorter (but, not recommended)
[do_something(i, item) for i, item in enumerate(my_list)]

--------

### PEP 8 4. Maximum line length

Limit all lines to a maximum of 79 characters.

In [None]:
# bad
with open('/path/to/some/file/you/want/to/read') as file_1, \
        open('/path/to/some/file/being/written', 'w') as file_2:
    file_2.write(file_1.read())

In [None]:
# better
filename_1 = '/path/to/some/file/you/want/to/read'
filename_2 = '/path/to/some/file/being/written'
with open(filename_1) as file_1:
    with open(filename_2, 'w') as file_2:
        file_2.write(file_1.read())

--------

## B. Common pitfalls

### 1. Scope matters!

In [21]:
# Function arguments
def spam(key, value, list_=[], dict_={}):
    list_.append(value)
    dict_[key] = value
    
    print('List: %r' % list_)
    print('Dict: %r' % dict_)
    
spam('key 1', 'value 1')
spam('key 2', 'value 2')

# we expect 
"""
List: ['value 1']
Dict: {'key 1': 'value 1'}
List: ['value 2']
Dict: {'key 2': 'value 2'}"""

List: ['value 1']
Dict: {'key 1': 'value 1'}
List: ['value 1', 'value 2']
Dict: {'key 1': 'value 1', 'key 2': 'value 2'}


"\nList: ['value 1']\nDict: {'key 1': 'value 1'}\nList: ['value 2']\nDict: {'key 2': 'value 2'}"

The reason is that list_ and dict_ are actually shared between multiple calls.

In [10]:
def spam(key, value, list_=None, dict_=None):
    if list_ is None:
        list_ = []
    if dict_ is None:
        dict_ = {}
    list_.append(value)
    dict_[key] = value
    
    print('List: %r' % list_)
    print('Dict: %r' % dict_)

spam('key 1', 'value 1')
spam('key 2', 'value 2')

List: ['value 1']
Dict: {'key 1': 'value 1'}
List: ['value 2']
Dict: {'key 2': 'value 2'}


----------

In [9]:
# Class properties

Modifying variables in the global scope <br>
:anything containing spam = makes the variable local to your scope

In [7]:
spam = 1
def eggs():
    spam += 1
    print('Spam: %r' % spam)

eggs()

UnboundLocalError: local variable 'spam' referenced before assignment

In [14]:
# But, it can be (not recommended)
spam =1
def eggs():
    global spam
    spam += 1
    print('Spam: %r' % spam)

eggs()

Spam: 2


In [9]:
# Overwriting and/or creating extra built-ins

# Don't
list = [1, 2, 3]
# Instead,
list_ = [1, 2, 3]

In [1]:
list = list((1, 2, 3))
list

[1, 2, 3]

In [2]:
list((4, 5, 6))

TypeError: 'list' object is not callable

--------

### 2. Modifying while iterating

while iterating through mutable objects such as lists, dicts, or sets, you cannot modify them. -> RuntimeError

In [1]:
dict_ = {'spam': 'eggs'}
list_ = ['spam']
set_ = {'spam', 'eggs'}
for key in dict_:
    del dict_[key]
for item in list_:
    list_.remove(item)
for item in set_:
    set_.remove(item)

RuntimeError: dictionary changed size during iteration

This can be avoided by copying the object. The most convenient option is by using
the list function:

In [3]:
dict_ = {'spam': 'eggs'}
list_ = ['spam']
set_ = {'spam', 'eggs'}
for key in list(dict_):
    del dict_[key]
for item in list(list_):
    list_.remove(item)
for item in list(set_):
    set_.remove(item)

--------

### 3. Catching exceptions – differences between Python 2 and 3

In [110]:
# Python 2
value = 'a'
try:
    # do something here
    value = int(value)
except (ValueError, TypeError) as e:
    print('Exception: %r' % e)

Exception: ValueError("invalid literal for int() with base 10: 'a'",)


Python 3 automatically deletes anything
saved with as variable at the end of the except statements
The reason for this is
that exceptions in Python 3 contain a "\__traceback\__" attribute.

In [111]:
# Python 3 makes this variable local to the exception scope
exception = None
try:
    value = int(value)
except ValueError as exception:
    try:
        print('We caught an exception: %r' % exception)
    finally:
        del exception

We caught an exception: ValueError("invalid literal for int() with base 10: 'a'",)


In [112]:
def spam(value):
    exception = None
    try:
        value = int(value)
    except ValueError as e:
        exception = e
        print('We caught an exception: %r' % exception)
    
    return exception

--------

### 4. Late binding – be careful with closures

_"프로그래밍 언어에서의 클로저란 퍼스트클래스 함수를 지원하는 언어의 네임 바인딩 기술이다. 클로저는 어떤 함수를 함수 자신이 가지고 있는 환경과 함께 저장한 레코드이다. 또한 함수가 가진 프리변수(free variable)를 클로저가 만들어지는 당시의 값과 레퍼런스에 맵핑하여 주는 역할을 한다. 클로저는 일반 함수와는 다르게, 자신의 영역 밖에서 호출된 함수의 변수값과 레퍼런스를 복사하고 저장한 뒤, 이 캡처한 값들에 액세스할 수 있게 도와준다."_

In [120]:
# first_class_function.py
def square_(x):
    return x * x

print(square_(5))

f = square_

print (square_)
print (f)

25
<function square_ at 0x000000000705AB70>
<function square_ at 0x000000000705AB70>


동일한 주소에 함수 오브젝트가 할당 되는 것 확인

In [124]:
# first_class_function.py
def square(x):
    return x * x

f = square

# f(5) 구문으로 square 함수를 호출.
print(f(5))

25


위 처럼 프로그래밍 언어가 퍼스트클래스 함수를 지원하면, <br>
변수에 함수를 할당할 수 있을뿐만 아니라, 인자로써 다른 함수에 전달하거나, <br>
함수의 리턴값으로도 사용 가능

In [127]:
def logger(msg):
    
    def log_message(): #1
        print ('Log: ', msg)
    
    return log_message

log_hi = logger('Hi')
print (log_hi) # log_message 오브젝트가 출력됩니다.
log_hi() # "Log: Hi"가 출력됩니다.

del logger # 글로벌 네임스페이스에서 logger 오브젝트를 지웁니다.

# logger 오브젝트가 지워진 것을 확인합니다.
try:
    print (logger)
except NameError:
    print ('NameError: logger는 존재하지 않습니다.')

log_hi() # logger가 지워진 뒤에도 Log: Hi"가 출력됩니다.

<function logger.<locals>.log_message at 0x0000000006DC6048>
Log:  Hi
NameError: logger는 존재하지 않습니다.
Log:  Hi


[참고] <br>
http://schoolofweb.net/blog/posts/%ED%8C%8C%EC%9D%B4%EC%8D%AC-%ED%8D%BC%EC%8A%A4%ED%8A%B8%ED%81%B4%EB%9E%98%EC%8A%A4-%ED%95%A8%EC%88%98-first-class-function/

--------

### 5. Circular imports

Even though Python is fairly tolerant towards **circular imports**, there are some cases
where you will get errors.

In [None]:
# eggs.py:
from spam import spam
def eggs():
    print('This is eggs')
    spam()

# spam.py:
from eggs import eggs
def spam():
    print('This is spam')
if __name__ == '__main__':
    eggs()

In [None]:
# output:
"""
Running spam.py will result in a circular import error: 
Traceback (most recent call last):
    File "spam.py", line 1, in <module>
        from eggs import eggs
    File "eggs.py", line 1, in <module>
        from spam import spam
    File "spam.py", line 1, in <module>
        from eggs import eggs
    ImportError: cannot import name 'eggs'
"""

In [144]:
# Solution 1: To move the imports within the functions so that they occurat runtime. 

# eggs.py:
def eggs():
    from spam import spam
    print('This is eggs')
    spam()

# spam.py:
def spam():
    from eggs import eggs
    print('This is spam')

if __name__ == '__main__':
    eggs()

In [145]:
# Solution 2: Lastly there is the solution of moving the imports below the code that actually uses them
eggs.py:
def eggs():
    print('This is eggs')
    spam()
    
from spam import spam

# spam.py:
def spam():
    print('This is spam')

    from eggs import eggs
    
if __name__ == '__main__':
    eggs()

### 6. Import collisions

In [None]:
# stl.py for "numpy-stl project"

import stl # but import itself (not stl package)

In [147]:
# Solution:  A relative import is generally a better option
# instead of writing, 
import spam
# write 
from . import spam

# 2nd : Avoid duplicated name.

------ ------

## Ref.

[1]: printable formatting 에 대한 설명 <br>
https://pyformat.info/ <br>
[2]: about-python-coding-convention <br>
https://spoqa.github.io/2012/08/03/about-python-coding-convention.html <br>
[3]: What do different aphorisms in The Zen of Python mean? <br>
https://www.quora.com/What-do-different-aphorisms-in-The-Zen-of-Python-mean <br>
[4]: PEP 20 code example <br>
https://gist.github.com/evandrix/2030615 <br>
[5]: jupyter_contrib_nbextensions <br>
https://github.com/ipython-contrib/jupyter_contrib_nbextensions <br>