# Język Python - Wykład 7

Python3: "A general goal is to reduce feature duplication by removing old ways of doing things" (PEP 3100)

Różnice Python 2.x i 3.x
  - https://docs.python.org/3/whatsnew/3.0.html
  - https://wiki.python.org/moin/Python2orPython3
  - http://python-notes.curiousefficiency.org/en/latest/python3/questions_and_answers.html

 - print vs print()
 - UTF i kodowanie
 - zip vs zip_longest
 - metaclasses
 - nowe i stare klasy
 - slowa kluczowe, True i False
 - biblioteki

## Ciąg znaków - przypomnienie

In [None]:
s = "ala ma kota"

In [None]:
s[0]

In [None]:
s[0] = 'b'

## Kodowanie znaków
Unicode - konsorcjum/standard. Kodowania:
  - UTF-8
  - UTF-16 / USC-2
  - UTF-32 / USC-4
  
Inne popularne kodowania:
  - ASCII
  - ISO 8859-2 (latin2)
  - Windows 1250

Pojęcia:
    - znak
    - kod
    - glif
    - kodowanie

## Zastosowania
### Terminal
 - LANG
This variable determines the locale category for native language, local customs and coded character set in the absence of the LC_ALL and other LC_* (LC_COLLATE, LC_CTYPE, LC_MESSAGES, LC_MONETARY, LC_NUMERIC, LC_TIME) environment variables. This can be used by applications to determine the language to use for error messages and instructions, collating sequences, date formats, and so forth.
 - LC_ALL
This variable determines the values for all locale categories. The value of the LC_ALL environment variable has precedence over any of the other environment variables starting with LC_ (LC_COLLATE, LC_CTYPE, LC_MESSAGES, LC_MONETARY, LC_NUMERIC, LC_TIME) and the LANG environment variable.

Przykłady:
 - export LANG=C  - POSIX locale (ASCII)
 - export LANG=en_US.UTF-8
 - export LANG=pl_PL.UTF-8
 ### Argumenty linii poleceń
 ### Nazwy plików i katalogów

### ASCII
  - kod 7-bitowy (1968, USA)
  - znaki sterujące: 0-31 i 127
  - znaki drukowalne: 32-126
  - https://pl.wikipedia.org/wiki/ASCII

In [None]:
" ".join(chr(i) for i in range(32,127))

In [None]:
print("a\nb")

In [None]:
print("a\r\nb")

In [None]:
print("a\r\rb")

## klasa *bytes* 
ciągi 8-bitowych liczb z zakresu 0 <= x < 255

In [None]:
bytes(10)

In [None]:
help(bytes)

In [None]:
bytes([10])

In [None]:
bytes([65])

In [None]:
tmp = bytes([1,2,3,4])
s = "1234"

In [None]:
type(tmp),type(s)

In [None]:
tmp[1],type(tmp[1])

In [None]:
s[1],type(s[1])

In [None]:
tmp[1:],type(tmp[1:])

In [None]:
list(tmp)[1],type(list(tmp)[1])

## Windows 1250
  - Microsoft
  - 8-bit
  - alfabety w środkowej i centralnej Europie (polski, czeski itp)
  - kompatybilny z ASCII

In [None]:
b1 = "agh".encode("cp1250")
b1, list(b1),len(b1)

In [None]:
b2 = "ąę".encode("cp1250")
b2, list(b2),len(b2)

In [None]:
b'\xbf'.decode('cp1250')

## ISO 8859-2 / ISO Latin2
  - standard ISO
  - 8-bit
  - alfabety w środkowej i centralnej Europie (polski, czeski itp)
  - kompatybilny z ASCII

In [None]:
b1 = "agh".encode("iso8859-2")
b1, list(b1)

In [None]:
b2 = "ąę".encode("iso8859-2")
b2, list(b2)

In [None]:
iso8859_codes = [bytes([x]).decode("iso8859-2", errors='replace') for x in range(255)] 
cp1250_codes = [bytes([x]).decode("cp1250", errors='replace') for x in range(255)] 
[ (i,it) for i,it in enumerate(zip(iso8859_codes,cp1250_codes)) if it[0] != it[1]]

## UTF-8
  - standard UNICODE
  - zmiennej długości (1-4 bajtów)
  - kompatybilny z ASCII


In [None]:
b1 = "agh".encode("utf-8")
b1, list(b1)

In [None]:
b2 = "ąę".encode("utf-8")
b2, list(b2)

## UTF-16
  - standard UNICODE
  - zmiennej długości (2-4 bajtów)
  - nie kompatybilny z ASCII

In [None]:
b1 = "agh".encode("utf-16")
b1, list(b1)

In [None]:
b2 = "ąę".encode("utf-16")
b2, list(b2)

## UTF-32 / USC-4
  - stałej długości (4 bajty)

In [None]:
b1 = "agh".encode("utf-32")
b1, list(b1)

In [None]:
b2 = "ąę".encode("utf-32")
b2, list(b2)

## Hint
Software should only work with Unicode strings internally, decoding the input data as soon as possible and encoding the output only at the end.
 - https://docs.python.org/3/howto/unicode.html
 - http://farmdev.com/talks/unicode/
 - http://lucumr.pocoo.org/2014/5/12/everything-about-unicode/

## Python 3.x
  - Unicode domyślnie (w tym ASCII) - str
  - osobny typ dla danych binarnych - bytes
  - wewnętrzna reprezentacja str:
     - do 3.3 - zależnie od kompilacji USC-2 albo USC-4
     - od 3.3 - PEP 0393 - 1 (ASCII),2(USC-2) albo 4(USC-4) bajty

In [None]:
type("AGH")

In [None]:
import sys
print(sys.getdefaultencoding())

In [None]:
type("ąę")

In [None]:
répertoire = "/tmp/records.log"
with open(répertoire, "w") as f:
    f.write("test\n")

In [None]:
%sx cat /tmp/records.log

In [None]:
"\N{GREEK CAPITAL LETTER DELTA}"

In [None]:
"\u0394"

In [None]:
ord('Δ')

In [None]:
chr(916)

kodowanie pliku:
    #!/usr/bin/env python
    # -*- coding: latin-1 -*-

In [None]:
# automatyczne rozpoznawanie kodowania
with open('1.txt','r') as f:
    tmp = f.read()
    print(tmp)
    print(len(tmp),type(tmp))

In [None]:
# wczytywanie ciągu znaków
with open('1.txt','br') as f:
    tmp = f.read()
    print(tmp)
    print(len(tmp),type(tmp))

In [None]:
%sx file L7.ipynb

In [None]:
%sx iconv -c -f UTF-8 -t ISO8859-2 L7.ipynb -o tmp.ipynb

In [None]:
%sx file tmp.ipynb

In [None]:
%cat test/ls.py

In [None]:
%sx cd test && LANG=C python2 ls.py && cd .

In [None]:
s = input()

In [None]:
type(s)

In [None]:
names = ['Leszek', 'Łukasz', 'Maria']

In [None]:
sorted(names)

In [None]:
import locale
locale.setlocale(locale.LC_ALL, 'en_US.UTF-8')
sorted(names,key=locale.strxfrm)

Collation - definiuje porządek sortowania
  - https://en.wikipedia.org/wiki/Collation
  - https://en.wikipedia.org/wiki/Unicode_collation_algorithm

## Python 2.x
  - ASCII domyślnie
  - dane binarnie domyślnie
  - osobny typ dla Unicode (spoza ASCII)

In [None]:
%%python2
print type("AGH")

In [None]:
%%python2
import sys
print sys.getdefaultencoding()

In [None]:
%%python2
print type(u"aa\u0107")

In [None]:
%%python2
d = u"au\u0107"
print type(d),len(d)

# encode : unicode -> str

e = d.encode("utf-8")
print e,type(e),len(e)

# decode : str -> unicode

dd = e.decode("utf-8")
print type(dd),len(dd)

In [None]:
%%python2
f = open('1.txt','r')
tmp = f.read()
print tmp
print len(tmp),type(tmp)
f.close()
print tmp[0]

In [None]:
%%python2
import codecs
f = codecs.open('1.txt','r',encoding="utf-8")
tmp = f.read()
print len(tmp),type(tmp)
f.close()
print tmp[0].encode('utf-8')

In [None]:
%%python2
f = open('2.txt','w')
f.write(u'au\u0107')
f.close()

In [None]:
%%python2
names = [u'\u0141ukasz', u'Maria',u'Leszek']
print names
print sorted(names)
import locale
locale.setlocale(locale.LC_ALL, 'en_US.UTF-8')
print sorted(names,cmp=locale.strcoll)

### Varia
http://www.laurentluce.com/posts/python-string-objects-implementation/
"Wow, the sophistication of the implementation contrasted with the simplicity of the API reminds me how much I am standing on the shoulders of giants every time I program in Python."

## print

Arguments (PEP 3105 -- Make print a function):
  - print is the only application-level functionality that has a statement (syntax!) dedicated to it. Syntax should cover only necessary items. 
  - print() can be easily replaced by more sophisticated functions, while print doesn't (>> syntax!)
  - in print - not easy to change separator from space to other character

In [None]:
%%python3
print('Hello world', 'here I come', sep="")

In [None]:
%%python2
print 'Hello world','here I come'

In [None]:
%%python3
def printnew(*args, **kwargs):
    __builtins__.print('AGH')
    return __builtins__.print(*args, **kwargs)
print = printnew
print('Hello world')
print = __builtins__.print

In [None]:
%%python2
def printnew(*args):
    print 'AGH'
    print args
print = printnew
print 'Hello world'

### Chevron print

Chevron (pl. szewron) - naszywka na rękawie lub naramienniku munduru, w kształcie prostej lub odwróconej litery "V"
https://pl.wikipedia.org/wiki/Szewron_(naszywka) https://en.wikipedia.org/wiki/Chevron_(insignia)

In [None]:
%%python3
import sys
print('Error!', file=sys.stderr)
print('Not an error', file=sys.stdout)
print('Error!', file=sys.stderr)

In [None]:
%%python2
import sys
print >> sys.stderr, 'Error!'
print >> sys.stdout, 'Not an error'
print >> sys.stderr, 'Error!'

## map, filter, reduce

### Views And Iterators Instead Of Lists

In [None]:
%%python2
isPrime=lambda x: all(x % i != 0 for i in range(int(x**0.5)+1)[2:])
import numpy as np
print np.array(filter(isPrime,[10,20,30,13,7]))

In [None]:
%%python3
isPrime=lambda x: all(x % i != 0 for i in range(int(x**0.5)+1)[2:])
import numpy as np
print(np.array(list(filter(isPrime,[10,20,30,13,7]))))

### map

In [None]:
%%python2
def p(*args):
   print [str(x) for x in args]
list(map( p, [1,2,3,4],[100] ))

In [None]:
%%python3
def p(*args):
   print([str(x) for x in args])
list(map( p, [1,2,3,4],[100] ))

In [None]:
%%python2
def transposed(matrix):
    """Return transposed matrix (list of lists).

    This function can handle non-square matrices.
    In this case it fills shorter list with None.

    >>> transposed( [[1,2,3], [3,4]] )
    [[1, 3], [2, 4], [3, None]]
    """
    return map(lambda *row: list(row), *matrix)
print transposed( [[1,2,3], [3,4]] )

In [None]:
%%python3
from itertools import zip_longest
def transposed(matrix):
    """Return transposed matrix (list of lists).

    This function can handle non-square matrices.
    In this case it fills shorter list with None.

    >>> transposed( [[1,2,3], [3,4]] )
    [[1, 3], [2, 4], [3, None]]
    """
    return list(map(list, zip_longest(*matrix)))
print(transposed( [[1,2,3], [3,4]] ))

### reduce

Python 3000 FAQ
http://www.artima.com/weblogs/viewpost.jsp?thread=211200

Q. If you're killing reduce(), why are you keeping map() and filter()?

A. I'm not killing reduce() because I hate functional programming; I'm killing it because almost all code using reduce() is less readable than the same thing written out using a for loop and an accumulator variable. On the other hand, map() and filter() are often useful and when used with a pre-existing function (e.g. a built-in) they are clearer than a list comprehension or generator expression. (Don't use these with a lambda though; then a list comprehension is clearer and faster.)

In [None]:
%%python2
def mul(x,y): return x/2.0+1/x
x = reduce(mul, range(1, 11))
print(x)

In [None]:
%%python3
import functools
def mul(x,y): return x/2.0+1/x
x = functools.reduce(mul, range(1, 11))
print(x)

In [None]:
%%python3
def mul(x,y): return x/2.0+1/x
result = 1
for x in range(1,11):
    result = mul(result, x)
x =result
print(x)

### range

In [None]:
%%python2
range(10)
xrange(10)
print [x for x in range(10)]

In [None]:
%%python3
print(range(10))
print(xrange(10))

### zip

In [None]:
%%python2
A = [1,2,3,4]
B = ['a','b','c']
zip(A,B)

In [None]:
%%python3
A = [1,2,3]
B = ['a','b','c']
zip(A,B)

## map, filter, reduce

In [None]:
%%python2
print 1 < 'a'
print sorted([3,1,2,'b','a','c'])

In [None]:
%%python3
print(1 < 'a')
print(sorted([3,1,2,'b','a','c']))

## integers
long renamed to int

In [None]:
%%python2
int("100")
long("100")

In [None]:
%%python3
int("100")
long("100")

https://www.python.org/dev/peps/pep-0238/
The current division (/) operator has an ambiguous meaning for
    numerical arguments: it returns the floor of the mathematical
    result of division if the arguments are ints or longs, but it
    returns a reasonable approximation of the division result if the
    arguments are floats or complex.  This makes expressions expecting
    float or complex results error-prone when integers are not
    expected but possible as inputs.
    
    The problem is unique to dynamically typed languages: in a
    statically typed language like C, the inputs, typically function
    arguments, would be declared as double or float, and when a call
    passes an integer argument, it is converted to double or float at
    the time of the call.
    
    The correct work-around is subtle: casting an argument to float()
    is wrong if it could be a complex number; adding 0.0 to an
    argument doesn't preserve the sign of the argument if it was minus
    zero.  The only solution without either downside is multiplying an
    argument (typically the first) by 1.0.  This leaves the value and
    sign unchanged for float and complex, and turns int and long into
    a float with the corresponding value.

In [None]:
%%python2
print 1/2

In [None]:
%%python3
print(1/2, 1//2)

In [None]:
import math
print(math.atan2(0.0,-0.0))

In [None]:
import math
print(math.atan2(-0.0,-0.0))

In [None]:
0.0 == -0.0

## Wyjatki

One of Python's guiding maxims is "there should be one -- and preferably only one -- obvious way to do it" [1] . Python 2.x's raise statement violates this principle, permitting multiple ways of expressing the same thought

In [None]:
%%python2
class MyEx(Exception):
    x = 0
    y = 0
    
einst = MyEx()
einst.x = 1
einst.y = 1

try:
    raise einst
#    raise MyEx, einst
except MyEx, e:
    print e.x, e.y

In [None]:
%%python3
class MyEx(Exception):
    x = 0
    y = 0
    
einst = MyEx()
einst.x = 1
einst.y = 1

try:
    raise einst
except MyEx as e:
    print(e.x, e.y)

In [None]:
%%python2
try:
    1/0
except Exception:
    raise TypeError

In [None]:
%%python2
class FooException(Exception):
    pass

class BarException(Exception):
    pass

class foo(object):
    def d(self):
       self.e()

    def e(self):
       self.f()

    def f(self):
       raise FooException('Problem')

class bar(object):
    def a(self):
       self.c()

    def b(self):
       self.c()

    def c(self):
        try:
            f = foo()
            f.d()
        except FooException as e:
            raise BarException(e)

bar().a()

## List comprehension

In [None]:
%%python3
x = 'before'
a = [x for x in [1, 2, 3]]
print(x)

## Rozpakowywanie krotek
PEP 3113 -- Removal of Tuple Parameter Unpacking
Unfortunately this feature of Python's rich function signature abilities, while handy in some situations, causes more issues than they are worth. 

In [None]:
%%python2
def middle( (a,b), (c,d) ):
    return 0.5*(a+c),0.5*(b+d)
print middle( (1,0), (13,-4))

In [None]:
%%python3
def middle( p1, p2 ):
    a,b = p1
    c,d = p2
    return 0.5*(a+c),0.5*(b+d)
print(middle( (1,0), (13,-4)))