# Python new features (start at 3.5 version)

## Python 3.5
* 矩陣乘法專用的運算符@
* 解構列表，**解構字典
* typing model用於註解參數與回傳型態 (此文檔於Python 3.8說明)

In [6]:
#矩陣乘法專用的運算符@
import numpy as np
x = np.array([3, 2])
p = np.array([5, 2])
print(x.shape, p.shape)
#p = np.array([5, 2]).transpose()
print(x@p) #even though it is invalid in math.

(2,) (2,)
19


註解參數與回傳型態一般使用

In [3]:
#General use
def get_value(json: str) -> str:
#meaning that json "should" be str and return value "should" be str (but it is not force.)
     x = parse(json)
     return x

## Python 3.6
* f-string (後續亦增加其他用法)
* async 和 await的協同生成器 (後續版本亦有更新)
* secrets module用於password等管理

數字underline用法

In [8]:
#數字underline用法
a = 1_000_000_000_000_000    # 1000000000000000
b = 0x_FF_FF_FF_FF       # 4294967295
print(a)
print(b)
print(1_000_000_000_000_000)

1000000000000000
4294967295
1000000000000000


async 和 await的協同生成器

In [49]:
#async 和 await的協同生成器
import asyncio

async def ticker(delay, to):

  for i in range(to):
    yield i
    await asyncio.sleep(delay)

n=5
print(ticker(3,n))
print(ticker(3,n))

<async_generator object ticker at 0x000002108188BC10>
<async_generator object ticker at 0x000002108188BC10>


## Python 3.7
* 內建breakpoint
* 類型和註解
* dictionary保持插入順序
* dataclass

內建breakpoint

In [11]:
# default breakpoint
for i in range(3):
    breakpoint() #please use it in Spyder or Pycharm...
    print(i)

0
1
2


類型和註解

In [105]:
# previous way
def foo(bar, baz):
    pass
foo(3, 1)

In [13]:
# with annotations
def foo(bar: 'Descrbie the bar', baz: print('random value')) -> 'return a thing':
    pass
foo(3, 1)

random value


In [7]:
#向前引用註解
from __future__ import annotations #in 4.0 version
class User:
    def __init__(self, name: str, prev_user: User) -> None:
        pass

**The important update in Python 3.7 -> dataclasses**  
* 免去 \_\_init\_\_() 方法的程式碼
* field() 的使用
* @dataclass() 的參數

In [12]:
#The original way
class Employee:
    """Class that contains basic information about an employee."""
    def __init__(self, name: str, job: str, salary: int = 0) -> None:
        self.name = name
        self.job = job
        self.salary = salary
employee_1 = Employee("Huang", "Data Analyst", 22_000)
print(employee_1)

<__main__.Employee object at 0x000001E6D8A6FAF0>


dataclass use &rarr; 免去 init() 方法的程式碼

In [82]:
from dataclasses import dataclass

@dataclass #The __repr__ is defined automatically...
class Employee_dc:
    name: str
    job: str
    salary: int = 0
        
employee_2 = Employee_dc("Wang", "Data Engineer", 24_000)
print(employee_2)

Employee_dc(name='Wang', job='Data Engineer', salary=24000)


\_\_post_init__()函式會在所有屬性在實例被初始化之後立刻接著被呼叫，使用方法是在 dataclass 內自己定義一個名為 \_\_post\_init\_\_() 的函式，它最常被用的場景是需要取決於初始的其他屬性來產生新資料的時候。

In [89]:
from dataclasses import dataclass, field #import field
import datetime

@dataclass #The __repr__ is defined automatically...
class Employee_dc:
    name: str
    job: str
    salary: int = 0 #default
    create_time:datetime.datetime = 0 #default
    company:str = 0 #default
        
    def __post_init__(self): #fuction __post_init__ would be implemented automatically.
        self.create_time = datetime.datetime.now()
        self.company = 'Python_learning_company'
        
employee_2 = Employee_dc("Wang", "Data Engineer", 24_000) #datetime.datetime.now()
print(employee_2)

Employee_dc(name='Wang', job='Data Engineer', salary=24000, create_time=datetime.datetime(2023, 2, 5, 3, 4, 50, 659908), company='Python_learning_company')


**field(*, default=MISSING, default_factory=MISSING, repr=True, hash=None, init=True, compare=True, metadata=None)：客製化資料屬性**  
field() 是用來有彈性地客製化 dataclass 裡的各個屬性（Attribute）資料。  
Suppose that the class is built with the creation time (automatically).

In [21]:
import datetime
print(datetime.datetime.now())

2023-02-05 00:28:12.741874


#filed keyword  
***init=false*** &rarr; attributes cannot be inpuuted.

In [None]:
from dataclasses import dataclass, field #import field
import datetime

@dataclass #The __repr__ is defined automatically...
class Employee_dc:
    name: str
    job: str
    salary: int = 0
    create_time:datetime.datetime = field(init=False)
    company:str = field(init=False)
        
    def __post_init__(self): #fuction __post_init__ would be implemented automatically.
        self.create_time = datetime.datetime.now()
        self.company = 'Python_learning_company'
        
employee_2 = Employee_dc("Wang", "Data Engineer", 24_000) #datetime.datetime.now()
print(employee_2)

filed keywords &rarr; **default_factory=object or default=value**  
default 與 default_factory 兩個參數只能擇一使用

In [41]:
from dataclasses import dataclass, field
import datetime

@dataclass #The __repr__ is defined automatically...
class Employee_dc:
    name: str
    job: str
    salary: int = 0
    create_time:datetime.datetime = field(init=False, default_factory=datetime.datetime.now) #default=datetime.datetime.now()
        
#     def __post_init__(self):
#         self.create_time = datetime.datetime.now()
        
employee_2 = Employee_dc("Wang", "Data Engineer", 24_000) #datetime.datetime.now()
print(employee_2)

Employee_dc(name='Wang', job='Data Engineer', salary=24000, create_time=datetime.datetime(2023, 2, 5, 1, 19, 56, 866048))


「可變物件引數預設值」就是需要用到 default_factory 的重要場景

In [57]:
@dataclass
class Employee:
    """Class that contains basic information about an employee."""
    name: str
    job: str
    salary: int = 0
    skillset: list[str] = field(default_factory=list) #instead of skilllset: list[str] = [] #[] is mutable.
em1 = Employee("Huang", "Data Analyst", 22_000) 
print(em1)
em1.skillset.append("Python")
em1.skillset.append("C++")
print(em1)

Employee(name='Huang', job='Data Analyst', salary=22000, skillset=[])
Employee(name='Huang', job='Data Analyst', salary=22000, skillset=['Python', 'C++'])


filed keyword &rarr; **repr=False**  
dataclass 自動幫我們實作 \_\_repr\__() 函式，print() 因此才可以呈現出較有意義的內容，但也有某些情況我們不希望部分屬性被 print 出來，這可以透過設定 field(repr=False) 來達成。

In [42]:
from dataclasses import dataclass, field
import datetime

@dataclass #The __repr__ is defined automatically...
class Employee_dc:
    name: str
    age: int = field(repr=False) #field(default=18, repr=False)
    job: str
    salary: int = 0
    create_time:datetime.datetime = field(init=False, default_factory=datetime.datetime.now) #default=datetime.datetime.now()
        
#     def __post_init__(self):
#         self.create_time = datetime.datetime.now()
        
employee = Employee_dc("Wang", 18, "Data Engineer", 24_000) #datetime.datetime.now()
print(employee)
print(employee.age)

Employee_dc(name='Wang', job='Data Engineer', salary=24000, create_time=datetime.datetime(2023, 2, 5, 1, 20, 1, 319835))
18


當你創建 dataclass 實例的時候，它可以預先幫你實作好的函式（dunder method）包括：
* __eq__() [預設]
* __repr__() [預設]
* __lt__() and __gt__()
* __hash__()
* 還有更多函式，請見 PEP 557

In [92]:
class RegularCard:
    def __init__(self, rank, suit):
        self.rank = rank
        self.suit = suit

    def __repr__(self):
        return (f'{self.__class__.__name__}'
                f'(rank={self.rank!r}, suit={self.suit!r})')
    
    def __eq__(self, other): #equality
        if not(isinstance(other, self.__class__)): #if other.__class__ is not self.__class__:
            return NotImplemented #or false
        return (self.rank, self.suit) == (other.rank, other.suit)
    
    def __lt__(self, other): #less than -> introduce later
        if other.__class__ is not self.__class__:
            return NotImplemented
        return (self.rank, self.suit) < (other.rank, other.suit)
    
A = RegularCard(3,5)
B = RegularCard(4,6)
print(A, B)
print(A == B, A < B)

RegularCard(rank=3, suit=5) RegularCard(rank=4, suit=6)
False True


In [93]:
#dataclass
@dataclass
class RegularCard:
    rank:int = field(default=0)
    suit:int = field(default=0)
        
A = RegularCard(3,5)
B = RegularCard(4,6)
C = RegularCard(3,5)

# automatically generate by dataclass
#     def __repr__(self):
#         return (f'{self.__class__.__name__}'
#                 f'(rank={self.rank!r}, suit={self.suit!r})')
    
#     def __eq__(self, other): #equality
#         if not(isinstance(other, self.__class__)): #if other.__class__ is not self.__class__:
#             return NotImplemented #or false
#         return (self.rank, self.suit) == (other.rank, other.suit

print(A, B, C) #by __repr__
print(A == B)
print(A == C, A is C)

RegularCard(rank=3, suit=5) RegularCard(rank=4, suit=6) RegularCard(rank=3, suit=5)
False
True False


**dataclass 用裝飾器參數再擴充功能**  **#the arguments in PYthon 3.11.**  
@dataclass(init=True, repr=True, eq=True, order=False, unsafe_hash=False, frozen=False, match_args=True, kw_only=False, slots=False, weakref_slot=False) 

In [106]:
# Originial definition
from datetime import datetime
import dateutil

class Article(object):
    def __init__(self, _id, author_id, title, text, tags=None, created=datetime.now(), edited=datetime.now()):
        self._id = _id
        self.author_id = author_id
        self.title = title
        self.text = text
        self.tags = list() if tags is None else tags
        self.created = created
        self.edited = edited

        if type(self.created) is str:
           self.created = dateutil.parser.parse(self.created)
        
        if type(self.edited) is str:
           self.edited = dateutil.parser.parse(self.edited)
        
        def __eq__(self, other):
            if not isinstance(other, self.__class__):
                return NotImplemented
            return (self._id, self.author_id) == (other._id, other.author_id)
        
        def __lt__(self, other):
            if not isinstance(other, self.__class__):
                return NotImplemented
            return (self._id, self.author_id) < (other._id, other.author_id)
        
        def __repr__(self):
            return '{}(id={}, author_id={}, title={})'.format(
                    self.__class__.__name__, self._id, self.author_id, self.title)

order：設定這個參數讓你的 dataclass 可以彼此比較大小、並排序。  
包含 \_\_gt__、\_\_ge__、\_\_lt__、\_\_le__。

In [109]:
#dataclass
from dataclasses import dataclass, field
from typing import List
from datetime import datetime
import dateutil

@dataclass(order=True)
class Article(object):
    _id: int
    author_id: int
    title: str = field(compare=False)#not compare in __eq__ et al.
    text: str = field(repr=False, compare=False)
    tags: List[str] = field(default_factory=list, repr=False, compare=False) 
    created: datetime = field(default=datetime.now(), repr=False, compare=False)
    edited: datetime = field(default=datetime.now(), repr=False, compare=False)

    def __post_init__(self):
       if type(self.created) is str:
           self.created = dateutil.parser.parse(self.created)

       if type(self.edited) is str:
           self.edited = dateutil.parser.parse(self.edited)

Another simple example:

In [110]:
@dataclass(order=True)
class T:
    a: int
    b: int
    c: int
t1 = T(1, 2, 3)
t2 = T(1, 0, 20)
print(t1==t2, t1>=t2, t1<t2)

False True False


**frozen**  
使用 @dataclass(frozen=True) 之後，所有對 dataclass *全部屬性*賦值（Assign）的行為都會出現程式錯誤（Exception），也就是要你的資料「結凍」起來、不可被修改。

In [8]:
from dataclasses import dataclass, field
@dataclass(frozen=True)
class T:
    a: int
    b: int
    c: int
t1 = T(1, 2, 3)
t2 = T(1, 0, 20)

print(t1, t2)
t1.a = 100 #FrozenInstanceError

T(a=1, b=2, c=3) T(a=1, b=0, c=20)


FrozenInstanceError: cannot assign to field 'a'

如果想結凍不可被修改的屬性資料只有一、兩個變數，請改用OPP中的 @property 封裝技術

**slots**  
不需要在程式執行階段增減物件屬性，就適合使用 slots，它會事先定義物件屬性需要多少記憶體空間，執行時就可以讓佔用的記憶體瘦身，並且加快存取速度。  
類別內部，自訂的屬性資料 Python 實際上是用 dict 儲存，可以用 \_\_dict__ 查看:

In [15]:
from dataclasses import dataclass, field
@dataclass()
class T:
    a: int
    b: int
    c: int
        
t1 = T(1, 2, 3)
print(t1.__dict__)

{'a': 1, 'b': 2, 'c': 3}


In [20]:
@dataclass()
class T:
    __slots__ = ['a', 'b', 'c']
    a: int
    b: int
    c: int
        
t1 = T(1, 2, 3)

print(t1)
print(t1.__slots__) #print(t1.__dict__) -> 'T' object has no attribute '__dict__'

T(a=1, b=2, c=3)
['a', 'b', 'c']


在Python 3.10 以上版本，只要裝飾器參數設定 @dataclass(slots=True)，dataclass 會自動幫你實作 __slots__

**kw_only**  #keyword only, Python 3.10 以上  
要求使用者創造實例時，只能用關鍵字引數（Keyword Argument），不能用位置引數（Positional Argument）
t1 = T(a=1, b=2, c=3)

**dataclass with pandas**  
pandas.DataFrame 可以接收 dict-like container for Series objects

In [45]:
import pandas as pd
 
class Product:
    def __init__(self, name, qty):
        self.name = name
        self.qty = qty
 
products = []
for i in range(10):
    products.append(Product(name=i, qty=i*2))
 
df = pd.DataFrame(products)

In [37]:
print(products)
print(products[0].__dict__)
df

[<__main__.Product object at 0x000002108105F640>, <__main__.Product object at 0x000002108105F3D0>, <__main__.Product object at 0x000002108105F310>, <__main__.Product object at 0x000002108105FF70>, <__main__.Product object at 0x0000021081165190>, <__main__.Product object at 0x0000021081165580>, <__main__.Product object at 0x0000021081165CA0>, <__main__.Product object at 0x0000021081165820>, <__main__.Product object at 0x00000210811655E0>, <__main__.Product object at 0x00000210811655B0>]
{'name': 0, 'qty': 0}


Unnamed: 0,0
0,<__main__.Product object at 0x000002108105F640>
1,<__main__.Product object at 0x000002108105F3D0>
2,<__main__.Product object at 0x000002108105F310>
3,<__main__.Product object at 0x000002108105FF70>
4,<__main__.Product object at 0x0000021081165190>
5,<__main__.Product object at 0x0000021081165580>
6,<__main__.Product object at 0x0000021081165CA0>
7,<__main__.Product object at 0x0000021081165820>
8,<__main__.Product object at 0x00000210811655E0>
9,<__main__.Product object at 0x00000210811655B0>


In [43]:
from dataclasses import dataclass
import pandas as pd

@dataclass
class Product:
    name: str
    qty: int

products = []
for i in range(10):
    products.append(Product(name=i, qty=i*2))
df = pd.DataFrame(products)

In [44]:
print(products)
df.head()

[Product(name=0, qty=0), Product(name=1, qty=2), Product(name=2, qty=4), Product(name=3, qty=6), Product(name=4, qty=8), Product(name=5, qty=10), Product(name=6, qty=12), Product(name=7, qty=14), Product(name=8, qty=16), Product(name=9, qty=18)]


Unnamed: 0,name,qty
0,0,0
1,1,2
2,2,4
3,3,6
4,4,8


## Python 3.8
* :=海象賦值 &rarr; (variable_name := expression)，用於if, while等協助減少呼叫asssignment的次數以及順暢邏輯
* 詳細定義位置參數以及關鍵字參數的區隔
* fstring可用=直接表示變數
* typing模組的改進:Python是動態類型語言，但可以通過typing模組添加類型提示，以便協力廠商工具驗證Python代碼。Python 3.8給typing添加了一些新元素，因此它能夠支援更健壯的檢查：
    - final修飾器和Final類型標注表明，被修飾或被標注的物件在任何時候都不應該被重寫、繼承，也不能被重新賦值。
    - Literal類型將運算式限定為特定的值或值的列表（不一定是同一個類型的值）。
    - TypedDict可以用來創建字典，其特定鍵的值被限制在一個或多個類型上。注意這些限制僅用於編譯時確定值的合法性，而不能在運行時進行限制。

:=海象符號 (walrus operator) &rarr; (variable_name := expression)

In [102]:
#:=海象符號 (walrus operator) -> (variable_name := expression)
#original 
a = 5
if (a >= 3):
    print('do something...')
    print(a)
### by :=
if (a:=5)>=(3):
    print('do something...')
    print(a)
### another case
a = 'abcdefg'
if (n:=len(a))>=2:
    print(n) #we don't need to use len(a) twice.

do something...
5
do something...
5
7


In [103]:
#original 
print("original ")
n = 3
while (n):
    print(n)
    n -= 1
print("by :=")
#by :=
n = 4
while (n:=n-1):
    print(n)

original 
3
2
1
by :=
3
2
1


詳細定義位置參數以及關鍵字參數的區隔

In [104]:
#詳細定義位置參數以及關鍵字參數的區隔
def f(a, b, /, c, d, *, e, f):   #a, b can be position/ c, d can be posion and keyword/ e, f can be keyword...
    print(a, b, c, d, e, f)

f(10, 20, 30, d=40, e=50, f=60) #valid
f(10, 20, 30, 40, e=50, f=60) #valid
f(10, b=20, c=30, d=40, e=50, f=60) #invalid
#f(10, 20, 30, 40, 50, f=60) #invalid

10 20 30 40 50 60
10 20 30 40 50 60


TypeError: f() got some positional-only arguments passed as keyword arguments: 'b'

fstring可用=直接表示變數

In [47]:
# fstring可用=直接表示變數
import datetime
user = 'eric_idle'  
member_since = datetime.date(1975, 7, 31)    
print(f'{user=} {member_since=}' )
delta = datetime.date.today() - member_since 
print(f'{user=!s}  {delta.days=:,d}')

user='eric_idle' member_since=datetime.date(1975, 7, 31)
user=eric_idle  delta.days=17,355


**Typing module introduction**

In [19]:
#General use
def get_value(json: str) -> str:
#meaing that json "should" be str adn return value "should" be str (but it is not force.)
     x = parse(json)
     return x

**typing.List**  
Python 內建的型別（例如： int, float, dict, list, …）可以直接使用型別註釋，但如果是較複雜的資料型別就需要 typing 模組的輔助。

In [24]:
from typing import List

def list_fun(l: List[int]):  # 表達都是 int 的列表(list)
    print(l)

list_fun([3, 3, 3])
list_fun([3, 'test'])

[3, 3, 3]
[3, 'test']


**typing.Union**  
一次接受多種型別

In [31]:
from typing import Union
def print_something(x: Union[int, str]):
    if isinstance(x, (int, str, )):#isinstance(object, classinfo)用來判斷型別
        print(f'Got {x}') 
    else:
        raise TypeError('Only int & str are accepted')
print_something(True)
#print_something(20.7)

Got True


**typing.Dict**  
通常使用字典(Dictionary)時，我們會存放一致型別的資料在字典中，例如以下的 word_count_mapping ， key 是字串， value 是整數  
word_count_mapping = {
    'zoo': 1,
    'zip': 2,
    'google': 10,
    'python': 12,
}

In [4]:
from typing import Dict
word_count_mapping:Dict[str, int]= {
    'zoo': 1, 'zip': 2, 
    'google': 10, 
    'python': 12,
}

def update_count(mapping: Dict[str, int], key: str, count: int):
    mapping[key] = count
    
update_count(word_count_mapping,'test', 30)
print(word_count_mapping)
#Optional[dict[str, str]], # 可以是 None 或者是從 str 映射到 str 的 dict

{'zoo': 1, 'zip': 2, 'google': 10, 'python': 12, 'test': 30}


**typing.TypedDict**  
為字典型的變數規定格式

In [47]:
from typing import TypedDict
class LoginDict(TypedDict):
    username: str
    password: str

login_dict: LoginDict = {
    'username': 'abc',
    'password': 'pass123456',
    #'car':NotRequired[str] 非必填element in python 3.11.
}

def valid_username_password(login_dict: LoginDict):
    print(login_dict['username'])
    
valid_username_password(login_dict)

abc


**typing.Iterable**  
Python 中所謂的 Iterable 指的是有實作 __iter__() 與 __next__() 的物件(object)，例如 str, dict, list, tuple 都是 Iterable ，所以都能夠透過 for 走訪，因此如果是可以接受 Iterable 的函式或方法，可以使用 typing.Iterable 進行標示：

In [49]:
from typing import Iterable

def print_iterable(x: Iterable):
    for i in x:
        print(i)
print_iterable(range(3))

0
1
2


**typing.Optional**  
可接受None以及其他型別，同於Optional[X, list]...等

In [5]:
from typing import Optional
def foo(a: Optional[list] = None):
    if a is None:
        a = []
    a.append(5)
    return a

**typing.Any**
如果是不限定型別的情況下，可以使用 typing.Any (or object)

In [56]:
from typing import Any
def print_all(*obj: Any): #def print_all(*obj:object):
    print(*obj)

print_all(*[1,2,3])

1 2 3


最後，可透過型別檢查器(type checker)幫助我們偵測誤用/錯用型別的情況，例如 mypy

## Python 3.9
* 字典更新和合併：|運算符用於合併字典，|= 運算符用於更新字典
* 最小公倍數 math.LCM* Python 3.9 將兩個新函數添加到 str 對象：
  - 第一個函數用於刪除前綴：str.removeprefix(prefix)
  - 第二個函數用於刪除後綴：str.removesuffix(suffix)
* 對異步編程（asyncio）和多進程庫進行了優化
* 增加randbytes

In [70]:
#字典更新和合併：|運算符用於合併字典，|= 運算符用於更新字典
a = {'farhad': 1, 'blog': 2,'python': 3}
b = {'farhad': 'malik','topic': 'python3.9'}
print(a | b)
#print(a |= b)

{'farhad': 'malik', 'blog': 2, 'python': 3, 'topic': 'python3.9'}


In [71]:
#removeprefix
print('farhad_python'.removeprefix('farhad_'))
# returns python
print('farhad_python'.removesuffix('_python'))
# returns farhad

python
farhad


In [3]:
#GCD and LCM
import math
print(math.gcd(49,14))
print(math.lcm(49, 14))

7
98


# 截至(2023/02/04)anaconda尚未更新

## Python 3.10
* switch case判斷式
* 顯示詳細的錯誤資訊
* Union可以用其他運算子代替
* 嚴謹的zip
* with語法可以一次開多個檔案

In [2]:
!python --version

Python 3.9.16


In [5]:
#match code
def main():
    n = 3
    match (n):
        case (1):
            print("OK")
        case (2):
            print("YES")
        case (3):
            print("THANKS")
        case _:
            print("Something error")
        

if __name__ == "__main__":
    main()

SyntaxError: invalid syntax (1523415427.py, line 4)

In [77]:
def sum_(a:int, b:int):
    print(a/b)
    return

sum_(12,0)

ZeroDivisionError: division by zero

## Python 3.11
* 比上一个版本快60% (The most important one)
* 改進的錯誤提示
* self也可以被類別提示 (from \_\_future\_\_ import annotations)
* Typedict in typing可建立非必要元素