# Python 教程编写

## 概述:

### Python 学习路径

基础: 
- https://www.liaoxuefeng.com/wiki/0014316089557264a6b348958f449949df42a6d3a2e542c000
- Python 最佳实践指南: https://pythonguidecn.readthedocs.io/zh/latest/    

进阶: 
- Python cookbook: https://book.douban.com/subject/26381341/
- Python 手册: https://book.douban.com/subject/6049132/
- 流程的 Python : https://book.douban.com/subject/27028517/
- Python 指南: http://pythonguidecn.readthedocs.io/zh/latest/

高级:
- Reading Open Source Projects and Keep Coding..

### Python 2 vs Python 3
- Python 2 将在 2020年停止维护
- Python 3 在性能上更优秀，解决原本的字节编码混乱的问题

### Python 环境管理
- pipenv: https://robots.thoughtbot.com/how-to-manage-your-python-projects-with-pipenv

### 统一代码风格 - pep8
- Python pep8 文档： https://www.python.org/dev/peps/pep-0008/
- 代码风格检查，修改：https://github.com/ambv/black
- 浅谈软件工程师的代码素养: http://wetest.qq.com/lab/view/385.html

### IDE
- 统一使用 Pycharm

### 错误信息捕获
- Traceback， 捕获上下文

### 测试
- Pandas 性能测试： https://github.com/mm-mansour/Fast-Pandas
- 单元测试： https://github.com/pytest-dev/pytest


## Python basic sytax

Please reading ppt in files folders:

- [Lecture1_python.pptx](files/Lecture1_python.pptx)
- [Lecture2_cis4930.pptx](files/Lecture2_cis4930.pptx)
- [Lecture3_cis4930.pptx](files/Lecture3_cis4930.pptx)
- [Lecture4_cis4930.pptx](files/Lecture4_cis4930.pptx)
- [Lecture5_cis4930.pptx](files/Lecture5_cis4930.pptx)
- [Lecture6_cis4930.pptx](files/Lecture6_cis4930.pptx)

## Python 环境配置

- how-to: http://docs.python-guide.org/en/latest/dev/virtualenvs/

Three Ways:
- conda --> recommanded fro data science project
- pipenv --> recommanded for web projects
- virtualenv

## 标准库

### datetime

In [24]:
## datetime

from datetime import datetime

today = datetime.today()
now = datetime.now()

print(today, "|", now)
print(now.strftime("%Y-%m-%d %H:%M:%S"))

2018-07-01 18:33:40.558459 | 2018-07-01 18:33:40.558506
2018-07-01 18:33:40


In [32]:
## to unix time
print(now.timestamp())

1530441220.558506


In [34]:
## from unix time
unix_time = 1530441220.558506
print(datetime.fromtimestamp(unix_time))

2018-07-01 18:33:40.558506


### functools

In [39]:
from functools import reduce

a = [1, 2, 3, 4, 5, 6]
f = lambda x, y : x*y

print(reduce(f, a))

720


## itertools

In [40]:
from itertools import chain

In [41]:
a = [1, 2, 3, 4]
b = [1, 2, 3, 4, 6, 7, 8, 8]

In [42]:
print(list(chain(a, b)))

[1, 2, 3, 4, 1, 2, 3, 4, 6, 7, 8, 8]


## Pyhton 网络操作

### requests

```shell
pip install requests
```

In [43]:
# simple get
import requests

url = "http://python.org"
resp = requests.get(url)

In [16]:
print(resp.text[:300])

<!doctype html>
<!--[if lt IE 7]>   <html class="no-js ie6 lt-ie7 lt-ie8 lt-ie9">   <![endif]-->
<!--[if IE 7]>      <html class="no-js ie7 lt-ie8 lt-ie9">          <![endif]-->
<!--[if IE 8]>      <html class="no-js ie8 lt-ie9">                 <![endif]-->
<!--[if gt IE 8]><!--><html class="no-js"


In [17]:
# get with parameters, with headers

import requests

url = "http://restapi.amap.com/v3/place/text"

querystring = {"key":"$key","keywords":"Bazhong","offset":"25","page":"1","city":"010"}

headers = {
    'Cache-Control': "no-cache",
    }

response = requests.request("GET", url, headers=headers, params=querystring)

print(response.text)

{"status":"1","count":"81","info":"OK","infocode":"10000","suggestion":{"keywords":[],"cities":[]},"pois":[{"id":"B000A7HBQS","name":"北京市第八中学(高中部)","type":"科教文化服务;学校;中学","typecode":"141202","biz_type":[],"address":"学院小街2号","location":"116.361969,39.911127","tel":"010-59733532","distance":[],"biz_ext":{"rating":"5.0","cost":[]},"pname":"北京市","cityname":"北京市","adname":"西城区","importance":[],"shopid":[],"shopinfo":"0","poiweight":[]},{"id":"B0FFG4GB1Q","name":"北京八中(北1门)","type":"通行设施;临街院门;临街院正门","typecode":"991401","biz_type":[],"address":"西便门外东大街乙2","location":"116.353917,39.901233","tel":[],"distance":[],"biz_ext":{"rating":[],"cost":[]},"pname":"北京市","cityname":"北京市","adname":"西城区","importance":[],"shopid":[],"shopinfo":"2","poiweight":[]},{"id":"B0FFG11IR0","name":"北京八中(北2门)","type":"通行设施;临街院门;临街院门","typecode":"991400","biz_type":[],"address":"西便门外大街10-16号附近","location":"116.354564,39.901258","tel":[],"distance":[],"biz_ext":{"rating":[],"cost":[]},"pname":"北京市","cityname":"北京市","adnam

## 数据库操作

- basic read/query --> pd.read_sql / pd.read_sql_table pd,.to_sql
- using sqlalchemy --> http://www.sqlalchemy.org/

In [45]:
from sqlalchemy import create_engine

In [46]:
# create db
db_url = "sqlite:///test.db"
engine = create_engine(db_url)

In [49]:
## to_sql
import pandas as pd
import numpy as np

data = pd.DataFrame(np.random.rand(4, 3), columns=["a", "b", "c"])

In [50]:
data

Unnamed: 0,a,b,c
0,0.959857,0.404704,0.203658
1,0.564886,0.00197,0.327876
2,0.45747,0.041059,0.881885
3,0.372469,0.655776,0.086884


In [62]:
data.to_sql(con=engine, name="t_sample", index=False, if_exists="replace")

In [63]:
## read_sql

data_read = pd.read_sql_table(con=engine, table_name="t_sample")

In [64]:
data_query = pd.read_sql(con=engine, sql="select count(1) from t_sample")

In [65]:
print(data_read)

          a         b         c
0  0.959857  0.404704  0.203658
1  0.564886  0.001970  0.327876
2  0.457470  0.041059  0.881885
3  0.372469  0.655776  0.086884


In [67]:
print(data_query)

   count(1)
0         4


## 爬虫

https://cuiqingcai.com/1052.html

- scrapy --> heavy stuff
- requests + lxml + bs4

## 机器学习

- checkout notebook/sklearn
- https://github.com/ageron/handson-ml

## Jupyter 

- https://www.dataquest.io/blog/jupyter-notebook-tutorial/