# G7 Python 入门课程
## 项目1 Python操作入门

入门可能主要是读取税前薪水和可能的纳税额，根据正确的公式，判断是否正确

**提示**：这样的文字将会指导你如何使用 iPython Notebook 来完成项目。

In [1]:
# 检查你的Python版本
from sys import version_info
if version_info.major != 2 and version_info.minor != 7:
    raise Exception('请使用Python 2.7来完成此项目')

In [2]:
import numpy as np
import pandas as pd

# 数据可视化代码
from titanic_visualizations import survival_stats
from IPython.display import display
%matplotlib inline

# 加载数据集
in_file = 'data.csv'
full_data = pd.read_csv(in_file)

# 显示数据列表中的前几项数据
display(full_data.head())

Unnamed: 0,name,salary,tax_maybe
0,wang,2500,0
1,zhang,7000,105
2,li,8000,205
3,song,9000,405
4,tang,50000,800


数据样本中，我们可以看到的特征

- **name**：名称
- **salary**：税前薪水
- **tax_maybe**：可能的缴税额度



个税的计算方式
![xxx](https://img-blog.csdn.net/20171017113915227?watermark/2/text/aHR0cDovL2Jsb2cuY3Nkbi5uZXQvVG9nZXRoZXJfQ1o=/font/5a6L5L2T/fontsize/400/fill/I0JBQkFCMA==/dissolve/70/gravity/Center)

In [26]:
# 个人所得税税率速查表
# 列表中每一项为元组，包含三项数据: (应纳税额, 税率，速算扣除数)
tax_table = [
         (80000, 0.45, 13505),
         (55000, 0.35, 5505),
         (35000, 0.3, 2755),
         (9000, 0.25, 1005),
         (4500, 0.2, 555),
         (1500, 0.1, 105),
         (0, 0.03, 0),
]
# 五险一金比例
point = 3500  #免征额
endowment_insurance_rate = 0.08  # 养老保险费率
hospital_rate = 0.02  # 医疗保险费率
losejob_rate = 0.01  # 失业保险费率
provident_rate = 0.12  # 公积金费率
provident_max = 20972 # 公积金基数最大值
provident_min = 1500 # 公积金基数最小值
endowment_insurance_min = 2193 # 养老保险基数最小值
endowment_insurance_max = 16445 #养老保险基数最大值


# 计算上交的养老保险
def _get_endowment_insurance(salary):
    if salary < endowment_insurance_min:
        endowment_insurance = endowment_insurance_min * endowment_insurance_rate
    elif salary > endowment_insurance_max:
        endowment_insurance = endowment_insurance_max * endowment_insurance_rate
    else:
        endowment_insurance = salary * endowment_insurance_rate
    return endowment_insurance


# 计算上交的公积金
def _get_provident(salary):
    if salary < provident_min:
        provident = provident_min * provident_rate
    elif salary > provident_max:
        provident = provident_max * provident_rate
    else:
        provident = salary * provident_rate
    return provident


#计算需要缴税的金额
def get_tax_salary(salary):
    #养老保险
    endowment_insurance = _get_endowment_insurance(salary)
    #公积金
    provident = _get_provident(salary)
    # 要缴纳的五险一金总额
    insure = endowment_insurance + provident + salary * hospital_rate + salary * losejob_rate
    # 需要算税的金额
    tax_salary = salary - insure - point
    return insure, tax_salary
    

#计算五险一金, 税, 实发工资
def calculator(salary):
    """ 返回税后薪水
    :param salary:
    """
    amount = get_tax_salary(salary)
    insure, tax_salary = amount[0], amount[1]
    if tax_salary < 0:
        tax = 0
        res_money = salary - insure - tax
    else:
        for item in tax_table:
            if tax_salary > item[0]:
                tax = tax_salary * item[1] - item[2]
                break
        res_money = salary - insure - tax
    print '税前工资为：{:.0f}, 各种保险为：{:.2f}, 税为：{:.2f}, 税后工资为：{:.2f}'.format(salary, insure, tax, res_money)
    return float('%.2f' % insure), float('%.2f' % tax), float('%.2f' % res_money)


In [30]:
taxs = {}
for index, salary in full_data.iterrows():
    tax = calculator(salary['salary'])[1]
    taxs[salary['name']] = tax
    if (tax == salary['tax_maybe']):
        print "so cool"
print taxs

税前工资为：2500, 各种保险为：575.00, 税为：0.00, 税后工资为：1925.00
so cool
税前工资为：7000, 各种保险为：1610.00, 税为：84.00, 税后工资为：5306.00
税前工资为：8000, 各种保险为：1840.00, 税为：161.00, 税后工资为：5999.00
税前工资为：9000, 各种保险为：2070.00, 税为：238.00, 税后工资为：6692.00
税前工资为：50000, 各种保险为：5332.24, 税为：9595.33, 税后工资为：35072.43
{'tang': 9595.33, 'song': 238.0, 'li': 161.0, 'wang': 0.0, 'zhang': 84.0}


## 扩展内容
将正确的纳税额导出到export.csv，并计算预测的正确率


In [31]:
def export() :
    # TODO 请导出正确的纳税额到export.csv, tax作为最后一列
    name = []
    salary = []
    insure = []
    tax = []
    res_money = []
    tax_maybe = []
    for index, info in full_data.iterrows():
        data = calculator(info['salary'])
        name.append(info['name'])
        salary.append(int(info['salary']))
        tax_maybe.append(int(info['tax_maybe']))
        insure.append(data[0])
        tax.append(data[1])
        res_money.append(data[2])
    columns = ['name','salary','insure','res_money','tax_maybe','tax']
    dataframe = pd.DataFrame({'name':name,'salary':salary,'insure':insure,'res_money':res_money,'tax_maybe':tax_maybe,'tax':tax})
    dataframe.to_csv("export.csv",index=False,columns=columns)
    print "export done"
    return tax_maybe, tax
    
    
export()

# 加载数据集
in_file = 'export.csv'
export_data = pd.read_csv(in_file)

# 显示数据列表中的前几项数据
display(export_data.head())

税前工资为：2500, 各种保险为：575.00, 税为：0.00, 税后工资为：1925.00
税前工资为：7000, 各种保险为：1610.00, 税为：84.00, 税后工资为：5306.00
税前工资为：8000, 各种保险为：1840.00, 税为：161.00, 税后工资为：5999.00
税前工资为：9000, 各种保险为：2070.00, 税为：238.00, 税后工资为：6692.00
税前工资为：50000, 各种保险为：5332.24, 税为：9595.33, 税后工资为：35072.43
export done


Unnamed: 0,name,salary,insure,res_money,tax_maybe,tax
0,wang,2500,575.0,1925.0,0,0.0
1,zhang,7000,1610.0,5306.0,105,84.0
2,li,8000,1840.0,5999.0,205,161.0
3,song,9000,2070.0,6692.0,405,238.0
4,tang,50000,5332.24,35072.43,800,9595.33


In [62]:
def accuracy_score(tax_maybe, tax):
    # TODO 计算正确率
    if tax_maybe != 0:
        score = (float('%.2f' % tax) - float('%.2f' % tax_maybe))/float('%.2f' % tax_maybe) * 100
        return "accuracy of {:.2f}%.".format(score)

def get_tax_list():
    # 获取所有的税数据
    out_file = 'export.csv'
    full_data = pd.read_csv(in_file)
    taxs = []
    taxs_maybe = []
    for index, salary in full_data.iterrows():
        taxs_maybe.append(salary['tax_maybe'])
        taxs.append(salary['tax'])
    return taxs_maybe, taxs

tax_data = get_tax_list()
tax_maybe = get_tax_list()[0]
tax = get_tax_list()[1]
for index in range(len(tax)):
    accuracy_score(tax[index], tax_maybe[index])

None
accuracy of 25.00%.
accuracy of 27.33%.
accuracy of 70.17%.
accuracy of -91.66%.


> **注意**: 当你写完了所有**4个TODO**。你就可以把你的 iPython Notebook 导出成 HTML 文件。你可以在菜单栏，这样导出**File -> Download as -> HTML (.html)** 把这个 HTML 和这个 iPython notebook 一起做为你的作业提交。