# 中国地区宝可梦数据生成器

本jupyter notebook用于在中国的一些景点处随机生成一系列宝可梦数据，应用于SimplePokemonGO项目。

首先让我们读取中国景点数据和我们的宝可梦数据库，新添加的宝可梦数据会增加到这个数据库中。读写和处理数据的工具是pandas。

In [24]:
import pandas as pd

# read scenic spots data 
df = pd.read_csv('china.csv', index_col=0)
# read database
db = pd.read_excel('Pokemon.xlsx')

读取完毕后我们可以用head函数简单看看数据长什么样子。

In [25]:
# show some data
df.head(5)

Unnamed: 0,citys,latlong,spots,provinces
0,张家界,"110.482434,29.111799",天门山国家森林公园,湖南
1,张家界,"110.442167,29.282828",张家界国家森林公园,湖南
2,衡阳,"112.864018,27.225530",衡山,湖南
3,长沙,"112.937353,28.182849",岳麓山,湖南
4,岳阳,"113.129702,29.371903",岳阳楼—君山岛景区,湖南


In [26]:
db.head(5)

Unnamed: 0,AddressLine,City,AdminDivision,Country,PostCode,Latitude,Longitude,Pokemon,Level
0,Tiananmen Square,Beijing,BJ,CN,100010,39.90733,116.39108,Pikachu,81
1,Summer Palace,Beijing,BJ,CN,100087,39.9975,116.2689,Eevee,66
2,Fawn Creek Township,Kansas,KS,US,30808,37.09024,-95.7,Charizard,88


接下来我们就要生成一些数据，引入一些库后，设置生成宝可梦数据的数目N，在循环中配置每个数据的值，并用append方法添加到数据库即可。

In [27]:
import random
import pinyin
# read pokemon names
with open('dex.txt', 'r') as f:
    dex = f.readlines()
# number of data to generate
N = 100
for i in range(N):
    # create a dict
    d = {}
    # ramdom sample a data
    data = df.sample(1)
    # AddressLine
    d[db.columns[0]] = data['spots'].item()
    # City
    d[db.columns[1]] = data['citys'].item()
    # AdminDivision
    d[db.columns[2]] = pinyin.get_initial(data['provinces'].item(), delimiter="").upper()
    # Country
    d[db.columns[3]] = 'CN'
    # PostCode
    d[db.columns[4]] = ''.join(["{}".format(random.randint(0, 9)) for num in range(0, 6)])
    # Latitude and Longitude
    d[db.columns[6]], d[db.columns[5]]= data['latlong'].item().split(',')
    # Pokemon
    d[db.columns[7]] = random.choice(dex)[:-1].capitalize()
    # Level
    d[db.columns[8]] = random.randint(1, 100)
    # append data to database
    db = db.append(d, ignore_index=True)

再看看我们添加了新的宝可梦数据的数据库。

In [28]:
db.head(10)

Unnamed: 0,AddressLine,City,AdminDivision,Country,PostCode,Latitude,Longitude,Pokemon,Level
0,Tiananmen Square,Beijing,BJ,CN,100010,39.9073,116.391,Pikachu,81
1,Summer Palace,Beijing,BJ,CN,100087,39.9975,116.269,Eevee,66
2,Fawn Creek Township,Kansas,KS,US,30808,37.0902,-95.7,Charizard,88
3,九莲山,新乡,HN,CN,806456,35.609058,113.572193,Metagross,77
4,关山草原,宝鸡,SX,CN,111582,34.025982,107.232166,Cinccino,57
5,西安城墙,西安,SX,CN,561544,34.251854,108.948302,Necrozmadusk,27
6,鸡笼山,合肥,AH,CN,621060,31.776348,117.55869,Darmanitanzen,20
7,胡耀邦故居,长沙,HN,CN,220161,28.078068,113.886339,Mareanie,81
8,桑耶寺,山南,XC,CN,855011,29.325332,91.504096,Trevenant,73
9,京辉高尔夫俱乐部,北京,BJ,CN,129244,39.805695,116.074097,Sceptilemega,60


最后导出到Excel文件。

In [29]:
db.to_excel('Pokemon.xlsx', index=False)

最后打开我们保存的文件，另存为同文件名的txt格式到根目录/data目录下便完成了宝可梦数据生成的步骤。