## 餐厅订单数据分析

'''

分析可得信息

1.订单表的长度，shape，columns

2.统计菜品的平均价格（amounts）

3.什么菜最受欢迎

4.哪个订单ID点的菜最多

......

~~加载数据~~

~~数据预处理（合并数据，NA处理），分析数据~~

~~统计8月卖出菜品的平均价格~~

~~频数统计，什么菜最受欢迎 （对菜名进行频数统计，取最大前10名）~~

~~数据可视化matplotlib~~

~~订单点菜的种类最多~~

~~8月份餐厅订单点菜种类前10名，平均点菜25个菜品

~~订单ID点菜数量Top10(分组order_id,counts求和，排序，前10)~~

~~8月份订单点菜数量前10名~~

~~哪个订单ID吃的钱做多~~

~~哪个订单ID平均消费最贵~~

~~一天当中什么时间段，点菜量比较集中（hour）~~

~~8月份哪一天订餐数量最多~~

~~拓展：排序，取点菜量最大的前5天(Done)~~

查看星期几人数最多，订餐数最多，映射数据到星期

'''

不同维度进行数据分析：

针对订单order_id：

        什么菜最受欢迎

        点菜的种类

        点菜的数量

        消费金额最大

        平均消费

针对时间日期进行分析：

        点菜量比较集中的时间

        哪一天订餐数量最大

        星期几就餐人数最多

技术点：

        拼接数据：pd.concat([列1,...])

        分组统计(求和)

        排序，切片Top10

        绘制柱状图走势和高度

'''



In [1]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

plt.rcParams['font.sans-serif'] = ['SimHei'] # 用来正常显示中文标签
plt.rcParams['axes.unicode_minus'] = False # 用来正常显示负号

In [2]:
# 加载数据
data1 = pd.read_excel(r'C:\Users\wdl\Data-analysis\五大实战项目\data\meal_order_detail.xlsx', sheet_name=0)
data2 = pd.read_excel(r'C:\Users\wdl\Data-analysis\五大实战项目\data\meal_order_detail.xlsx', sheet_name=1)
data3 = pd.read_excel(r'C:\Users\wdl\Data-analysis\五大实战项目\data\meal_order_detail.xlsx', sheet_name=2)

In [3]:
# 数据预处理（合并数据，NA处理），分析数据
data = pd.concat([data1, data2, data3], axis=0)
data.info()

<class 'pandas.core.frame.DataFrame'>
Int64Index: 10037 entries, 0 to 3610
Data columns (total 19 columns):
 #   Column             Non-Null Count  Dtype         
---  ------             --------------  -----         
 0   detail_id          10037 non-null  int64         
 1   order_id           10037 non-null  int64         
 2   dishes_id          10037 non-null  int64         
 3   logicprn_name      0 non-null      float64       
 4   parent_class_name  0 non-null      float64       
 5   dishes_name        10037 non-null  object        
 6   itemis_add         10037 non-null  int64         
 7   counts             10037 non-null  int64         
 8   amounts            10037 non-null  int64         
 9   cost               0 non-null      float64       
 10  place_order_time   10037 non-null  datetime64[ns]
 11  discount_amt       0 non-null      float64       
 12  discount_reason    0 non-null      float64       
 13  kick_back          0 non-null      float64       
 14  add_inp

In [4]:
data.dropna(axis=1, how='any', inplace=True)
data.head(5)

Unnamed: 0,detail_id,order_id,dishes_id,dishes_name,itemis_add,counts,amounts,place_order_time,add_inprice,picture_file,emp_id
0,2956,417,610062,蒜蓉生蚝,0,1,49,2016-08-01 11:05:36,0,caipu/104001.jpg,1442
1,2958,417,609957,蒙古烤羊腿,0,1,48,2016-08-01 11:07:07,0,caipu/202003.jpg,1442
2,2961,417,609950,大蒜苋菜,0,1,30,2016-08-01 11:07:40,0,caipu/303001.jpg,1442
3,2966,417,610038,芝麻烤紫菜,0,1,25,2016-08-01 11:11:11,0,caipu/105002.jpg,1442
4,2968,417,610003,蒜香包,0,1,13,2016-08-01 11:11:30,0,caipu/503002.jpg,1442


In [5]:
# 统计8月卖出菜品的平均价格
np.round(data.amounts.mean(), 1)

44.8

In [6]:
# 删除series的字符
data.dishes_name = data.dishes_name.str.replace(r'_x000D_\n', '', regex=True)
data.dishes_name.value_counts()

白饭/大碗         323
凉拌菠菜          269
谷稻小庄          239
麻辣小龙虾         216
辣炒鱿鱼          189
             ... 
特醇嘉士伯啤酒罐装      13
鸡蛋、肉末肠粉        12
三丝鳝鱼           10
百里香奶油烤紅酒牛肉      5
铁板牛肉            3
Name: dishes_name, Length: 145, dtype: int64

In [7]:
new_data = data.drop(data[(data.dishes_name=='白饭/大碗') | (data.dishes_name=='白饭/小碗')].index)
new_data.dishes_name.value_counts()

凉拌菠菜          245
谷稻小庄          218
麻辣小龙虾         198
五色糯米饭(七色)     175
芝士烩波士顿龙虾      171
             ... 
鸡蛋、肉末肠粉        11
特醇嘉士伯啤酒罐装      11
三丝鳝鱼            9
百里香奶油烤紅酒牛肉      5
铁板牛肉            3
Name: dishes_name, Length: 143, dtype: int64

In [8]:
# 频数统计，什么菜最受欢迎 （对菜名进行频数统计，取最大前10名）
ser1 = new_data.dishes_name.value_counts().nlargest(10)
ser1

凉拌菠菜         245
谷稻小庄         218
麻辣小龙虾        198
五色糯米饭(七色)    175
芝士烩波士顿龙虾     171
辣炒鱿鱼         168
香酥两吃大虾       165
焖猪手          159
水煮鱼          148
蒙古烤羊腿        140
Name: dishes_name, dtype: int64

In [9]:
from pyecharts import options as opts
from pyecharts.charts import Bar, Line
from pyecharts.faker import Faker

bar = Bar(init_opts=opts.InitOpts(theme='dark'))
bar.add_xaxis(ser1.index.to_list())
bar.add_yaxis(
    "销量", ser1.values.tolist(), bar_max_width=30, category_gap='50%',
    itemstyle_opts=opts.ItemStyleOpts(
        border_width=1, opacity=0.7),
    label_opts=opts.LabelOpts(is_show=False),
)
bar.set_global_opts(
    xaxis_opts=opts.AxisOpts(
        axislabel_opts=opts.LabelOpts(
            rotate=-15, position='bottom', interval=0, font_size=12, margin=20, horizontal_align='center'),
        name_location='end', name_gap=10, name='菜名'),
    yaxis_opts=opts.AxisOpts(
        name='销量', name_location='end', name_gap=10, name_rotate=0),
    title_opts=opts.TitleOpts(title="最受欢迎的10个菜"),
    legend_opts=opts.LegendOpts(
        pos_top='2%', pos_left='center', border_width=0),
    toolbox_opts=opts.ToolboxOpts(
        is_show=True, orient='vertical', pos_left='right', pos_top='center'),
    visualmap_opts=opts.VisualMapOpts(
        is_show=True, type_='color', min_=130, max_=250, range_color=Faker.visual_color, pos_bottom='7%'),
)

line = Line(init_opts=opts.InitOpts(theme='dark'))
line.add_xaxis(ser1.index.to_list())
line.add_yaxis("销量", ser1.values.tolist(), is_smooth=True, is_symbol_show=True, symbol='circle',
               symbol_size=6, linestyle_opts=opts.LineStyleOpts(width=3, type_='dashed'),
               )


bar.overlap(line).render_notebook()


In [10]:
# pyecharts词云图
from pyecharts.charts import WordCloud

wordcloud = WordCloud(init_opts=opts.InitOpts(theme='dark'))
wordcloud.add("欢迎", [list(z) for z in zip(ser1.index.to_list(), ser1.values.tolist())],
                word_size_range=[20, 100], shape='diamond')
wordcloud.set_global_opts(title_opts=opts.TitleOpts(title="最受欢迎的10个菜"),
                          toolbox_opts=opts.ToolboxOpts(is_show=True, orient='vertical', pos_left='right', pos_top='center'),
                          )

wordcloud.render_notebook()

In [11]:
# 订单点菜的种类最多
ser2 = new_data.order_id.value_counts()[:10]
ser2

398     34
1078    25
582     25
392     24
1295    24
1311    23
465     23
1318    23
672     23
1166    22
Name: order_id, dtype: int64

In [12]:
from pyecharts import options as opts
from pyecharts.charts import Bar, Line
from pyecharts.faker import Faker

bar = Bar(init_opts=opts.InitOpts(theme='dark'))
bar.add_xaxis([f'ID: {i}' for i in ser2.index.to_list()])
bar.add_yaxis(
    "种类", ser2.values.tolist(), bar_max_width=30, category_gap='50%',
    itemstyle_opts=opts.ItemStyleOpts(
        border_width=1, opacity=0.7),
    label_opts=opts.LabelOpts(
        is_show=True, font_size=16, color='black', font_weight='bold'),
)
bar.set_global_opts(
    xaxis_opts=opts.AxisOpts(
        axislabel_opts=opts.LabelOpts(
            position='bottom', interval=0, font_size=12, margin=20, horizontal_align='center'),
        name_location='end', name_gap=10, name='订单ID'),
    yaxis_opts=opts.AxisOpts(
        name='种类', name_location='end', name_gap=10, name_rotate=0),
    title_opts=opts.TitleOpts(title="订单点菜的种类最多"),
    legend_opts=opts.LegendOpts(
        pos_top='2%', pos_left='center', border_width=0),
    toolbox_opts=opts.ToolboxOpts(
        is_show=True, orient='vertical', pos_left='right', pos_top='center'),
    visualmap_opts=opts.VisualMapOpts(
        is_show=True, type_='color', min_=20, max_=40, range_color=Faker.visual_color, pos_bottom='7%'),
)
bar.set_series_opts(markline_opts=opts.MarkLineOpts(is_silent=True, 
                                                    data=[opts.MarkLineItem(type_='average', name='平均值',)], 
                                                    label_opts=opts.LabelOpts(is_show=True, 
                                                                              position='end', 
                                                                              formatter='{b}:\n{c}种',
                                                                              )))

bar.render_notebook()


In [13]:
# 平均点菜25个菜品
np.round(new_data.order_id.value_counts()[:10].mean(), 0)

25.0

In [14]:
# 订单ID点菜数量Top10
ser3 = new_data.groupby('order_id').counts.sum().nlargest(10)
ser3

order_id
398     34
1051    33
1033    30
1318    30
1150    28
752     27
1078    27
1186    27
557     26
1019    26
Name: counts, dtype: int64

In [15]:
from pyecharts import options as opts
from pyecharts.charts import Bar, Line
from pyecharts.faker import Faker

bar = Bar(init_opts=opts.InitOpts(theme='dark'))
bar.add_xaxis([f'ID: {i}' for i in ser3.index.to_list()])
bar.add_yaxis(
    "个数", ser3.values.tolist(), bar_max_width=30, category_gap='50%',
    itemstyle_opts=opts.ItemStyleOpts(
        border_width=1, opacity=0.7),
    label_opts=opts.LabelOpts(is_show=True, font_size=16, color='black', font_weight='bold'),
)
bar.set_global_opts(
    xaxis_opts=opts.AxisOpts(
        axislabel_opts=opts.LabelOpts(
         position='bottom', interval=0, font_size=12, margin=20, horizontal_align='center'),
        name_location='end', name_gap=10, name='订单ID'),
    yaxis_opts=opts.AxisOpts(
        name='个数', name_location='end', name_gap=10, name_rotate=0,),
    title_opts=opts.TitleOpts(title="订单ID点菜数量Top10"),
    legend_opts=opts.LegendOpts(
        pos_top='2%', pos_left='center', border_width=0),
    toolbox_opts=opts.ToolboxOpts(
        is_show=True, orient='vertical', pos_left='right', pos_top='center'),
    visualmap_opts=opts.VisualMapOpts(
        is_show=True, type_='color', min_=25, max_=35, range_color=Faker.visual_color, pos_bottom='7%'),
)
bar.set_series_opts(markline_opts=opts.MarkLineOpts(is_silent=True, 
                                                    data=[opts.MarkLineItem(type_='average', name='平均值')],
                                                    label_opts=opts.LabelOpts(is_show=True,formatter='{b}:\n{c}个')),
                    )

bar.render_notebook()


In [16]:
# 哪个订单ID吃的钱做多
ser4 = data.groupby('order_id').amounts.sum().nlargest(10)
ser4

order_id
1166    1314
743     1214
1317    1210
576     1162
408     1148
1121    1146
561     1144
1178    1129
385     1125
584     1121
Name: amounts, dtype: int64

In [17]:
# 哪个订单ID平均消费最贵
data.groupby('order_id').amounts.mean().nlargest(1)

order_id
909    117.75
Name: amounts, dtype: float64

In [18]:
from pyecharts import options as opts
from pyecharts.charts import Bar, Line
from pyecharts.faker import Faker

bar = Bar(init_opts=opts.InitOpts(theme='dark'))
bar.add_xaxis([f'ID: {i}' for i in ser4.index.to_list()])
bar.add_yaxis(
    "金额：单位（元）", ser4.values.tolist(), bar_max_width=30, category_gap='50%',
    itemstyle_opts=opts.ItemStyleOpts(
        border_width=1, opacity=0.7),
    label_opts=opts.LabelOpts(is_show=True, font_size=16, color='black', font_weight='bold'),
)
bar.set_global_opts(
    xaxis_opts=opts.AxisOpts(
        axislabel_opts=opts.LabelOpts(
         position='bottom', interval=0, font_size=12, margin=20, horizontal_align='center'),
        name_location='end', name_gap=10, name='订单ID'),
    yaxis_opts=opts.AxisOpts(
        name='金额：单位（元）', name_location='end', name_gap=10, name_rotate=0,),
    title_opts=opts.TitleOpts(title="哪个订单ID吃的钱做多"),
    legend_opts=opts.LegendOpts(
        pos_top='2%', pos_left='center', border_width=0),
    toolbox_opts=opts.ToolboxOpts(
        is_show=True, orient='vertical', pos_left='right', pos_top='center'),
    visualmap_opts=opts.VisualMapOpts(
        is_show=True, type_='color', min_=1100, max_=1320, range_color=Faker.visual_color, pos_bottom='7%'),
)
bar.set_series_opts(markline_opts=opts.MarkLineOpts(is_silent=True, 
                                                    data=[opts.MarkLineItem(type_='average', name='平均值')],
                                                    label_opts=opts.LabelOpts(is_show=True,formatter='{b}:\n{c}元')),
                    )

bar.render_notebook()


In [19]:
data['hour'] = data.place_order_time.dt.hour

In [20]:
hour_data = data.groupby('hour').counts.sum()
hour_data

hour
11    1022
12     915
13     888
14     140
17    1231
18    1726
19    1619
20    1709
21    1662
22     214
Name: counts, dtype: int64

In [21]:
from pyecharts import options as opts
from pyecharts.charts import Bar, Line
from pyecharts.faker import Faker


bar = Bar(init_opts=opts.InitOpts(theme='dark', width='1100px'))
bar.add_xaxis([f'{i}点' for i in hour_data.index.to_list()])
bar.add_yaxis(
    "点菜金额",
    hour_data.values.tolist(),
    label_opts=opts.LabelOpts(is_show=False),
    bar_max_width=30, category_gap='50%',
    markline_opts=opts.MarkLineOpts(data=[opts.MarkLineItem(type_="average", name="平均值")]),
    itemstyle_opts=opts.ItemStyleOpts(opacity=0.7)
)
bar.set_global_opts(
    title_opts=opts.TitleOpts(title="一天中某时间段，点菜数量"),
    datazoom_opts=[opts.DataZoomOpts(range_start=0, range_end=100,
                                     is_show=True, type_="slider",
                                     pos_bottom="7%", pos_top="96%")],
    visualmap_opts=opts.VisualMapOpts(is_show=True, type_='color', 
                                      min_=1100, max_=1320, range_color=Faker.visual_color, pos_bottom='2%',),
    toolbox_opts=opts.ToolboxOpts(is_show=True, orient='vertical', 
                                  pos_left='right', pos_top='center'),
    legend_opts=opts.LegendOpts(pos_top='2%', pos_left='center', border_width=0),
    xaxis_opts=opts.AxisOpts(name='时间：单位（点）', name_location='end', name_gap=10, name_rotate=0,),
    yaxis_opts=opts.AxisOpts(name='金额：数量（个）', name_location='end', name_gap=10, name_rotate=0,),
    tooltip_opts=opts.TooltipOpts(is_show=True, formatter='{b}：{c}个'),
)
bar.set_series_opts(markline_opts=opts.MarkLineOpts(is_silent=False, data=[opts.MarkLineItem(type_='average', name='平均值')], label_opts=opts.LabelOpts(is_show=True,formatter='{b}:\n{c}元')),
                    )


line = Line(init_opts=opts.InitOpts(theme='dark'))
line.add_xaxis(xaxis_data=[f'{i}点' for i in hour_data.index.to_list()])
line.add_yaxis(
    series_name="",
    y_axis=hour_data.values.tolist(),
    linestyle_opts=opts.LineStyleOpts(width=4, type_='dashed'),
    is_symbol_show=True, symbol='circle', symbol_size=10,
)
bar.overlap(line).render_notebook()


In [22]:
data['day'] = data.place_order_time.dt.day

In [23]:
# 8月份哪一天订餐数量最多
ser5 = data.groupby('day').counts.sum()
ser5

day
1     233
2     151
3     192
4     169
5     224
6     793
7     761
8     171
9     167
10    227
11    191
12    196
13    824
14    770
15    230
16    118
17    229
18    258
19    238
20    996
21    853
22    156
23    201
24    154
25    153
26    224
27    831
28    892
29    163
30    167
31    194
Name: counts, dtype: int64

In [24]:
from pyecharts import options as opts
from pyecharts.charts import Bar, Line
from pyecharts.faker import Faker


bar = Bar(init_opts=opts.InitOpts(theme='dark', width='1100px'))
bar.add_xaxis([f'{i}' for i in ser5.index.to_list()])
bar.add_yaxis(
    "点菜数量",
    ser5.values.tolist(),
    label_opts=opts.LabelOpts(is_show=False),
    bar_max_width=30, category_gap='50%',
    markline_opts=opts.MarkLineOpts(data=[opts.MarkLineItem(type_="average", name="平均值")]),
    itemstyle_opts=opts.ItemStyleOpts(opacity=0.7)
)
bar.set_global_opts(
    title_opts=opts.TitleOpts(title="8月份哪一天订餐数量最多"),
    datazoom_opts=[opts.DataZoomOpts(range_start=0, range_end=100,
                                     is_show=True, type_="slider",
                                     pos_bottom="7%", pos_top="96%")],
    visualmap_opts=opts.VisualMapOpts(is_show=True, type_='color', 
                                      min_=150, max_=1000, range_color=Faker.visual_color, pos_bottom='2%',),
    toolbox_opts=opts.ToolboxOpts(is_show=True, orient='vertical', 
                                  pos_left='right', pos_top='center'),
    legend_opts=opts.LegendOpts(pos_top='2%', pos_left='center', border_width=0),
    xaxis_opts=opts.AxisOpts(name='单位（天）', name_location='end', name_gap=10, name_rotate=0,),
    yaxis_opts=opts.AxisOpts(name='数量（个）', name_location='end', name_gap=10, name_rotate=0,),
    tooltip_opts=opts.TooltipOpts(is_show=True, formatter='{b}：{c}个'),
)
bar.set_series_opts(markline_opts=opts.MarkLineOpts(is_silent=False, data=[opts.MarkLineItem(type_='average', name='平均值')], label_opts=opts.LabelOpts(is_show=True,formatter='{b}:\n{c}个')),
)

line = Line(init_opts=opts.InitOpts(theme='dark'))
line.add_xaxis(xaxis_data=[f'{i}' for i in ser5.index.to_list()])
line.add_yaxis(
    series_name="",
    y_axis=ser5.values.tolist(),
    linestyle_opts=opts.LineStyleOpts(width=4, type_='dashed'),
    is_symbol_show=True, symbol='circle', symbol_size=10,
)
bar.overlap(line).render_notebook()


In [25]:
# 排序，取点菜量最大的前5天
data.groupby('day').counts.sum().nlargest(5)

day
20    996
28    892
21    853
27    831
13    824
Name: counts, dtype: int64

查看星期几人数最多，订餐数最多，映射数据到星期

In [26]:
data['day_of_week'] = data.place_order_time.dt.dayofweek + 1

In [27]:
ser6 = data.groupby('day_of_week').counts.sum() # 一周中哪一天订餐数量最多
ser6

day_of_week
1     953
2     804
3     996
4     771
5     882
6    3444
7    3276
Name: counts, dtype: int64

In [28]:
ser7 = data.groupby('day_of_week').order_id.nunique()
ser7

day_of_week
1     88
2     76
3     89
4     64
5     80
6    282
7    264
Name: order_id, dtype: int64

In [29]:
from pyecharts import options as opts
from pyecharts.charts import Bar, Grid, Line
from pyecharts.faker import Faker


bar1 = Bar(init_opts=opts.InitOpts(theme='dark'))
bar1.add_xaxis([f'星期{i}' for i in ser6.index.to_list()])
bar1.add_yaxis(
    "点菜金额",
    ser6.values.tolist(),
    label_opts=opts.LabelOpts(is_show=False),
    bar_max_width=30, category_gap='50%',
    markline_opts=opts.MarkLineOpts(data=[opts.MarkLineItem(type_="average", name="平均值")]),
    itemstyle_opts=opts.ItemStyleOpts(opacity=0.7)
)
bar1.set_global_opts(
    title_opts=opts.TitleOpts(title="一周哪一天订餐金额最多"),
    datazoom_opts=[opts.DataZoomOpts(range_start=0, range_end=100,
                                     is_show=True, type_="slider",
                                     pos_bottom="3%", pos_top="99%")],
    # visualmap_opts=opts.VisualMapOpts(is_show=True, type_='color', 
    #                                   min_=60, max_=3500, range_color=Faker.visual_color, pos_bottom='2%',),
    toolbox_opts=opts.ToolboxOpts(is_show=True, orient='vertical', 
                                  pos_left='right', pos_top='center'),
    legend_opts=opts.LegendOpts(pos_top='2%', pos_left='center', border_width=0),
    xaxis_opts=opts.AxisOpts(name='单位（天）', name_location='end', name_gap=10, name_rotate=0,),
    yaxis_opts=opts.AxisOpts(name='金额（元）', name_location='end', name_gap=10, name_rotate=0,),
    tooltip_opts=opts.TooltipOpts(is_show=True, formatter='{b}：{c}元'),
)
bar1.set_series_opts(markline_opts=opts.MarkLineOpts(is_silent=False, data=[opts.MarkLineItem(type_='average', name='平均值')], label_opts=opts.LabelOpts(is_show=True,formatter='{b}:\n{c}元')),
)

line1 = Line(init_opts=opts.InitOpts(theme='dark'))
line1.add_xaxis(xaxis_data=[f'星期{i}' for i in ser6.index.to_list()])
line1.add_yaxis(
    series_name="",
    y_axis=ser6.values.tolist(),
    linestyle_opts=opts.LineStyleOpts(width=4, type_='dashed'),
    is_symbol_show=True, symbol='circle', symbol_size=10,
)


bar2 = Bar(init_opts=opts.InitOpts(theme='dark'))
bar2.add_xaxis([f'星期{i}' for i in ser7.index.to_list()])
bar2.add_yaxis(
    "点菜数量",
    ser7.values.tolist(),
    label_opts=opts.LabelOpts(is_show=False),
    bar_max_width=30, category_gap='50%',
    markline_opts=opts.MarkLineOpts(data=[opts.MarkLineItem(type_="average", name="平均值")]),
    itemstyle_opts=opts.ItemStyleOpts(opacity=0.7)
)
bar2.set_global_opts(
    title_opts=opts.TitleOpts(title="一周哪一天订餐数量最多", pos_top='50%'),
    # datazoom_opts=[opts.DataZoomOpts(range_start=0, range_end=100,
    #                                  is_show=True, type_="slider",
    #                                  pos_bottom="7%", pos_top="96%")],
    # visualmap_opts=opts.VisualMapOpts(is_show=True, type_='color',
    #                                   min_=50, max_=3500, range_color=Faker.visual_color, pos_bottom='2%',),
    toolbox_opts=opts.ToolboxOpts(is_show=True, orient='vertical', 
                                  pos_left='right', pos_top='center'),
    legend_opts=opts.LegendOpts(pos_top='50%', pos_left='center', border_width=0),
    xaxis_opts=opts.AxisOpts(name='单位（天）', name_location='end', name_gap=10, name_rotate=0,),
    yaxis_opts=opts.AxisOpts(name='数量（个）', name_location='end', name_gap=10, name_rotate=0,),
    tooltip_opts=opts.TooltipOpts(is_show=True, formatter='{b}：{c}个'),
)
bar2.set_series_opts(markline_opts=opts.MarkLineOpts(is_silent=False, data=[opts.MarkLineItem(type_='average', name='平均值')], label_opts=opts.LabelOpts(is_show=True,formatter='{b}:\n{c}个')),
)

line2 = Line(init_opts=opts.InitOpts(theme='dark'))
line2.add_xaxis(xaxis_data=[f'星期{i}' for i in ser7.index.to_list()])
line2.add_yaxis(
    series_name="",
    y_axis=ser7.values.tolist(),
    linestyle_opts=opts.LineStyleOpts(width=4, type_='dashed'),
    is_symbol_show=True, symbol='circle', symbol_size=10,
)

overlap_1 = bar1.overlap(line1)
overlap_2 = bar2.overlap(line2)

grid = Grid(init_opts=opts.InitOpts(theme='dark', width='1200px', height='1200px'))
grid.add(overlap_1, grid_opts=opts.GridOpts(pos_bottom="58%"))
grid.add(overlap_2, grid_opts=opts.GridOpts(pos_top="58%"))
grid.render_notebook()

绘制圆环图分析星期与销售额占比情况。

In [30]:
ser10 = np.round(data.groupby('day_of_week').amounts.sum() / data.amounts.sum() * 100, 2)
ser10

day_of_week
1     8.78
2     7.04
3     9.12
4     6.87
5     8.11
6    30.12
7    29.95
Name: amounts, dtype: float64

In [31]:
ser10.index = ser10.index.map(lambda x: f'星期{x}')
ser10

day_of_week
星期1     8.78
星期2     7.04
星期3     9.12
星期4     6.87
星期5     8.11
星期6    30.12
星期7    29.95
Name: amounts, dtype: float64

In [32]:
[list(z) for z in zip(ser10.index.to_list(), ser10.values.tolist())],


([['星期1', 8.78],
  ['星期2', 7.04],
  ['星期3', 9.12],
  ['星期4', 6.87],
  ['星期5', 8.11],
  ['星期6', 30.12],
  ['星期7', 29.95]],)

In [33]:
from pyecharts import options as opts
from pyecharts.charts import Pie
from pyecharts.faker import Faker

pie = Pie(init_opts=opts.InitOpts(theme='dark'))
pie.add(
    "星期与销售额占比",
    [list(z) for z in zip(ser10.index.to_list(), ser10.values.tolist())],
    radius=["30%", "55%"],
    percent_precision=2,
    tooltip_opts=opts.TooltipOpts(formatter="{b}：{c}%"),
    label_opts=opts.LabelOpts(
        position="outside",
        formatter="{b|{b}: }  {per|{d}%}  ",
        border_width=1,
        border_radius=4,
        rich={
            "a": {"color": "#999", "lineHeight": 22, "align": "center"},
            "abg": {
                "backgroundColor": "#e3e3e3",
                "width": "100%",
                "align": "right",
                "height": 22,
                "borderRadius": [4, 4, 0, 0],
            },
            "hr": {
                "borderColor": "#aaa",
                "width": "100%",
                "borderWidth": 0.5,
                "height": 0,
            },
            "b": {"fontSize": 16, "lineHeight": 33},
            "per": {
                "color": "#eee",
                "backgroundColor": "#334455",
                "padding": [2, 4],
                "borderRadius": 2,
            },
        },
    ),
)
pie.set_global_opts(title_opts=opts.TitleOpts(title="星期与销售额占比情况"),
                    legend_opts=opts.LegendOpts(border_width=0),
                    toolbox_opts=opts.ToolboxOpts(is_show=True, orient='vertical', pos_left='right', pos_top='center'),
)
pie.render_notebook()


绘制气泡图分析时间，订单量，销售额的关系。

In [34]:
data.head(3)

Unnamed: 0,detail_id,order_id,dishes_id,dishes_name,itemis_add,counts,amounts,place_order_time,add_inprice,picture_file,emp_id,hour,day,day_of_week
0,2956,417,610062,蒜蓉生蚝,0,1,49,2016-08-01 11:05:36,0,caipu/104001.jpg,1442,11,1,1
1,2958,417,609957,蒙古烤羊腿,0,1,48,2016-08-01 11:07:07,0,caipu/202003.jpg,1442,11,1,1
2,2961,417,609950,大蒜苋菜,0,1,30,2016-08-01 11:07:40,0,caipu/303001.jpg,1442,11,1,1


In [35]:
ser5 = data.groupby('day').counts.sum()
ser5

day
1     233
2     151
3     192
4     169
5     224
6     793
7     761
8     171
9     167
10    227
11    191
12    196
13    824
14    770
15    230
16    118
17    229
18    258
19    238
20    996
21    853
22    156
23    201
24    154
25    153
26    224
27    831
28    892
29    163
30    167
31    194
Name: counts, dtype: int64

In [36]:
ser11 = data.groupby('day').amounts.sum()
ser11

day
1      9366
2      6125
3      6890
4      7549
5      8671
6     32167
7     31306
8      6532
9      7155
10    10231
11     7202
12     7448
13    32672
14    31347
15    10223
16     4278
17     9014
18     9539
19    10812
20    39757
21    35126
22     6671
23     7766
24     6174
25     6614
26     9563
27    30914
28    36975
29     6701
30     6357
31     8727
Name: amounts, dtype: int64

In [37]:
new_df = pd.concat([ser5, ser11], axis=1)
new_df

Unnamed: 0_level_0,counts,amounts
day,Unnamed: 1_level_1,Unnamed: 2_level_1
1,233,9366
2,151,6125
3,192,6890
4,169,7549
5,224,8671
6,793,32167
7,761,31306
8,171,6532
9,167,7155
10,227,10231


In [None]:
new_df

In [50]:
new_df.amounts.values.tolist()

[9366,
 6125,
 6890,
 7549,
 8671,
 32167,
 31306,
 6532,
 7155,
 10231,
 7202,
 7448,
 32672,
 31347,
 10223,
 4278,
 9014,
 9539,
 10812,
 39757,
 35126,
 6671,
 7766,
 6174,
 6614,
 9563,
 30914,
 36975,
 6701,
 6357,
 8727]

In [133]:
from pyecharts import options as opts
from pyecharts.charts import Scatter
from pyecharts.commons.utils import JsCode
from pyecharts.faker import Faker

scatter = Scatter(init_opts=opts.InitOpts(theme='dark'))
scatter.add_xaxis(new_df.index.to_list())
scatter.add_yaxis("销量", new_df.amounts.values.tolist(),
                  )
scatter.set_global_opts(
        title_opts=opts.TitleOpts(title="时间，订单量，销售额的关系"),
        legend_opts=opts.LegendOpts(is_show=True, border_width=0),
        toolbox_opts=opts.ToolboxOpts(is_show=True, orient='vertical', pos_left='right', pos_top='bottom'),
        visualmap_opts=opts.VisualMapOpts(max_=40000, min_=4000, range_color=Faker.visual_color,pos_right=0, pos_bottom='65%',textstyle_opts=opts.TextStyleOpts(color='white', font_size=10, vertical_align='right')),
        xaxis_opts=opts.AxisOpts(name='日期', name_location='end', name_gap=10, name_rotate=0,),
        yaxis_opts=opts.AxisOpts(name='销售额（元）', name_location='end', name_gap=10, name_rotate=0, max_=45000, split_number=10),
        tooltip_opts=opts.TooltipOpts(is_show=True, formatter='{b}号：{c}元'),
    )
scatter.render_notebook()

针对订单order_id：

        什么菜最受欢迎 - 凉拌菠菜

        点菜的种类 - '凉拌菠菜', '谷稻小庄 ', '麻辣小龙虾', '五色糯米饭(七色)', '芝士烩波士顿龙虾', '辣炒鱿鱼', '香酥两吃大虾',
       '焖猪手', '水煮鱼', '蒙古烤羊腿'

        点菜的数量 - 

        消费金额最大 - 1314

        平均消费 - 1171.3元

针对时间日期进行分析：

        点菜量比较集中的时间 - 18点 - 21点

        哪一天订餐数量最大 - 2016-08-20

        星期几就餐人数最多 - 星期六、星期日
