그래프에 레전드와 텍스트, 어노테이션을 추가해보자
First steps 3: Adding legends, text, and annotations

reference(https://docs.bokeh.org/en/latest/docs/first_steps/first_steps_3.html)

![image-2.png](attachment:image-2.png)
![image-3.png](attachment:image-3.png)

# 범례를 포함하는 차트 랜더링

In [1]:
from bokeh.io import output_notebook
from bokeh.plotting import figure, show
output_notebook()

In [2]:
# prepare some data
x = [1, 2, 3, 4, 5]
y1 = [4, 5, 5, 7, 2]
y2 = [2, 3, 4, 5, 6]

In [3]:
# create a new plot
p = figure(title="Legend example")

line chart와 circle 차트를 생성하는데, 이 때 legend_label을 함께 지정해준다.

In [4]:
# add circle renderer with legend_label arguments
line = p.line(x, y1, legend_label="Temp.", line_color="blue", line_width=2)
circle = p.circle(
    x,
    y2,
    legend_label="Objects",
    fill_color="red",
    fill_alpha=0.5,
    line_color="blue",
    size=80,
)

범례의 위치와 타이틀, 폰트, 색상 등을 지정해준다

In [5]:
# display legend in top left corner (default is top right corner)
p.legend.location = "top_left"

# add a title to your legend
p.legend.title = "Obervations"

# change appearance of legend text
p.legend.label_text_font = "times"
p.legend.label_text_font_style = "italic"
p.legend.label_text_color = "navy"

# change border and background of legend
p.legend.border_line_width = 3
p.legend.border_line_color = "navy"
p.legend.border_line_alpha = 0.8
p.legend.background_fill_color = "navy"
p.legend.background_fill_alpha = 0.2

In [6]:
# show the results
show(p)

# 헤드라인 커스텀화

데이터 준비

In [7]:
# prepare some data
x = [1, 2, 3, 4, 5]
y = [6, 7, 2, 4, 5]

새로운 플롯 생성

In [8]:
# create new plot
p = figure(title="Headline example")

선형 차트 추가

In [9]:
# add line renderer with a legend
p.line(x, y, legend_label="Temp.", line_width=2)

헤드라인(타이틀에 대한 조정)

In [10]:
# show the results
show(p)

기존 차트의 경우, 차트 위쪽에 Headline example이라는 헤드라인을 가졌는데, 아래와 같이 변경이 가능함

## 헤드라인 변경

In [11]:
# create new plot
p = figure()

선형 차트 추가

In [12]:
# add line renderer with a legend
p.line(x, y, legend_label="Temp.", line_width=2)

헤드라인(타이틀에 대한 조정)

In [13]:
# change headline location to the left
p.title_location = "left"

# change headline text
p.title.text = "Changing headline text example"

# style the headline
p.title.text_font_size = "25px"
p.title.align = "right"
p.title.background_fill_color = "darkgrey"
p.title.text_color = "white"

In [14]:
# show the results
show(p)

# 주석 추가하기

주석은 읽기 쉽도록 플롯에 추가하는 시각적 요소임. 다양한 종류의 주석에 대한 자세한 내용은 사용 설명서의 Adding Annotations를 참조할 수 있음.

아래는 상자 주석을 통해 플롯의 특정 영역을 강조 표시하는 예시임.

## 기본 플롯

In [15]:
from bokeh.models import BoxAnnotation

데이터 준비

In [16]:
import random

# generate some data (1-50 for x, random values for y)
x = list(range(0, 51))
y = random.sample(range(0, 100), 51)

플롯 생성하기

In [17]:
# create new plot
p = figure(title="Box annotation example")

선형 renderder 추가하기

In [18]:
# add line renderer
line = p.line(x, y, line_color="#000000", line_width=2)

In [19]:
# show the results
show(p)

플롯 생성하기

In [20]:
# create new plot
p = figure(title="Box annotation example")

선형 renderder 추가하기

In [21]:
# add line renderer
line = p.line(x, y, line_color="#000000", line_width=2)

박스형 주석 추가하기

In [22]:
# add box annotations
low_box = BoxAnnotation(
    top=20, fill_alpha=0.2, fill_color="#F0E442")
mid_box = BoxAnnotation(
    bottom=20, top=80, fill_alpha=0.2, fill_color="#009E73")
high_box = BoxAnnotation(
    bottom=80, fill_alpha=0.2, fill_color="#F0E442")

# add boxes to existing figure
p.add_layout(low_box)
p.add_layout(mid_box)
p.add_layout(high_box)

In [23]:
# show the results
show(p)

---

# Breast-cancer Example

유방암 데이터를 기반으로 범례, 헤드라인, 주석을 커스텀해보자

In [24]:
from sklearn.datasets import load_breast_cancer

data = load_breast_cancer()
data.keys()

dict_keys(['data', 'target', 'frame', 'target_names', 'DESCR', 'feature_names', 'filename', 'data_module'])

In [25]:
import pandas as pd
df_data = pd.DataFrame(data['data'], columns=data['feature_names'])
df_data.head()

Unnamed: 0,mean radius,mean texture,mean perimeter,mean area,mean smoothness,mean compactness,mean concavity,mean concave points,mean symmetry,mean fractal dimension,...,worst radius,worst texture,worst perimeter,worst area,worst smoothness,worst compactness,worst concavity,worst concave points,worst symmetry,worst fractal dimension
0,17.99,10.38,122.8,1001.0,0.1184,0.2776,0.3001,0.1471,0.2419,0.07871,...,25.38,17.33,184.6,2019.0,0.1622,0.6656,0.7119,0.2654,0.4601,0.1189
1,20.57,17.77,132.9,1326.0,0.08474,0.07864,0.0869,0.07017,0.1812,0.05667,...,24.99,23.41,158.8,1956.0,0.1238,0.1866,0.2416,0.186,0.275,0.08902
2,19.69,21.25,130.0,1203.0,0.1096,0.1599,0.1974,0.1279,0.2069,0.05999,...,23.57,25.53,152.5,1709.0,0.1444,0.4245,0.4504,0.243,0.3613,0.08758
3,11.42,20.38,77.58,386.1,0.1425,0.2839,0.2414,0.1052,0.2597,0.09744,...,14.91,26.5,98.87,567.7,0.2098,0.8663,0.6869,0.2575,0.6638,0.173
4,20.29,14.34,135.1,1297.0,0.1003,0.1328,0.198,0.1043,0.1809,0.05883,...,22.54,16.67,152.2,1575.0,0.1374,0.205,0.4,0.1625,0.2364,0.07678


다양한 속성 정보중, `'mean radius', 'mean texture'` 네 개의 속성을 하나의 플롯으로 시각화해보자

In [26]:
samples = df_data.index
mean_radius = df_data['mean radius']
mean_texture = df_data['mean texture']

In [27]:
p = figure(title='Breast Cancer Histogram', 
           x_axis_label='Attributes [mean radius, mean texture]', 
           y_axis_label='Frequency')

In [28]:
import numpy as np

radius_hist, radius_edges = np.histogram(mean_radius, density=True, bins=10)
texture_hist, texture_edges = np.histogram(mean_texture, density=True, bins=10)

In [29]:
radius = p.quad(top=radius_hist, bottom=0, 
                left=radius_edges[:-1], right=radius_edges[1:], 
                legend_label='mean radius',
                color='blue', line_color="white", alpha=0.5)
texture = p.quad(top=texture_hist, bottom=0, 
                 left=texture_edges[:-1], right=texture_edges[1:], 
                 legend_label='mean texture',
                 color='green', line_color="white", alpha=0.5)

### 주석 추가하기

In [30]:
from bokeh.models import BoxAnnotation

# add box annotations
left_box = BoxAnnotation(
    right=10, fill_alpha=0.2, fill_color='ivory')#"#F0E442")
mid_box = BoxAnnotation(
    left=10, right=28, fill_alpha=0.2, fill_color='pink')#"#009E73")
right_box = BoxAnnotation(
    left=28, fill_alpha=0.2, fill_color='ivory')#"#F0E442")

# # add box annotations
# low_box = BoxAnnotation(
#     top=20, fill_alpha=0.2, fill_color="#F0E442")
# mid_box = BoxAnnotation(
#     bottom=20, top=80, fill_alpha=0.2, fill_color="#009E73")
# high_box = BoxAnnotation(
#     bottom=80, fill_alpha=0.2, fill_color="#F0E442")

# add boxes to existing figure
p.add_layout(left_box)
p.add_layout(mid_box)
p.add_layout(right_box)

### legend custom

In [31]:
# display legend in top left corner (default is top right corner)
p.legend.location = "top_right"

# add a title to your legend
p.legend.title = "features"

# change appearance of legend text
p.legend.label_text_font = "gothic"
p.legend.label_text_color = "black"

# change border and background of legend
p.legend.border_line_width = 1
p.legend.border_line_color = "gray"
p.legend.border_line_alpha = 0.8

### Headline

In [32]:
# change headline location to the left
p.title_location = "above"

# change headline text
p.title.text = "Breast Cancer Histogram Example (Radius/Texture)"

# style the headline
p.title.text_font_size = "15px"
p.title.align = "center"
# p.title.background_fill_color = "lightgray"
p.title.text_color = "Black"

In [33]:
show(p)

---

히스토그램을 라이브러리 최대한 없이 그리려고 노력한 흔적

In [34]:
# min_val = mean_radius.min()
# max_val = mean_radius.max()
# bins = 10


# data_range = max_val - min_val
# box_size = data_range / bins

# min_val, max_val, [x+box_size*(i+1) for i, x in enumerate(range(bins))]

In [35]:
# import numpy as np

# lvl_freqs = []

# for lvl in range(bins):
# #     print(lvl)
#     bottom = min_val+box_size*lvl
#     upper = min_val+box_size*(lvl+1)
# #     print(bottom, upper)
    
#     if lvl == bins-1:
#         lvl_freq = mean_radius.apply(lambda x: True if (x>=bottom) & (x<=upper) else False).sum()
#     else:
#         lvl_freq = mean_radius.apply(lambda x: True if (x>=bottom) & (x<upper) else False).sum()
    
#     lvl_freqs.append([bottom, lvl_freq])

# lvl_freqs = np.array(lvl_freqs)

In [36]:
# from matplotlib import pyplot as plt

# plt.bar(x=lvl_freqs[:, 0], height=lvl_freqs[:, 1], width=2.1)

In [37]:
# lvl_freqs[:, 0]

In [38]:
# p.line(samples, mean_compactness, legend_label="meanCompactness", color='black', line_width=2)
# p.line(samples, mean_concavity, legend_label="meanConcavity", color='blue', line_width=2)

# radius = p.vbar(
#     x=lvl_freqs[:, 0], top=lvl_freqs[:, 1], legend_label="Rate", width=0.5, bottom=0, color="red")

# radius = p.quad(top=hist, bottom=0, left=edges[:-1], right=edges[1:], line_color="white")
# texture = p.vbar(x=x, top=y2, legend_label="Rate", width=0.5, bottom=0, color="red")