### Association Rule (연관성 규칙)
* 이번 시간에는 연관성 규칙을 통해서 로또번호에서 연관되어 있는 번호를 추출하고자 한다. 

### 1. 데이터 불러오기

In [1]:
# 학습 데이터 불러오기
import pandas as pd
data = pd.read_csv('lotto.csv')
display(data)

Unnamed: 0,회차,1,2,3,4,5,6,보너스
0,1014,3,11,14,18,26,27,21
1,1013,21,22,26,34,36,41,32
2,1012,5,11,18,20,35,45,3
3,1011,1,9,12,26,35,38,42
4,1010,9,12,15,25,34,36,3
...,...,...,...,...,...,...,...,...
1009,5,16,24,29,40,41,42,3
1010,4,14,27,30,31,40,42,2
1011,3,11,16,19,21,27,31,30
1012,2,9,13,21,25,32,42,2


In [2]:
data_as = pd.melt(data, id_vars=['회차'], value_vars=['1', '2', '3', '4', '5', '6', '보너스'])
data_as

Unnamed: 0,회차,variable,value
0,1014,1,3
1,1013,1,21
2,1012,1,5
3,1011,1,1
4,1010,1,9
...,...,...,...
7093,5,보너스,3
7094,4,보너스,2
7095,3,보너스,30
7096,2,보너스,2


### 2. 환경설정

* PyCaret에서 기계 학습 실험의 첫 번째 단계는 수행하고자 하는 작업에 맞는 필요한 모듈(arules)를 가져오고 환경을 설정하는 단계
* 본 실습에서 연관성 규칙을 위해 사용하는 모듈은 pycaret.arules
* setup() 함수를 통해 DataFrame의 데이터('data_as'), 거래ID변수('회차'), 아이템변수('value')를 정의하고 분류 모델을 초기화

In [3]:
from pycaret.arules import *

In [4]:
s = setup(data = data_as, transaction_id ='회차', item_id='value')

Description,Value
session_id,3911.0
# Transactions,1014.0
# Items,45.0
Ignore Items,


### 3. 모델 생성

* 모델생성은 create_model함수를 바탕으로 이루어짐

In [5]:
model = create_model(metric='lift', threshold=1.2, min_support=0.03, max_len=3)

In [6]:
model

Unnamed: 0,antecedents,consequents,antecedent support,consequent support,support,confidence,lift,leverage,conviction
0,(25),(2),0.146,0.1607,0.0325,0.223,1.3871,0.0091,1.0801
1,(2),(25),0.1607,0.146,0.0325,0.2025,1.3871,0.0091,1.0708
2,(3),(20),0.1578,0.1637,0.0345,0.2188,1.3362,0.0087,1.0705
3,(20),(3),0.1637,0.1578,0.0345,0.2108,1.3362,0.0087,1.0672
4,(21),(11),0.1568,0.1578,0.0325,0.2075,1.3153,0.0078,1.0628
5,(11),(21),0.1578,0.1568,0.0325,0.2062,1.3153,0.0078,1.0623
6,(12),(24),0.1647,0.1598,0.0335,0.2036,1.2743,0.0072,1.055
7,(24),(12),0.1598,0.1647,0.0335,0.2099,1.2743,0.0072,1.0572
8,(7),(18),0.1558,0.1647,0.0325,0.2089,1.2682,0.0069,1.0558
9,(18),(7),0.1647,0.1558,0.0325,0.1976,1.2682,0.0069,1.0521


In [7]:
pd.set_option('display.max_rows', None)
model

Unnamed: 0,antecedents,consequents,antecedent support,consequent support,support,confidence,lift,leverage,conviction
0,(25),(2),0.146,0.1607,0.0325,0.223,1.3871,0.0091,1.0801
1,(2),(25),0.1607,0.146,0.0325,0.2025,1.3871,0.0091,1.0708
2,(3),(20),0.1578,0.1637,0.0345,0.2188,1.3362,0.0087,1.0705
3,(20),(3),0.1637,0.1578,0.0345,0.2108,1.3362,0.0087,1.0672
4,(21),(11),0.1568,0.1578,0.0325,0.2075,1.3153,0.0078,1.0628
5,(11),(21),0.1578,0.1568,0.0325,0.2062,1.3153,0.0078,1.0623
6,(12),(24),0.1647,0.1598,0.0335,0.2036,1.2743,0.0072,1.055
7,(24),(12),0.1598,0.1647,0.0335,0.2099,1.2743,0.0072,1.0572
8,(7),(18),0.1558,0.1647,0.0325,0.2089,1.2682,0.0069,1.0558
9,(18),(7),0.1647,0.1558,0.0325,0.1976,1.2682,0.0069,1.0521
