FFM 모델을 빌드하기 위해 소스 코드를 뜯어 보자.

출처: https://github.com/shenweichen/DeepCTR

In [1]:
# 모듈 불러오기
from itertools import chain
import tensorflow as tf

In [3]:
# 파라미터 설정
DEFAULT_GROUP_NAME = "default_group"

# utils.py


## 계산

 코드 활용에 필요한 계산(?)들이 정의되어 있다. ~~풀링 모드에 따라서 reduce_sum, reduce_max, reduce_mean 다시 정의한 건가? 왜 굳이 tf 내장 함수 안 쓰고?~~ 

* ~~reduce_sum~~
* ~~reduce_max~~ 
* ~~div~~
* ~~softmax~~
* ~~reduce_mean~~

In [22]:
import tensorflow as tf


def reduce_mean(input_tensor, axis=None, keep_dims=False, name=None, reduction_indices=None):
    try:
        return tf.reduce_mean(input_tensor, axis=axis, keep_dims=keep_dims, name=name, reduction_indices=reduction_indices)
    except TypeError:
        return tf.reduce_mean(input_tensor, axis=axis, keepdims=keep_dims, name=name)

def reduce_sum(input_tensor, axis=None, keep_dims=False, name=None, reduction_indices=None):
    try:
        return tf.reduce_sum(input_tensor, axis=axis, keep_dims=keep_dims, name=name, reduction_indices=reduction_indices)
    except TypeError:
        return tf.reduce_sum(input_tensor, axis=axis, keepdims=keep_dims, name=name)

def reduce_max(input_tensor, axis=None, keep_dims=False, name=None, reduction_indices=None):
    try:
        return tf.reduce_max(input_tensor, axis=axis, keep_dims=keep_dims, name=name, reduction_indices=reduction_indices)
    except TypeError:
        return tf.reduce_max(input_tensor, axis=axis, keepdims=keep_dims, name=name)

def div(x, y, name=None):
    try:
        return tf.div(x, y, name=name)
    except AttributeError:
        return tf.divide(x, y, name=name)

def softmax(logits, dim=-1, name=None):
    try:
        return tf.nn.softmax(logits, dim=dim, name=name)
    except TypeError:
        return tf.nn.softmax(logits, axis=dim, name=name)

## 레이어 concat, add

- concat_func
- add_func

### concat, add를 위한 클래스

In [29]:
import tensorflow as tf


class Add(tf.keras.layers.Layer):
    def __init__(self, **kwargs):
        super(Add, self).__init__(**kwargs)

    def build(self, input_shape):
        super(Add, self).build(input_shape) # call해야 함.

    def call(self, inputs, **kwargs):
        if not isinstance(inputs, list):
            return inputs
        if len(inputs) == 1:
            return inputs[0]
        if len(inputs) == 0:
            return tf.constant([[0.0]])

        return tf.keras.layers.add(inputs)


class NoMask(tf.keras.layers.Layer):
    def __init__(self, **kwargs):
        super(NoMask, self).__init__(**kwargs)

    def build(self, input_shape):
        # Be sure to call this somewhere!
        super(NoMask, self).build(input_shape)

    def call(self, x, mask=None, **kwargs):
        return x

    def compute_mask(self, inputs, mask):
        return None

In [30]:
def add_func(inputs):
    return Add()(inputs)

def concat_func(inputs, axis=-1, mask=False):
    if not mask:
        inputs = list(map(NoMask(), inputs))
    if len(inputs) == 1:
        return inputs[0]
    else:
        return tf.keras.layers.Concatenate(axis=axis)(inputs)

## combinde_dnn_input

In [31]:
from tensorflow.keras.layers import Concatenate


def combined_dnn_input(sparse_embedding_list, dense_value_list):
    if len(sparse_embedding_list) > 0 and len(dense_value_list) > 0:
        sparse_dnn_input = Flatten()(concat_func(sparse_embedding_list))
        dense_dnn_input = Flatten()(concat_func(dense_value_list))
        return concat_func([sparse_dnn_input, dense_dnn_input])
    elif len(sparse_embedding_list) > 0:
        return Flatten()(concat_func(sparse_embedding_list))
    elif len(dense_value_list) > 0:
        return Flatten()(concat_func(dense_value_list))
    else:
        raise NotImplementedError("dnn_feature_columns가 비어 있으면 안 됨.")

# Layer 종류 class

- sequence: SequencePoolingLayer, WeightedSequenceLayer



## SequencePoolingLayer

> 원래 layers.sequence에 있음.

 Keras layer를 상속받는다. 가변 길이의 sequence나 multi-value 피쳐에 대해 sum, mean, max 풀링 연산을 한다. 
 
1. Input: [seq_value, seq_len] : sequence의 value와 길이를 나타내는 ??두 텐서의 리스트
    - seq_value : `(batch_size, T, embedding_size)`
    - seq_len : `(batch_size, 1)`

2. Output: `(batch_size, 1, embedding_size)`의 3D tensor
    - 풀링연산 한 뒤 3차원으로 바꾼 것인가??

3. Arguments
    - mode: 풀링 연산 종류(mean, max, sum)
    - supports_masking: `True`일 경우, 마스킹 가능해야 함.

<br>

 masking 가능 여부에 따라서 달라진다. masking support할 거면 직접 mask 만들어주고 차원 확장해주고, 그게 아니면 인자로 받은 것에서 tensorflow 내장 함수 이용해서 sequence_mask 만들고 축만 바꿔주면 되는 듯. tile 함수 통해서 마스킹하는 과정은 기존에 살펴 본 마스킹과 동일. 0에 가까운 수를 곱해서 빼준다.

<br>

 축 확장해서 3차원으로!

In [24]:
from tensorflow.keras.layers import Layer
import tensorflow as tf

class SeuqencePoolingLayer(Layer):

    def __init__(self, mode='mean', supports_masking=False, **kwargs):

        if mode not in ['sum', 'mean', 'max']:
            raise ValueError("풀링 연산은 sum, mean, max 중 하나여야 함.")
        self.mode = mode
        self.eps = tf.constant(1e-8, dtype=tf.float32)
        super(SequencePoolingLayer, self).__init__(**kwargs)
        
        self.supports_masking = supports_masking
    
    def build(self, input_shape):
        if not self.supports_masking:
            self.seq_len_max = int(input_shape[0][1])
        super(SequencePoolingLayer, self).build(input_shape) # call해야 함에 주의.
    
    def call(self, seq_value_len_list, mask=None ,**kwargs):
        if self.supports_masking:
            if mask is None:
                raise ValueError("supports_masking 옵션이 True일 때는 input이 masking가능해야 함.")
            uiseq_embed_list = seq_value_len_list
            mask = tf.cast(mask, dtype=tf.float32) # tf.to_float(mask)
            use_behavior_length = reduce_sum(mask, axis=-1, keep_dims=True)
            mask = tf.expand_dims(mask, axis=2)
        else:
            uiseq_embed_list, user_behavior_length = seq_value_len_list
            mask = tf.sequence_mask(user_behavior_length, self.seq_len_max, dtype=tf.float32)
        
        embedding_size = uiseq_embed_list.shape[-1]
        mask = tf.tile(mask, [1, 1, embedding_size])

        if self.mode == 'max':
            hist = uiseq_embed_list - (1-mask) * 1e9
            return reduce_max(hist, 1, keep_dims=True)
        
        if self.mode == 'mean':
            hist = div(hist, tf.cast(user_behavior_length, dtype=tf.foat32) + self.eps)
        
        hist = tf.expand_dims(hist, axis=1)
        return hist
    
    def compute_output_shape(self, input_shape):
        if self.supports_masking:
            return (None, 1, input_shape[-1])
        else:
            return (None, 1, input_shape[0][-1])
    
    def compute_mask(self, inputs, mask):
        return None
    
    def get_config(self, ):
        config = {'mode': self.mode,
                  'supports_masking': self.supports_masking}
        base_config = super(SequencePoolingLayer, self).get_config()
        return dict(list(base_config.items())) + list(config.items())

## WeightedSequenceLayer

> 원래 layers.sequence에 있음.


 Keras layer를 상속받는다. 가변 길이의 sequence나 multi-value 피쳐에 대해 weight score를 적용한다.
 
1. Input: [seq_value, seq_len, seq_weight] : sequence의 value와 길이, weight를 나타내는 세 텐서의 리스트.
    - seq_value: `(batch_size, T, embedding_size)`
    - seq_len: `(batch_size, 1)`
    - seq_weight: `(batch_size, T, 1)`

2. Output: `(batch_size, 1, embedding_size)`의 3D tensor
    - 풀링연산 한 뒤 3차원으로 바꾼 것인가??

3. Arguments
    - weight_normalization: weight 적용 전에 normalize할 것인지 여부.
    - supports_masking: `True`일 경우, 마스킹 가능해야 함.

<br>

 masking 가능 여부, weight_normalization 여부에 따라서 달라진다. weight_normalization은 softmax. 

In [25]:
class WeightedSequenceLayer(Layer):

    def __init__(self, weight_normalization=True, supports_masking=False, **kwargs):
        super(WeightedSequenceLayer, self).__init__(**kwargs)
        self.weight_normalization = weight_normalization
        self.supports_masking = supports_masking

    def build(self, input_shape):
        if not self.supports_masking:
            self.seq_len_max = int(input_shape[0][1])
        super(WeightedSequenceLayer, self).build(input_shape) # call해야 함에 주의.

    def call(self, input_list, mask=None, **kwargs):
        if self.supports_masking:
            if mask is None:
                raise ValueError("supports_masking 옵션이 True일 때는 input이 masking가능해야 함.")
            key_input, value_input = input_list
            mask = tf.expand_dims(mask[0], axis=2)
        else:
            key_input, key_length_input, value_input = input_list
            mask = tf.sequence_mask(key_length_input, self.seq_len_max, dtype=tf.bool)
            mask = tf.transpose(mask, (0, 2, 1))

        embedding_size = key_input.shape[-1]

        if self.weight_normalization:
            paddings = tf.ones_like(value_input) * (-2 ** 32 + 1)
        else:
            paddings = tf.zeros_like(value_input)
        value_input = tf.where(mask, value_input, paddings)

        if self.weight_normalization:
            value_input = softmax(value_input, dim=1)

        if len(value_input.shape) == 2:
            value_input = tf.expand_dims(value_input, axis=2)
            value_input = tf.tile(value_input, [1, 1, embedding_size])

        return tf.multiply(key_input, value_input)

    def compute_output_shape(self, input_shape):
        return input_shape[0]

    def compute_mask(self, inputs, mask):
        if self.supports_masking:
            return mask[0]
        else:
            return None

    def get_config(self, ):
        config = {'weight_normalization': self.weight_normalization, 
                  'supports_masking': self.supports_masking}
        base_config = super(WeightedSequenceLayer, self).get_config()
        return dict(list(base_config.items()) + list(config.items()))

##  FM Layer

> 원래 layers.interaction에 있음.


In [None]:
from tensorflow.keras.layers import Layer


class FM(Layer):
    
    def __init__(self, **kwargs):
        super(FM, self).__init__(**kwargs)
    
    def build(self, input_shape):
        if len(input_shape) != 3:
            raise ValueError("차원이 맞지 않음: %d, \
                              3차원이어야 함" % (len(input_shape)))
        super(FM, self).build(input_shape) # call해야 함.

'''
여기서부터 다시!
'''

# Feature 종류 class


1. SparseFeat
    - int형이어야 함.

2. DenseFeat
    - float형이어야 함.

3. VarLenSparseFeat
    - ~~프로퍼티 속성 변경해야 하므로 속성 접근 정의해 놓은 듯.~~ 잉 아닌디?
    - 

In [32]:
from collections import namedtuple
from tensorflow.keras.initializers import RandomNormal, Zeros


class SparseFeat(namedtuple('SparseFeat',
                            ['name', 'vocabulary_size', 'embedding_dim', 'use_hash', 'dtype',
                             'embeddings_initializer', 'embedding_name', 'group_name', 
                             'trainable'])):
    __slots__ = ()

    def __new__(cls, name, vocabulary_size, embedding_dim=4, use_hash=False, dtype='int32',
                embeddings_initializer=None, embedding_name=None, group_name=DEFAULT_GROUP_NAME,
                trainable=False):        

        print("SparseFeat, __new__ 메소드 호출")

        if embedding_dim == 'auto':
            embedding_dim = 6 * int(pow(vocabulary_size, 0.25))
        if embedding_initializer is None:
            embeddings_initializer = RandomNormal(mean=0.0, stdev=0.0001, seed=2020)
        if embedding_name is None:
            embedding_name = name
        
        return super(SparseFeat, cls).__new__(cls, name, vocabulary_size, embedding_dim, use_hash, dtype,
                                              embeddings_initializer, ebedding_name, group_name,
                                              trainable)
        
    def __hash__(self):
        return self.name.__hash__()


class DenseFeat(namedtuple('DenseFeat',
                           ['name', 'dimension', 'dtype'])):
    __slots__ = ()

    def __new__(cls, name, dimension=1, dtype='float32'):
        return super(DenseFeat, cls).__new__(cls, name, dimension, dtype)
    
    def __hash__(self):
        return self.name.__hash__()


class VarLenSparseFeat(namedtuple('VarLenSparseFeat',
                                  ['sparsefeat', 'maxlen', 'combiner', 'length_name', 'weight_name', 'weight_norm'])):
    __slots__ = ()

    def __new__(cls, sparsefeat, maxlen, combiner='mean', length_name=None, weight_name=None, weight_norm=True):
        return super(VarLenSparseFeat, cls).__new__(cls, sparsefeat, maxlen, combiner, length_name, weight_name, weight_norm)

    @property
    def name(self):
        return self.sparsefeat.name

    @property
    def vocabulary_size(self):
        return self.sparsefeat.vocabulary_size
    
    @property
    def embedding_dim(self):
        return self.sparsefeat.embedding_dim
    
    @property
    def use_hash(self):
        return self.sparsefeat.use_hash
    
    @property
    def dtype(self):
        return self.sparsefeat.dtype
    
    @property
    def dtype(self):
        return self.sparsefeat.dtype
    
    @property
    def embeddings_initializer(self):
        return self.sparsefeat.embeddings_initializer
    
    @property
    def embedding_name(self):
        return self.sparsefeat.embedding_name
    
    @property
    def group_name(self):
        return self.sparsefeat.group_name
    
    @property
    def trainable(self):
        return self.sparsefeat.trainable
    
    def __hash__(self):
        return self.name.__hash__()

# Input Feature building

- sparsefeat, densefeat, varlensparsefeat인지에 따라서 input feature 빌딩 방법이 다르다.
    - 가변 길이 sparse인 경우, max 길이에 맞춤.
    - batch_shape으로 바꿀 수 없는지 확인.

In [10]:
from collections import OrderedDict
from tensorflow.keras.layers import Input


def build_input_features(feature_columns, prefix=''):
    input_features = OrderedDict()
    for fc in feature_columns:
        if isinstance(fc, SparseFeat):
            input_features[fc.name] = Input(shape=(1, ), name=prefix+fc.name, dtype=fc.dtype)
        elif isinstance(fc, DenseFeat):
            input_features[fc.name] = Input(shape=(fc.dimension, ), name=prefix+fc.name, dtype=fc.dtype)
        elif isinstance(fc, VarLenSparseFeat):
            input_features[fc.name] = Input(shape=(fc.maxlen,), name=prefix+fc.name, dtype=fc.dtype)
            if fc.weight_name is not None:
                input_features[fc.weight_name] = Input(shape=(fc.maxlen, 1), name=prefix+fc.weight_name, dtype='float32')
            if fc.length_name is not None:
                input_features[fc.length_name] = Input(shape=(1, ), name=prefix+fc.length_name, dtype='int32')
        else:
            raise TypeError("Feature Column 오류. 현재 Feature Column Type: {}".shape(type(fc)))
    return input_features

# feature column으로부터 input 만들기

 `inputs.py`에 정의되어 있는 함수들을 이용해서 input 값을 만든다.



- filter를 통해 각 column이 어디에 속하는지 확인한 후 list로 반환. 단, feature_columns 인자 없으면 빈 리스트.
- embedding matrix를 `create_embedding_matrix` 함수를 가지고 만든 후,
- embedding_lookup을 통해 위에서 만든 dictionary에서 feature의 embedding을 찾는다. 그런데 이건 sparse feature에 대해서만.
- dense feature에 대해서는 get_dense_input한 뒤, 위와 비슷한 방식으로 함수들을 이용해서 embedding matrix 만들고 찾고.
- 만약에 dense support하지 않는데 dense column 있다면 잘못한 것이므로 오류.
- 마지막에 mergeDict로 다 합친다. 그런데 그룹핑하지 않으려면 chain해서 flatten.

## inputs.py

 인풋값 만드는 데 필요한 함수들 정의되어 있다.

- ~~`create_embedding_dict`~~
- ~~`create_embeding_matrix`~~
- ~~`embedding_lookup`~~
- ~~`get_dense_input`~~
- ~~`varlen_embedding_lookup`~~
- ~~`get_varlen_pooling_list`~~
- ~~`mergeDict`~~

### create_embedding_dict

 sparse feature와 varlen sparse feature에 대해서 임베딩을 만든다. 케라스 임베딩 레이어이고, input_dim, output_dim은 feature에서 애초에 설정되어 있고, trainable 여부는 feature에서 trainable 설정한 여부와 동일하다. 딕셔너리에 임베딩 레이어를 설정해서 저장한다.

 


In [17]:
from tensorflow.keras.layers import Embedding
from tensorflow.keras.regularizers import l2


def create_embedding_dict(sparse_feature_columns, varlen_sparse_feature_columns, seed, l2_reg,
                          prefix='sparse_', seq_mask_zero=True):
    sparse_embedding = {}
    for feat in sparse_feature_columns:
        emb = Embedding(feat.vocabulary_size, feat.embedding_dim,
                        embeddings_initializer=feat.embeddings_initializer,
                        embeddings_regularizer=l2(l2_reg),
                        name=prefix+'_emb_'+feat.embedding_name)
        emb.trainable = feat.trainable
        sparse_embedding[feat.embedding_name] = emb
    
    if varlen_sparse_feature_columns and len(varlen_sparse_feature_columns) > 0:
        for feat in varlen_sparse_feature_columns:
            emb = Embedding(feat.vocabulary_size, feat.embedding_dim,
                            embeddings_initializer=feat.embeddings_initializer,
                            embeddings_regularizer=l2(l2_reg),
                            name=prefix+'_seq_emb_'+feat.name,
                            mask_zero=seq_mask_zero)
            emb.trainable = feat.trainable
            sparse_embedding[feat.embedding_name] = emb
    return sparse_embedding

### create_embedding_matrix

- `feature_column`에 정의되어 있는 클래스들 활용한다. 
- 일단 지금은 노트북이므로 import 안 해도 되는 것으로.
- 원래는 아래처럼 써야 한다.
```
from . import feature_column as fc_lib

    sparse_feature_columns = list(
    filter(lambda x: isinstance(x, fc_lib.SparseFeat), feature_columns)) if feature_columns else []
```

In [18]:
def create_embedding_matrix(feature_columns, l2_reg, seed, prefix='', seq_mask_zero=True):
    sparse_feature_columns = list(
        filter(lambda x: isinstance(x, SparseFeat), feature_columns)) if feature_columns else []
    varlen_sparse_feature_columns = list(
        filter(lambda x: isinstance(x, VarLenSparseFeat), feature_columns)) if feature_columns else []
    sparse_emb_dict = create_embedding_dict(sparse_features, varlen_sparse_feature_columns, seed,
                                            l2_reg, prefix=prefix + 'sparse', seq_mask_zero=seq_mask_zero)
    return sparse_emb_dict

### embedding_lookup / varlen_embedding_lookup

 hash 혹은 feauture name 사용해서 embedding을 찾는다.

 - embedding_lookup은 defaultdict 해서 list flatten하고.

In [19]:
from collections import defaultdict
from itertools import chain


def embedding_lookup(spasre_embedding_dict, sparse_input_dict, sparse_feature_columns, return_feat_list=(), 
                     mask_feat_list=(), to_list=False):
    group_embedding_dict=defaultdict(list)
    for fc in sparse_feature_columns:
        feature_name = fc.name
        embedding_name = fc.embedding_name
        if (len(return_feat_list) == 0 or feature_name in return_feat_list):
            if fc.use_hash:
                lookup_idx = Hash(fc.vocabulary_size, mask_zero=(feature_name in mask_feat_list))(
                    sparse_input_dict[feature_name])
            else:
                lookup_idx = sparse_input_dict[feature_name]
            
            group_emgedding_dict[fc.group_name].append(spasre_embedding_dict[embedding_name](lookup_idx))
        if to_list:
            return list(chain.from_iterable(group_embedding_dict.value()))
        return group_embedding_dict

In [20]:
def varlen_embedding_lookup(embedding_dict, sequence_input_dict, varlen_sparse_feature_columns):
    varlen_embedding_vec_dict = {}
    for fc in varlen_sparse_feature_columns:
        feature_name = fc.name
        embedding_name = fc.embedding_name
        if fc.use_hash:
            lookup_idx = Hash(fc.vocabulary_size, mask_zero=True)(sequence_input_dict[feature_name])
        else:
            lookup_idx = sequence_input_dict[feature_name]
        varlen_embedding_vec_dict[feature_name] = embedding_dict[embedding_name](lookup_idx)
    return varlen_embedding_vec_dict

### get_dense_input


- `feature_column`에 정의되어 있는 클래스들 활용한다. 
- 일단 지금은 노트북이므로 import 안 해도 되는 것으로.
- 원래는 아래처럼 써야 한다.
```
from . import feature_column as fc_lib

    sparse_feature_columns = list(
    filter(lambda x: isinstance(x, fc_lib.DenseFeat), feature_columns)) if feature_columns else []
```

In [21]:
def get_dense_input(features, feature_columns):
    dense_feature_columns = list(
        filter(lambda x: isinstance(x, DenseFeat), feature_columns)) if feature_columns else []
    dense_input_list = [features[fc.name] for fc in dense_feature_columns]
    return dense_input_list

### get_varlen_pooling_list

 가변 길이 sequence 풀링해서 리스트를 반환한다.

 - weight_name 있으면 WeightedSequenceLayer 사용하고,
 - weight_name 없으면 weight 안 주는 거로 생각해서 embedding 사용하는데,
<br>

 그렇게 해서 SequencePoolingLayer로 간다.

In [26]:
from collections import defaultdict


def get_varlen_pooling_list(embedding_dict, features, varlen_sparse_columns, to_list=False):
    pooling_vec_list = defaultdict(list)
    for fc in varlen_sparse_columns:
        feature_name = fc.name
        combiner = fc.combiner
        feature_length_name = fc.feature_length_name
        if feature_length_name is not None:
            if fc.weight_name is not None:
                seq_input = WeightedSequenceLayer(weight_normalization=fc.weight_norm)(
                    [embedding_dict[feature_name], features[feature_length_name], features[fc.weight_name]])
            else:
                seq_input = embedding_dict[feature_name]
            vec = SeuqencePoolingLayer(combinder, supports_masking=False)(
                [seq_input, features[feature_length_name]])
        else:
            if fc.weight_name is not None:
                seq_input = WeightedSequenceLayer(weight_normalization=fc.weight_norm, supports_masking=True)(
                    [embedding_dict[feature_name], features[fc.weight_name]])
            else:
                seq_input = embedding_dict[feature_name]
            vec = SequencePoolingLayer(combinder, supports_masking=True)(seq_input)
        
        pooling_vec_list[fc.group_name].append(vec)
        
    if to_list:
        return chain.from_iterable(pooling_vec_list.values())
    return pooling_vec_list

### mergeDict

In [27]:
from collections import defaultdict


def mergeDict(a, b):
    c = defaultdict(list)
    for k, v in a.items():
        c[k].extend(v)
    for k, v in b.items():
        c[k].extend(v)
    return c

## 다시 input 만들기로 돌아 와서.

 임베딩된 것을 합친다. 만약에 그룹핑 허용하지 않으면, flatten하여 리스트로 반환한다.

In [28]:
def input_from_feature_columns(features, feature_columns, l2_reg, seed, prefix='', seq_mask_zero=True,
                               support_dense=True, support_group=False):
    
    sparse_feature_columns = list(
        filter(lambda x: isinstance(x, SparseFeat), feature_columns)) if feature_columns else []
    varlen_sparse_feature_columns = list(
        filter(lambda x: isinstance(x, VarLenSparseFeat), feature_columns)) if feature_columns else []
    
    embedding_matrix_dict = create_embedding_matrix(feature_colums, l2_reg, seed, prefix=prefix, 
                                                    seq_mask_zero=seq_mask_zero)

    group_sparse_embedding_dict = embedding_lookup(embedding_matrix_dict, features, sparse_feature_columns)
    dense_value_list = get_dense_input(features, feature_columns)

    if not support_dense and len(dense_value_list) > 0:
        raise ValueError("DenseFeature DNN에서 사용할 수 없음.")
    
    seq_embed_dict = varlen_embedding_lookup(embedding_matrix_dict, features, varlen_sparse_feature_columns)
    group_varlen_sparse_embedding_dict = get_varlen_pooling_list(sequence_embed_dict, features,
                                                                 varlen_sparse_feature_columns)
    group_embedding_dict = mergeDict(group_sparse_embedding_dict, group_varlen_sparse_embedding_dict)
    if not support_group:
        group_embedding_dict = list(chain.from_iterable(group_embedding_dict.values()))
    return group_embedding_dict, dense_value_list

# linear 로짓 계산

- deepcopy 안 해도 되는지?
- `_replace`로 namedtuple 프로퍼티 변경
- 프로퍼티 바꾸고 할당해야 하므로 enumerate 안 될 듯.

In [None]:
def get_linear_logit(features, feature_columns, units=1, use_bias=False, seed=1024, prefix='linear', l2_reg=0):
    linear_feature_columns = copy(feature_columns)
    for i in range(len(linear_feature_columns)):
        if isinstance(linear_feature_columns[i], SparseFeat):
            linear_feature_columns[i] = linear_feature_columns[i]._replace(embedding_dim=1, 
                                                                           embeddings_initializer=Zeros())
        if isinstance(linear_feature_columns[i], VarLenSparseFeat):
            linear_feature_columns[i] = linear_feature_columns[i]._replace(embedding_dim=1,
                                                                           embedidngs_initializer=Zeros())
    
    linear_emb_list = [input_from_feature_columns(

        '''
        여기 채워야 합니다.
        '''

    )]

# DeepFM 모델 빌드

1. 파라미터
    - `linear_feature_columns`: linear part에 사용될 feature들의 이터러블.
    - `dnn_feature_columns`: deep part에 사용될 feature들의 이터러블.
    - `fm_group`: feature 상호작용에 사용될 feature들의 리스트와 그것의 그룹 이름(지정하면 되나?)
    - `dnn_hidden_units`: DNN 각각의 레이어에 사용될 layer number, units의 리스트(각각은 양수이거나 빈 리스트여야 함).
    - `l2_reg_linear`: 실수. linear part에 적용될 L2 규제항.
    - `l2_reg_embeddng`: 실수. embedding vector에 적용할 L2 규제항.
    - `l2_reg_dnn`: 실수. DNN에 적용할 L2 규제항.
    - `seed`: 시드값.
    - `dnn_dropout`: DNN 네트워크 노드 드롭아웃 비율.
    - `dnn_activation`: DNN 활성화 함수.
    - `dnn_use_bn`: DNN에서 배치 노멀라이제이션 여부.
    - `task`: binary(binary logloss), regression(regression loss)

2. 리턴 : 케라스 모델 객체.

* linear column, dnn column 리스트로 받아서, input feature 빌드한다.
* ordered dictionary 반환되는데, 거기서 values만 받아 리스트로 만든다.

## feature_column

- ~~build_input_features~~
- ~~get_linear_logit~~
- ~~DEFAULT_GROUP_NAME~~
- ~~input_from_feature_columns~~

## layers.core
- PredictionLayer
- DNN

## layers.interaction
- ~~FM~~

## layers.utils
- ~~concat_func~~
- ~~add_func~~
- ~~combined_dnn_input~~

In [None]:
# 모델 빌드 함수
def DeepFM(linear_feature_columns, dnn_feature_columns, fm_group=[DEFAULT_GROUP_NAME], dnn_hidden_units=(128, 128),
           l2_reg_linear=0.00001, l2_reg_embedding=0.00001, l2_reg_dnn=0, seed=1024, dnn_dropout=0,
           dnn_activation='relu', dnn_use_bn=False, task='binary'):
    features = build_input_features(linear_feature_columns + dnn_feature_columns)
    inputs_list = list(features.values())
    linear_logit = get_linear_logit(features, linear_feature_columns, seed=seed, prefix='linear',
                                    l2_reg=l2_reg_linear)
    group_embedding_dict, dense_value_list = input_from_feature_columns(features, dnn_feature_columns, l2_reg_embedding,
                                                                        seed, support_group=True)
    
    fm_logit = add_func()
    