(Vanilla)RNN
입력과 출력을 시퀀스 단위로 처리하는 가장 기본적인 인공 신경망 시퀀스 모델이다.
피드 포워드 신경망과 달리 RNN은 은닉층의 노드에서 활성화 함수를 통해 나온 결과값을 출력층 방향으로 보냄과 동시에 다시 은닉층 노드의 다음 계산의 입력으로 보낸다. 은닉층에서 활성화 함수를 통해 결과를 내보내는 셀, 이 셀은 이전의 값을 기억하려고 하는 일종의 메모리 역할을 하며 메모리 셀 또는 RNN 셀이라고 표현한다.   
메모리 셀은 바로 이전 시점에 나온 값(은닉 상태)을 입력으로 사용하는 재귀적 활동을 한다.    

앞서 언급한 바닐라 RNN은 **장기 의존성 문제**를 갖고 있다. RNN은 비교적 짧은 시퀀스에는 효과를 보이지만, time step이 길어질 수록 앞의 정보가 손실되어 뒤로 충분히 전달되지 못하는 현상이 발생한다.

# LSTUR : Neural News Recommendation with Long- and Short term User Representations
LSTUR은 유저의 장기적인 선호와 단기적인 관심사를 모두 기록하는 추천 접근법이다. LSTUR의 핵심은 뉴스와 유저 인코더이다. 뉴스 인코더는 뉴스의 제목에서 뉴스의 representation을 학습한다. 유저 인코더는 유저 ID의 임베딩으로부터 장기 유저 representation을, 최근 방문한 뉴스로부터 단기 유저 representation을 학습한다.   
장기 유저 repr과 단기 유저 repr을 결합하는 데는 두 가지 방법이 있다.
- 첫 번째 방법은 단기 유저 repr의 GRU 네트워크의 은닉 상태를 초기화하기 위해 장기 유저 repr을 사용하는 방법입니다.
- 두 번째 방법은 장기, 단기 유저 repr을 하나의 유저 벡터로 합치는 것입니다.

## Data format
빠른 과정을 위해 MINDdemo 데이터셋을 사용한다. MINDsmall이나 MINDlarge 데이터셋을 사용하고 싶으면 `MIND_type` 파라미터를 ['large', 'small', 'demo']중에 선택하여 다운로드 소스를 변경하면 된다.    

### News data
이 파일은 newsid, category, subcaetgory, title, abstract, url과 title, abstract의 entities를 포함한다.   
We generate a word_dict file to tranform words in news title to word indexes, and a embedding matrix is initted from pretrained glove embeddings.

### Behaviors data
이 파일의 각 라인은 한 impression에 대한 정보들을 나타낸다.   
ex) [Impression ID] [User ID] [Impression Time] [User Click History] [Impression News]   
User Click History is the user historical clicked news before Impression Time. Impression News is the displayed news in an impression
ex) [News ID 1]-[label1] ... [News ID n]-[labeln]   
label은 유저에 의해 뉴스가 클릭 되었나 안되었나를 나타낸다.

##  Global settings and imports

In [1]:
import sys
import os
import zipfile
import numpy as np
import scrapbook as sb
import tensorflow as tf
from tqdm import tqdm
from tempfile import TemporaryDirectory
tf.get_logger().setLevel('ERROR')

from recommenders.models.deeprec.deeprec_utils import download_deeprec_resources
from recommenders.models.newsrec.newsrec_utils import prepare_hparams
from recommenders.models.newsrec.models.lstur import LSTURModel
from recommenders.models.newsrec.io.mind_iterator import MINDIterator
from recommenders.models.newsrec.newsrec_utils import get_mind_data_set

print('System version: {}'.format(sys.version))
print('Tensorflow version: {}'.format(tf.__version__))

System version: 3.7.13 (default, Mar 29 2022, 02:18:16) 
[GCC 7.5.0]
Tensorflow version: 2.7.3


In [2]:
gpus = tf.config.experimental.list_physical_devices('GPU')
if gpus:
    try:
        for gpu in gpus:
            tf.config.experimental.set_memory_growth(gpu, True)
        logical_gpu = tf.config.experimental.list_logical_devices('GPU')
        print(len(gpus), 'Physical GPUs', len(logical_gpu), 'Logical GPUs')
    except RuntimeError as e:
            print(e)

1 Physical GPUs 1 Logical GPUs


2022-07-08 13:13:55.200993: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:939] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2022-07-08 13:13:55.216609: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:939] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2022-07-08 13:13:55.217208: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:939] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2022-07-08 13:13:55.218738: I tensorflow/core/platform/cpu_feature_guard.cc:151] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 AVX512F FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compil

## Parameters

In [3]:
epochs = 5
seed = 40
batch_size = 8
MIND_type = 'demo'

## Download and load data

In [5]:
tmpdir = TemporaryDirectory()
data_path = tmpdir.name
print('data path: {}'.format(data_path))

train_news_file = os.path.join(data_path, 'train', r'news.tsv')
train_behaviors_file = os.path.join(data_path, 'train', r'behaviors.tsv')
valid_news_file = os.path.join(data_path, 'valid', r'news.tsv')
valid_behaviors_file = os.path.join(data_path, 'valid', r'behaviors.tsv')
wordEmb_file = os.path.join(data_path, "utils", "embedding.npy")
userDict_file = os.path.join(data_path, "utils", "uid2index.pkl")
wordDict_file = os.path.join(data_path, "utils", "word_dict.pkl")
yaml_file = os.path.join(data_path, "utils", r'lstur.yaml')

mind_url, mind_train_dataset, mind_dev_dataset, mind_utils = get_mind_data_set(MIND_type)

if not os.path.exists(train_news_file):
    download_deeprec_resources(mind_url, os.path.join(data_path, 'train'), mind_train_dataset)
    
if not os.path.exists(valid_news_file):
    download_deeprec_resources(mind_url, os.path.join(data_path, 'valid'), mind_dev_dataset)

if not os.path.exists(yaml_file):
    download_deeprec_resources(r'https://recodatasets.z20.web.core.windows.net/newsrec/', os.path.join(data_path, 'utils'), mind_utils)

data path: /tmp/tmpa0rezubw


100%|███████████████████████████████████████| 17.0k/17.0k [00:32<00:00, 520KB/s]
100%|███████████████████████████████████████| 9.84k/9.84k [00:33<00:00, 294KB/s]
100%|███████████████████████████████████████| 95.0k/95.0k [03:18<00:00, 479KB/s]


## Create hyper-parameters

In [6]:
hparams = prepare_hparams(yaml_file, wordEmb_file=wordEmb_file,
                         wordDict_file=wordDict_file, userDict_file=userDict_file,
                         batch_size=batch_size, epochs=epochs)
print(hparams)

HParams object with values {'support_quick_scoring': True, 'dropout': 0.2, 'attention_hidden_dim': 200, 'head_num': 4, 'head_dim': 100, 'filter_num': 400, 'window_size': 3, 'vert_emb_dim': 100, 'subvert_emb_dim': 100, 'gru_unit': 400, 'type': 'ini', 'user_emb_dim': 50, 'learning_rate': 0.0001, 'optimizer': 'adam', 'epochs': 5, 'batch_size': 8, 'show_step': 100000, 'title_size': 30, 'his_size': 50, 'data_format': 'news', 'npratio': 4, 'metrics': ['group_auc', 'mean_mrr', 'ndcg@5;10'], 'word_emb_dim': 300, 'cnn_activation': 'relu', 'model_type': 'lstur', 'loss': 'cross_entropy_loss', 'wordEmb_file': '/tmp/tmpa0rezubw/utils/embedding.npy', 'wordDict_file': '/tmp/tmpa0rezubw/utils/word_dict.pkl', 'userDict_file': '/tmp/tmpa0rezubw/utils/uid2index.pkl'}


## Train the LSTUR model

In [7]:
model = LSTURModel(hparams, MINDIterator, seed=seed)

2022-07-08 13:19:31.617871: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:939] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2022-07-08 13:19:31.618264: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:939] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2022-07-08 13:19:31.618536: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:939] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2022-07-08 13:19:31.618803: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:939] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2022-07-08 13:19:31.619036: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:939] successful NUMA node read from S

Tensor("conv1d/Relu:0", shape=(None, 30, 400), dtype=float32)
Tensor("att_layer2/Sum_1:0", shape=(None, 400), dtype=float32)


  super(Adam, self).__init__(name, **kwargs)


In [14]:
print(model.run_eval(valid_news_file, valid_behaviors_file))

  updates=self.state_updates,
2022-07-08 13:12:31.024874: W tensorflow/core/common_runtime/bfc_allocator.cc:275] Allocator (GPU_0_bfc) ran out of memory trying to allocate 1.01GiB with freed_by_count=0. The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory were available.
2022-07-08 13:12:31.024922: W tensorflow/core/common_runtime/bfc_allocator.cc:275] Allocator (GPU_0_bfc) ran out of memory trying to allocate 1.01GiB with freed_by_count=0. The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory were available.
2341it [00:02, 950.20it/s] 
943it [00:16, 58.82it/s]
7538it [00:00, 10592.78it/s]


{'group_auc': 0.5201, 'mean_mrr': 0.2214, 'ndcg@5': 0.2292, 'ndcg@10': 0.2912}


In [8]:
model

<recommenders.models.newsrec.models.lstur.LSTURModel at 0x7f80199b1890>

In [8]:
%%time
model.fit(train_news_file, train_behaviors_file, valid_news_file, valid_behaviors_file)

0it [00:00, ?it/s]2022-07-08 13:19:45.873848: I tensorflow/stream_executor/cuda/cuda_dnn.cc:366] Loaded cuDNN version 8201
2022-07-08 13:19:46.257916: W tensorflow/core/common_runtime/bfc_allocator.cc:275] Allocator (GPU_0_bfc) ran out of memory trying to allocate 1.09GiB with freed_by_count=0. The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory were available.
2022-07-08 13:19:46.258599: W tensorflow/core/common_runtime/bfc_allocator.cc:275] Allocator (GPU_0_bfc) ran out of memory trying to allocate 1.09GiB with freed_by_count=0. The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory were available.
2022-07-08 13:19:46.258777: W tensorflow/core/common_runtime/bfc_allocator.cc:275] Allocator (GPU_0_bfc) ran out of memory trying to allocate 1.96GiB with freed_by_count=0. The caller indicates that this is not a failure, but may mean that there could be performance ga

at epoch 1
train info: logloss loss:1.4731498061512935
eval info: group_auc:0.6065, mean_mrr:0.2668, ndcg@10:0.3605, ndcg@5:0.2993
at epoch 1 , train time: 428.5 eval time: 22.4


4344it [07:11, 10.06it/s]
2341it [00:01, 1563.08it/s]
943it [00:15, 59.73it/s]
7538it [00:00, 11083.28it/s]


at epoch 2
train info: logloss loss:1.3782375785033347
eval info: group_auc:0.6238, mean_mrr:0.2812, ndcg@10:0.3745, ndcg@5:0.3099
at epoch 2 , train time: 431.8 eval time: 21.3


4344it [07:12, 10.04it/s]
2341it [00:01, 1603.91it/s]
943it [00:15, 60.15it/s]
7538it [00:00, 13492.68it/s]


at epoch 3
train info: logloss loss:1.3325865667923928
eval info: group_auc:0.6245, mean_mrr:0.2877, ndcg@10:0.3788, ndcg@5:0.3156
at epoch 3 , train time: 432.8 eval time: 20.9


4344it [07:13, 10.01it/s]
2341it [00:01, 1613.11it/s]
943it [00:15, 59.85it/s]
7538it [00:00, 13775.39it/s]


at epoch 4
train info: logloss loss:1.2822263821287159
eval info: group_auc:0.6345, mean_mrr:0.2912, ndcg@10:0.3864, ndcg@5:0.3221
at epoch 4 , train time: 433.8 eval time: 21.1


4344it [07:08, 10.15it/s]
2341it [00:01, 1664.62it/s]
943it [00:15, 61.45it/s]
7538it [00:00, 15580.55it/s]


at epoch 5
train info: logloss loss:1.224057599325865
eval info: group_auc:0.6438, mean_mrr:0.2955, ndcg@10:0.3896, ndcg@5:0.3262
at epoch 5 , train time: 428.0 eval time: 20.5
CPU times: user 37min 13s, sys: 1min 50s, total: 39min 3s
Wall time: 37min 41s


<recommenders.models.newsrec.models.lstur.LSTURModel at 0x7f05702a2990>

In [10]:
res_syn = model.run_eval(valid_news_file, valid_behaviors_file)
print(res_syn)

2341it [00:01, 1608.45it/s]
943it [00:15, 61.71it/s]
7538it [00:00, 14911.29it/s]


{'group_auc': 0.6438, 'mean_mrr': 0.2955, 'ndcg@5': 0.3262, 'ndcg@10': 0.3896}


In [11]:
sb.glue('res_syn',res_syn)

## Svae the model

In [13]:
model_path = os.path.join(data_path, 'model')
os.makedirs(model_path, exist_ok=True)
model.model.save_weights(os.path.join(model_path, 'lstur_ckpt'))

## Output Prediction File

In [14]:
group_impr_indexes, group_labels, group_preds = model.run_fast_eval(valid_news_file, valid_behaviors_file)

2341it [00:01, 1569.10it/s]
943it [00:15, 60.29it/s]
7538it [00:00, 12226.84it/s]


In [15]:
with open(os.path.join(data_path, 'prediction.txt'), 'w') as f:
    for impr_index, preds in tqdm(zip(group_impr_indexes, group_preds)):
        impr_index += 1
        pred_rank = (np.argsort(np.argsort(preds)[::-1])+1).tolist()
        print('asdf:', pred_rank[:2])
        pred_rank = '[' + ','.join([str(i) for i in pred_rank])+']'
        print('fdsa:', pred_rank)
        f.write(' '.join([str(impr_index), pred_rank])+'\n')

642it [00:00, 6404.83it/s]

asdf: [9, 4]
fdsa: [9,4,20,17,28,22,24,26,3,7,13,14,25,23,21,8,15,12,16,27,11,1,2,5,6,19,10,18]
asdf: [27, 9]
fdsa: [27,9,52,40,54,45,10,55,49,6,37,47,31,53,35,28,23,19,43,5,44,48,16,3,1,17,60,7,41,4,61,18,38,59,2,34,32,51,36,25,8,20,29,50,13,39,21,30,57,56,58,15,42,24,22,14,26,11,33,46,12]
asdf: [4, 1]
fdsa: [4,1,11,41,16,25,37,15,39,21,51,38,40,19,9,42,28,32,31,6,33,3,10,46,2,34,24,18,50,44,20,45,8,26,49,52,47,22,17,29,5,35,53,13,23,48,43,54,14,27,36,12,7,30]
asdf: [11, 9]
fdsa: [11,9,5,7,3,2,10,13,8,1,6,4,12]
asdf: [3, 9]
fdsa: [3,9,1,7,2,8,6,10,4,5]
asdf: [4, 31]
fdsa: [4,31,18,32,36,10,19,7,6,24,34,15,35,5,29,12,23,22,9,3,26,16,30,17,8,37,2,14,27,38,1,28,13,20,21,25,11,33]
asdf: [34, 3]
fdsa: [34,3,36,10,33,1,12,29,31,28,8,37,15,35,7,4,21,20,30,6,24,11,14,9,16,27,25,2,18,13,22,23,19,5,17,26,32]
asdf: [93, 28]
fdsa: [93,28,81,43,22,29,41,94,40,83,90,75,92,3,99,11,100,96,31,74,88,84,65,78,48,17,68,95,16,34,5,47,69,24,7,98,64,18,103,37,20,2,67,97,77,21,4,89,42,51,53,49,55,27,58,9,73,

2649it [00:00, 6634.87it/s]

asdf: [2, 1]
fdsa: [2,1]
asdf: [7, 13]
fdsa: [7,13,2,8,106,69,17,32,80,119,46,9,10,27,82,113,74,59,99,64,91,37,109,96,52,6,19,30,103,93,26,70,78,88,35,87,4,25,16,98,20,108,44,39,97,47,45,21,68,89,85,24,3,55,51,112,15,114,62,71,76,117,38,101,86,40,12,1,48,110,43,50,28,81,41,22,33,54,49,72,116,90,94,65,104,66,92,111,60,53,56,14,79,84,120,105,18,95,11,5,31,118,58,107,77,42,61,23,34,102,29,63,57,100,83,67,115,36,73,75]
asdf: [1, 3]
fdsa: [1,3,4,6,2,7,5]
asdf: [20, 5]
fdsa: [20,5,28,38,3,1,11,17,39,14,26,34,32,42,6,41,36,35,19,25,31,24,10,43,7,33,40,23,4,21,8,2,9,37,18,15,29,27,22,30,12,44,16,13]
asdf: [16, 6]
fdsa: [16,6,22,33,34,4,5,54,55,56,25,53,10,2,43,52,19,41,3,28,51,11,40,29,12,27,8,18,21,31,14,44,26,38,36,23,48,39,35,46,7,47,45,42,20,1,9,30,50,37,17,49,13,24,15,32]
asdf: [6, 1]
fdsa: [6,1,7,4,3,2,5]
asdf: [2, 1]
fdsa: [2,1]
asdf: [9, 11]
fdsa: [9,11,2,4,12,1,8,6,7,10,3,5]
asdf: [14, 7]
fdsa: [14,7,1,13,9,6,11,2,8,10,5,12,3,4]
asdf: [6, 5]
fdsa: [6,5,1,2,4,3,9,13,7,12,10,8,11]
asdf:

3313it [00:00, 6319.20it/s]

asdf: [38, 26]
fdsa: [38,26,19,28,43,14,18,24,7,15,30,34,33,37,23,39,8,2,6,27,11,16,35,32,21,4,40,10,41,25,42,9,13,12,1,22,29,17,31,20,36,5,3]
asdf: [20, 8]
fdsa: [20,8,2,12,5,9,21,26,14,4,6,11,18,23,24,17,16,1,22,3,15,13,19,10,25,7]
asdf: [17, 10]
fdsa: [17,10,34,15,9,37,24,1,3,25,30,16,35,21,13,12,22,18,38,39,20,19,4,36,29,7,32,33,5,23,6,14,27,26,11,2,28,31,8]
asdf: [6, 14]
fdsa: [6,14,21,4,22,18,16,15,20,11,12,8,10,9,19,17,2,13,1,7,3,5]
asdf: [77, 39]
fdsa: [77,39,62,132,38,5,28,4,146,16,111,101,91,108,130,49,134,57,12,23,100,86,149,10,56,3,61,25,51,66,90,136,124,59,58,65,128,110,29,33,46,8,32,36,137,64,95,122,151,105,52,41,139,68,6,92,150,82,27,116,131,143,123,15,93,138,76,115,118,102,113,35,20,125,112,94,119,140,147,73,96,42,141,142,17,117,21,43,84,75,14,44,1,48,79,97,54,98,53,34,135,30,70,60,31,104,133,109,144,18,40,67,50,121,13,55,148,19,81,120,74,37,127,2,85,63,107,87,78,114,71,22,99,89,126,9,106,145,7,47,103,24,129,26,80,83,88,69,45,11,72]
asdf: [5, 1]
fdsa: [5,1,12,10,7,9,4,6

4661it [00:00, 6543.10it/s]

asdf: [18, 30]
fdsa: [18,30,50,3,27,33,49,41,19,26,35,24,46,21,42,23,37,48,14,43,45,2,11,6,13,31,47,39,52,12,28,10,20,4,7,44,1,16,15,32,51,36,38,22,34,29,9,5,25,17,40,8]
asdf: [1, 2]
fdsa: [1,2]
asdf: [23, 27]
fdsa: [23,27,6,38,34,11,41,14,20,22,9,4,2,1,31,19,28,24,15,32,18,29,12,37,25,3,13,30,40,10,39,35,8,36,17,5,16,7,42,26,33,21]
asdf: [16, 11]
fdsa: [16,11,19,8,12,7,3,4,6,1,5,14,2,9,15,10,20,18,13,17]
asdf: [83, 55]
fdsa: [83,55,78,75,61,27,60,16,47,50,59,8,15,7,36,48,63,17,31,81,57,39,19,1,46,62,37,82,49,5,38,74,58,10,9,20,21,42,67,14,77,54,51,79,6,30,24,32,40,71,35,72,41,52,28,25,66,70,34,69,13,45,68,18,3,76,11,53,26,22,23,12,65,56,33,29,80,43,44,64,4,73,2]
asdf: [16, 15]
fdsa: [16,15,10,4,13,5,8,17,20,3,11,7,18,22,2,1,14,19,6,9,12,21]
asdf: [9, 12]
fdsa: [9,12,10,5,11,4,3,7,8,1,2,6]
asdf: [1, 4]
fdsa: [1,4,2,3]
asdf: [106, 94]
fdsa: [106,94,8,12,18,40,110,53,62,91,49,55,107,24,93,85,46,68,51,50,59,100,13,69,64,54,72,66,74,1,79,77,11,89,80,67,70,25,41,5,30,105,16,34,58,7,63,81,48

5978it [00:00, 6544.34it/s]

asdf: [11, 7]
fdsa: [11,7,5,10,6,8,4,9,1,2,3]
asdf: [11, 14]
fdsa: [11,14,2,18,20,5,21,24,19,16,13,10,27,7,22,9,8,3,12,6,23,4,25,28,1,15,26,17]
asdf: [102, 51]
fdsa: [102,51,79,94,33,42,36,63,7,86,18,46,87,53,68,84,16,90,25,12,9,66,27,83,11,100,91,50,24,92,5,30,62,47,32,54,99,26,75,57,56,19,22,73,15,61,52,60,35,76,89,88,71,70,31,14,74,38,6,44,2,59,67,65,48,13,64,81,10,4,49,95,82,3,55,78,1,85,29,97,34,45,98,77,69,28,20,93,21,80,43,96,58,8,40,72,37,39,23,101,41,17]
asdf: [31, 21]
fdsa: [31,21,36,14,75,28,57,7,12,45,11,39,72,50,20,46,27,56,43,18,62,63,44,23,30,34,22,17,1,37,9,33,16,5,52,29,67,13,74,47,77,51,25,41,42,24,2,35,4,26,60,55,66,6,54,58,76,78,69,70,73,3,53,59,48,65,8,32,49,61,64,68,38,10,40,71,19,15]
asdf: [32, 52]
fdsa: [32,52,54,10,119,21,13,45,22,17,56,49,87,39,70,20,51,96,102,104,12,114,6,36,122,103,35,77,97,73,98,11,58,53,99,112,117,105,1,67,101,69,123,48,100,55,84,59,40,65,2,25,7,120,46,90,115,47,72,16,81,37,43,33,28,23,9,78,3,86,61,74,93,64,111,95,106,26,94,34,19,30,108,11

7538it [00:01, 6523.14it/s]

asdf: [22, 35]
fdsa: [22,35,14,28,3,34,53,33,8,10,47,5,7,20,21,26,46,18,12,39,9,55,43,60,51,37,45,32,15,38,23,6,58,52,59,27,25,40,48,57,1,19,44,17,36,50,16,11,54,29,31,61,56,30,49,2,41,42,4,13,24]
asdf: [32, 4]
fdsa: [32,4,42,16,5,15,40,17,19,27,1,35,38,3,21,23,29,25,9,14,28,22,2,37,18,10,24,12,20,26,39,34,7,31,30,41,33,36,13,11,8,6]
asdf: [7, 64]
fdsa: [7,64,88,71,28,91,70,25,56,13,22,23,6,77,94,2,42,32,62,19,47,98,37,59,80,14,29,12,33,36,63,26,40,90,58,4,11,57,68,31,75,15,65,69,27,51,79,73,93,20,86,16,9,48,60,54,55,78,87,76,66,21,99,17,24,67,81,96,34,84,49,39,50,83,82,72,95,46,5,1,92,30,18,85,89,52,38,43,100,74,61,97,41,3,10,45,35,8,44,53]
asdf: [8, 48]
fdsa: [8,48,82,11,26,93,5,85,41,28,70,12,36,13,72,25,34,1,98,56,88,97,22,92,74,68,3,61,73,94,80,33,64,66,77,50,46,47,43,45,18,27,53,99,55,89,49,79,29,42,6,84,86,44,81,35,95,62,31,21,7,10,23,96,37,15,32,40,67,100,58,30,2,76,87,65,90,83,60,54,14,75,63,91,59,20,57,17,52,71,24,39,4,9,78,51,69,38,19,16]
asdf: [10, 50]
fdsa: [10,50,45,42,30




In [16]:
f = zipfile.ZipFile(os.path.join(data_path, 'prediction.zip'), 'w', zipfile.ZIP_DEFLATED)
f.write(os.path.join(data_path, 'prediction.txt'), arcname='prediction.txt')
f.close()

In [17]:
data_path

'/tmp/tmpa0rezubw'