# Version 2 - Using Covisitation Matrices (Score 0.57845)
In this notebook, we present a new approach for our recommendation system:
We will use a data structure called a **Covisitation Matrix**.
Covisitation Matrix captures the frequency with which pairs of items are interacted with together within user sessions. This matrix helps identify items that are often viewed, clicked, or purchased in close temporal proximity, enabling us to recommend items that are frequently co-engaged with others.

In addition, we will also take into consideration popular items, and session lengths, when choosing the items to recommend.


### Step 1 - Generate Candidates
For each test session, we generate possible choices (candidates). In this notebook, we generate candidates from the following sources:
* Session history of clicks, carts, orders
* Most popular 20 clicks, carts, orders during test week
* Co-visitaion matrices

### Step 2 - ReRank and Choose 20
Given the list of candidates (might be more than 20), we must select 20 to be our predictions.
The rules give priority to:
* Most recent previously visited items
* Items previously visited multiple times
* Items previously in cart or order
* Co-visitation matrix of cart/order to cart/order
* Popular items




# Step 1 - Candidate Generation with RAPIDS
For candidate generation, we build four co-visitation matrices.

* One computes the popularity of cart/order given a user's previous click/cart/order. We apply type weighting to this matrix.
* One computes the popularity of cart/order given a user's previous cart/order. We call this "buy2buy" matrix.
* One computes the popularity of clicks given a user previously click/cart/order.  We apply time weighting to this matrix.
* One computes the popularity of cart/order given a user's previous click/cart/order, but only for interactions after 2 pm. (This matrix is already pre computed in this dataset: https://www.kaggle.com/datasets/cdeotte/otto-covisit-matrix)


In [1]:
VER = 5

import pandas as pd, numpy as np
from tqdm.notebook import tqdm
import os, sys, pickle, glob, gc
from collections import Counter
import cudf, itertools
print('We will use RAPIDS version',cudf.__version__)

We will use RAPIDS version 21.10.01


## Compute the Co-visitation Matrices
We compute the co-visitation matrices using RAPIDS cuDF on GPU, which is faster than using pandas on CPU. 

In [2]:
%%time
# CACHE FUNCTIONS
def read_file(f):
    return cudf.DataFrame( data_cache[f] )
def read_file_to_cache(f):
    df = pd.read_parquet(f)
    df.ts = (df.ts/1000).astype('int32')
    df['type'] = df['type'].map(type_labels).astype('int8')
    return df

# CACHE THE DATA ON CPU BEFORE PROCESSING ON GPU
data_cache = {}
type_labels = {'clicks':0, 'carts':1, 'orders':2}
files = glob.glob('../input/otto-chunk-data-inparquet-format/*_parquet/*')
for f in files: data_cache[f] = read_file_to_cache(f)

# CHUNK PARAMETERS
READ_CT = 5
CHUNK = int( np.ceil( len(files)/6 ))
print(f'We will process {len(files)} files, in groups of {READ_CT} and chunks of {CHUNK}.')

We will process 146 files, in groups of 5 and chunks of 25.
CPU times: user 46.9 s, sys: 12.7 s, total: 59.6 s
Wall time: 57.6 s


## 1) "Carts Orders" Co-visitation Matrix - Type Weighted


In [3]:
%%time

# Computes co-visitation scores between products added to cart or ordered using session data.
# For each product, it finds other products from the same session within 24 hours,
# applies weights based on interaction type (click=1, cart=6, order=3),
# and saves the top 15 most relevant product-to-product recommendations.

type_weight = {0: 1, 1: 6, 2: 3}

# Split data into manageable parts to avoid memory overflow
DISK_PIECES = 4
SIZE = 1.86e6 / DISK_PIECES  # Range size per disk part

# Process in parts to handle large data efficiently
for PART in range(DISK_PIECES):
    print()
    print('### DISK PART', PART + 1)

    # Outer chunk loop: split full file list into groups of CHUNK size
    for j in range(6):
        start_file_idx = j * CHUNK
        end_file_idx = min((j + 1) * CHUNK, len(files))
        print(f'Processing files {start_file_idx} through {end_file_idx - 1} in groups of {READ_CT}...')

        # Inner chunk loop: read and process files in groups of READ_CT
        for k in range(start_file_idx, end_file_idx, READ_CT):
            # Read a group of READ_CT files
            df = [read_file(files[k])]
            for i in range(1, READ_CT):
                if k + i < end_file_idx:
                    df.append(read_file(files[k + i]))
            df = cudf.concat(df, ignore_index=True, axis=0)

            # Sort and keep only the last 30 interactions per session
            df = df.sort_values(['session', 'ts'], ascending=[True, False])
            df = df.reset_index(drop=True)
            df['n'] = df.groupby('session').cumcount()
            df = df.loc[df.n < 30].drop('n', axis=1)

            # Create product pairs from the same session within a 24-hour window
            df = df.merge(df, on='session')
            df = df.loc[((df.ts_x - df.ts_y).abs() < 24 * 60 * 60) & (df.aid_x != df.aid_y)]

            # Filter by current disk part range to reduce memory usage
            df = df.loc[(df.aid_x >= PART * SIZE) & (df.aid_x < (PART + 1) * SIZE)]

            # Assign weights and aggregate
            df = df[['session', 'aid_x', 'aid_y', 'type_y']].drop_duplicates(['session', 'aid_x', 'aid_y'])
            df['wgt'] = df.type_y.map(type_weight)
            df = df[['aid_x', 'aid_y', 'wgt']]
            df.wgt = df.wgt.astype('float32')
            df = df.groupby(['aid_x', 'aid_y']).wgt.sum()

            # Combine results from inner chunks
            if k == start_file_idx:
                tmp2 = df
            else:
                tmp2 = tmp2.add(df, fill_value=0)
            print(k, ', ', end='')
        print()

        # Combine results from outer chunks
        if start_file_idx == 0:
            tmp = tmp2
        else:
            tmp = tmp.add(tmp2, fill_value=0)
        del tmp2, df
        gc.collect()

    # Format and extract top 15 co-visitation scores per product
    tmp = tmp.reset_index()
    tmp = tmp.sort_values(['aid_x', 'wgt'], ascending=[True, False])
    tmp = tmp.reset_index(drop=True)
    tmp['n'] = tmp.groupby('aid_x').aid_y.cumcount()
    tmp = tmp.loc[tmp.n < 15].drop('n', axis=1)

    # Save to disk (convert to pandas to reduce memory use)
    tmp.to_pandas().to_parquet(f'top_15_carts_orders_v{VER}_{PART}.pqt')



### DISK PART 1
Processing files 0 through 24 in groups of 5...


  "When using a sequence of booleans for `ascending`, "


0 , 5 , 10 , 15 , 20 , 
Processing files 25 through 49 in groups of 5...
25 , 30 , 35 , 40 , 45 , 
Processing files 50 through 74 in groups of 5...
50 , 55 , 60 , 65 , 70 , 
Processing files 75 through 99 in groups of 5...
75 , 80 , 85 , 90 , 95 , 
Processing files 100 through 124 in groups of 5...
100 , 105 , 110 , 115 , 120 , 
Processing files 125 through 145 in groups of 5...
125 , 130 , 135 , 140 , 145 , 

### DISK PART 2
Processing files 0 through 24 in groups of 5...
0 , 5 , 10 , 15 , 20 , 
Processing files 25 through 49 in groups of 5...
25 , 30 , 35 , 40 , 45 , 
Processing files 50 through 74 in groups of 5...
50 , 55 , 60 , 65 , 70 , 
Processing files 75 through 99 in groups of 5...
75 , 80 , 85 , 90 , 95 , 
Processing files 100 through 124 in groups of 5...
100 , 105 , 110 , 115 , 120 , 
Processing files 125 through 145 in groups of 5...
125 , 130 , 135 , 140 , 145 , 

### DISK PART 3
Processing files 0 through 24 in groups of 5...
0 , 5 , 10 , 15 , 20 , 
Processing files 25 

## 2) "Buy2Buy" Co-visitation Matrix

In [4]:
%%time

# Computes co-visitation scores between products added to cart or ordered (buy2buy).
# Pairs are created from products in the same session within 14 days, then top 15 per product are saved.

# Use the smallest possible number of disk pieces that avoids memory errors
DISK_PIECES = 1
SIZE = 1.86e6 / DISK_PIECES  # Range size per disk part

# Process in parts to handle large data
for PART in range(DISK_PIECES):
    print()
    print('### DISK PART', PART + 1)

    # Outer chunk loop: divide full file list into CHUNK-sized groups
    for j in range(6):
        start_file_idx = j * CHUNK
        end_file_idx = min((j + 1) * CHUNK, len(files))
        print(f'Processing files {start_file_idx} thru {end_file_idx - 1} in groups of {READ_CT}...')

        # Inner chunk loop: read and process files in groups of READ_CT
        for k in range(start_file_idx, end_file_idx, READ_CT):
            # Read a group of files
            df = [read_file(files[k])]
            for i in range(1, READ_CT):
                if k + i < end_file_idx:
                    df.append(read_file(files[k + i]))
            df = cudf.concat(df, ignore_index=True, axis=0)

            # Keep only cart and order interactions
            df = df.loc[df['type'].isin([1, 2])]

            # Sort and keep only the last 30 interactions per session
            df = df.sort_values(['session', 'ts'], ascending=[True, False])
            df = df.reset_index(drop=True)
            df['n'] = df.groupby('session').cumcount()
            df = df.loc[df.n < 30].drop('n', axis=1)

            # Create product pairs from same session within 14 days
            df = df.merge(df, on='session')
            df = df.loc[((df.ts_x - df.ts_y).abs() < 14 * 24 * 60 * 60) & (df.aid_x != df.aid_y)]

            # Filter by current disk part range to manage memory
            df = df.loc[(df.aid_x >= PART * SIZE) & (df.aid_x < (PART + 1) * SIZE)]

            # Assign uniform weight and aggregate
            df = df[['session', 'aid_x', 'aid_y', 'type_y']].drop_duplicates(['session', 'aid_x', 'aid_y'])
            df['wgt'] = 1
            df = df[['aid_x', 'aid_y', 'wgt']]
            df.wgt = df.wgt.astype('float32')
            df = df.groupby(['aid_x', 'aid_y']).wgt.sum()

            # Combine results from inner chunks
            if k == start_file_idx:
                tmp2 = df
            else:
                tmp2 = tmp2.add(df, fill_value=0)
            print(k, ', ', end='')
        print()

        # Combine results from outer chunks
        if start_file_idx == 0:
            tmp = tmp2
        else:
            tmp = tmp.add(tmp2, fill_value=0)
        del tmp2, df
        gc.collect()

    # Format and extract top 15 co-visitation scores per product
    tmp = tmp.reset_index()
    tmp = tmp.sort_values(['aid_x', 'wgt'], ascending=[True, False])
    tmp = tmp.reset_index(drop=True)
    tmp['n'] = tmp.groupby('aid_x').aid_y.cumcount()
    tmp = tmp.loc[tmp.n < 15].drop('n', axis=1)

    # Save to disk (convert to pandas to reduce memory use)
    tmp.to_pandas().to_parquet(f'top_15_buy2buy_v{VER}_{PART}.pqt')



### DISK PART 1
Processing files 0 thru 24 in groups of 5...
0 , 5 , 

  "When using a sequence of booleans for `ascending`, "


10 , 15 , 20 , 
Processing files 25 thru 49 in groups of 5...
25 , 30 , 35 , 40 , 45 , 
Processing files 50 thru 74 in groups of 5...
50 , 55 , 60 , 65 , 70 , 
Processing files 75 thru 99 in groups of 5...
75 , 80 , 85 , 90 , 95 , 
Processing files 100 thru 124 in groups of 5...
100 , 105 , 110 , 115 , 120 , 
Processing files 125 thru 145 in groups of 5...
125 , 130 , 135 , 140 , 145 , 
CPU times: user 20.6 s, sys: 8.81 s, total: 29.4 s
Wall time: 29.8 s


## 3) "Clicks" Co-visitation Matrix - Time Weighted

In [5]:
%%time

# Computes co-visitation scores for product clicks.
# For each product, finds other products clicked in the same session within 24 hours,
# applies time-based weights, and saves the top 20 most relevant co-clicked products.

# Use the smallest number of disk pieces that avoids memory errors
DISK_PIECES = 4
SIZE = 1.86e6 / DISK_PIECES  # Range size per disk part

# Process in parts to handle large data efficiently
for PART in range(DISK_PIECES):
    print()
    print('### DISK PART', PART + 1)

    # Outer chunk loop: split full file list into CHUNK-sized groups
    for j in range(6):
        start_file_idx = j * CHUNK
        end_file_idx = min((j + 1) * CHUNK, len(files))
        print(f'Processing files {start_file_idx} thru {end_file_idx - 1} in groups of {READ_CT}...')

        # Inner chunk loop: read and process files in groups of READ_CT
        for k in range(start_file_idx, end_file_idx, READ_CT):
            # Read a group of files
            df = [read_file(files[k])]
            for i in range(1, READ_CT):
                if k + i < end_file_idx:
                    df.append(read_file(files[k + i]))
            df = cudf.concat(df, ignore_index=True, axis=0)

            # Sort and keep only the last 30 interactions per session
            df = df.sort_values(['session', 'ts'], ascending=[True, False])
            df = df.reset_index(drop=True)
            df['n'] = df.groupby('session').cumcount()
            df = df.loc[df.n < 30].drop('n', axis=1)

            # Create product pairs from same session within 24 hours
            df = df.merge(df, on='session')
            df = df.loc[((df.ts_x - df.ts_y).abs() < 24 * 60 * 60) & (df.aid_x != df.aid_y)]

            # Filter by current disk part range to manage memory
            df = df.loc[(df.aid_x >= PART * SIZE) & (df.aid_x < (PART + 1) * SIZE)]

            # Assign time-decayed weights
            df = df[['session', 'aid_x', 'aid_y', 'ts_x']].drop_duplicates(['session', 'aid_x', 'aid_y'])
            df['wgt'] = 1 + 3 * (df.ts_x - 1659304800) / (1662328791 - 1659304800)
            df = df[['aid_x', 'aid_y', 'wgt']]
            df.wgt = df.wgt.astype('float32')
            df = df.groupby(['aid_x', 'aid_y']).wgt.sum()

            # Combine results from inner chunks
            if k == start_file_idx:
                tmp2 = df
            else:
                tmp2 = tmp2.add(df, fill_value=0)
            print(k, ', ', end='')
        print()

        # Combine results from outer chunks
        if start_file_idx == 0:
            tmp = tmp2
        else:
            tmp = tmp.add(tmp2, fill_value=0)
        del tmp2, df
        gc.collect()

    # Format and extract top 20 co-visitation scores per product
    tmp = tmp.reset_index()
    tmp = tmp.sort_values(['aid_x', 'wgt'], ascending=[True, False])
    tmp = tmp.reset_index(drop=True)
    tmp['n'] = tmp.groupby('aid_x').aid_y.cumcount()
    tmp = tmp.loc[tmp.n < 20].drop('n', axis=1)

    # Save to disk (convert to pandas to reduce memory use)
    tmp.to_pandas().to_parquet(f'top_20_clicks_v{VER}_{PART}.pqt')



### DISK PART 1
Processing files 0 thru 24 in groups of 5...
0 , 5 , 10 , 15 , 20 , 
Processing files 25 thru 49 in groups of 5...
25 , 30 , 35 , 40 , 45 , 
Processing files 50 thru 74 in groups of 5...
50 , 55 , 60 , 65 , 70 , 
Processing files 75 thru 99 in groups of 5...
75 , 80 , 85 , 90 , 95 , 
Processing files 100 thru 124 in groups of 5...
100 , 105 , 110 , 115 , 120 , 
Processing files 125 thru 145 in groups of 5...
125 , 130 , 135 , 140 , 145 , 

### DISK PART 2
Processing files 0 thru 24 in groups of 5...
0 , 5 , 10 , 15 , 20 , 
Processing files 25 thru 49 in groups of 5...
25 , 30 , 35 , 40 , 45 , 
Processing files 50 thru 74 in groups of 5...
50 , 55 , 60 , 65 , 70 , 
Processing files 75 thru 99 in groups of 5...
75 , 80 , 85 , 90 , 95 , 
Processing files 100 thru 124 in groups of 5...
100 , 105 , 110 , 115 , 120 , 
Processing files 125 thru 145 in groups of 5...
125 , 130 , 135 , 140 , 145 , 

### DISK PART 3
Processing files 0 thru 24 in groups of 5...
0 , 5 , 10 , 15 , 

In [6]:
# FREE MEMORY
del data_cache, tmp
_ = gc.collect()

# Step 2 - ReRank (choose 20) using handcrafted rules
For description of the handcrafted rules, read this notebook's intro.

In [7]:
def load_test():    
    dfs = []
    # Iterate over all test parquet files in the specified directory
    for e, chunk_file in enumerate(glob.glob('../input/otto-chunk-data-inparquet-format/test_parquet/*')):
        chunk = pd.read_parquet(chunk_file)  # Load each chunk
        chunk.ts = (chunk.ts / 1000).astype('int32')  # Convert timestamp from ms to seconds
        chunk['type'] = chunk['type'].map(type_labels).astype('int8')  # Map interaction type to int8 label
        dfs.append(chunk)  # Add to list of DataFrames

    # Concatenate all test chunks into a single DataFrame
    return pd.concat(dfs).reset_index(drop=True)  # .astype({"ts": "datetime64[ms]"}) optional

# Load and inspect test data
test_df = load_test()
print('Test data has shape', test_df.shape)
test_df.head()


Test data has shape (6928123, 4)


Unnamed: 0,session,aid,ts,type
0,13099779,245308,1661795832,0
1,13099779,245308,1661795862,1
2,13099779,972319,1661795888,0
3,13099779,972319,1661795898,1
4,13099779,245308,1661795907,0


In [8]:
%%time
def pqt_to_dict(df):
    return df.groupby('aid_x').aid_y.apply(list).to_dict()
# LOAD THREE CO-VISITATION MATRICES
top_20_clicks = pqt_to_dict( pd.read_parquet(f'top_20_clicks_v{VER}_0.pqt') )
for k in range(1,DISK_PIECES): 
    top_20_clicks.update( pqt_to_dict( pd.read_parquet(f'top_20_clicks_v{VER}_{k}.pqt') ) )
top_15_buys = pqt_to_dict( pd.read_parquet(f'top_15_carts_orders_v{VER}_0.pqt') )
for k in range(1,DISK_PIECES): 
    top_15_buys.update( pqt_to_dict( pd.read_parquet(f'top_15_carts_orders_v{VER}_{k}.pqt') ) )
top_20_buy2buy = pqt_to_dict( pd.read_parquet(f'top_15_buy2buy_v{VER}_0.pqt') )

# TOP CLICKED AND ORDERED ITEMS IN THE TEST DATASET
top_clicks = test_df.loc[test_df['type']=='clicks','aid'].value_counts().index.values[:20]
top_orders = test_df.loc[test_df['type']=='orders','aid'].value_counts().index.values[:20]


CPU times: user 1min 19s, sys: 5.93 s, total: 1min 25s
Wall time: 1min 19s


In [9]:
# Type weight mapping for different interaction types
# clicks: 1, carts: 6, orders: 3
type_weight_multipliers = {0: 1, 1: 6, 2: 3}

def suggest_clicks(df):
    # Extract user history: item IDs and interaction types
    aids = df.aid.tolist()
    types = df.type.tolist()
    
    # Keep unique item IDs in reverse session order (most recent first)
    unique_aids = list(dict.fromkeys(aids[::-1]))

    # If there are enough items in history, rerank based on weighted interactions
    if len(unique_aids) >= 20:
        weights = np.logspace(0.1, 1, len(aids), base=2, endpoint=True) - 1
        aid_scores = Counter()
        
        # Score items by recency-weighted type importance
        for aid, w, t in zip(aids, weights, types):
            aid_scores[aid] += w * type_weight_multipliers[t]

        # Return top 20 scored item IDs
        sorted_aids = [k for k, v in aid_scores.most_common(20)]
        return sorted_aids

    # Otherwise, use co-visitation candidates from the click matrix
    co_visitation_candidates = list(itertools.chain(*[top_20_clicks[aid] for aid in unique_aids if aid in top_20_clicks]))

    # Rerank candidates by frequency, excluding already seen items
    top_candidates = [aid for aid, cnt in Counter(co_visitation_candidates).most_common(20) if aid not in unique_aids]

    # Combine unique user history with top candidates
    result = unique_aids + top_candidates[:20 - len(unique_aids)]

    # If still under 20, pad with global top clicked items
    return result + list(top_clicks)[:20 - len(result)]


# Load the top 40 co-visitation items after 2 PM from external file
PATH = '/kaggle/input/otto-covisit-matrix/'
top_40_day = pickle.load(open(PATH + '/top_40_aids_v181_0.pkl', 'rb'))

def suggest_buys(df, custom_type_weights):
    # Extract user history: item IDs and interaction types
    aids = df.aid.tolist()
    types = df.type.tolist()

    # Keep unique item IDs in reverse session order
    unique_aids = list(dict.fromkeys(aids[::-1]))

    # Filter for cart/order interactions only to define "buy" items
    df = df.loc[(df['type'] == 1) | (df['type'] == 2)]
    unique_buys = list(dict.fromkeys(df.aid.tolist()[::-1]))

    # If enough history exists, use weighted reranking
    if len(unique_aids) >= 20:
        weights = np.logspace(0.5, 1, len(aids), base=2, endpoint=True) - 1
        aid_scores = Counter()

        # Score items based on repeat interactions and type weights
        for aid, w, t in zip(aids, weights, types):
            aid_scores[aid] += w * custom_type_weights[t]

        # Boost scores of items found in the buy2buy co-visitation matrix
        buy2buy_candidates = list(itertools.chain(*[top_20_buy2buy[aid] for aid in unique_buys if aid in top_20_buy2buy]))
        for aid in buy2buy_candidates:
            aid_scores[aid] += 0.1

        # Return top 20 scored item IDs
        sorted_aids = [k for k, v in aid_scores.most_common(20)]
        return sorted_aids

    # Otherwise, use fallback co-visitation sources

    # From cart/order co-visitation matrix
    co_cart_order_candidates = list(itertools.chain(*[top_15_buys[aid] for aid in unique_aids if aid in top_15_buys]))

    # From buy2buy co-visitation matrix
    buy2buy_candidates = list(itertools.chain(*[top_20_buy2buy[aid] for aid in unique_buys if aid in top_20_buy2buy]))

    # From time-of-day based co-visitation matrix (top 40 after 2 PM)
    time_filtered_candidates = list(itertools.chain(*[top_40_day[aid][:10] for aid in unique_aids if aid in top_40_day]))

    # Combine and rerank candidates by frequency, excluding items already seen
    top_candidates = [aid for aid, cnt in Counter(
        co_cart_order_candidates + buy2buy_candidates + time_filtered_candidates
    ).most_common(20) if aid not in unique_aids]

    # Combine history with top candidates
    result = unique_aids + top_candidates[:20 - len(unique_aids)]

    # If still under 20, pad with global top ordered items
    return result + list(top_orders)[:20 - len(result)]


# Create Submission CSV
Inferring test data with Pandas groupby is slow. We need to accelerate the following code.

In [10]:
%%time
pred_df_clicks = test_df.sort_values(["session", "ts"]).groupby(["session"]).apply(
    lambda x: suggest_clicks(x)
)

cart_weights = {0: 1, 1: 6, 2: 3}
order_weights = {0: 1, 1: 12, 2: 2}

pred_df_carts = test_df.sort_values(["session", "ts"]).groupby(["session"]).apply(
    lambda x: suggest_buys(x,cart_weights)
)

pred_df_orders = test_df.sort_values(["session", "ts"]).groupby(["session"]).apply(
    lambda x: suggest_buys(x,order_weights)
)

CPU times: user 45min 3s, sys: 9.38 s, total: 45min 12s
Wall time: 45min 8s


In [11]:
clicks_pred_df = pd.DataFrame(pred_df_clicks.add_suffix("_clicks"), columns=["labels"]).reset_index()
orders_pred_df = pd.DataFrame(pred_df_orders.add_suffix("_orders"), columns=["labels"]).reset_index()
carts_pred_df = pd.DataFrame(pred_df_carts.add_suffix("_carts"), columns=["labels"]).reset_index()

In [12]:
pred_df = pd.concat([clicks_pred_df, orders_pred_df, carts_pred_df])
pred_df.columns = ["session_type", "labels"]
pred_df["labels"] = pred_df.labels.apply(lambda x: " ".join(map(str,x)))
pred_df.to_csv("submission.csv", index=False)
pred_df.head()

Unnamed: 0,session_type,labels
0,12899779_clicks,59625 1253524 737445 438191 731692 1790770 942...
1,12899780_clicks,1142000 736515 973453 582732 1502122 889686 48...
2,12899781_clicks,918667 199008 194067 57315 141736 1460571 7594...
3,12899782_clicks,834354 595994 740494 889671 987399 779477 1344...
4,12899783_clicks,1817895 607638 1754419 1216820 1729553 300127 ...
