# Twitch data analysis

This script allows to obtain 4 dataset needed to create the graph model:

1. **Dataset Streamer**
- `ID_streamer`
- `streamer`
- `minutes_live`
- `viewer_mean`
- `viewer_peak`
- `spect_mean`
- `spect_tot`
- `social_link(?)`

2. **Dataset Games**
- `ID_game`
- `game_name`
- ...

3. **Dataset Streamer-Games**
- `ID_streamer`
- `ID_game`
- `minutes`

4. **Dataset Streamer-Streamer**
- `ID_streamer_i`
- `ID_streamer_j`
- `overlap_percentage`

## PreProcessing

### Setup parameters

In [1]:
interval = 5                      # Interval between observations (fixed)
freq = 3                          # Sample frequency (1=5min, 2=10min, 3=15min ecc.)
min_stream_time = 4               # Minimum live time parameter
min_watch_time = 2                # Minimum threshold of time as a spectator (min_watch_time <= min_stream_time)
min_viewer_mean = 10              # Minimum number of mean viewers
min_spect_mean = min_viewer_mean  # Minimum number of mean spectator
min_overlap_percentage = 1        # Minimum overlap percentage between 2 streamers

print(f'Minimum live time threshold:\t\t\t {interval * freq * min_stream_time} min')
print(f'Minimum threshold of time as a spectator:\t {interval * freq * min_watch_time} min')

Minimum live time threshold:			 60 min
Minimum threshold of time as a spectator:	 30 min


### Import libraries and data

In [2]:
import os
import numpy as np
import pandas as pd
from datetime import datetime
import json
from pandas import json_normalize
from collections import Counter
from collections import defaultdict
import gzip

#### Files Stream

In [3]:
relpath1 = '..\DataCollection\Twitch_stream_spect\Files_stream\\'
completelist = [os.path.join(path, name) for path, subdirs, files in os.walk(relpath1) for name in files]
print("Number of files:", len(completelist))
print(*completelist[:5], sep = "\n")

Number of files: 3823
..\DataCollection\Twitch_stream_spect\Files_stream\2022-05-05_15-09.json
..\DataCollection\Twitch_stream_spect\Files_stream\2022-05-05_15-14.json
..\DataCollection\Twitch_stream_spect\Files_stream\2022-05-05_15-20.json
..\DataCollection\Twitch_stream_spect\Files_stream\2022-05-05_15-24.json
..\DataCollection\Twitch_stream_spect\Files_stream\2022-05-05_15-29.json


Sampling files based on frequency

In [4]:
flist = completelist[::freq]
print("Number of files:", len(flist))
print(*flist[:5], sep = "\n")

Number of files: 1275
..\DataCollection\Twitch_stream_spect\Files_stream\2022-05-05_15-09.json
..\DataCollection\Twitch_stream_spect\Files_stream\2022-05-05_15-24.json
..\DataCollection\Twitch_stream_spect\Files_stream\2022-05-05_15-39.json
..\DataCollection\Twitch_stream_spect\Files_stream\2022-05-05_15-54.json
..\DataCollection\Twitch_stream_spect\Files_stream\2022-05-05_16-09.json


Import selected files

In [5]:
data_dict = {}

for file in flist:
    start = relpath1
    end = '.json'
    time = file[len(start): - len(end)]
    time_obj = datetime.strptime(time, '%Y-%m-%d_%H-%M') #extract timestamp
    
    with open(file, 'r') as f:
        data_dict[time_obj] = json.load(f)

print("Number of keys (time intervals):", len(data_dict.keys()))

Number of keys (time intervals): 1275


Example `'pasquale890': {'spect': ['anytow'], 'game_name': 'FIFA 22', 'viewer_count': 0}`


#### Bot list

In [6]:
relpath2 = '..\DataCollection\Other_datasets\\'

bot_list = pd.read_csv(f'{relpath2}Twitch_bot_list.csv')
bot_list = bot_list['Twitch Username'].tolist()
bot_list[:10]

['turbopascai',
 'tteyyd',
 'twitchdetails',
 '0liviajadeee',
 'judgejudysiayer',
 'thelurkertv',
 'nothingbutgay',
 'streamers_on_discord',
 '0_applebadapple_0',
 'agressive_sock']

### Remove bot

In [7]:
spect_with_bot = {}

for streamer in data_dict[time_obj].keys():
    spect_with_bot[streamer] = len(data_dict[time_obj][streamer]['spect'])

In [8]:
def remove_bot(spect_with_bot, bot_set):
    return list(set(spect_with_bot) - bot_set)

In [9]:
bot_set = set(bot_list)
for time_obj in data_dict.keys():
    for streamer in data_dict[time_obj].keys():
        data_dict[time_obj][streamer]['spect'] = remove_bot(data_dict[time_obj][streamer]['spect'], bot_set)

Check the difference between viewer and spect in a specific instant

In [10]:
for streamer in data_dict[time_obj].keys():
    data_dict[time_obj][streamer]['spect_count'] = len(data_dict[time_obj][streamer]['spect'])
    data_dict[time_obj][streamer]['diff_count'] = data_dict[time_obj][streamer]['viewer_count'] - data_dict[time_obj][streamer]['spect_count']
    
    print(f"{streamer}\n\
    \tViewer Count: {data_dict[time_obj][streamer]['viewer_count']}\
    \tSpect Count: {len(data_dict[time_obj][streamer]['spect'])}\
    \tDifference: {data_dict[time_obj][streamer]['diff_count']}\
    \tBot Number: {spect_with_bot[streamer] - len(data_dict[time_obj][streamer]['spect'])}\
    ")

Juventibus
    	Viewer Count: 2824    	Spect Count: 1914    	Difference: 910    	Bot Number: 11    
ocwsport
    	Viewer Count: 2575    	Spect Count: 1966    	Difference: 609    	Bot Number: 11    
pizfn
    	Viewer Count: 2115    	Spect Count: 1576    	Difference: 539    	Bot Number: 10    
Justees
    	Viewer Count: 1768    	Spect Count: 1280    	Difference: 488    	Bot Number: 10    
Xiuder_
    	Viewer Count: 1752    	Spect Count: 1185    	Difference: 567    	Bot Number: 17    
Sabaku_no_Sutoriimaa
    	Viewer Count: 1664    	Spect Count: 1172    	Difference: 492    	Bot Number: 6    
Paoloidolo
    	Viewer Count: 1319    	Spect Count: 1049    	Difference: 270    	Bot Number: 15    
NanniTwitch
    	Viewer Count: 1277    	Spect Count: 1040    	Difference: 237    	Bot Number: 15    
Everyeyeit
    	Viewer Count: 1268    	Spect Count: 102    	Difference: 1166    	Bot Number: 16    
Terenas
    	Viewer Count: 1215    	Spect Count: 674    	Difference: 541    	Bot Number: 17    
Multipl

Difference between viewer and spect for a time instant, can be negative and positive:

- bot excluded from viewer count
- viewer not connected to the chat
- data updated at different instants

### Streamer repeated dataset

In [11]:
streamer_repeated = []
game_name_repeated = []
viewer_count_repeated = []
spect_count_repeated = []
spect_repeated = []

for istante, value in data_dict.items():
    streamer_repeated.extend(list(value.keys()))
    for streamer, val in value.items():
        game_name_repeated.append(val['game_name'])
        viewer_count_repeated.append(val['viewer_count'])
        spect_count_repeated.append(len(val['spect']))
        spect_repeated.append(val['spect'])

len(streamer_repeated)

1723899

In [12]:
streamer_repeated = pd.DataFrame({'streamer': streamer_repeated,
                                  'game_name': game_name_repeated,
                                  'viewer_count': viewer_count_repeated,
                                  'spect_count': spect_count_repeated,
                                  'spect': spect_repeated
                                 })
streamer_repeated

Unnamed: 0,streamer,game_name,viewer_count,spect_count,spect
0,ilMasseo,Nintendo Switch Sports,3899,3014,"[giuseppez_00, lu_pacciocciu, cosimo96le, spri..."
1,pizfn,Fortnite,2522,1892,"[vezingodmode, il_merio_, falciootv, cycooo, f..."
2,Xiuder_,Fortnite,2223,1504,"[gamerofitalyofficial, falciootv, antoniocavut..."
3,ocwsport,Sports,2183,1695,"[andrea___18, drfrapep, matteo290403, francesc..."
4,Juventibus,Sports,1939,1358,"[matt_man222, zamplano, salvotj, la_faccia_di_..."
...,...,...,...,...,...
1723894,goldmeember,Call Of Duty: Modern Warfare,0,0,[]
1723895,3nzo_ny,Call Of Duty: Modern Warfare,0,0,[]
1723896,WoZniaK_TV,League of Legends,0,0,[]
1723897,Triac92,Life is Strange: True Colors,0,0,[]


### Streamer grouped dataset

In [13]:
streamer_grouped = streamer_repeated.groupby("streamer").agg({'viewer_count': ['count', 'mean', 'max'],
                                                              'spect_count': 'mean',
                                                              'spect': 'sum',
                                                              'game_name': (lambda x: list(set(x)))
                                                             })

In [14]:
streamer_grouped.columns = ["_".join(pair) for pair in streamer_grouped.columns]
streamer_grouped['spect_sum'] = streamer_grouped['spect_sum'].apply(set)
streamer_grouped['spect_count_tot'] = streamer_grouped['spect_sum'].apply(len)
streamer_grouped['viewer_count_count'] = streamer_grouped['viewer_count_count'] * interval * freq

streamer_grouped.sort_values('viewer_count_mean', ascending = False, inplace = True)
streamer_grouped

Unnamed: 0_level_0,viewer_count_count,viewer_count_mean,viewer_count_max,spect_count_mean,spect_sum,game_name_<lambda>,spect_count_tot
streamer,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
LyonWGFLive,1155,14011.454545,19727,9337.805195,"{francowgf042, checcowgf_, fratm_zebby, thecry...","[Minecraft, Roblox, Grand Theft Auto V, Phasmo...",58801
Tumblurr,2925,13683.697436,32436,10655.635897,"{ayarstormxx, m477303310, davies0907, ilbotsup...","[Slots, Bee Simulator, Peggle 2, Sakura Succub...",148019
ChristianVieriOfficial,570,11014.263158,24931,7056.315789,"{biechry, simone2323_, davies0907, el_puma311,...",[Sports],64614
GrenBaud,1005,10285.805970,30017,7600.000000,"{buzdelcul, ale02022002, russottogiulio, 32404...","[Just Chatting, Music]",78521
ZanoXVII,3480,7681.939655,13973,5699.521552,"{mattiiax, kejuis_lqkhe, j4ck_130, samuelesa18...","[Call of Duty: Modern Warfare 2, Golf With You...",157718
...,...,...,...,...,...,...,...
dragoinvincibile,15,0.000000,0,0.000000,{},[],0
querxyyyy,15,0.000000,0,0.000000,{},[],0
albert_simmiz,15,0.000000,0,0.000000,{},[FIFA 22],0
ild4dd4,30,0.000000,0,0.000000,{},[Minecraft],0


In [15]:
streamer_grouped_cut = streamer_grouped.drop(streamer_grouped[(streamer_grouped.viewer_count_count < min_stream_time)
                                                              | (streamer_grouped.spect_count_tot < min_spect_mean)
                                                              | (streamer_grouped.viewer_count_mean < min_viewer_mean)
                                                             ].index)
streamer_grouped_cut

Unnamed: 0_level_0,viewer_count_count,viewer_count_mean,viewer_count_max,spect_count_mean,spect_sum,game_name_<lambda>,spect_count_tot
streamer,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
LyonWGFLive,1155,14011.454545,19727,9337.805195,"{francowgf042, checcowgf_, fratm_zebby, thecry...","[Minecraft, Roblox, Grand Theft Auto V, Phasmo...",58801
Tumblurr,2925,13683.697436,32436,10655.635897,"{ayarstormxx, m477303310, davies0907, ilbotsup...","[Slots, Bee Simulator, Peggle 2, Sakura Succub...",148019
ChristianVieriOfficial,570,11014.263158,24931,7056.315789,"{biechry, simone2323_, davies0907, el_puma311,...",[Sports],64614
GrenBaud,1005,10285.805970,30017,7600.000000,"{buzdelcul, ale02022002, russottogiulio, 32404...","[Just Chatting, Music]",78521
ZanoXVII,3480,7681.939655,13973,5699.521552,"{mattiiax, kejuis_lqkhe, j4ck_130, samuelesa18...","[Call of Duty: Modern Warfare 2, Golf With You...",157718
...,...,...,...,...,...,...,...
QLASH_Simracing,180,10.000000,15,5.083333,"{davidsonsws, marrguitar75, majo256, exozon, g...","[F1 2021, Assetto Corsa]",20
teoKrazia,120,10.000000,15,8.125000,"{jet_spot_stream, claudiob0077, markamaterasu,...","[Elden Ring, Sifu]",21
yume940,225,10.000000,23,6.466667,"{costa_n_teen, darththoraj, retroforeverr, onl...",[INSIDE],27
zanella_productions,165,10.000000,12,8.727273,"{sixi736, rosh_van, 0xbalder, barbonestztermin...",[Just Chatting],17


In [16]:
columns = ['streamer',
           'minutes_live',
           'viewer_mean',
           'viewer_peak',
           'spect_mean',
           'spect_list',
           'game_name',
           'spect_tot'
          ]
columns_ordered = ['streamer',
                   'minutes_live',
                   'viewer_mean',
                   'viewer_peak',
                   'spect_mean',
                   'spect_tot',
                   #'spect_list',
                   #'game_name'
                  ]

In [17]:
streamer_grouped_cut.reset_index(inplace = True)
streamer_grouped_cut.columns = columns
streamer_grouped_cut.index.name = 'ID_streamer'
streamer_grouped_cut

Unnamed: 0_level_0,streamer,minutes_live,viewer_mean,viewer_peak,spect_mean,spect_list,game_name,spect_tot
ID_streamer,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1
0,LyonWGFLive,1155,14011.454545,19727,9337.805195,"{francowgf042, checcowgf_, fratm_zebby, thecry...","[Minecraft, Roblox, Grand Theft Auto V, Phasmo...",58801
1,Tumblurr,2925,13683.697436,32436,10655.635897,"{ayarstormxx, m477303310, davies0907, ilbotsup...","[Slots, Bee Simulator, Peggle 2, Sakura Succub...",148019
2,ChristianVieriOfficial,570,11014.263158,24931,7056.315789,"{biechry, simone2323_, davies0907, el_puma311,...",[Sports],64614
3,GrenBaud,1005,10285.805970,30017,7600.000000,"{buzdelcul, ale02022002, russottogiulio, 32404...","[Just Chatting, Music]",78521
4,ZanoXVII,3480,7681.939655,13973,5699.521552,"{mattiiax, kejuis_lqkhe, j4ck_130, samuelesa18...","[Call of Duty: Modern Warfare 2, Golf With You...",157718
...,...,...,...,...,...,...,...,...
2972,QLASH_Simracing,180,10.000000,15,5.083333,"{davidsonsws, marrguitar75, majo256, exozon, g...","[F1 2021, Assetto Corsa]",20
2973,teoKrazia,120,10.000000,15,8.125000,"{jet_spot_stream, claudiob0077, markamaterasu,...","[Elden Ring, Sifu]",21
2974,yume940,225,10.000000,23,6.466667,"{costa_n_teen, darththoraj, retroforeverr, onl...",[INSIDE],27
2975,zanella_productions,165,10.000000,12,8.727273,"{sixi736, rosh_van, 0xbalder, barbonestztermin...",[Just Chatting],17


## Streamer dataset

In [18]:
streamer_dataset = streamer_grouped_cut.copy()
streamer_dataset = streamer_dataset[columns_ordered]
#streamer_dataset['spect_list'] = streamer_dataset['spect_list'].apply(list)
streamer_dataset['viewer_mean'] = streamer_dataset['viewer_mean'].apply(int)
streamer_dataset['spect_mean'] = streamer_dataset['spect_mean'].apply(int)
streamer_dataset

Unnamed: 0_level_0,streamer,minutes_live,viewer_mean,viewer_peak,spect_mean,spect_tot
ID_streamer,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
0,LyonWGFLive,1155,14011,19727,9337,58801
1,Tumblurr,2925,13683,32436,10655,148019
2,ChristianVieriOfficial,570,11014,24931,7056,64614
3,GrenBaud,1005,10285,30017,7600,78521
4,ZanoXVII,3480,7681,13973,5699,157718
...,...,...,...,...,...,...
2972,QLASH_Simracing,180,10,15,5,20
2973,teoKrazia,120,10,15,8,21
2974,yume940,225,10,23,6,27
2975,zanella_productions,165,10,12,8,17


### Export streamer dataset

In [19]:
streamer_dataset.to_csv('Streamer_dataset.csv')

## Games dataset

In [20]:
games_list = list(set(streamer_grouped_cut.game_name.sum()))
games_dataset = pd.DataFrame({'game_name': games_list})
games_dataset.index.name = 'ID_game'
games_dataset

Unnamed: 0_level_0,game_name
ID_game,Unnamed: 1_level_1
0,
1,Animal Crossing: New Horizons
2,Astroneer
3,Baba is You
4,Bloodstained: Ritual of the Night
...,...
1140,Poppy Playtime
1141,Final Fantasy IX
1142,Ghost Exile
1143,Higurashi: When They Cry


### Export games dataset

In [21]:
games_dataset.to_csv('Games_dataset.csv')

## Streamer-Games dataset

In [22]:
streamer_games_grouped = streamer_repeated.groupby(['streamer', 'game_name'], as_index = False).agg(minutes = ('viewer_count', 'count'))
streamer_games_grouped['minutes'] = streamer_games_grouped['minutes'] * interval * freq
streamer_games_grouped

Unnamed: 0,streamer,game_name,minutes
0,0000clubtv,Music,15
1,000Skillz,Just Chatting,180
2,000dani0,Minecraft,390
3,000murasaki000,Fortnite,60
4,000smoke,League of Legends,285
...,...,...,...
82905,ばかやろう,Destiny 2,120
82906,リュウザ,Overprime,210
82907,打個遊戲,Just Chatting,300
82908,늦덕위즈원,SUPERSTAR IZ*ONE,225


In [23]:
streamer_games_grouped = pd.merge(streamer_games_grouped, streamer_dataset['streamer'].reset_index(), on = 'streamer', how = 'right')
streamer_games_grouped = pd.merge(streamer_games_grouped, games_dataset['game_name'].reset_index(), on = 'game_name', how = 'left')
streamer_games_grouped

Unnamed: 0,streamer,game_name,minutes,ID_streamer,ID_game
0,LyonWGFLive,Among Us,15,0,772
1,LyonWGFLive,Fortnite,45,0,259
2,LyonWGFLive,Gartic Phone,30,0,548
3,LyonWGFLive,Grand Theft Auto V,45,0,465
4,LyonWGFLive,Just Chatting,300,0,693
...,...,...,...,...,...
8092,teoKrazia,Elden Ring,30,2973,133
8093,teoKrazia,Sifu,90,2973,683
8094,yume940,INSIDE,225,2974,492
8095,zanella_productions,Just Chatting,165,2975,693


In [24]:
streamer_games_grouped.set_index(['ID_streamer', 'ID_game'], inplace = True)

### Export streamer-games datasets

In [25]:
streamer_games_grouped['minutes'].to_csv('Streamer-Games_dataset.csv')

## Streamer-Streamer dataset

In [26]:
spect_dict = streamer_grouped_cut[['spect_list']].to_dict()['spect_list']
len(spect_dict.keys())

2977

In [27]:
import logging

logging.basicConfig(level=logging.INFO)
logger = logging.getLogger('Progress')

viewerOverlapDict = {}
completedStreamers = set() #Save which streamers have been processed to avoid repeating
count = 1

for key in spect_dict:
    tempList = {}
    totalLength = len(spect_dict.keys())
    logger.info(str(count) + "/" + str(totalLength)) #Print progress so I can keep track

    for comparisonKey in spect_dict: #Loop through every key again for each key in the dictionary
        if(comparisonKey != key and comparisonKey not in completedStreamers): #If its not a self comparison and the comparison hasn't already been completed
            overlapSize = len(spect_dict[key] & spect_dict[comparisonKey]) #Find the overlap size of the two streamers using set intersection
            if len(spect_dict[key]) < len(spect_dict[comparisonKey]):
                temp = round(overlapSize/len(spect_dict[key])*100, 1) #If the size is over 300 add {comparisonStreamer: overlap} to the dictionary
                if temp >= min_overlap_percentage:
                    tempList[comparisonKey] = temp
            else:
                temp = round(overlapSize/len(spect_dict[comparisonKey])*100, 1)
                if temp >= min_overlap_percentage:
                    tempList[comparisonKey] = temp
    viewerOverlapDict[key] = tempList #Add this comparison dictionary to the larger dictionary for that streamer
    completedStreamers.add(key) #Add the streamer to completed as no comparisons using this streamer need to be done anymore
    count+=1

INFO:Progress:1/2977
INFO:Progress:2/2977
INFO:Progress:3/2977
INFO:Progress:4/2977
INFO:Progress:5/2977
INFO:Progress:6/2977
INFO:Progress:7/2977
INFO:Progress:8/2977
INFO:Progress:9/2977
INFO:Progress:10/2977
INFO:Progress:11/2977
INFO:Progress:12/2977
INFO:Progress:13/2977
INFO:Progress:14/2977
INFO:Progress:15/2977
INFO:Progress:16/2977
INFO:Progress:17/2977
INFO:Progress:18/2977
INFO:Progress:19/2977
INFO:Progress:20/2977
INFO:Progress:21/2977
INFO:Progress:22/2977
INFO:Progress:23/2977
INFO:Progress:24/2977
INFO:Progress:25/2977
INFO:Progress:26/2977
INFO:Progress:27/2977
INFO:Progress:28/2977
INFO:Progress:29/2977
INFO:Progress:30/2977
INFO:Progress:31/2977
INFO:Progress:32/2977
INFO:Progress:33/2977
INFO:Progress:34/2977
INFO:Progress:35/2977
INFO:Progress:36/2977
INFO:Progress:37/2977
INFO:Progress:38/2977
INFO:Progress:39/2977
INFO:Progress:40/2977
INFO:Progress:41/2977
INFO:Progress:42/2977
INFO:Progress:43/2977
INFO:Progress:44/2977
INFO:Progress:45/2977
INFO:Progress:46/29

INFO:Progress:362/2977
INFO:Progress:363/2977
INFO:Progress:364/2977
INFO:Progress:365/2977
INFO:Progress:366/2977
INFO:Progress:367/2977
INFO:Progress:368/2977
INFO:Progress:369/2977
INFO:Progress:370/2977
INFO:Progress:371/2977
INFO:Progress:372/2977
INFO:Progress:373/2977
INFO:Progress:374/2977
INFO:Progress:375/2977
INFO:Progress:376/2977
INFO:Progress:377/2977
INFO:Progress:378/2977
INFO:Progress:379/2977
INFO:Progress:380/2977
INFO:Progress:381/2977
INFO:Progress:382/2977
INFO:Progress:383/2977
INFO:Progress:384/2977
INFO:Progress:385/2977
INFO:Progress:386/2977
INFO:Progress:387/2977
INFO:Progress:388/2977
INFO:Progress:389/2977
INFO:Progress:390/2977
INFO:Progress:391/2977
INFO:Progress:392/2977
INFO:Progress:393/2977
INFO:Progress:394/2977
INFO:Progress:395/2977
INFO:Progress:396/2977
INFO:Progress:397/2977
INFO:Progress:398/2977
INFO:Progress:399/2977
INFO:Progress:400/2977
INFO:Progress:401/2977
INFO:Progress:402/2977
INFO:Progress:403/2977
INFO:Progress:404/2977
INFO:Progre

INFO:Progress:719/2977
INFO:Progress:720/2977
INFO:Progress:721/2977
INFO:Progress:722/2977
INFO:Progress:723/2977
INFO:Progress:724/2977
INFO:Progress:725/2977
INFO:Progress:726/2977
INFO:Progress:727/2977
INFO:Progress:728/2977
INFO:Progress:729/2977
INFO:Progress:730/2977
INFO:Progress:731/2977
INFO:Progress:732/2977
INFO:Progress:733/2977
INFO:Progress:734/2977
INFO:Progress:735/2977
INFO:Progress:736/2977
INFO:Progress:737/2977
INFO:Progress:738/2977
INFO:Progress:739/2977
INFO:Progress:740/2977
INFO:Progress:741/2977
INFO:Progress:742/2977
INFO:Progress:743/2977
INFO:Progress:744/2977
INFO:Progress:745/2977
INFO:Progress:746/2977
INFO:Progress:747/2977
INFO:Progress:748/2977
INFO:Progress:749/2977
INFO:Progress:750/2977
INFO:Progress:751/2977
INFO:Progress:752/2977
INFO:Progress:753/2977
INFO:Progress:754/2977
INFO:Progress:755/2977
INFO:Progress:756/2977
INFO:Progress:757/2977
INFO:Progress:758/2977
INFO:Progress:759/2977
INFO:Progress:760/2977
INFO:Progress:761/2977
INFO:Progre

INFO:Progress:1073/2977
INFO:Progress:1074/2977
INFO:Progress:1075/2977
INFO:Progress:1076/2977
INFO:Progress:1077/2977
INFO:Progress:1078/2977
INFO:Progress:1079/2977
INFO:Progress:1080/2977
INFO:Progress:1081/2977
INFO:Progress:1082/2977
INFO:Progress:1083/2977
INFO:Progress:1084/2977
INFO:Progress:1085/2977
INFO:Progress:1086/2977
INFO:Progress:1087/2977
INFO:Progress:1088/2977
INFO:Progress:1089/2977
INFO:Progress:1090/2977
INFO:Progress:1091/2977
INFO:Progress:1092/2977
INFO:Progress:1093/2977
INFO:Progress:1094/2977
INFO:Progress:1095/2977
INFO:Progress:1096/2977
INFO:Progress:1097/2977
INFO:Progress:1098/2977
INFO:Progress:1099/2977
INFO:Progress:1100/2977
INFO:Progress:1101/2977
INFO:Progress:1102/2977
INFO:Progress:1103/2977
INFO:Progress:1104/2977
INFO:Progress:1105/2977
INFO:Progress:1106/2977
INFO:Progress:1107/2977
INFO:Progress:1108/2977
INFO:Progress:1109/2977
INFO:Progress:1110/2977
INFO:Progress:1111/2977
INFO:Progress:1112/2977
INFO:Progress:1113/2977
INFO:Progress:11

INFO:Progress:1415/2977
INFO:Progress:1416/2977
INFO:Progress:1417/2977
INFO:Progress:1418/2977
INFO:Progress:1419/2977
INFO:Progress:1420/2977
INFO:Progress:1421/2977
INFO:Progress:1422/2977
INFO:Progress:1423/2977
INFO:Progress:1424/2977
INFO:Progress:1425/2977
INFO:Progress:1426/2977
INFO:Progress:1427/2977
INFO:Progress:1428/2977
INFO:Progress:1429/2977
INFO:Progress:1430/2977
INFO:Progress:1431/2977
INFO:Progress:1432/2977
INFO:Progress:1433/2977
INFO:Progress:1434/2977
INFO:Progress:1435/2977
INFO:Progress:1436/2977
INFO:Progress:1437/2977
INFO:Progress:1438/2977
INFO:Progress:1439/2977
INFO:Progress:1440/2977
INFO:Progress:1441/2977
INFO:Progress:1442/2977
INFO:Progress:1443/2977
INFO:Progress:1444/2977
INFO:Progress:1445/2977
INFO:Progress:1446/2977
INFO:Progress:1447/2977
INFO:Progress:1448/2977
INFO:Progress:1449/2977
INFO:Progress:1450/2977
INFO:Progress:1451/2977
INFO:Progress:1452/2977
INFO:Progress:1453/2977
INFO:Progress:1454/2977
INFO:Progress:1455/2977
INFO:Progress:14

INFO:Progress:1757/2977
INFO:Progress:1758/2977
INFO:Progress:1759/2977
INFO:Progress:1760/2977
INFO:Progress:1761/2977
INFO:Progress:1762/2977
INFO:Progress:1763/2977
INFO:Progress:1764/2977
INFO:Progress:1765/2977
INFO:Progress:1766/2977
INFO:Progress:1767/2977
INFO:Progress:1768/2977
INFO:Progress:1769/2977
INFO:Progress:1770/2977
INFO:Progress:1771/2977
INFO:Progress:1772/2977
INFO:Progress:1773/2977
INFO:Progress:1774/2977
INFO:Progress:1775/2977
INFO:Progress:1776/2977
INFO:Progress:1777/2977
INFO:Progress:1778/2977
INFO:Progress:1779/2977
INFO:Progress:1780/2977
INFO:Progress:1781/2977
INFO:Progress:1782/2977
INFO:Progress:1783/2977
INFO:Progress:1784/2977
INFO:Progress:1785/2977
INFO:Progress:1786/2977
INFO:Progress:1787/2977
INFO:Progress:1788/2977
INFO:Progress:1789/2977
INFO:Progress:1790/2977
INFO:Progress:1791/2977
INFO:Progress:1792/2977
INFO:Progress:1793/2977
INFO:Progress:1794/2977
INFO:Progress:1795/2977
INFO:Progress:1796/2977
INFO:Progress:1797/2977
INFO:Progress:17

INFO:Progress:2099/2977
INFO:Progress:2100/2977
INFO:Progress:2101/2977
INFO:Progress:2102/2977
INFO:Progress:2103/2977
INFO:Progress:2104/2977
INFO:Progress:2105/2977
INFO:Progress:2106/2977
INFO:Progress:2107/2977
INFO:Progress:2108/2977
INFO:Progress:2109/2977
INFO:Progress:2110/2977
INFO:Progress:2111/2977
INFO:Progress:2112/2977
INFO:Progress:2113/2977
INFO:Progress:2114/2977
INFO:Progress:2115/2977
INFO:Progress:2116/2977
INFO:Progress:2117/2977
INFO:Progress:2118/2977
INFO:Progress:2119/2977
INFO:Progress:2120/2977
INFO:Progress:2121/2977
INFO:Progress:2122/2977
INFO:Progress:2123/2977
INFO:Progress:2124/2977
INFO:Progress:2125/2977
INFO:Progress:2126/2977
INFO:Progress:2127/2977
INFO:Progress:2128/2977
INFO:Progress:2129/2977
INFO:Progress:2130/2977
INFO:Progress:2131/2977
INFO:Progress:2132/2977
INFO:Progress:2133/2977
INFO:Progress:2134/2977
INFO:Progress:2135/2977
INFO:Progress:2136/2977
INFO:Progress:2137/2977
INFO:Progress:2138/2977
INFO:Progress:2139/2977
INFO:Progress:21

INFO:Progress:2441/2977
INFO:Progress:2442/2977
INFO:Progress:2443/2977
INFO:Progress:2444/2977
INFO:Progress:2445/2977
INFO:Progress:2446/2977
INFO:Progress:2447/2977
INFO:Progress:2448/2977
INFO:Progress:2449/2977
INFO:Progress:2450/2977
INFO:Progress:2451/2977
INFO:Progress:2452/2977
INFO:Progress:2453/2977
INFO:Progress:2454/2977
INFO:Progress:2455/2977
INFO:Progress:2456/2977
INFO:Progress:2457/2977
INFO:Progress:2458/2977
INFO:Progress:2459/2977
INFO:Progress:2460/2977
INFO:Progress:2461/2977
INFO:Progress:2462/2977
INFO:Progress:2463/2977
INFO:Progress:2464/2977
INFO:Progress:2465/2977
INFO:Progress:2466/2977
INFO:Progress:2467/2977
INFO:Progress:2468/2977
INFO:Progress:2469/2977
INFO:Progress:2470/2977
INFO:Progress:2471/2977
INFO:Progress:2472/2977
INFO:Progress:2473/2977
INFO:Progress:2474/2977
INFO:Progress:2475/2977
INFO:Progress:2476/2977
INFO:Progress:2477/2977
INFO:Progress:2478/2977
INFO:Progress:2479/2977
INFO:Progress:2480/2977
INFO:Progress:2481/2977
INFO:Progress:24

INFO:Progress:2783/2977
INFO:Progress:2784/2977
INFO:Progress:2785/2977
INFO:Progress:2786/2977
INFO:Progress:2787/2977
INFO:Progress:2788/2977
INFO:Progress:2789/2977
INFO:Progress:2790/2977
INFO:Progress:2791/2977
INFO:Progress:2792/2977
INFO:Progress:2793/2977
INFO:Progress:2794/2977
INFO:Progress:2795/2977
INFO:Progress:2796/2977
INFO:Progress:2797/2977
INFO:Progress:2798/2977
INFO:Progress:2799/2977
INFO:Progress:2800/2977
INFO:Progress:2801/2977
INFO:Progress:2802/2977
INFO:Progress:2803/2977
INFO:Progress:2804/2977
INFO:Progress:2805/2977
INFO:Progress:2806/2977
INFO:Progress:2807/2977
INFO:Progress:2808/2977
INFO:Progress:2809/2977
INFO:Progress:2810/2977
INFO:Progress:2811/2977
INFO:Progress:2812/2977
INFO:Progress:2813/2977
INFO:Progress:2814/2977
INFO:Progress:2815/2977
INFO:Progress:2816/2977
INFO:Progress:2817/2977
INFO:Progress:2818/2977
INFO:Progress:2819/2977
INFO:Progress:2820/2977
INFO:Progress:2821/2977
INFO:Progress:2822/2977
INFO:Progress:2823/2977
INFO:Progress:28

In [28]:
prova = 1

print(max(viewerOverlapDict[prova], key=viewerOverlapDict[prova].get))
print(viewerOverlapDict[prova][max(viewerOverlapDict[prova], key=viewerOverlapDict[prova].get)])

760
86.2


In [29]:
streamer_streamer_dataset = pd.concat({k: pd.DataFrame.from_dict(v, 'index') for k, v in viewerOverlapDict.items()}, axis=0)
streamer_streamer_dataset.tail(10)

Unnamed: 0,Unnamed: 1,0
2961,2967,1.6
2962,2964,1.0
2962,2966,1.1
2962,2967,1.0
2962,2969,3.4
2962,2973,4.8
2963,2966,1.1
2964,2966,1.1
2964,2974,3.7
2966,2969,3.4


In [30]:
streamer_streamer_dataset.index.names = ['ID_streamer_i', 'ID_streamer_j']
streamer_streamer_dataset.columns = ['overlap_percentage']
streamer_streamer_dataset

Unnamed: 0_level_0,Unnamed: 1_level_0,overlap_percentage
ID_streamer_i,ID_streamer_j,Unnamed: 2_level_1
0,1,2.4
0,3,1.8
0,4,3.9
0,5,2.5
0,7,1.8
...,...,...
2962,2973,4.8
2963,2966,1.1
2964,2966,1.1
2964,2974,3.7


### Export Dict

In [31]:
streamer_streamer_dataset.to_csv('Streamer-Streamer_dataset.csv')