#### DATA VISUALIZATION

<br>

# Gaming Data with Twitch
<hr>

[Twitch](https://www.twitch.tv/) is the world’s leading live streaming platform for gamers, with 15 million daily active users. Using data to understand its users and products is one of the main responsibilities of the Twitch [Science Team](https://science.twitch.tv/).
<br>
In the Data Science Path Cumulative Project, we have partnered with the Twitch Science Team and we were given a scrubbed dataset (800,000 rows) that describe user engagmeent with Twitch stream and Twitch chat on January 1st, 2015:
- `stream.csv`
- `chat.csv`

In [1]:
%%html
<style>
    table {
        display: inline-block
    }
</style>

### `stream.csv`

<table>
    <tr>
        <th>Headers</th>
        <th>Description</th>
    </tr>
    <tr>
        <td><code>time</code></td>
        <td>date and time (YYYY-MM-DD HH:MM:SS)</td>
    </tr>
    <tr>
        <td><code>device_id</code></td>
        <td>device ID</td>
    </tr>
    <tr>
        <td><code>login</code></td>
        <td>login ID</td>
    </tr>
    <tr>
        <td><code>channel</code></td>
        <td>streamer name</td>
    </tr>
    <tr>
        <td><code>country</code></td>
        <td>country name abbreviation</td>
    </tr>
    <tr>
        <td><code>player</code></td>
        <td>streamed device</td>
    </tr>
    <tr>
        <td><code>game</code></td>
        <td>game name</td>
    </tr>
    <tr>
        <td><code>stream_format</code></td>
        <td>stream quality</td>
    </tr>
    <tr>
        <td><code>subscriber</code></td>
        <td>is the viewer a subscriber? (true/false)</td>
    </tr>
</table>
<br>

### `chat.csv`

<table>
    <tr>
        <th>Headers</th>
        <th>Description</th>
    </tr>
    <tr>
        <td><code>time</code></td>
        <td>date and time (YYYY-MM-DD HH:MM:SS)</td>
    </tr>
    <tr>
        <td><code>device_id</code></td>
        <td>device ID</td>
    </tr>
    <tr>
        <td><code>login</code></td>
        <td>login ID</td>
    </tr>
    <tr>
        <td><code>channel</code></td>
        <td>streamer name</td>
    </tr>
    <tr>
        <td><code>country</code></td>
        <td>country name abbreviation</td>
    </tr>
    <tr>
        <td><code>player</code></td>
        <td>chat device</td>
    </tr>
    <tr>
        <td><code>game</code></td>
        <td>game name</td>
    </tr>
</table>

<br>
<hr>

Twitch Science Team provided this data:
- June Dershewitz, Head of Data Governance & Analytics, Twitch
- Sharmeen Browarek, Product Manager, Twitch
- Carson Forter, Data Scientist, Twitch

In [123]:
from collections import Counter
import pandas as pd
import numpy as np
from datetime import *

In [124]:
stream = pd.read_csv('stream.csv')
stream.head(20)

Unnamed: 0,time,device_id,login,channel,country,player,game,stream_format,subscriber
0,2015-01-01 18:33:52,40ffc2fa6534cf760becbdbf5311e31ad069e46e,085c1eb7b587bfe654f0df7b4ba7f4fc4013636c,frank,US,iphone_t,League of Legends,,
1,2015-01-01 23:35:33,9a8cc2b7162b99c0a0f501dc9a5ec4f68586a760,5ad49a7b408ce452140b180dd6efb57a9f4d22c7,george,US,site,DayZ,chunked,False
2,2015-01-01 04:39:38,2f9c3f9ee1033b71a3819564243f20ef3bec0183,5b9a43e68f019185f55615d0b83019dee4b5d06f,frank,US,site,League of Legends,chunked,False
3,2015-01-01 11:15:30,0cda8226ba2583424c80c3c1c22c1256b080ad17,02c7797faa4d8a3ff4b0c14ee1764b6817b53d0b,estelle,CH,site,Dota 2,high,False
4,2015-01-01 11:28:19,e3288ca5e3153aa85e32f64cdd994b7666968dcf,b920c228acbcbebee26d9c79f6eb73b73a9480c7,morty,FR,site,Heroes of the Storm,medium,False
5,2015-01-01 23:27:36,343fe2bfd58595d5c18602d420ecf6f9d694d5a8,7814f661a54349ff5eee84f9d6f476918c9b7270,george,US,iphone_t,DayZ,,
6,2015-01-01 21:09:23,80a0c7d1abb6a5a0060e18202b77bef831e08ca5,eb158cab3f606d3894a32e20bddbfd2d589095a9,frank,US,site,League of Legends,high,True
7,2015-01-01 19:14:27,1e342e5e4e228f617449029054b3bb19c5224528,2aaf6a414bc3dc923b04f986de7ba1b8101c6698,frank,CA,site,League of Legends,high,False
8,2015-01-01 13:51:04,272cffbb1a9a33ad3bb48a2ee9ae5cbcac5ca22e,401716920e3435b5e3eec9fc4ccd6a44c7af38f5,kramer,TR,site,Counter-Strike: Global Offensive,chunked,False
9,2015-01-01 22:00:14,593ed161c456eeeb9e18b8005786d42abc1a7373,ef24dc49ceb4bcd3cccb0fa862d8a150ecf935e1,frank,US,site,League of Legends,medium,False


In [125]:
print(len(stream))

526299


In [126]:
chat = pd.read_csv('chat.csv')
chat.head(20)

Unnamed: 0,time,device_id,login,channel,country,player,game
0,2015-01-01 18:45:50,70e2b95b5ac0d4c227e46966658d16b3e044996e,5c2f5c1f19a7738e16ed0be551d865e8a8fce71d,jerry,BY,,Dota 2
1,2015-01-01 01:16:57,f2b9065b55fd80d6aa653ce989b489f4ec5198be,0d77740e4fb5ce77d94f9f6c8ef1f762990d0344,elaine,HK,,Devil May Cry 4: Special Edition
2,2015-01-01 16:22:10,d448ba963d7e1023dd1b0a40b95d4f6611750692,77ab14c1fb815e1c369ba0cf7d4c56b4fe489997,frank,GB,iphone_t,League of Legends
3,2015-01-01 03:58:13,8d6823dc52b400b50aebf269bf1f03a36d19eeaa,91cb88c0743761589273fc5e800e7743ece46494,frank,US,iphone_t,League of Legends
4,2015-01-01 11:47:35,16c1e39594d62358d27ae604ad43a071f0d86bc4,51a9234f83d656607cfd7f26690c12d2ffbce353,estelle,DE,,Dota 2
5,2015-01-01 17:59:51,6fcc75522de37833a0fb21fba4965aad3b63ea57,f628d1cb946ea2e8cffc0b327bc9d77775b8d3c0,jerry,RU,,Dota 2
6,2015-01-01 02:24:33,dea94b3030025d837dd841fbfd479e775987f65d,9dbbcf6c7792074771c4c7284807041eac467ad5,elaine,TW,,Gaming Talk Shows
7,2015-01-01 18:26:34,671bee0f3d66077876d9bc231990597292392cc2,51c286a41daa8e060275f622f2b8436bee9fab91,jerry,UA,,Dota 2
8,2015-01-01 13:13:18,8b31d5ebd1f4f41d4365ae4a471c1686dd256745,06decad1d9565150791e183da017f47123433a4c,estelle,GB,ipad_t,Dota 2
9,2015-01-01 20:20:55,f2ebb129e6930e608f2ed3f5fb52bc4d533c4891,4679f8113aa157ba76fc6db5878d7ee625e88d55,frank,CA,,League of Legends


In [127]:
print(len(chat))

148562


In [128]:
unique_games = pd.unique(stream['game'])

for game in unique_games:
    print(game)

League of Legends
DayZ
Dota 2
Heroes of the Storm
Counter-Strike: Global Offensive
Hearthstone: Heroes of Warcraft
The Binding of Isaac: Rebirth
Agar.io
Gaming Talk Shows
nan
Rocket League
World of Tanks
ARK: Survival Evolved
SpeedRunners
Breaking Point
Duck Game
Devil May Cry 4: Special Edition
Block N Load
Fallout 3
Batman: Arkham Knight
Reign Of Kings
The Witcher 3: Wild Hunt
The Elder Scrolls V: Skyrim
Super Mario Bros.
H1Z1
The Last of Us
Depth
Mortal Kombat X
Senran Kagura: Estival Versus
The Sims 4
You Must Build A Boat
Choice Chamber
Music
Risk of Rain
Grand Theft Auto V
Besiege
Super Mario Bros. 3
Hektor
Bridge Constructor Medieval
Lucius
Blackjack
Cities: Skylines


In [129]:
unique_channels = pd.unique(stream['channel'])

for channel in unique_channels:
    print(channel)

frank
george
estelle
morty
kramer
jerry
helen
newman
elaine
susan


### Aggregate Functions:

In [130]:
#print(Counter(stream['game']).keys())
#print(Counter(stream['game']).values())

unique_game_name_count = {key: value for key, value in zip(Counter(stream['game']).keys(), Counter(stream['game']).values())}
#print(unique_game_name_count)
unique_game_name_count = dict(sorted(unique_game_name_count.items(), key = lambda item: item[1], reverse = True))

print("Viewers Per Game")
for name, count in unique_game_name_count.items():
    if(count == 1):
        print(str(name) + ": " + str(count) + " viewer")
    else:
        print(str(name) + ": " + str(count) + " viewers")

Viewers Per Game
League of Legends: 193533 viewers
Dota 2: 85608 viewers
Counter-Strike: Global Offensive: 54438 viewers
DayZ: 38004 viewers
Heroes of the Storm: 35310 viewers
The Binding of Isaac: Rebirth: 29467 viewers
Gaming Talk Shows: 28115 viewers
World of Tanks: 15932 viewers
Hearthstone: Heroes of Warcraft: 14399 viewers
Agar.io: 11480 viewers
Rocket League: 7087 viewers
ARK: Survival Evolved: 4158 viewers
SpeedRunners: 3367 viewers
nan: 3124 viewers
Duck Game: 1063 viewers
Fallout 3: 485 viewers
Devil May Cry 4: Special Edition: 231 viewers
Breaking Point: 161 viewers
Batman: Arkham Knight: 117 viewers
Reign Of Kings: 50 viewers
The Witcher 3: Wild Hunt: 45 viewers
Block N Load: 34 viewers
Depth: 27 viewers
Mortal Kombat X: 22 viewers
H1Z1: 7 viewers
Super Mario Bros.: 6 viewers
Grand Theft Auto V: 5 viewers
Music: 4 viewers
The Elder Scrolls V: Skyrim: 3 viewers
The Last of Us: 3 viewers
Senran Kagura: Estival Versus: 2 viewers
Risk of Rain: 2 viewers
The Sims 4: 1 viewer
You

In [131]:
countries_lol = stream.country[stream.game == 'League of Legends']
unique_countries_lol = {key: value for key, value in zip(Counter(countries_lol).keys(), Counter(countries_lol).values())}
#print(unique_countries_lol)
unique_countries_lol = dict(sorted(unique_countries_lol.items(), key = lambda item: item[1], reverse = True))

print("League of Legend Viewers Per Country")
for country, viewers in unique_countries_lol.items():
    if(viewers == 1):
        print(str(country) + ": " + str(viewers) + " viewer")
    else:
        print(str(country) + ": " + str(viewers) + " viewers")

#values = unique_countries_lol.values()
#total = sum(values)
#print(total) => 193533

#league_of_legends = stream[stream.game == 'League of Legends']
#league_of_legends.head()

League of Legend Viewers Per Country
US: 85606 viewers
CA: 13034 viewers
DE: 10835 viewers
nan: 7641 viewers
GB: 6964 viewers
TR: 4412 viewers
AU: 3911 viewers
SE: 3533 viewers
NL: 3213 viewers
DK: 2909 viewers
GR: 2885 viewers
PL: 2776 viewers
PT: 2757 viewers
RO: 2464 viewers
IT: 2333 viewers
FR: 2316 viewers
TW: 2257 viewers
BR: 2252 viewers
MX: 2244 viewers
NO: 2047 viewers
BE: 1712 viewers
ES: 1574 viewers
FI: 1416 viewers
CZ: 1143 viewers
NZ: 1104 viewers
RU: 1058 viewers
HU: 1037 viewers
AT: 952 viewers
LT: 903 viewers
BG: 894 viewers
HK: 892 viewers
HR: 869 viewers
CL: 835 viewers
RS: 748 viewers
IL: 593 viewers
CH: 577 viewers
JP: 562 viewers
SG: 558 viewers
AR: 541 viewers
MA: 462 viewers
IE: 461 viewers
CO: 446 viewers
SK: 437 viewers
SI: 436 viewers
SA: 422 viewers
AE: 371 viewers
BA: 330 viewers
PH: 315 viewers
KW: 314 viewers
PR: 292 viewers
EE: 250 viewers
LV: 231 viewers
MY: 229 viewers
CR: 222 viewers
UA: 220 viewers
TN: 193 viewers
VE: 192 viewers
KR: 192 viewers
CY: 

In [137]:
player_source = {key: value for key, value in zip(Counter(stream['player']).keys(), Counter(stream['player']).values())}
#print(player_source)
player_source = dict(sorted(player_source.items(), key = lambda item: item[1], reverse = True))

print("The Source Users Use to Watch Stream")
for source, viewer in player_source.items():
    print(str(source) + ": " + str(viewer) + " viewers")

The Source Users Use to Watch Stream
site: 246115 viewers
iphone_t: 100689 viewers
android: 93508 viewers
ipad_t: 53646 viewers
embed: 19819 viewers
xbox_one: 4863 viewers
home: 3479 viewers
frontpage: 1567 viewers
amazon: 1155 viewers
xbox360: 985 viewers
roku: 233 viewers
chromecast: 149 viewers
facebook: 83 viewers
ouya: 3 viewers
nvidia shield: 3 viewers
android_pip: 2 viewers


In [138]:
conditions = [
    (stream['game'] == 'League of Legends') | (stream['game'] == 'Dota 2') | (stream['game'] == 'Heroes of the Storm'),
    (stream['game'] == 'Counter-Strike: Global Offensive'),
    (stream['game'] == 'DayZ') | (stream['game'] == 'ARK: Survival Evolved'),
    (stream['game'] != '')
]

values = ['MOBA', 'FPS', 'Survival', 'Other']

stream['genre'] = np.select(conditions, values)
#stream.head()


genre_count = {key: value for key, value in zip(Counter(stream['genre']).keys(), Counter(stream['genre']).values())}
#print(genre_count)
genre_count = dict(sorted(genre_count.items(), key = lambda item: item[1], reverse = True))

print("Viewers Per Genre of Game")
for genre, count in genre_count.items():
    print(str(genre) + ": " + str(count) + " viewers")

Viewers Per Genre of Game
MOBA: 314451 viewers
Other: 115248 viewers
FPS: 54438 viewers
Survival: 42162 viewers


In [134]:
#print(len(stream['time'])) => 526299 rows
stream['time'] = pd.to_datetime(stream['time'])
stream['time'].head(10)

0   2015-01-01 18:33:52
1   2015-01-01 23:35:33
2   2015-01-01 04:39:38
3   2015-01-01 11:15:30
4   2015-01-01 11:28:19
5   2015-01-01 23:27:36
6   2015-01-01 21:09:23
7   2015-01-01 19:14:27
8   2015-01-01 13:51:04
9   2015-01-01 22:00:14
Name: time, dtype: datetime64[ns]

In [139]:
hour = stream['time'].dt.strftime("%H")
#print(stream['time'].dt.strftime("%H"))

hour_count = {key: value for key, value in zip(Counter(hour).keys(), Counter(hour).values())}
#print(hour_count)
hour_count = dict(sorted(hour_count.items(), key = lambda item: item[0]))

print("Viewers Per Hour")
for hour, count in hour_count.items():
    print(str(hour) + ": " + str(count) + " viewers")

Viewers Per Hour
00: 15411 viewers
01: 14407 viewers
02: 24141 viewers
03: 16205 viewers
04: 15098 viewers
05: 6265 viewers
06: 1483 viewers
07: 8505 viewers
08: 11223 viewers
09: 9863 viewers
10: 11584 viewers
11: 33645 viewers
12: 50261 viewers
13: 43390 viewers
14: 26219 viewers
15: 26707 viewers
16: 25191 viewers
17: 28350 viewers
18: 28863 viewers
19: 28374 viewers
20: 29816 viewers
21: 29399 viewers
22: 22062 viewers
23: 19837 viewers


In [136]:
merged = pd.merge(stream, chat, on = 'device_id', how = 'inner')
merged.to_csv("merged.csv", index = False)

new_file = pd.read_csv("merged.csv")
new_file.head()

Unnamed: 0,time_x,device_id,login_x,channel_x,country_x,player_x,game_x,stream_format,subscriber,genre,time_y,login_y,channel_y,country_y,player_y,game_y
0,2015-01-01 18:33:52,40ffc2fa6534cf760becbdbf5311e31ad069e46e,085c1eb7b587bfe654f0df7b4ba7f4fc4013636c,frank,US,iphone_t,League of Legends,,,MOBA,2015-01-01 22:43:57,085c1eb7b587bfe654f0df7b4ba7f4fc4013636c,frank,US,iphone_t,League of Legends
1,2015-01-01 23:11:18,40ffc2fa6534cf760becbdbf5311e31ad069e46e,085c1eb7b587bfe654f0df7b4ba7f4fc4013636c,frank,US,iphone_t,League of Legends,,,MOBA,2015-01-01 22:43:57,085c1eb7b587bfe654f0df7b4ba7f4fc4013636c,frank,US,iphone_t,League of Legends
2,2015-01-01 17:29:23,40ffc2fa6534cf760becbdbf5311e31ad069e46e,085c1eb7b587bfe654f0df7b4ba7f4fc4013636c,frank,US,iphone_t,League of Legends,,,MOBA,2015-01-01 22:43:57,085c1eb7b587bfe654f0df7b4ba7f4fc4013636c,frank,US,iphone_t,League of Legends
3,2015-01-01 16:39:05,40ffc2fa6534cf760becbdbf5311e31ad069e46e,085c1eb7b587bfe654f0df7b4ba7f4fc4013636c,frank,US,iphone_t,League of Legends,,,MOBA,2015-01-01 22:43:57,085c1eb7b587bfe654f0df7b4ba7f4fc4013636c,frank,US,iphone_t,League of Legends
4,2015-01-01 21:54:57,40ffc2fa6534cf760becbdbf5311e31ad069e46e,085c1eb7b587bfe654f0df7b4ba7f4fc4013636c,frank,US,iphone_t,League of Legends,,,MOBA,2015-01-01 22:43:57,085c1eb7b587bfe654f0df7b4ba7f4fc4013636c,frank,US,iphone_t,League of Legends
