## Scraping Twitch Chat Data

#### Using Twitch Sockets from https://www.learndatasci.com/tutorials/how-stream-text-data-twitch-sockets-python/

In [10]:
import os

server = 'irc.chat.twitch.tv'
port = 6667
nickname = 'gCapstone1'
token = os.environ.get('TWITCH_OAUTH_TOKEN')


#specify the channel and log file name
channel = '#bmkibler' #hearthstone
log_file_name = 'bmkibler_chat.log'

In [11]:
#To establish a connection to Twitch IRC we'll be using Python's socket library. First we need to instantiate a socket:

import socket

sock = socket.socket()
sock.connect((server, port))

In [12]:
#Once connected, we need to send our token and nickname for authentication, and the channel to connect to over the socket.

# With sockets, we need to send() these parameters as encoded strings:

sock.send(f"PASS {token}\n".encode('utf-8'))
sock.send(f"NICK {nickname}\n".encode('utf-8'))
sock.send(f"JOIN {channel}\n".encode('utf-8'))

15

In [13]:
# Now we have successfully connected and can receive responses from the channel we subscribed to. To get a single response we can call .recv() and then decode the message from bytes:

resp = sock.recv(2048).decode('utf-8')
print(resp)

:tmi.twitch.tv 001 1337mik3 :Welcome, GLHF!
:tmi.twitch.tv 002 1337mik3 :Your host is tmi.twitch.tv
:tmi.twitch.tv 003 1337mik3 :This server is rather new
:tmi.twitch.tv 004 1337mik3 :-
:tmi.twitch.tv 375 1337mik3 :-
:tmi.twitch.tv 372 1337mik3 :You are in a maze of twisty passages, all alike.
:tmi.twitch.tv 376 1337mik3 :>
:1337mik3!1337mik3@1337mik3.tmi.twitch.tv JOIN #bmkibler
:1337mik3.tmi.twitch.tv 353 1337mik3 = #bmkibler :1337mik3
:1337mik3.tmi.twitch.tv 366 1337mik3 #bmkibler :End of /NAMES list



In [14]:
# Note: running this the first time will show a welcome message from Twitch. Run it again to show the first message from the channel.

# The 2048 is the buffer size in bytes, or the amount of data to receive. The convention is to use small powers of 2, so 1024, 2048, 4096, etc. Rerunning the above will receive the next message that was pushed to the socket.

resp = sock.recv(2048).decode('utf-8')
print(resp)

:capozmichael!capozmichael@capozmichael.tmi.twitch.tv PRIVMSG #bmkibler :but what's the plan of this deck vs control? maybe landing, but not enough



In [15]:
# Writing messages to a file
# Right now, our socket is being inundated with responses from Twitch but we have two problems:

# We need to continuously check for new messages
# We want to log the messages as they come in
# To fix, we'll use a loop to check for new messages while the socket is open and use Python's logging library to log messages to a file.

# First, let's set up a basic logger in Python that will write messages to a file:

import logging

# saves to the specified log file name
logging.basicConfig(level=logging.DEBUG,
                    format='%(asctime)s — %(message)s',
                    datefmt='%Y-%m-%d_%H:%M:%S',
                    handlers=[logging.FileHandler(log_file_name, encoding='utf-8')])

In [16]:
logging.info(resp)

In [17]:
# Opening chat.log we can see the first message:

# 2018-12-10_11:26:40 — :spappygram!spappygram@spappygram.tmi.twitch.tv PRIVMSG #ninja :Chat, let Ninja play solos if he wants. His friends can get in contact with him.

# So we have the time the message was logged at the beginning, a double dash separator, and then the message. This format corresponds to the format argument we used in basicConfig.

# Later, we'll be parsing these each message and use the time as a piece of data to explore.

# Continuous message writing
# Now on to continuously checking for new messages in a loop.

# When we're connected to the socket, Twitch (and other IRC) will periodically send a keyword — "PING" — to check if you're still using the connection. We want to check for this keyword, and send an appropriate response — "PONG".

# One other thing we'll do is parse emojis so they can be written to a file. To do this, we'll use the emoji library that will provide a mapping from emojis to their meaning in words. For example, if a 👍 shows up in a message it'll be converted to :thumbs_up:.

# The following is a while loop that will continuously check for new messages from the socket, send a PONG if necessary, and log messages with parsed emojis:

from emoji import demojize

counter=1

while True:
    resp = sock.recv(2048).decode('utf-8')
    print(resp)
    
    if resp.startswith('PING'):
        sock.send("PONG\n".encode('utf-8'))
    
    elif len(resp) > 0:
        logging.info(demojize(resp))
    
    print(counter)
    counter+=1
        
# This will keep running until you stop it. To see the messages in real-time open a new terminal, navigate to the log's location, and run tail -f chat.log.

:analogrebellion!analogrebellion@analogrebellion.tmi.twitch.tv PRIVMSG #bmkibler :LUL
:tbs77340!tbs77340@tbs77340.tmi.twitch.tv PRIVMSG #bmkibler :i never knew magic rules, i don't know how to play this game :((
:baconality!baconality@baconality.tmi.twitch.tv PRIVMSG #bmkibler :could you mouse over the leylines he has?

1
:medal_2!medal_2@medal_2.tmi.twitch.tv PRIVMSG #bmkibler :@tbs77340 try it out! It has a pretty decent tutorial that shows you the ropes

2
:anduin_wrinn!anduin_wrinn@anduin_wrinn.tmi.twitch.tv PRIVMSG #bmkibler :@bmkibler sry for offtopic, reaction of hs annoucement, already gone? bmkGalv

3
:keksz04!keksz04@keksz04.tmi.twitch.tv PRIVMSG #bmkibler :@tbs77340 - same here :d

4
:therealaurumserenity!therealaurumserenity@therealaurumserenity.tmi.twitch.tv PRIVMSG #bmkibler :@hoo_hoo_kachoo my friends are going to hate you forever and they don’t even know you. Thank you for making this deck. Lol

5
:moobot!moobot@moobot.tmi.twitch.tv PRIVMSG #bmkibler :Support the stream

KeyboardInterrupt: 

In [18]:
#close socket IF DONE

sock.close()