#Social Data Mining, Week 3 Assignments

These assignments will help you learn to invoke social API calls from Python, find effective client wrapper modules, and read social API documentation.

This weeks assignments requires disparate software systems to talk together. Many things can go wrong (installation problems, network errors, etc), and the error messages you see may seem confusing. **If you get stuck, please post the forum. We're here to help you!**

#Assignment 1: Continuing exploring the GitHub module.

**Background:** This question will provide you with experience navigating API documentation from a social API such as GitHub, and help you learn to write "native" Python code to connect to a social media API.

**Point total:** 10 points.

**Time estimate:** 60 minutes.

In the lesson we learned how to retrieve the recent GitHub changes. Study the GitHub API to write a function called print_commit_messages. The function should output the time and message for the most recent 10 commits by a particular user.

In [59]:
import urllib
import json
import pprint


def print_commit_messages(username):
    """
        Prints the timestamp and commit messages for the most recent 10 commits by the user.
    """
    # open an http connection to the url and return a file for it
    eventQuery = 'https://api.github.com/users/' + username + '/events'
    url = urllib.urlopen(eventQuery)

    # read the http response into a string.
    commit_count = 0
    response = url.read()
    responseData = json.loads(response)
    print "Github Commits for user = ", username, "\n"
    for responseRec in responseData:
        if responseRec["type"]=='PushEvent':
            print responseRec["type"], " created at ", responseRec["created_at"]

            for commit in responseRec["payload"]["commits"]:
                commit_count+=1
                if commit_count<=10:
                    print "commit message[" + str(commit_count) + "] = \n"      
                    print "+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++\n"
                    print commit["message"]
                    print "+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++\n"

                else:
                    break;
        if commit_count>10:
            break
#print_commit_messages('polsztyn')
print_commit_messages('torvalds')


Github Commits for user =  torvalds 

PushEvent  created at  2017-11-13T03:11:04Z
commit message[1] = 

+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

stop using '%pK' for /proc/kallsyms pointer values

Not only is it annoying to have one single flag for all pointers, as if
that was a global choice and all kernel pointers are the same, but %pK
can't get the 'access' vs 'open' time check right anyway.

So make the /proc/kallsyms pointer value code use logic specific to that
particular file.  We do continue to honor kptr_restrict, but the default
(which is unrestricted) is changed to instead take expected users into
account, and restrict access by default.

Right now the only actual expected user is kernel profiling, which has a
separate sysctl flag for kernel profile access.  There may be others.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

***Hint:*** https://developer.github.com/v3/activity/events/#list-events-performed-by-a-user

#Assignment 2: Learn the Reddit API

**Background:** This function will continue improving your native python social media API coding skills. It will also give you practice choosing and installing wrapper client modules, and triangulating the documentation for a wrapper library with the documentation for a social media api.

**Point total:** 10 points.

**Time estimate:** 90 minutes

Next, you'll practice using both native Python and wrapper-based API calls to the Reddit API. As part of this question, you'll need to

1. Explore the [Reddit API documentation](http://www.reddit.com/dev/api), 
2. Compare alternate Reddit Python client wrapper modules, and
3. Install one of the modules. 

You will need to complete these steps to access social media APIs out in the world. However, in this class you have lots of help. Ask if you get stuck!

Print the top posts in the [Reddit python forum](http://www.reddit.com/r/python). For each post, print the title, URL, submitter, and number of votes. Do this in two ways:

1. Using "native Python" code with urllib and the json module.
2. Using a reasonable Python client wrapper module.

***Hint:*** You can just add ".json" to any Reddit webpage URL to get the API call results.

In [60]:
# Using the praw reddit interface.
def install_module(package_name):
    try:
        __import__(package_name)
        print('module ' + package_name + ' already installed')
    except ImportError:
        print('installing module ' + package_name)
        import pip
        pip.main(['install', package_name])

install_module('praw')
import praw

module praw already installed


In [62]:
import urllib
import json
import pprint

def print_reddit_python_messages():
    """
        Prints title, URL, submitter, and number of votes of top Reddit Python posts.
    """
    # open an http connection to the url and return a file for it
    redditQuery = 'https://www.reddit.com/r/Python/top/.json'
    url = urllib.urlopen(redditQuery)
    response = url.read()
    responseData = json.loads(response)
    print "type = ", type(responseData)
    print responseData.keys()
    
    
    for article in responseData['data']['children']:
        articleTitle =     article['data']['title']
        articleURL =       article['data']['permalink'] 
        articleSubmitter = article['data']['author']    
        articleVoteCount = article['data']['score']  
    
        print '\n'
        print 'articleTitle = ', articleTitle
        print 'articleURL = ', articleURL    
        print 'articleSubmitter = ', articleSubmitter     
        print 'articleVoteCount = ', str(articleVoteCount)  
        print '\n'

print_reddit_python_messages()


type =  <type 'dict'>
[u'message', u'error']


KeyError: 'data'

#Assignment 3: Bieber fever OR Bieber fury?

**Background:** After completing this question, you will understand how to connect two disparate social media APIs to complete a complex task.

**Point total:** 15 points.

**Time estimate:** 3 hours

For this assignment, you will write a Python program that finds 10 tweets about Justin Bieber (or any other person you'd like) as they happen, and classifies them as positive or negative in [sentiment](http://en.wikipedia.org/wiki/Sentiment_analysis). To do this, you'll need to use BOTH the Twitter API and AlchemyAPI.


You already toured the streaming Twitter API in the lesson. You just need to adapt the code to look for Bieber.

For the last step, you need to estimate the sentiment associated with each tweet. We'll use the AlchemyAPI. Begin by [registering for the free AlchemyAPI license](http://www.alchemyapi.com/api/register.html). This license entitles you to 1000 API calls per day. You'll receive the license key via email.

One you have the license, you can install the module by running the following cell, **substituting your license key**.


In [7]:
# You will receive this in an email after registering for a free license at http://www.alchemyapi.com/api/register.html
LICENSE_KEY = 'XXXXXXXXXXXXXX'

def install_alchemy(key):
    import urllib
    import os
    
    if not os.path.isfile('api_key.txt'):
        f = open('api_key.txt', 'w')
        f.write(key)
        f.close()
        
    if not os.path.isfile('alchemyapi.py'):
        f = urllib.urlopen('https://raw.githubusercontent.com/AlchemyAPI/alchemyapi_python/master/alchemyapi.py')
        s = f.read()
        f.close()
        f = open('alchemyapi.py', 'w')
        f.write(s)
        f.close()

    import alchemyapi
    

install_alchemy(LICENSE_KEY)

All that you need to figure out how to do is ask the alchemyapi about the sentiment of each Bieber related tweet. 

***Hint:*** The [alchemyapi python tutorial](http://www.alchemyapi.com/developers/getting-started-guide/using-alchemyapi-with-python/) provides a fantastic overview on the module. Since you already installed the module using the above code, you can skip down to the "Next Steps" section.