# GitHub comments sentiment analysis

Example of how to combine the public (free) APIs from GitHub and Google to analyse the sentiment (positive or negative) of comments placed at pull requests.

Requirements:
- Google cloud package: pip install google-cloud
- Python 3.6


## Used packages

In [None]:
from google.cloud import language
import json
import os
import pprint
import urllib.request
from IPython.display import display, Markdown  # Used to get some fancier looking output

## Configuration settings

In [None]:
url = "https://api.github.com/repos/" # GitHub API entree point
owner = "astropy"                     # Repository owner
repo = "astropy"                      # Repository name
number = "7712"                       # Pull request number
req_type = "comments"                 # Either comments or review, see: 
#https://api.github.com/repos/astropy/astropy/pulls/7712/review

# Set the path to your Google API credentials in the environment, see: 
os.environ["GOOGLE_APPLICATION_CREDENTIALS"] = "<path to credentials.json>"

## Actual code

### Load some comments from GitHub using the REST API

In [None]:
url_full = f"{url}{owner}/{repo}/pulls/{number}/{req_type}" # Build the request URL
contents = urllib.request.urlopen(url_full).read()
github_data = json.loads(contents)

### Convert the comments into a Document, the document will be send to the Google API

In [None]:
documents = []
for comment in github_data:
    #print(comment['user']['login'])  # The username of the person who posted the comment
    #TODO strip the comment from special characters
    documents.append(language.types.Document(content=comment['body'], type='PLAIN_TEXT',))

### For every comment query the Google API and request the sentiment of the comment

In [None]:
google_client = language.LanguageServiceClient( )
results = []
for doc in documents:
    results.append(google_client.analyze_sentiment(document=doc, encoding_type='UTF32',))

### Print the results 

**score** of the sentiment ranges between -1.0 (negative) and 1.0 (positive) and corresponds to the overall emotional leaning of the text.

**magnitude** indicates the overall strength of emotion (both positive and negative) within the given text, between 0.0 and +inf. Unlike score, magnitude is not normalized; each expression of emotion within the text (both positive and negative) contributes to the text's magnitude (so longer text blocks may have greater magnitudes).

In [None]:
for r in results:
    text = "---\n"
    text += r.sentences[0].text.content
    text += "\n\n**Score:** {} **Magnitude: {}**\n\n"
    text = text.format(round(r.document_sentiment.score,2), round(r.document_sentiment.magnitude,2))
    display(Markdown(text))    