<img src="http://imgur.com/1ZcRyrc.png" style="float: left; margin: 20px; height: 55px">

# NLP & APIs : Predicting Positive & Negative sentiments on Ubisoft's reviews

--- 
# Part 1


---

### Contents:
- [Obtaining Data](#Data-Import)
- [Exporting csv](#Export-csv)

In [24]:
# Importing all libraries used: 

import requests
import pandas as pd      
import re
import os

## Data Import

All data is retrieved from user Steam reviews of Assassin's Creed IV: Black Flag. Documentation can be found [here](https://partner.steamgames.com/doc/store/getreviews).

In [25]:
# Using API to retrieve app reviews of Assassins Creed IV: Black Flag. ID of game is 242050
ass_creed_res = requests.get(url='https://store.steampowered.com/appreviews/242050?json=1').json()

In [None]:
# Identifying the keys of the dictionary. 
ass_creed_res.keys()

In [None]:
# Accessing Query Summary: Note that the current reviews are only the first 20.
ass_creed_res['query_summary']

Total reviews for the game on steam was 22,056. This should be the total amount we will be able to work with the classify our data

In [None]:
# Cursor will be needed to access then next 20 reviews.
ass_creed_res['cursor']

In [None]:
# Trying to access the next 20 reviews.
requests.get(url='https://store.steampowered.com/appreviews/242050?json=1&cursor=AoIIPxCyEH66sZwE').json()

Now to try and automate this process in a function

In [None]:
# Defining a function to automate this process

def get_reviews(appid, params={'json':1}):
    url = 'https://store.steampowered.com/appreviews/'
    response = requests.get(url=url+appid, params=params)
    return response.json()

def get_n_reviews(appid, n=100):
    reviews = []
    cursor = '*'
    params = {
            'json' : 1,
            'filter' : 'all',
            'language' : 'english',
            'day_range' : 9223372036854775807,
            'review_type' : 'all',
            'purchase_type' : 'all'
            }

    while n > 0:
        params['cursor'] = cursor.encode()
        params['num_per_page'] = min(100, n)
        n -= 100

        response = get_reviews(appid, params)
        cursor = response['cursor']
        reviews += response['reviews']

        if len(response['reviews']) < 100: break

    return reviews

In [None]:
# Converting results of all reviews into a dataframe
reviews = pd.DataFrame(get_n_reviews('242050', n =22056))

In [None]:
reviews.head()

In [None]:
# Turning Author into a dictionary
auth_dict = dict(enumerate(reviews['author'], 1))

auth_dict

In [None]:
#Turning author dictionary into a dataframe
author = pd.DataFrame.from_dict(auth_dict, orient = 'index').reset_index(drop=True)
author.head()

In [None]:
# Joining the two dataframes together
reviews = author.join(reviews)
reviews.head()

In [None]:
# Dropping the following columns due to repetition or unncessary.
reviews.drop(columns = ['author', 'timestamp_dev_responded', 'developer_response', 'steam_china_location'], inplace = True)
reviews.head()

In [None]:
# Checking for duplicated rows
reviews.duplicated().sum()

In [None]:
# Saving dataframe into a csv
if not os.path.exists('data'):
    os.makedirs('data') 
reviews.to_csv('data/assassins_creed_reviews.csv', index= False)