# Task 1: Setting up Twitter API

Here is what we want to accomplish:

1. Set up a "Twitter" account
2. Get the access tokens
3. Set up the access tokens


- All we need to do is visit this page: [Twitter](https://twitter.com/signup?lang=en) and fill in the details to sign up! (_Note: No need to do this if you already are a Twitterati!_)

- Next step is to get to the Twitter Application Management page: [Twitter Applicaition Management](https://apps.twitter.com/)

- Sign in with the credentials that you used to create your Twitter Id.

- Below is the homepage that you'll see. It won't have any applications, so click on _Create New Apps_.

<br>
<div align="center"><img src="./images/twitter01.png"/></div>

- It would take you to the _Create an application_ page. (_see below_)

<br>
<div align="center"><img src="./images/twitter02.png"/></div>

- Fill up all the details and click on _Create Application_, you'll be redirected to the app management page. (_See below_)

<br>
<div align="center"><img src="./images/twitter03.png"/></div>


- Click on the _Keys and Access Tokens_ tab, you would see two sections:
    * Application Settings
    * Your Access Token


- _Application Settings_ has the tokens that the application needs to authenticate your api call against.

<br>
<div align="center">
<img src="./images/twitter04.png"/>
</div>

- _Access Tokens_ need to be created for a freshly minted Twitter app. So go on and click the _Create my access token_. 

<br>
<div align="center">
<img src="./images/twitter05.png"/>
</div>

- You are now set to use it for exciting applications!

# Task 2: Setting up GitHub API

Here is what we want to accomplish:

1. Setup a Github account
2. Get the tokens
3. Setup the tokens

- Visit the website: [Github](https://github.com/)
- Click on _Sign up_ and fill out all the details in __Step 1: Set up a personal account__, get yourself a cool github id! (_Note: You can choose to continue with your previous id, if you had one).

- __Step 2: Choose your plan__, let the default be _Unlimited public repositories for free_ and click on _Continue_

- __Step 3: Tailor your experience__, fill in the details (_or you could skip this_).


<br>
<div align="center">
<img src="./images/github01.png"/>
</div>

- Click on _Settings_ and scroll-down to find _Developer settings_.

<br>
<div align="center">
<img src="./images/github02.png"/>
</div>

- Register a new application, that would take you to the page below:

<br>
<div align="center">
<img src="./images/github03.png"/>
</div>

- Fill in the details, just the same as while creating a Twitter app, and register your application.

- Your `Client ID` and `Client Secret` are now created for the demo application. (_see below_)

<br>
<div align="center">
<img src="./images/github04.png"/>
</div>


# Task 3: GitHub Recommender System

In this task, we'll build a very simple recommender system for GitHub. We'll look at the GitHub profile of a given user and use that information to generate repository recommendations (other GitHub repository that user may be interested in) for that user.

Task: [For this GH user](https://api.github.com/users/karpathy/repos), answer the following questions:

## a. How many GH repos does the user have?


In [1]:
import requests

url = "https://api.github.com/users/karpathy"
request = requests.get(url)

In [2]:
request.status_code

200

In [3]:
request.json()

{u'avatar_url': u'https://avatars3.githubusercontent.com/u/241138?v=4',
 u'bio': None,
 u'blog': u'twitter.com/karpathy',
 u'company': None,
 u'created_at': u'2010-04-10T17:55:32Z',
 u'email': None,
 u'events_url': u'https://api.github.com/users/karpathy/events{/privacy}',
 u'followers': 12264,
 u'followers_url': u'https://api.github.com/users/karpathy/followers',
 u'following': 5,
 u'following_url': u'https://api.github.com/users/karpathy/following{/other_user}',
 u'gists_url': u'https://api.github.com/users/karpathy/gists{/gist_id}',
 u'gravatar_id': u'',
 u'hireable': None,
 u'html_url': u'https://github.com/karpathy',
 u'id': 241138,
 u'location': u'Stanford',
 u'login': u'karpathy',
 u'name': u'Andrej',
 u'organizations_url': u'https://api.github.com/users/karpathy/orgs',
 u'public_gists': 7,
 u'public_repos': 29,
 u'received_events_url': u'https://api.github.com/users/karpathy/received_events',
 u'repos_url': u'https://api.github.com/users/karpathy/repos',
 u'site_admin': False,


In [4]:
repositories = request.json()

In [5]:
type(repositories)

dict

In [6]:
repositories.get('public_repos')

29

## b. How many GH repos has the user liked


In [7]:
import requests

url = u'https://api.github.com/users/karpathy/starred'
request = requests.get(url)

In [8]:
Starred_Request = request.json()
Starred_Request

[{u'archive_url': u'https://api.github.com/repos/albanie/SIGBOVIK17-GUNs/{archive_format}{/ref}',
  u'assignees_url': u'https://api.github.com/repos/albanie/SIGBOVIK17-GUNs/assignees{/user}',
  u'blobs_url': u'https://api.github.com/repos/albanie/SIGBOVIK17-GUNs/git/blobs{/sha}',
  u'branches_url': u'https://api.github.com/repos/albanie/SIGBOVIK17-GUNs/branches{/branch}',
  u'clone_url': u'https://github.com/albanie/SIGBOVIK17-GUNs.git',
  u'collaborators_url': u'https://api.github.com/repos/albanie/SIGBOVIK17-GUNs/collaborators{/collaborator}',
  u'comments_url': u'https://api.github.com/repos/albanie/SIGBOVIK17-GUNs/comments{/number}',
  u'commits_url': u'https://api.github.com/repos/albanie/SIGBOVIK17-GUNs/commits{/sha}',
  u'compare_url': u'https://api.github.com/repos/albanie/SIGBOVIK17-GUNs/compare/{base}...{head}',
  u'contents_url': u'https://api.github.com/repos/albanie/SIGBOVIK17-GUNs/contents/{+path}',
  u'contributors_url': u'https://api.github.com/repos/albanie/SIGBOVIK17-

In [9]:
len(Starred_Request)

30

## c. List the repos (name+urls) liked by the user

In [10]:
list1 = []
for i in Starred_Request:
    list1.append("Name:%s ;URL:%s"% (i.get("name"),i.get("html_url")))
list1

[u'Name:SIGBOVIK17-GUNs ;URL:https://github.com/albanie/SIGBOVIK17-GUNs',
 u'Name:thyme ;URL:https://github.com/sourcegraph/thyme',
 u'Name:mac-dev-setup ;URL:https://github.com/nicolashery/mac-dev-setup',
 u'Name:densecap ;URL:https://github.com/jcjohnson/densecap',
 u'Name:adnn ;URL:https://github.com/dritchie/adnn',
 u'Name:Stochastic_Depth ;URL:https://github.com/yueatsprograms/Stochastic_Depth',
 u'Name:torch-rnn ;URL:https://github.com/jcjohnson/torch-rnn',
 u'Name:fb.resnet.torch ;URL:https://github.com/facebook/fb.resnet.torch',
 u'Name:tensorflow_tutorials ;URL:https://github.com/pkmital/tensorflow_tutorials',
 u'Name:deep-residual-networks ;URL:https://github.com/KaimingHe/deep-residual-networks',
 u'Name:weblas ;URL:https://github.com/waylonflinn/weblas',
 u'Name:tf-adversarial ;URL:https://github.com/siemanko/tf-adversarial',
 u'Name:scholar.py ;URL:https://github.com/ckreibich/scholar.py',
 u'Name:char-rnn-tensorflow ;URL:https://github.com/sherjilozair/char-rnn-tensorflow

## d. List the users (name+url) who own the repos liked by the user

In [11]:
list1 = []
for i in Starred_Request:
    list1.append("Name:%s ;URL:%s"% (i.get('owner').get('login'),i.get('owner').get('html_url')))
list1

[u'Name:albanie ;URL:https://github.com/albanie',
 u'Name:sourcegraph ;URL:https://github.com/sourcegraph',
 u'Name:nicolashery ;URL:https://github.com/nicolashery',
 u'Name:jcjohnson ;URL:https://github.com/jcjohnson',
 u'Name:dritchie ;URL:https://github.com/dritchie',
 u'Name:yueatsprograms ;URL:https://github.com/yueatsprograms',
 u'Name:jcjohnson ;URL:https://github.com/jcjohnson',
 u'Name:facebook ;URL:https://github.com/facebook',
 u'Name:pkmital ;URL:https://github.com/pkmital',
 u'Name:KaimingHe ;URL:https://github.com/KaimingHe',
 u'Name:waylonflinn ;URL:https://github.com/waylonflinn',
 u'Name:siemanko ;URL:https://github.com/siemanko',
 u'Name:ckreibich ;URL:https://github.com/ckreibich',
 u'Name:sherjilozair ;URL:https://github.com/sherjilozair',
 u'Name:nlintz ;URL:https://github.com/nlintz',
 u'Name:PrincetonVision ;URL:https://github.com/PrincetonVision',
 u'Name:tensorflow ;URL:https://github.com/tensorflow',
 u'Name:ShaoqingRen ;URL:https://github.com/ShaoqingRen',
 u

## e. List the repos (name+url) liked by the users who own the repos liked by the user

In [12]:
list1 = []
for i in Starred_Request:
    url = i.get('owner').get('starred_url')
    url = url.replace ("{/owner}{/repo}","")
    x = requests.get(url)
    x = x.json()
    for u in x:
        list1.append("Name:%s ;URL:%s"% (u.get("name"),u.get("html_url")))
list1
    

[u'Name:vlb ;URL:https://github.com/lenck/vlb',
 u'Name:Batch_Normalized_Maxout_NIN ;URL:https://github.com/JiaRenChang/Batch_Normalized_Maxout_NIN',
 u'Name:neural-style-matconvnet ;URL:https://github.com/aravindhm/neural-style-matconvnet',
 u'Name:matconvnet-contrib ;URL:https://github.com/vlfeat/matconvnet-contrib',
 u'Name:trust ;URL:https://github.com/ncase/trust',
 u'Name:refinenet ;URL:https://github.com/guosheng/refinenet',
 u'Name:mcnDCGAN ;URL:https://github.com/hbilen/mcnDCGAN',
 u'Name:schedule ;URL:https://github.com/jotaf98/schedule',
 u'Name:dagnn_caffe_deploy ;URL:https://github.com/ecoto/dagnn_caffe_deploy',
 u'Name:convert_torch_to_pytorch ;URL:https://github.com/clcarwin/convert_torch_to_pytorch',
 u'Name:siamese-mnist ;URL:https://github.com/lenck/siamese-mnist',
 u'Name:mcn-example-module ;URL:https://github.com/lenck/mcn-example-module',
 u'Name:Befungell ;URL:https://github.com/zwade/Befungell',
 u'Name:autonn ;URL:https://github.com/vlfeat/autonn',
 u'Name:fer-c

## f. Sort the list obtained in the last task by frequency (highest frequency first) and return the top 5 repos (name+url)

# Task 4: Parsing YAML

In this task, we'll convert nosql data (represented as a YAML file) to a rectangular format (dataframe) and then answer a few questions about the dataset.

Task: The file `data/335982.yaml` has ball-by-ball summary of a cricket match. Produce the following simple version of match scorecard from it listing the following:

## a. How many runs were scored by each batsman?

In [13]:
import yaml

In [15]:
with open("data/335982.yaml", 'r') as stream:
    try:
        ipl_yaml = (yaml.load(stream))
    except yaml.YAMLError as exc:
        print(exc)

In [16]:
# ! pip install tqdmtqdm

In [17]:
class FlatDict:        
    def flatDict(self, dictObj=None):
        '''Flatten a given dict
        '''
        #print('Arg received: ', dictObj)
        for key, value in dictObj.items():
            #print('Now iterating through: ', {key:value})
            if isinstance(value, dict):
                #print('Value: ', value, ', Is value a dictionary? ', isinstance(value, dict))
                for key2, value2 in value.items():
                    self.flatDict({'_'.join([key, key2]) : value2})
            elif isinstance(value, list) and isinstance(value[0], str):
                value = ', '.join(value)
                #print('The pair to be updated: ', {key:value})
                self.flatteneddict.update({key:value})
            else:
                #print('The pair to be updated: ', {key:value})
                self.flatteneddict.update({key:value})
        
    
    def __init__(self, dictObj=None):
        self.flatteneddict = {}
        if not isinstance(dictObj, dict):
            raise ValueError('Expected a dictionary object as input!')
        self.flatDict(dictObj)
    
    
    def __repr__(self):
        return(str(self.flatteneddict))

In [18]:
import pandas as pd
pd.set_option("display.max_columns", 101)

class CricSheet(FlatDict):
    
    def __init__(self,  dictObj=None):
        super().__init__(dictObj)
        self.info = dictObj["info"]
#         print(self.flatteneddict)
        self.ballsDF = pd.DataFrame()

    def get_ballsDF(self):  
        for idx, inningsObj in enumerate(self.flatteneddict["innings"]):# idx = 0, 1; inningsObj = {'ist ininnings': dict}
            inningsDict = list(inningsObj.values())[0]                  # inningsDict = {'team': val, 'deliveries': dict}
            for ball in inningsDict['deliveries']:                      # a dict
                self.flatteneddict = {}                                 # clear out details of last delivery
                self.flatteneddict.update({'innings': idx + 1})
                self.flatteneddict.update({'batting_team': inningsDict['team']})
                self.flatDict(self.info)
                
#                 print('Ball: ', ball)
                
                for ball_no, ball_details in ball.items():
#                     print('ball_no: ', ball_no, 'ball_details: ', ball_details)
                    self.flatDict(ball_details)
#                     print(self.flatteneddict)
                    idx_df = int(1000*(idx+1) + 10*ball_no)
                    newDF = pd.DataFrame(self.flatteneddict, index=[idx_df])
                    #print('newDF: \n', type(newDF), '\n', newDF)
                    self.ballsDF = pd.concat([self.ballsDF, newDF])
                    
        cols = ['competition', 'gender', 'match_type', 'dates','city', 'umpires', 'venue', 'teams',
                 'toss_winner', 'toss_decision', 'outcome_by_runs', 'outcome_winner', 'player_of_match', 
                 'innings', 'batting_team', 'batsman', 'non_striker', 'bowler', 'overs', 
                 'runs_batsman', 'runs_extras', 'extras_byes', 'extras_legbyes', 
                 'extras_wides', 'runs_total', 'wicket_fielders', 'wicket_kind', 'wicket_player_out']
        
        self.ballsDF = self.ballsDF[cols]

In [19]:
ipl_df = CricSheet(ipl_yaml)
ipl_df.get_ballsDF()
ipl_df.ballsDF.head()

TypeError: super() takes at least 1 argument (0 given)

## b. How many balls were faced by each batsman?

In [None]:
batsmen_balls = ipl_df.ballsDF.groupby(["batting_team", "batsman"], as_index = False)["city"].count()
batsmen_balls.columns = ["batting_team", "batsman", "total_balls"]
batsmen_balls

## c. How many balls were balled by each bowler?

In [None]:
bowler_balls = ipl_df.ballsDF.groupby(["batting_team", "bowler"], as_index = False)["runs_batsman"].count()
bowler_balls.columns = ["batting_team", "bowler", "total_balls"]
bowler_balls

## d. How many runs were conceded by each bowler?

In [None]:
bowler_runs = ipl_df.ballsDF.groupby(["batting_team", "bowler"], as_index = False)["runs_batsman"].sum()
bowler_runs.columns = ["batting_team", "bowler", "runs_conceded"]
bowler_runs

## e. Name of the teams

In [None]:
z.ballsDF.teams.iloc[0]

## f. Who batted first?

In [None]:
z.ballsDF.loc[z.ballsDF["innings"]==1, "batting_team"].iloc[0]

## g. Who won?

In [None]:
z.ballsDF.outcome_winner.iloc[0]