![Ironhack logo](https://i.imgur.com/1QgrNNw.png)

# Lab | API Scavenger Game

## Introduction

In the lesson, you have learned how to make Python requests to APIs and parse the JSON responses to extract the information you need. In this lab, you will practice these skills by playing an API scavenger hunt game. In case you haven't played scavenger hunt when you were a kid, in a scavenger hunt players need to collect a list of items and they receive clues to help them in the mission. In this lab, you will be seeking secrets hidden inside the massive data from the API. Your data analytics skills will make you a cool API detective. 


## Goals

### Challenge 1: Fork Languages

You will find out how many programming languages are used among all the forks created from the main lab repo of your bootcamp. Assuming the main lab repo is `ironhack-datalabs/madrid-oct-2018`, you will:

1. Obtain the full list of forks created from the main lab repo via Github API.

1. Loop the JSON response to find out the `language` attribute of each fork. Use an array to store the `language` attributes of each fork.
    * *Hint: Each language should appear only once in your array.*

1. Print the language array. It should be something like:

  ```["Python", "Jupyter Notebook", "HTML"]```

Again, the documentation of Github API is [here](https://developer.github.com/v3/).

In [5]:
import pandas as pd
import json
import requests
from pandas.io.json import json_normalize
import base64

In [6]:
url = 'https://api.github.com/repos/ironhack-datalabs/madrid-oct-2018/forks'

response = requests.get(url)
results = response.json()
results = json_normalize(results)

In [7]:
results.columns


Index(['id', 'node_id', 'name', 'full_name', 'private', 'html_url',
       'description', 'fork', 'url', 'forks_url', 'keys_url',
       'collaborators_url', 'teams_url', 'hooks_url', 'issue_events_url',
       'events_url', 'assignees_url', 'branches_url', 'tags_url', 'blobs_url',
       'git_tags_url', 'git_refs_url', 'trees_url', 'statuses_url',
       'languages_url', 'stargazers_url', 'contributors_url',
       'subscribers_url', 'subscription_url', 'commits_url', 'git_commits_url',
       'comments_url', 'issue_comment_url', 'contents_url', 'compare_url',
       'merges_url', 'archive_url', 'downloads_url', 'issues_url', 'pulls_url',
       'milestones_url', 'notifications_url', 'labels_url', 'releases_url',
       'deployments_url', 'created_at', 'updated_at', 'pushed_at', 'git_url',
       'ssh_url', 'clone_url', 'svn_url', 'homepage', 'size',
       'stargazers_count', 'watchers_count', 'language', 'has_issues',
       'has_projects', 'has_downloads', 'has_wiki', 'has_pages', 

In [8]:
results['language']

0                 None
1     Jupyter Notebook
2     Jupyter Notebook
3     Jupyter Notebook
4     Jupyter Notebook
5     Jupyter Notebook
6                 HTML
7     Jupyter Notebook
8     Jupyter Notebook
9               Python
10    Jupyter Notebook
11    Jupyter Notebook
12    Jupyter Notebook
13    Jupyter Notebook
14    Jupyter Notebook
15    Jupyter Notebook
Name: language, dtype: object

In [9]:
lst_languages = list(set(results['language'].tolist()))

lst_languages

['Jupyter Notebook', 'HTML', 'Python', None]

### Challenge 2: Count Commits

Count how many commits were made in the past week.

1. Obtain all the commits made in the past week via API, which is a JSON array that contains multiple commit objects.

1. Count how many commit objects are contained in the array.

In [10]:
url = 'https://api.github.com/repos/ironhack-datalabs/madrid-oct-2018/commits'

response = requests.get(url)
results = response.json()
df = json_normalize(results)

In [11]:
df.head()

Unnamed: 0,sha,node_id,url,html_url,comments_url,parents,commit.author.name,commit.author.email,commit.author.date,commit.committer.name,...,committer.following_url,committer.gists_url,committer.starred_url,committer.subscriptions_url,committer.organizations_url,committer.repos_url,committer.events_url,committer.received_events_url,committer.type,committer.site_admin
0,1638e5506e6947b77bfe78761d345476ae80d017,MDY6Q29tbWl0MTUzNzIwODA0OjE2MzhlNTUwNmU2OTQ3Yj...,https://api.github.com/repos/ironhack-datalabs...,https://github.com/ironhack-datalabs/mad-oct-2...,https://api.github.com/repos/ironhack-datalabs...,[{'sha': 'f332b5e6fcea965dc80f62220d7ee1457b04...,Marc Pomar,marc@faable.com,2019-08-19T11:11:36Z,Marc Pomar,...,https://api.github.com/users/boyander/followin...,https://api.github.com/users/boyander/gists{/g...,https://api.github.com/users/boyander/starred{...,https://api.github.com/users/boyander/subscrip...,https://api.github.com/users/boyander/orgs,https://api.github.com/users/boyander/repos,https://api.github.com/users/boyander/events{/...,https://api.github.com/users/boyander/received...,User,False
1,f332b5e6fcea965dc80f62220d7ee1457b04b90d,MDY6Q29tbWl0MTUzNzIwODA0OmYzMzJiNWU2ZmNlYTk2NW...,https://api.github.com/repos/ironhack-datalabs...,https://github.com/ironhack-datalabs/mad-oct-2...,https://api.github.com/repos/ironhack-datalabs...,[{'sha': '4c048c3efc18cf9d50e34c76919c8049ee7f...,ta-data-bcn,47005065+ta-data-bcn@users.noreply.github.com,2019-03-07T15:49:16Z,GitHub,...,https://api.github.com/users/web-flow/followin...,https://api.github.com/users/web-flow/gists{/g...,https://api.github.com/users/web-flow/starred{...,https://api.github.com/users/web-flow/subscrip...,https://api.github.com/users/web-flow/orgs,https://api.github.com/users/web-flow/repos,https://api.github.com/users/web-flow/events{/...,https://api.github.com/users/web-flow/received...,User,False
2,4c048c3efc18cf9d50e34c76919c8049ee7f2dbd,MDY6Q29tbWl0MTUzNzIwODA0OjRjMDQ4YzNlZmMxOGNmOW...,https://api.github.com/repos/ironhack-datalabs...,https://github.com/ironhack-datalabs/mad-oct-2...,https://api.github.com/repos/ironhack-datalabs...,[{'sha': '41e09d4fbf64bacb2fdfe7d90cdd0cd71bd2...,ta-data-bcn,47005065+ta-data-bcn@users.noreply.github.com,2019-03-07T15:38:52Z,GitHub,...,https://api.github.com/users/web-flow/followin...,https://api.github.com/users/web-flow/gists{/g...,https://api.github.com/users/web-flow/starred{...,https://api.github.com/users/web-flow/subscrip...,https://api.github.com/users/web-flow/orgs,https://api.github.com/users/web-flow/repos,https://api.github.com/users/web-flow/events{/...,https://api.github.com/users/web-flow/received...,User,False
3,41e09d4fbf64bacb2fdfe7d90cdd0cd71bd24310,MDY6Q29tbWl0MTUzNzIwODA0OjQxZTA5ZDRmYmY2NGJhY2...,https://api.github.com/repos/ironhack-datalabs...,https://github.com/ironhack-datalabs/mad-oct-2...,https://api.github.com/repos/ironhack-datalabs...,[{'sha': '750eb91e4535c14ec4bf03f60cb5756a1c93...,Tony Ojeda,tojeda@districtdatalabs.com,2019-01-18T18:28:33Z,Tony Ojeda,...,https://api.github.com/users/ojedatony1616/fol...,https://api.github.com/users/ojedatony1616/gis...,https://api.github.com/users/ojedatony1616/sta...,https://api.github.com/users/ojedatony1616/sub...,https://api.github.com/users/ojedatony1616/orgs,https://api.github.com/users/ojedatony1616/repos,https://api.github.com/users/ojedatony1616/eve...,https://api.github.com/users/ojedatony1616/rec...,User,False
4,750eb91e4535c14ec4bf03f60cb5756a1c93f68f,MDY6Q29tbWl0MTUzNzIwODA0Ojc1MGViOTFlNDUzNWMxNG...,https://api.github.com/repos/ironhack-datalabs...,https://github.com/ironhack-datalabs/mad-oct-2...,https://api.github.com/repos/ironhack-datalabs...,[{'sha': '4a8cc3cb64821cb58182a3a94be38f17d27a...,Michal Monselise,michal.monselise@gmail.com,2019-01-17T21:59:35Z,Michal Monselise,...,https://api.github.com/users/michalmonselise/f...,https://api.github.com/users/michalmonselise/g...,https://api.github.com/users/michalmonselise/s...,https://api.github.com/users/michalmonselise/s...,https://api.github.com/users/michalmonselise/orgs,https://api.github.com/users/michalmonselise/r...,https://api.github.com/users/michalmonselise/e...,https://api.github.com/users/michalmonselise/r...,User,False


In [12]:
results

[{'sha': '1638e5506e6947b77bfe78761d345476ae80d017',
  'node_id': 'MDY6Q29tbWl0MTUzNzIwODA0OjE2MzhlNTUwNmU2OTQ3Yjc3YmZlNzg3NjFkMzQ1NDc2YWU4MGQwMTc=',
  'commit': {'author': {'name': 'Marc Pomar',
    'email': 'marc@faable.com',
    'date': '2019-08-19T11:11:36Z'},
   'committer': {'name': 'Marc Pomar',
    'email': 'marc@faable.com',
    'date': '2019-08-19T11:11:36Z'},
   'message': 'Cambiar los autores',
   'tree': {'sha': '51812285dc70ac37103735ee7ed672befc582b5f',
    'url': 'https://api.github.com/repos/ironhack-datalabs/mad-oct-2018/git/trees/51812285dc70ac37103735ee7ed672befc582b5f'},
   'url': 'https://api.github.com/repos/ironhack-datalabs/mad-oct-2018/git/commits/1638e5506e6947b77bfe78761d345476ae80d017',
   'comment_count': 0,
   'verification': {'verified': False,
    'reason': 'unsigned',
    'signature': None,
    'payload': None}},
  'url': 'https://api.github.com/repos/ironhack-datalabs/mad-oct-2018/commits/1638e5506e6947b77bfe78761d345476ae80d017',
  'html_url': 'https

In [13]:
results[0]

{'sha': '1638e5506e6947b77bfe78761d345476ae80d017',
 'node_id': 'MDY6Q29tbWl0MTUzNzIwODA0OjE2MzhlNTUwNmU2OTQ3Yjc3YmZlNzg3NjFkMzQ1NDc2YWU4MGQwMTc=',
 'commit': {'author': {'name': 'Marc Pomar',
   'email': 'marc@faable.com',
   'date': '2019-08-19T11:11:36Z'},
  'committer': {'name': 'Marc Pomar',
   'email': 'marc@faable.com',
   'date': '2019-08-19T11:11:36Z'},
  'message': 'Cambiar los autores',
  'tree': {'sha': '51812285dc70ac37103735ee7ed672befc582b5f',
   'url': 'https://api.github.com/repos/ironhack-datalabs/mad-oct-2018/git/trees/51812285dc70ac37103735ee7ed672befc582b5f'},
  'url': 'https://api.github.com/repos/ironhack-datalabs/mad-oct-2018/git/commits/1638e5506e6947b77bfe78761d345476ae80d017',
  'comment_count': 0,
  'verification': {'verified': False,
   'reason': 'unsigned',
   'signature': None,
   'payload': None}},
 'url': 'https://api.github.com/repos/ironhack-datalabs/mad-oct-2018/commits/1638e5506e6947b77bfe78761d345476ae80d017',
 'html_url': 'https://github.com/ironh

In [14]:
results[0]['author']

{'login': 'boyander',
 'id': 568638,
 'node_id': 'MDQ6VXNlcjU2ODYzOA==',
 'avatar_url': 'https://avatars1.githubusercontent.com/u/568638?v=4',
 'gravatar_id': '',
 'url': 'https://api.github.com/users/boyander',
 'html_url': 'https://github.com/boyander',
 'followers_url': 'https://api.github.com/users/boyander/followers',
 'following_url': 'https://api.github.com/users/boyander/following{/other_user}',
 'gists_url': 'https://api.github.com/users/boyander/gists{/gist_id}',
 'starred_url': 'https://api.github.com/users/boyander/starred{/owner}{/repo}',
 'subscriptions_url': 'https://api.github.com/users/boyander/subscriptions',
 'organizations_url': 'https://api.github.com/users/boyander/orgs',
 'repos_url': 'https://api.github.com/users/boyander/repos',
 'events_url': 'https://api.github.com/users/boyander/events{/privacy}',
 'received_events_url': 'https://api.github.com/users/boyander/received_events',
 'type': 'User',
 'site_admin': False}

In [15]:
df['commit.author.date'].count()

30


### Challenge 3: Hidden Cold Joke

Using Python, call Github API to find out the cold joke contained in the 24 secret files in the following repo:

https://github.com/ironhack-datalabs/scavenger

The filenames of the secret files contain `.scavengerhunt` and they are scattered in different directories of this repo. The secret files are named from `.0001.scavengerhunt` to `.0024.scavengerhunt`. They are scattered randomly throughout this repo. You need to **search for these files by calling the Github API**, not searching the local files on your computer.

Notes:

* Github API documentation can be found [here](https://developer.github.com/v3/).

* You will need to study the Github API documentation to decide which API endpoint to call and what parameters to use in order to obtain the information you need. Unless you are already super familiar with Github API or super lucky, you probably will do some trials and errors. Therefore, be prepared to go back and forth in studying the API documentation, testing, and revising until you obtain what you need.

* After receiving the JSON data object, you need to inspect its structure and decide how to parse the data.

* When you test your requests with Github API, sometimes you may be blocked by Github with an error message that reads:

  > You have triggered an abuse detection mechanism and have been temporarily blocked from content creation. Please retry your request again later.

  Don't worry. Check the parameters in your request and wait for a minute or two before you make additional requests.

**After you find out the secrete files:**

1. Sort the filenames ascendingly.

1. Read the content of each secret files into an array of strings.

1. Concatenate the strings in the array separating each two with a whitespace.

1. Print out the joke.

In [16]:
url = 'https://api.github.com/repos/ironhack-datalabs/scavenger/contents'

In [17]:
response = requests.get(url)
results = response.json()
df = json_normalize(results)

In [18]:
df.head()

Unnamed: 0,name,path,sha,size,url,html_url,git_url,download_url,type,_links.self,_links.git,_links.html
0,.gitignore,.gitignore,e43b0f988953ae3a84b00331d0ccf5f7d51cb3cf,10,https://api.github.com/repos/ironhack-datalabs...,https://github.com/ironhack-datalabs/scavenger...,https://api.github.com/repos/ironhack-datalabs...,https://raw.githubusercontent.com/ironhack-dat...,file,https://api.github.com/repos/ironhack-datalabs...,https://api.github.com/repos/ironhack-datalabs...,https://github.com/ironhack-datalabs/scavenger...
1,15024,15024,2945e51c87ad5da893c954afcf092f06343bbb7d,0,https://api.github.com/repos/ironhack-datalabs...,https://github.com/ironhack-datalabs/scavenger...,https://api.github.com/repos/ironhack-datalabs...,,dir,https://api.github.com/repos/ironhack-datalabs...,https://api.github.com/repos/ironhack-datalabs...,https://github.com/ironhack-datalabs/scavenger...
2,15534,15534,5af6f2a7287e4191f39e55693fc1e9c8918d1d3a,0,https://api.github.com/repos/ironhack-datalabs...,https://github.com/ironhack-datalabs/scavenger...,https://api.github.com/repos/ironhack-datalabs...,,dir,https://api.github.com/repos/ironhack-datalabs...,https://api.github.com/repos/ironhack-datalabs...,https://github.com/ironhack-datalabs/scavenger...
3,17020,17020,9c49f920aa4d9433fa99a5824128f0e6b90ec5f2,0,https://api.github.com/repos/ironhack-datalabs...,https://github.com/ironhack-datalabs/scavenger...,https://api.github.com/repos/ironhack-datalabs...,,dir,https://api.github.com/repos/ironhack-datalabs...,https://api.github.com/repos/ironhack-datalabs...,https://github.com/ironhack-datalabs/scavenger...
4,30351,30351,c488d7f64088c852e22067d48fdc64ee3670f3ba,0,https://api.github.com/repos/ironhack-datalabs...,https://github.com/ironhack-datalabs/scavenger...,https://api.github.com/repos/ironhack-datalabs...,,dir,https://api.github.com/repos/ironhack-datalabs...,https://api.github.com/repos/ironhack-datalabs...,https://github.com/ironhack-datalabs/scavenger...


In [19]:
links = df.url.tolist()

# First link is from a .gitignore archive, we're not interested
links = links[1:]
links

['https://api.github.com/repos/ironhack-datalabs/scavenger/contents/15024?ref=master',
 'https://api.github.com/repos/ironhack-datalabs/scavenger/contents/15534?ref=master',
 'https://api.github.com/repos/ironhack-datalabs/scavenger/contents/17020?ref=master',
 'https://api.github.com/repos/ironhack-datalabs/scavenger/contents/30351?ref=master',
 'https://api.github.com/repos/ironhack-datalabs/scavenger/contents/40303?ref=master',
 'https://api.github.com/repos/ironhack-datalabs/scavenger/contents/44639?ref=master',
 'https://api.github.com/repos/ironhack-datalabs/scavenger/contents/45525?ref=master',
 'https://api.github.com/repos/ironhack-datalabs/scavenger/contents/47222?ref=master',
 'https://api.github.com/repos/ironhack-datalabs/scavenger/contents/47830?ref=master',
 'https://api.github.com/repos/ironhack-datalabs/scavenger/contents/49418?ref=master',
 'https://api.github.com/repos/ironhack-datalabs/scavenger/contents/50896?ref=master',
 'https://api.github.com/repos/ironhack-dat

In [21]:
# DOWNLOAD URL:

link = links[1]

response = requests.get(link)
results2 = response.json()
down_link = json_normalize(results2).download_url
down_link

0    https://raw.githubusercontent.com/ironhack-dat...
1    https://raw.githubusercontent.com/ironhack-dat...
2    https://raw.githubusercontent.com/ironhack-dat...
3    https://raw.githubusercontent.com/ironhack-dat...
Name: download_url, dtype: object

In [23]:
# TESTING CONDITION
link = down_link.values.tolist()

for string in link:
    if 'scavengerhunt' in string:
        print(string)

https://raw.githubusercontent.com/ironhack-datalabs/scavenger/master/15534/.0008.scavengerhunt
https://raw.githubusercontent.com/ironhack-datalabs/scavenger/master/15534/.0012.scavengerhunt


In [32]:
# OPEN FILE TEST

url = link[1]

print(url)

response = requests.get(url)
content = response.json()

content

https://raw.githubusercontent.com/ironhack-datalabs/scavenger/master/15534/.0012.scavengerhunt


20

In [29]:
# Getting the data:

name_list= []
content_list= []


for url in links:
    #Looping through folders
    response = requests.get(url)
    results = response.json()
    df = json_normalize(results)
    for url in df.url:
        # looping through download_urls in the data
        response = requests.get(url)
        results = response.json()
        folders = json_normalize(results)
        for name in folders.name.tolist():
            if name.startswith('.'):
                name_list.append(name)
            else:
                continue
        for down_url in folders.download_url:
            if 'scavengerhunt' in down_url:
                response = requests.get(down_url)
                string = response.json()
                
            else:
                continue
                
## Blocked on enconding JSON base64 data.
## ????

# df['name'] = name_list
# df.index = df['name']
# df['value'] = content_list
# scavenger = df.value.sort(ascending= True).values.tolist.' '.join()

JSONDecodeError: Expecting value: line 1 column 1 (char 0)