# API Scavenger Game

## Challenge 1: Fork Languages

You will find out how many programming languages are used among all the forks created from the main lab repo of your bootcamp.

In [12]:
# import libraries here
import requests
import pandas as pd

Assuming the main lab repo is ironhack-datalabs/madrid-oct-2018, you will:

#### 1. Obtain the full list of forks created from the main lab repo via Github API.

To list forks, we can use the GET method. As explained in the GitHub API documentation, we need to make the request to: GET /repos/:owner/:repo/forks.

In [13]:
# your code here
with open('../token/token.txt') as f:
    token = 'token ' + f.readline()
base_url = 'https://api.github.com/'
cunillet_repos = 'users/Cunillet/repos'
ih_dl_repo = 'repos/ironhack-datalabs/mad-oct-2018'
forks = '/forks'
commits = '/commits'
response = requests.get(base_url + ih_dl_repo + forks, headers={'Authorization':token})
if response.status_code == 200:
    result = response.json()

#### 2. Loop the JSON response to find out the language attribute of each fork. Use an array to store the language attributes of each fork.
Hint: Each language should appear only once in your array.
Print the language array. It should be something like: ["Python", "Jupyter Notebook", "HTML"]

In [14]:
# your code here
df = pd.DataFrame(result)
lang_urls = df['languages_url']
languages = {}
for lang_url in lang_urls:
    lan_resp = requests.get(lang_url, headers={'Authorization':token})
    if lan_resp.status_code == 200:
        lan_res = lan_resp.json()
        for k,v in lan_res.items():
            if k in languages:
                languages[k] = languages[k] + v
            else:
                languages[k] = v
languages

{'Jupyter Notebook': 22617301,
 'HTML': 6976424,
 'Python': 308847,
 'Shell': 1051}

## Challenge 2: Count Commits
Count how many commits were made in the month of october of 2018.
#### 1. Obtain all the commits made in October 2018 via API, which is a JSON array that contains multiple commit objects.

In [15]:
# your code here
page = 1
next_page = True
commit_list = []
while next_page:
    parameters = {
        'since': '2018-10-01',
        'until': '2018-10-31',
        'per_page': '100',
        'page': page
    }
    response = requests.get(base_url + ih_dl_repo + commits, params=parameters, headers={'Authorization':token})
    df = pd.DataFrame(response.json())
    if 'commit' in df:
        commit_list.append(df.commit)
        page = page + 1
    else:
        next_page = False
commit_list

[0     {'author': {'name': 'Gobinde43', 'email': 'bel...
 1     {'author': {'name': 'Gobinde43', 'email': 'bel...
 2     {'author': {'name': 'Gobinde43', 'email': 'bel...
 3     {'author': {'name': 'Zhou Zhou', 'email': 'zho...
 4     {'author': {'name': 'Zhou Zhou', 'email': 'zho...
 5     {'author': {'name': 'Zhou Zhou', 'email': 'zho...
 6     {'author': {'name': 'Zhou Zhou', 'email': 'zho...
 7     {'author': {'name': 'Zhou Zhou', 'email': 'zho...
 8     {'author': {'name': 'Zhou Zhou', 'email': 'zho...
 9     {'author': {'name': 'Tony Ojeda', 'email': 'to...
 10    {'author': {'name': 'Tony Ojeda', 'email': 'to...
 11    {'author': {'name': 'Gobinde43', 'email': 'bel...
 12    {'author': {'name': 'Tony Ojeda', 'email': 'to...
 13    {'author': {'name': 'Tony Ojeda', 'email': 'to...
 14    {'author': {'name': 'Tony Ojeda', 'email': 'to...
 15    {'author': {'name': 'Tony Ojeda', 'email': 'to...
 16    {'author': {'name': 'Zhou Zhou', 'email': 'zho...
 17    {'author': {'name': 'Zho

In [16]:

c_list = []
for cs in commit_list:
    for c in cs:
        c_list.append(c['committer'])
c_list

[{'name': 'Gobinde43',
  'email': 'belenlinacero@gmail.com',
  'date': '2018-10-30T22:51:52Z'},
 {'name': 'Gobinde43',
  'email': 'belenlinacero@gmail.com',
  'date': '2018-10-30T22:45:48Z'},
 {'name': 'Gobinde43',
  'email': 'belenlinacero@gmail.com',
  'date': '2018-10-30T22:25:52Z'},
 {'name': 'Zhou Zhou',
  'email': 'zhou.eye8@gmail.com',
  'date': '2018-10-30T05:06:04Z'},
 {'name': 'Zhou Zhou',
  'email': 'zhou.eye8@gmail.com',
  'date': '2018-10-30T04:54:28Z'},
 {'name': 'Zhou Zhou',
  'email': 'zhou.eye8@gmail.com',
  'date': '2018-10-30T02:19:11Z'},
 {'name': 'Zhou Zhou',
  'email': 'zhou.eye8@gmail.com',
  'date': '2018-10-30T02:15:53Z'},
 {'name': 'Zhou Zhou',
  'email': 'zhou.eye8@gmail.com',
  'date': '2018-10-30T02:15:49Z'},
 {'name': 'Zhou Zhou',
  'email': 'zhou.eye8@gmail.com',
  'date': '2018-10-30T02:15:02Z'},
 {'name': 'Tony Ojeda',
  'email': 'tojeda@districtdatalabs.com',
  'date': '2018-10-29T21:57:34Z'},
 {'name': 'Tony Ojeda',
  'email': 'tojeda@districtdatalabs

#### 2. Count how many commit objects are contained in the array.

In [17]:
# your code here
len(c_list)

59

## Challenge 3: Hidden Cold Joke

Using Python, call Github API to find out the cold joke contained in the 24 secret files in the following repo:

https://github.com/ironhack-datalabs/scavenger

The filenames of the secret files contain .scavengerhunt and they are scattered in different directories of this repo. The secret files are named from .0001.scavengerhunt to .0024.scavengerhunt. They are scattered randomly throughout this repo. You need to search for these files by calling the Github API, not searching the local files on your computer.

#### 1. Find the secret files.

In [40]:
# your code here
#response = requests.get(commit_list[0][4]['url'].replace('/git',''), params=parameters, headers={'Authorization':token})
contents = 'repos/ironhack-datalabs/scavenger/contents/'
response = requests.get(base_url + contents, headers={'Authorization':token})
result = response.json()


In [41]:
#response = requests.get(base_url + contents + '47222', headers={'Authorization':token})
#result = response.json()
result

[{'name': '.gitignore',
  'path': '.gitignore',
  'sha': 'e43b0f988953ae3a84b00331d0ccf5f7d51cb3cf',
  'size': 10,
  'url': 'https://api.github.com/repos/ironhack-datalabs/scavenger/contents/.gitignore?ref=master',
  'html_url': 'https://github.com/ironhack-datalabs/scavenger/blob/master/.gitignore',
  'git_url': 'https://api.github.com/repos/ironhack-datalabs/scavenger/git/blobs/e43b0f988953ae3a84b00331d0ccf5f7d51cb3cf',
  'download_url': 'https://raw.githubusercontent.com/ironhack-datalabs/scavenger/master/.gitignore',
  'type': 'file',
  '_links': {'self': 'https://api.github.com/repos/ironhack-datalabs/scavenger/contents/.gitignore?ref=master',
   'git': 'https://api.github.com/repos/ironhack-datalabs/scavenger/git/blobs/e43b0f988953ae3a84b00331d0ccf5f7d51cb3cf',
   'html': 'https://github.com/ironhack-datalabs/scavenger/blob/master/.gitignore'}},
 {'name': '15024',
  'path': '15024',
  'sha': '2945e51c87ad5da893c954afcf092f06343bbb7d',
  'size': 0,
  'url': 'https://api.github.com

In [47]:
"""result[1]['path']
response_f = requests.get(base_url + contents + result[1]['path'], headers={'Authorization':token})
response_f.json()"""
content_lst = []
for content in result:
    if '.gitignore' not in content['path']:
        resp = requests.get(base_url + contents + content['path'], headers={'Authorization':token}).json()
        for elem in resp:
            if 'scavenger' in elem['path']:
                content_lst.append(elem['path'])

content_lst

['15024/.0006.scavengerhunt',
 '15534/.0008.scavengerhunt',
 '15534/.0012.scavengerhunt',
 '17020/.0007.scavengerhunt',
 '30351/.0021.scavengerhunt',
 '40303/.0022.scavengerhunt',
 '44639/.0005.scavengerhunt',
 '45525/.0018.scavengerhunt',
 '47222/.0016.scavengerhunt',
 '47222/.0024.scavengerhunt',
 '47830/.0010.scavengerhunt',
 '49418/.0014.scavengerhunt',
 '50896/.0011.scavengerhunt',
 '55417/.0023.scavengerhunt',
 '55685/.0020.scavengerhunt',
 '60224/.0003.scavengerhunt',
 '68848/.0004.scavengerhunt',
 '70751/.0019.scavengerhunt',
 '70985/.0017.scavengerhunt',
 '88596/.0002.scavengerhunt',
 '89338/.0013.scavengerhunt',
 '91701/.0015.scavengerhunt',
 '97881/.0009.scavengerhunt',
 '98750/.0001.scavengerhunt']

#### 2.  Sort the filenames ascendingly.

In [48]:
# your code here
import re

sorted_lst = []
for i in range(26):
    for j in content_lst:
        pattern = '0' + str(i) + '\.scavengerhunt'
        if bool(re.search(pattern, j)):
            sorted_lst.append(j)
            break
sorted_lst

['98750/.0001.scavengerhunt',
 '88596/.0002.scavengerhunt',
 '60224/.0003.scavengerhunt',
 '68848/.0004.scavengerhunt',
 '44639/.0005.scavengerhunt',
 '15024/.0006.scavengerhunt',
 '17020/.0007.scavengerhunt',
 '15534/.0008.scavengerhunt',
 '97881/.0009.scavengerhunt',
 '47830/.0010.scavengerhunt',
 '50896/.0011.scavengerhunt',
 '15534/.0012.scavengerhunt',
 '89338/.0013.scavengerhunt',
 '49418/.0014.scavengerhunt',
 '91701/.0015.scavengerhunt',
 '47222/.0016.scavengerhunt',
 '70985/.0017.scavengerhunt',
 '45525/.0018.scavengerhunt',
 '70751/.0019.scavengerhunt',
 '55685/.0020.scavengerhunt',
 '30351/.0021.scavengerhunt',
 '40303/.0022.scavengerhunt',
 '55417/.0023.scavengerhunt',
 '47222/.0024.scavengerhunt']

#### 3. Read the content of each secret files into an array of strings.
Since the response is encoded, you will need to send the following information in the header of your request:
````python
headers = {'Accept': 'application/vnd.github.v3.raw'}
````

In [50]:
# your code here
out = ''
for c in sorted_lst:
    resp = requests.get(base_url + contents + c, headers={'Authorization':token,'Accept': 'application/vnd.github.v3.raw'})
    out = out + resp.text

#### 4. Concatenate the strings in the array separating each two with a whitespace.

In [61]:
# your code here
out = out.replace('\n', ' ').strip()

#### 5. Print out the joke.

In [62]:
# your code here
out

'In data science, 80 percent of time spent is preparing data, 20 percent of time is spent complaining about the need to prepare data.'