## Challenge 3: Hidden Cold Joke

Using Python, call Github API to find out the cold joke contained in the 24 secret files in the following repo:

https://github.com/ironhack-datalabs/scavenger

The filenames of the secret files contain .scavengerhunt and they are scattered in different directories of this repo. The secret files are named from .0001.scavengerhunt to .0024.scavengerhunt. They are scattered randomly throughout this repo. You need to search for these files by calling the Github API, not searching the local files on your computer.

Notes:

Github API documentation can be found here.

You will need to study the Github API documentation to decide which API endpoint to call and what parameters to use in order to obtain the information you need. Unless you are already super familiar with Github API or super lucky, you probably will do some trials and errors. Therefore, be prepared to go back and forth in studying the API documentation, testing, and revising until you obtain what you need.

After receiving the JSON data object, you need to inspect its structure and decide how to parse the data.

When you test your requests with Github API, sometimes you may be blocked by Github with an error message that reads:

You have triggered an abuse detection mechanism and have been temporarily blocked from content creation. Please retry your request again later.

Don't worry. Check the parameters in your request and wait for a minute or two before you make additional requests.

After you find out the secrete files:

Sort the filenames ascendingly.

Read the content of each secret files into an array of strings.

Concatenate the strings in the array separating each two with a whitespace.

Print out the joke.

In [1]:
import os
from dotenv import load_dotenv
import requests

load_dotenv()
print("setup completado")

setup completado


In [2]:
def get_gh_data(endpoint, apiKey=os.getenv("GITHUB_APIKEY"), query_params={}):
    
    baseUrl = "https://api.github.com"
    url = f"{baseUrl}{endpoint}"
    headers = {
        "Authorization": f"Bearer {apiKey}"
    }
    res = requests.get(url, params=query_params, headers=headers)
    print(f"Request data to {res.url} status_code:{res.status_code}")
    
    data = res.json()
    return data  

In [6]:
# https://api.github.com/search/code?q=addClass+repo:jquery/jquery+filename:classes.js
# https://github.com/ironhack-datalabs/scavenger
# .0001.scavengerhunt to .0024.scavengerhunt
# https://docs.github.com/en/github/searching-for-information-on-github/searching-code#search-by-the-file-contents-or-file-path


params ={"q":"filename:.scavengerhunt repo:ironhack-datalabs/scavenger"}


data = get_gh_data(f"/search/code", query_params=params)
print(len(data["items"]))

data_resume = {file["name"]:file["url"] for file in data["items"]}
#print(data_resume)

data_resume_items = data_resume.items()
data_ordenado = sorted(data_resume_items)
#print(data_ordenado)


words_list = []



for i in data_ordenado:
    res = requests.get(i[1])
    print(res.json())
    text_file = requests.get(res.json()["download_url"])
    #print(text_file.text)
    texto = text_file.text
    #print(texto)
    words_list.append(texto)

#print(words_list)

"""
{'message': "API rate limit exceeded for 5.224.116.191. (But here's the good news: 
Authenticated requests get a higher rate limit. Check out the documentation for more details.)", 
'documentation_url': 'https://developer.github.com/v3/#rate-limiting'}

"""

Request data to https://api.github.com/search/code?q=filename%3A.scavengerhunt+repo%3Aironhack-datalabs%2Fscavenger status_code:200
24
{'name': '.0001.scavengerhunt', 'path': '98750/.0001.scavengerhunt', 'sha': '2add7632f1323136324efbf38ec66db1838b6173', 'size': 3, 'url': 'https://api.github.com/repos/ironhack-datalabs/scavenger/contents/98750/.0001.scavengerhunt?ref=9308ccc8a4c34c5e3a991ee815222a9691c32476', 'html_url': 'https://github.com/ironhack-datalabs/scavenger/blob/9308ccc8a4c34c5e3a991ee815222a9691c32476/98750/.0001.scavengerhunt', 'git_url': 'https://api.github.com/repos/ironhack-datalabs/scavenger/git/blobs/2add7632f1323136324efbf38ec66db1838b6173', 'download_url': 'https://raw.githubusercontent.com/ironhack-datalabs/scavenger/9308ccc8a4c34c5e3a991ee815222a9691c32476/98750/.0001.scavengerhunt', 'type': 'file', 'content': 'SW4K\n', 'encoding': 'base64', '_links': {'self': 'https://api.github.com/repos/ironhack-datalabs/scavenger/contents/98750/.0001.scavengerhunt?ref=9308ccc8

{'name': '.0008.scavengerhunt', 'path': '15534/.0008.scavengerhunt', 'sha': 'e351fb73264581ce26504b97ef07daea35116f32', 'size': 6, 'url': 'https://api.github.com/repos/ironhack-datalabs/scavenger/contents/15534/.0008.scavengerhunt?ref=9308ccc8a4c34c5e3a991ee815222a9691c32476', 'html_url': 'https://github.com/ironhack-datalabs/scavenger/blob/9308ccc8a4c34c5e3a991ee815222a9691c32476/15534/.0008.scavengerhunt', 'git_url': 'https://api.github.com/repos/ironhack-datalabs/scavenger/git/blobs/e351fb73264581ce26504b97ef07daea35116f32', 'download_url': 'https://raw.githubusercontent.com/ironhack-datalabs/scavenger/9308ccc8a4c34c5e3a991ee815222a9691c32476/15534/.0008.scavengerhunt', 'type': 'file', 'content': 'c3BlbnQK\n', 'encoding': 'base64', '_links': {'self': 'https://api.github.com/repos/ironhack-datalabs/scavenger/contents/15534/.0008.scavengerhunt?ref=9308ccc8a4c34c5e3a991ee815222a9691c32476', 'git': 'https://api.github.com/repos/ironhack-datalabs/scavenger/git/blobs/e351fb73264581ce26504

{'name': '.0016.scavengerhunt', 'path': '47222/.0016.scavengerhunt', 'sha': 'f5cb13223fdc1b11f4cfbbe1694f533b3c579fa0', 'size': 3, 'url': 'https://api.github.com/repos/ironhack-datalabs/scavenger/contents/47222/.0016.scavengerhunt?ref=9308ccc8a4c34c5e3a991ee815222a9691c32476', 'html_url': 'https://github.com/ironhack-datalabs/scavenger/blob/9308ccc8a4c34c5e3a991ee815222a9691c32476/47222/.0016.scavengerhunt', 'git_url': 'https://api.github.com/repos/ironhack-datalabs/scavenger/git/blobs/f5cb13223fdc1b11f4cfbbe1694f533b3c579fa0', 'download_url': 'https://raw.githubusercontent.com/ironhack-datalabs/scavenger/9308ccc8a4c34c5e3a991ee815222a9691c32476/47222/.0016.scavengerhunt', 'type': 'file', 'content': 'aXMK\n', 'encoding': 'base64', '_links': {'self': 'https://api.github.com/repos/ironhack-datalabs/scavenger/contents/47222/.0016.scavengerhunt?ref=9308ccc8a4c34c5e3a991ee815222a9691c32476', 'git': 'https://api.github.com/repos/ironhack-datalabs/scavenger/git/blobs/f5cb13223fdc1b11f4cfbbe16

{'name': '.0024.scavengerhunt', 'path': '47222/.0024.scavengerhunt', 'sha': '47eb4306e5fec9e051dacabc7039348109784b94', 'size': 6, 'url': 'https://api.github.com/repos/ironhack-datalabs/scavenger/contents/47222/.0024.scavengerhunt?ref=9308ccc8a4c34c5e3a991ee815222a9691c32476', 'html_url': 'https://github.com/ironhack-datalabs/scavenger/blob/9308ccc8a4c34c5e3a991ee815222a9691c32476/47222/.0024.scavengerhunt', 'git_url': 'https://api.github.com/repos/ironhack-datalabs/scavenger/git/blobs/47eb4306e5fec9e051dacabc7039348109784b94', 'download_url': 'https://raw.githubusercontent.com/ironhack-datalabs/scavenger/9308ccc8a4c34c5e3a991ee815222a9691c32476/47222/.0024.scavengerhunt', 'type': 'file', 'content': 'ZGF0YS4K\n', 'encoding': 'base64', '_links': {'self': 'https://api.github.com/repos/ironhack-datalabs/scavenger/contents/47222/.0024.scavengerhunt?ref=9308ccc8a4c34c5e3a991ee815222a9691c32476', 'git': 'https://api.github.com/repos/ironhack-datalabs/scavenger/git/blobs/47eb4306e5fec9e051dac

'\n{\'message\': "API rate limit exceeded for 5.224.116.191. (But here\'s the good news: \nAuthenticated requests get a higher rate limit. Check out the documentation for more details.)", \n\'documentation_url\': \'https://developer.github.com/v3/#rate-limiting\'}\n\n'

In [16]:
#print(words_list)

words_list = [s.strip() for s in words_list]

# Concatenate the strings in the array separating each two with a whitespace

result = ""
for i in range(len(words_list)):
    if i>0 and i%2==0:
        result+=" "+words_list[i]
    else:
        result+=words_list[i]

print(result)
# El resultado del ejercicio es el siguiente

Indata science,80 percentof timespent ispreparing data,20 percentof timeis spentcomplaining aboutthe needto preparedata.
