# GitHub Commit Monitor - API call example in Python

## Project's overview
This project was made as a **self-assignment/learning tool**, and consists of an **example** of **REST API call** written in **Python**, accompanied by an introduction to some of the concepts surrounding the topic. 

In particular the code presented in this notebook consists of a call to the **GitHub REST API** (https://docs.github.com/en/rest?apiVersion=2022-11-28), which we here use to scrape some of our **GitHub's account information**, namely:
- API Key expiration date
- Last commit informations for the last five (different) projects receiving a commit

One might imagine to find themselves in need of retrieving this kind of info about their profile if they host different projects receiving commits by one or more other people. This monitoring script save the user some time in getting to their profile and check each and every project receiving commits.

Anyhow, I tried keeping the code simple choosing just these two outputs as others would have had a similar implementation to eithern one or the other of the two I've chosen. A more robust code would probably implement more info retrieval or use them as input to other functions, not just listing them as done here.

Below I wrote an essencial **glossary** of the concepts introduced with this lesson.

In the last part of the note-book there will be a **completely commented** version of the **code**, which can be copied by anyone, better if quoting this github project as a reference (https://github.com/RAGilardi/GitHub-Commit-Monitor_API-call-example-in-Python).

An interesting **excercise**, for someone reading this document who wants to learn about implementing API Calls, would be to use this code as a starting point and add more information to its output, using one of the many other parameters which the GitHub REST API gives access to (to do this I strongly suggest to try and read the documentation linked above, before asking ChatGPT to write for us a mockup which we don't completely undestand).

## Glossary
    
    
   - **API**: Application Programming Interfaces are pieces of software designed to facilitate communication between different machines or software. They often consist of various components that can be used individually as tools or services made available to programmers.
   
   APIs come in various forms and implementations, but in the context of Web 3.0, they are frequently utilized to grant access to a webpage or web application's resources to external programs.
   
   The utilization of an API is facilitated by the source app or machine owner, who defines the API specifications. By adhering to these specifications, external users can access some of the API's routines through so-called API calls. Typically, users are also required to have a personal and secret API key for authentication.

   - **API vs Scraping**: Continuing with the discussion on APIs used to access server or webpage information, an alternative method for end-users to access public data is through **web scraping**. This involves techniques in which a web page's source text is parsed, and information is extracted using hardcoded software.
   
   While web scraping provides end-users with complete control over the information they can retrieve, it also heavily relies on their ability to interpret the webpage's source code and develop code to extract the desired data. Additionally, web scraping is often tightly coupled with the page layout, meaning that even minor HTML modifications can render scraping software ineffective until appropriately adjusted.
   
   In contrast, APIs are typically developed and maintained by the source owner, allowing callers to easily access specified information, even if it's not explicitly public and scrapeable. 
   
   The **advantage of using an API** over web scraping is that, as long as the service owner maintains it, API calls may not require updates from the end-user. Furthermore, public APIs, accompanied by comprehensive documentation, facilitate easier sharing and discussion of API call codes and functionality.
   
   The main **disadvantage of APIs** compared to web scraping is that a service owner may restrict access to certain information via APIs, even if it's scrapeable. Therefore, users may resort to scraping the desired data against the owner's wishes.It's also essential to note that many websites employ anti-scraping techniques to prevent data extraction from their HTML.

   - **REST vs SOAP**: There are two primary ways in which APIs are implemented, based on different technologies and requirements.
    
    **REST APIs** can receive calls using HTTP verbs (such as GET or POST) and typically return responses in JSON format. However, they are flexible and can return data in various formats, including code. REST APIs operate on a server-client interface, possibly with intermediate layers. They are cacheable, stateless, and provide a uniform output.
    
    **SOAP APIs** adhere to more stringent rules, accepting calls as HTML files within "SOAP envelopes" and exclusively returning XML responses. They are considered more secure as they tightly control each step of the data exchange process.

    For more details, please refer to this https://aws.amazon.com/compare/the-difference-between-soap-rest/?nc1=h_ls.

## The Code: a simple API call

In [20]:
#needed for REST API HTML calls
import requests

#needed to calculate the expiration date on the API key
from datetime import datetime, timedelta

#added to make password invisible while typing it
import getpass

#function with which we will learn when will our API key will expire
def getAPIkeyExpiration(token):
    
    #prepare the input for the requests.post function
    #these are based on gitHub API documentation
    url = "https://api.github.com/graphql"
    headers = {
        "Authorization": f"Bearer {token}",
        "Content-Type": "application/json",
        "Accept": "application/vnd.github.merge-info-preview+json"
    }
    query = "{__schema { directives { description name}}}"
    
    # Increase the timeout value as needed (more queries might require more time to be done,
    #running the risk of a standby from gitHub server)
    response = requests.post(url, headers=headers, json={"query": query}, timeout=10000)

    #raise an error if the server respond something different from 200 (ok), showing the actual response status code
    if response.status_code != 200:
        raise ValueError(f"Error retrieving repositories: {response.status_code}")
    
    #save the expiration date in a variable
    expiration_date = None
    if 'GitHub-Authentication-Token-Expiration' in response.headers:
        expirationDateStr = response.headers['GitHub-Authentication-Token-Expiration']
        expirationDate = datetime.strptime(expirationDateStr, "%Y-%m-%d %H:%M:%S %Z")
    
    return expirationDate

def getLastCommitsPerProject(username, token):
    
    #prepare the input for the requests.post function
    #these are based on gitHub API documentation (but are simply the user page repository url)
    url = f"https://api.github.com/users/{username}/repos"
    headers = {"Authorization": f"token {token}"}
    response = requests.get(url, headers=headers, timeout=10000)  
    
    #check the status_code response before continuing
    if response.status_code != 200:
        raise ValueError(f"Error retrieving repositories: {response.status_code}")
    
    repos = response.json()
    lastCommits = []
    
    #reads the data from the Jason
    #note that the author will be gitHub if the commit was made from the website by the owner account
    # Limiting to 5 repositories for demonstration purposes (they are already ordered by date of last commit)
    for repo in repos[:5]:  
        commitsUrl = f"https://api.github.com/repos/{username}/{repo['name']}/commits"
        commitsResponse = requests.get(commitsUrl, headers=headers, timeout=10000)  # Increase the timeout value as needed
        commits = commitsResponse.json()
        if commits:
            lastCommit = commits[0]['commit']['message']
            lastCommitDate = commits[0]['commit']['committer']['date']
            lastAuthor = commits[0]['commit']['committer']['name']
            lastCommits.append((repo['name'], lastCommit, lastCommitDate, lastAuthor))
    
    return lastCommits

#main function for calling other functions and giving outputs
def main():
    
    #input username (visible) and token (hidden) with empty input check
    #in case the token/user are wrong an error raise will be given by getLastCommitsPerProject
    username = input("Enter your GitHub username: ")
    
    if not username.strip():
        raise ValueError("Username cannot be empty.")
    
    token = getpass.getpass('Enter your GitHub API token: ')

    if not token.strip():
        raise ValueError("Token cannot be empty.")
        
    #might also hardcode user/key if more comfortable (still, it's a security red flag)
    #username = 'user'
    #token = 'APIkey'
    
    #call the function which returns data on last commits
    #made it first, even though it costs more time than the api key, because it says if the user/token are right
    lastCommits = getLastCommitsPerProject(username, token)
    lastCommits = sorted(lastCommits, key=lambda x: x[2], reverse=True)
    
    #calls the function retrieving info on the current API key
    expirationDate = getAPIkeyExpiration(token)
    
    #if the function return a proper result, we print it
    if expirationDate:
        print(f"\nAPI Key Expiration Date: {expirationDate} UTC")
    else:
        print("\nAPI Key Expiration Date: Not available")
    
    #printing the info on the last commits
    print("\nLast commits per project:")
    for i, (project, commitMessage, commitDate, author) in enumerate(lastCommits[:10], start=1):
        print(f"{i}) {project}: {commitMessage} (Date: {commitDate}, Author: {author})")

#function call, if commented it can be called simply by main()
if __name__ == "__main__":
    main()

Enter your GitHub username: RAGilardi
Enter your GitHub API token: ········

API Key Expiration Date: 2024-05-08 14:17:41 UTC

Last commits per project:
1) GitHub-Commit-Monitor_API-call-example-in-Python: Update README.md (Date: 2024-04-28T15:21:58Z, Author: GitHub)
2) PoweBI_eCommerce_Dashboard_Example: Completed Project

First draft of the project with data files, project description and graphics (Date: 2024-04-28T12:31:09Z, Author: GitHub)
3) Arcade_Mania_Reboot: Update index.html (Date: 2023-08-21T16:08:17Z, Author: GitHub)
4) AthenaWFI_Sixte_Pipeline: Update README.md (Date: 2023-05-01T12:02:14Z, Author: GitHub)
5) Monty-Hall-Problem-Empirical-Solution: Add files via upload (Date: 2023-04-30T17:27:19Z, Author: GitHub)
