# GITHUB STATS INFO

[github api endpoints](https://docs.github.com/en/rest/overview/endpoints-available-for-github-app-installation-access-tokens?apiVersion=2022-11-28#branches)

When going through the GitHub page of a potential tech employee, it's important to assess their coding skills, work habits, and overall suitability for your team. 

1. **Contribution History:** by analyzing the number of 
   - commits -> `commit frequency`
   - pull requests -> `pull request frequency`
   - contributions made over time -> `lines of code contributed`

2. **Quality of Code:** by using code analysis tools that generate metrics like
   - cyclomatic complexity
   - code duplication
   - code coverage.
   These tools can provide quantitative data on code maintainability, readability, and quality.

3. ~~**Problem-Solving Skills:** Assessing problem-solving skills statistically can be more challenging. One way is to evaluate the complexity and effectiveness of solutions they've implemented in their code, which can be measured using metrics like algorithmic efficiency.~~

4. **Collaboration:** Collaboration can be assessed by looking at the
 - number of pull requests
 - issues opened
 - contributions to other team members' projects.

5. **Variety of Projects:** Statistically, can assess this by categorizing their repositories into different project types or technologies and calculating the distribution.

6. **Use of Technologies:** Can measure this by identifying the programming languages and technologies they use in their repositories and quantifying their usage. The statistical representation might be in the form of a technology stack distribution.

7. **Project Completeness:** A bit more qualitative, but you can assess it by analyzing the commit and update history of their repositories.
_Frequent updates, bug fixes, and improvements might indicate a commitment to project completion._

8. **Open Source Contributions:** Measure this statistically by evaluating the number and significance of contributions made to well-known open-source projects. The `reputation` and `impact` of these contributions can be quantified.

9. ~~**Issue Management:** You can assess their responsiveness statistically by analyzing the time taken to address issues and the percentage of closed issues in their repositories.~~

10. **Documentation:** Assess the quality of documentation statistically by looking at the `length, clarity, and completeness of README files`, and potentially analyzing user feedback regarding documentation.

11. ~~**Version Control:** Measure this statistically by analyzing their use of Git, such as commit frequency, the number of branches, and the structure of their commit history.~~

12. ~~**Learning and Growth:** This may be more challenging to measure statistically but can be inferred from the `update frequency of projects`, the adoption of new technologies, or the addition of new features.~~

13. ~~**Coding Style Consistency:** You can measure this statistically by using code analysis tools that check adherence to coding style guides and report inconsistencies.~~

14. ~~**Passion and Enthusiasm:** While difficult to measure directly, you can assess passion indirectly through the number and nature of contributions, as well as their engagement in relevant forums or communities.~~

15. ~~**Adaptability:** Measuring adaptability statistically might involve looking at the variety of programming languages and frameworks they've used and assessing their proficiency in each based on quantitative criteria.~~


### IMPORTS

In [1]:
import requests
from pprint import pprint as pp
from tabulate import tabulate
from datetime import datetime

# USER

**PROFILE:**

| What to Measure                 | How to Measure It                   | Why It's Important                             |
|--------------------------------|-------------------------------------|-------------------------------------------------|
| GitHub Username                | Access the user's GitHub profile.  | Identifies the user on the platform.           |
| Name and Bio                   | Retrieve the user's name and bio from their profile. | Provides basic personal information.      |
| Location and Email             | Check the user's location and email information if provided. | Helps understand their geographic base and contact details. |
| Public Activity                | Examine the user's recent public activity, including starred repositories and followed users. | Shows current interests and engagements. |
| Follower and Following Count   | Count the number of followers and accounts they are following on GitHub. | Indicates their level of engagement with the community. |
| Organizations                  | Identify the organizations the user is a member of or has contributed to. | Shows affiliations and professional associations. |
| Public Repositories            | Count the number of public repositories owned by the user. | Indicates their coding projects and open-source contributions. |
| Contribution Graph             | Analyze the user's GitHub contribution graph to see their recent activity. | Provides an overview of their coding activity over time. |


In [17]:
USER_BASE_URL = 'https://api.github.com/users'
headers = {'Authorization': 'Bearer KEY'}

In [3]:
# 'Yantiomene'
username = 'esmond-adjei'
user_response = requests.get(f'{USER_BASE_URL}/{username}', headers=headers)

In [4]:
pp(user_response.json())

{'avatar_url': 'https://avatars.githubusercontent.com/u/81225469?v=4',
 'bio': 'A student at KNUST, Kumasi and ALX. Interested in Python, Software '
        'Engineering, Data Analysis, Algorithms and Machine Learning. Learner '
        'at ALX',
 'blog': 'esmond.vercel.app',
 'collaborators': 7,
 'company': None,
 'created_at': '2021-03-23T09:14:11Z',
 'disk_usage': 118593,
 'email': None,
 'events_url': 'https://api.github.com/users/esmond-adjei/events{/privacy}',
 'followers': 6,
 'followers_url': 'https://api.github.com/users/esmond-adjei/followers',
 'following': 12,
 'following_url': 'https://api.github.com/users/esmond-adjei/following{/other_user}',
 'gists_url': 'https://api.github.com/users/esmond-adjei/gists{/gist_id}',
 'gravatar_id': '',
 'hireable': None,
 'html_url': 'https://github.com/esmond-adjei',
 'id': 81225469,
 'location': 'Ghana',
 'login': 'esmond-adjei',
 'name': 'Esmond Adjei',
 'node_id': 'MDQ6VXNlcjgxMjI1NDY5',
 'organizations_url': 'https://api.github.com/u

# ACTIVITIES

Ananlysis of the users' activities will provide insights into the users' behavior in terms of the following:
- frequency of commits
- frequency of issues
- frequency of pull requests

- community involvement
- project engagement

| What to Measure                 | How to Measure It                   | Why It's Important                             |
|--------------------------------|-------------------------------------|-------------------------------------------------|
| Contribution History           | Analyze the number of commits, pull requests, issues created, and issues commented on. Measure activity over time. | Assesses their ongoing contributions and commitment to coding. |
| Collaboration                  | Count the number of pull requests and issues opened. Analyze the number of contributions to other repositories. Measure response times in issues and pull requests. | Reflects their ability to collaborate and engage with the community. |
| Issue Management               | Measure time-to-close for issues. Calculate issue closure rates. | Evaluates their efficiency and responsiveness in issue handling. |

In [5]:
user_activity = requests.get(f'{USER_BASE_URL}/{username}/events', headers=headers)

In [6]:
pp(len(user_activity.json()))

30


response is a list of dictionaries with following keys:
```python
for event in user_activity.json():
    print(event.keys())

# [ 
#  'id'::number_str,
#  'type'::str,
#  'actor'::dict,
#  'repo'::dict,
#  'payload'::dict,
#  'public'::bool,
#  'created_at'::date_str,
#  'org'::dict
# ]
```

In [7]:
pp(user_activity.json()[0])

{'actor': {'avatar_url': 'https://avatars.githubusercontent.com/u/81225469?',
           'display_login': 'esmond-adjei',
           'gravatar_id': '',
           'id': 81225469,
           'login': 'esmond-adjei',
           'url': 'https://api.github.com/users/esmond-adjei'},
 'created_at': '2023-10-23T22:15:04Z',
 'id': '32786398193',
 'payload': {'action': 'started'},
 'public': True,
 'repo': {'id': 705795945,
          'name': 'esmond-adjei/forage-lyft-starter-repo',
          'url': 'https://api.github.com/repos/esmond-adjei/forage-lyft-starter-repo'},
 'type': 'WatchEvent'}


# REPOSITORIES

This analysis provides insights into the repositories of a user, shedding light on the user's interests and skills.

**General**
- total number of repositories
- total number of forks
- total number of stars

**Specifics**
- programming language distribution
- analysis of project topics (e.g. web, desktop, mobile, software engineering, ai/ml, etc.)

**PROJECTS:**

| What to Measure                 | How to Measure It                   | Why It's Important                             |
|--------------------------------|-------------------------------------|-------------------------------------------------|
| Quality of Code                | Use code analysis tools to assess code complexity, duplication, and code coverage. Analyze commit messages for meaningful content. | Ensures the quality and maintainability of their code. |
| Variety of Projects            | Categorize repositories by project type or technology. Quantify the number of different categories. | Shows their ability to work on diverse projects and technologies. |
| Project Completeness           | Analyze commit history for regular updates, bug fixes, and completion of projects. | Reflects their commitment to project completion and maintenance. |
| Documentation                  | Examine README files for length, clarity, and completeness. Perform a quantitative analysis of documentation quality. | Demonstrates their ability to provide clear project documentation. |
| Version Control                | Measure commit frequency, analyze branching strategy, and assess the cleanliness of the commit history. | Shows their ability to maintain version control and work in a team. |
| Coding Style Consistency       | Evaluate adherence to coding style guides using linters. Analyze code style deviations. | Ensures consistency in code style and adherence to coding standards. |


In [8]:
repo_response = requests.get(f'{USER_BASE_URL}/{username}/repos', headers=headers) # len = 30

In [9]:
# get length and keys of repo data
repo_data_keys = list(repo_response.json()[0].keys()) # len = 79
pp(repo_data_keys)
print('Total number of keys:', len(repo_data_keys))

['id',
 'node_id',
 'name',
 'full_name',
 'private',
 'owner',
 'html_url',
 'description',
 'fork',
 'url',
 'forks_url',
 'keys_url',
 'collaborators_url',
 'teams_url',
 'hooks_url',
 'issue_events_url',
 'events_url',
 'assignees_url',
 'branches_url',
 'tags_url',
 'blobs_url',
 'git_tags_url',
 'git_refs_url',
 'trees_url',
 'statuses_url',
 'languages_url',
 'stargazers_url',
 'contributors_url',
 'subscribers_url',
 'subscription_url',
 'commits_url',
 'git_commits_url',
 'comments_url',
 'issue_comment_url',
 'contents_url',
 'compare_url',
 'merges_url',
 'archive_url',
 'downloads_url',
 'issues_url',
 'pulls_url',
 'milestones_url',
 'notifications_url',
 'labels_url',
 'releases_url',
 'deployments_url',
 'created_at',
 'updated_at',
 'pushed_at',
 'git_url',
 'ssh_url',
 'clone_url',
 'svn_url',
 'homepage',
 'size',
 'stargazers_count',
 'watchers_count',
 'language',
 'has_issues',
 'has_projects',
 'has_downloads',
 'has_wiki',
 'has_pages',
 'has_discussions',
 'fork

**Basic Information:**
`id`, `node_id`, `name`, `full_name`, `private`, `owner`, `html_url`, `description`, `fork`, `url`, `forks_url`, `keys_url`, `collaborators_url`, `teams_url`, `hooks_url`, `issue_events_url`, `events_url`, `assignees_url`, `branches_url`, `tags_url`, `blobs_url`, `git_tags_url`, `git_refs_url`, `trees_url`, `statuses_url`, `languages_url`

**Dates and Timestamps:**
`created_at`, `updated_at`, `pushed_at`

**Access URLs:**
`git_url`, `ssh_url`, `clone_url`, `svn_url`

**Branches and Tags:**
`default_branch`

**Collaboration and Contributors:**
`collaborators_url`, `teams_url`, `hooks_url`, `assignees_url`, `subscribers_url`, `contributors_url`

**Events and Activities:**
`issue_events_url`, `events_url`, `commits_url`, `git_commits_url`, `comments_url`, `issue_comment_url`

**Source Code and Contents:**
`blobs_url`, `contents_url`, `compare_url`, `merges_url`, `archive_url`, `downloads_url`, `contents_url`

**Repository Statistics:**
`size`, `stargazers_count`, `watchers_count`, `forks_count`, `open_issues_count`

**Features and Settings:**
`has_issues`, `has_projects`, `has_downloads`, `has_wiki`, `has_pages`, `has_discussions`, `mirror_url`, `archived`, `disabled`, `allow_forking`, `is_template`, `web_commit_signoff_required`, `visibility`

**License Information:**
`license`

**Topics:**
`topics`


In [10]:
pp(repo_response.json()[21])

{'allow_forking': True,
 'archive_url': 'https://api.github.com/repos/esmond-adjei/mit-deep-learning-lex-fridman/{archive_format}{/ref}',
 'archived': False,
 'assignees_url': 'https://api.github.com/repos/esmond-adjei/mit-deep-learning-lex-fridman/assignees{/user}',
 'blobs_url': 'https://api.github.com/repos/esmond-adjei/mit-deep-learning-lex-fridman/git/blobs{/sha}',
 'branches_url': 'https://api.github.com/repos/esmond-adjei/mit-deep-learning-lex-fridman/branches{/branch}',
 'clone_url': 'https://github.com/esmond-adjei/mit-deep-learning-lex-fridman.git',
 'collaborators_url': 'https://api.github.com/repos/esmond-adjei/mit-deep-learning-lex-fridman/collaborators{/collaborator}',
 'comments_url': 'https://api.github.com/repos/esmond-adjei/mit-deep-learning-lex-fridman/comments{/number}',
 'commits_url': 'https://api.github.com/repos/esmond-adjei/mit-deep-learning-lex-fridman/commits{/sha}',
 'compare_url': 'https://api.github.com/repos/esmond-adjei/mit-deep-learning-lex-fridman/comp

In [11]:
# project languages
repo_languages = requests.get(repo_response.json()[0]['languages_url'], headers=headers)
print(repo_languages.json())

{'Python': 60336, 'HTML': 19352, 'CSS': 8819}


In [16]:
def get_total_commit(BASE_URL, repo_name, repo_branch):
    # get total commits for a repo
    commit_url = f'{BASE_URL}/{username}/{repo_name}/branches/{repo_branch}'
    print(commit_url)
    commit_response = requests.get(commit_url, headers=headers)
    if commit_response.status_code == 200:
        commit_info = commit_response.json()
        print(commit_info)
        total_commits = commit_info['commit']['commit']['committer']['name']
    else:
        total_commits = "N/A"
    return total_commits

repo_name = repo_response.json()[0]['name']
repo_branch = repo_response.json()[0]['default_branch']
print(repo_name, repo_branch)
commit = get_total_commit(USER_BASE_URL, repo_name, repo_branch)
print(commit)

AirBnB_clone main
https://api.github.com/users/esmond-adjei/AirBnB_clone/branches/main


ConnectionError: HTTPSConnectionPool(host='api.github.com', port=443): Max retries exceeded with url: /users/esmond-adjei/AirBnB_clone/branches/main (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x000001D473BB8350>: Failed to resolve 'api.github.com' ([Errno 11001] getaddrinfo failed)"))

In [13]:
def calculate_duration(timestamp1, timestamp2):
    # calculate duration between two timestamps in days and hours
    dt1 = datetime.fromisoformat(timestamp1)
    dt2 = datetime.fromisoformat(timestamp2)
    duration = dt2 - dt1
    days = duration.days
    seconds = duration.seconds
    hours = seconds // 3600
    formatted_duration = f"{days} days, {hours} hrs"

    return formatted_duration


def tabluate_repo_data(repos):
    # tabulate repo data
    repo_data = []
    for repo in repos:
        repo_info = {
            'Repo Name': repo['name'],
            'Owner': repo['owner']['login'],
            'Total Commits': repo['default_branch'],
            'Language': repo['language'],
            'Created At': repo['created_at'],
            'Updated At': repo['updated_at'],
            'Duration': calculate_duration(repo['created_at'], repo['updated_at']),
            'Stars': repo['stargazers_count'],
            'Forks': repo['forks_count']
        }
        repo_data.append(repo_info)

    table = tabulate(repo_data, headers='keys', tablefmt='pretty')
    print(table)


In [14]:
tabluate_repo_data(repo_response.json())

+-------------------------------+--------------+---------------+------------+----------------------+----------------------+------------------+-------+-------+
|           Repo Name           |    Owner     | Total Commits |  Language  |      Created At      |      Updated At      |     Duration     | Stars | Forks |
+-------------------------------+--------------+---------------+------------+----------------------+----------------------+------------------+-------+-------+
|         AirBnB_clone          | esmond-adjei |     main      |   Python   | 2023-05-09T17:15:55Z | 2023-05-09T17:51:05Z |  0 days, 0 hrs   |   0   |   1   |
|        AirBnB_clone_v2        | esmond-adjei |    master     |   Python   | 2023-09-11T02:24:45Z | 2023-09-11T02:27:35Z |  0 days, 0 hrs   |   0   |   0   |
|        AirBnB_clone_v3        | esmond-adjei |    master     |   Python   | 2023-09-28T17:10:47Z | 2023-09-28T17:20:57Z |  0 days, 0 hrs   |   0   |   0   |
|        AirBnB_clone_v4        | esmond-adjei

# INTERESTS

| What to Measure                 | How to Measure It                   | Why It's Important                             |
|--------------------------------|-------------------------------------|-------------------------------------------------|
| Problem-Solving Skills         | Assess the complexity and effectiveness of solutions implemented in their code. Analyze code changes related to problem-solving tasks. | Indicates their ability to solve coding challenges effectively. |
| Use of Technologies            | Analyze the programming languages and frameworks used in repositories. Quantify technology usage based on commits and file types. | Reflects their proficiency in different technologies and stacks. |
| Learning and Growth            | Analyze repositories for updates, technology changes, and the addition of new features or improvements. | Demonstrates their adaptability and commitment to growth. |
| Adaptability                   | Analyze the variety of programming languages and frameworks used in repositories. Quantify proficiency in each technology. | Reflects their ability to adapt to different tech stacks and learn new skills. |