<a href="https://colab.research.google.com/github/grosa1/hands-on-ml-tutorials/blob/master/tutorial_1/pygithub.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# PyGithub examples

PyGitHub is a Python library to access the GitHub API. This library enables you to manage GitHub resources such as repositories, user profiles, and organizations in your Python applications.

## Install and import

We also install pandas along to PyGithub to easily work on the data that will be extracted:

In [1]:
!pip install pandas
!pip install PyGithub



In [0]:
import pandas as pd
from github import Github

### First steps



First create a Github instance using API key or user credentials:

In [0]:
# using username and password
g = Github("user", "password")

In [0]:
# or using an access token
g = Github('access_token')

To get current API call limit:

In [4]:
g.get_rate_limit()

RateLimit(core=Rate(reset=2020-04-29 05:45:25, remaining=4994, limit=5000))

To show your GitHub repos:

In [5]:
for repo in g.get_user().get_repos():
    print(repo.name)

projectL
myunimol-android
tesi
alexa-unit-test
android-bluetooth-testbed
androidtemplate
android_samples
anpr-github-metrics
ant-colony-tsp
api
app-docenti
app-docenti-server
app-frosinone-scuola-calcio
atticus-ecg
bankAccountExample
chart-example
chip2cheek
chip2cheek-api
chip2cheek-app
cordova-plugin-firebase
covid19-bot
crop-style-ase2018
crypto-trading-toolkit-android
cuda-first-test
cwb_multi_thread
diametro-pneumatici
docker-ionic
ecg-clustering
flask
gestione-ricevimenti
hands-on-ml-tutorials
heif
Iliad-Unofficial-API
ionic_electron
keystroke-dynamics
keystroke_dynamics_relazione
keystroke_login
Machine_Learning_Spring_Weka
material-design-icons
metodi-ottimizzazione
microsat
microsat-cuda
MIPHAS-App
miphas-app-sanitario
miphas-dss-echo-server
mirror-vxheaven.org
monero
myunimol-android
node-blog
nodejs-tsw-2017
p-median-python
pannello-frosinone-scuola-calcio
plouse
pso_travelling_salesman
react-native-android-toolbar-example
RepoTEST
ricevimenti-unimol-plugin
ringsat
SmartTabL

### Search repositories by language
We store a list of `dict` to create a pandas `DataFrame` later.
<br>
What we get for each repository is:
- **full name**, for example "elastic/elasticsearch"
- total number of **commits**
- total number of **issues** that are currently open
- **stars count**, as a very rough measure to estimate how well known a repository is

In [0]:
rows = list()
repos = g.search_repositories(query='language:java')
for repo in repos[:50]:
  rows.append({
      'full_name': repo.full_name, 
      'commits': int(repo.get_commits().totalCount),
      'issues': int(repo.get_issues(state='open').totalCount),
      'stars': int(repo.stargazers_count)
      })

Using `sort_values` we sort the `DataFrame` by number of commits

In [7]:
df = pd.DataFrame(rows)
df.sort_values(by=["commits"], ascending=False, inplace=True)
df.head()

Unnamed: 0,full_name,commits,issues,stars
3,elastic/elasticsearch,52105,2757,48522
31,jenkinsci/jenkins,29856,59,15364
4,spring-projects/spring-boot,26324,472,47289
32,bazelbuild/bazel,25743,2256,14573
46,apache/flink,21527,520,12752


We get the name of the first repository in terms of number of commits

In [9]:
first_repo = df["full_name"].iloc[0]

first_repo

'elastic/elasticsearch'

### Working with repos

In [10]:
repository = g.get_repo(first_repo)
dir(repository)

['CHECK_AFTER_INIT_FLAG',
 '_CompletableGithubObject__complete',
 '_CompletableGithubObject__completed',
 '_GithubObject__makeSimpleAttribute',
 '_GithubObject__makeSimpleListAttribute',
 '_GithubObject__makeTransformedAttribute',
 '_Repository__create_pull',
 '_Repository__create_pull_1',
 '_Repository__create_pull_2',
 '__class__',
 '__delattr__',
 '__dict__',
 '__dir__',
 '__doc__',
 '__eq__',
 '__format__',
 '__ge__',
 '__getattribute__',
 '__gt__',
 '__hash__',
 '__init__',
 '__init_subclass__',
 '__le__',
 '__lt__',
 '__module__',
 '__ne__',
 '__new__',
 '__reduce__',
 '__reduce_ex__',
 '__repr__',
 '__setattr__',
 '__sizeof__',
 '__str__',
 '__subclasshook__',
 '__weakref__',
 '_allow_merge_commit',
 '_allow_rebase_merge',
 '_allow_squash_merge',
 '_archive_url',
 '_archived',
 '_assignees_url',
 '_blobs_url',
 '_branches_url',
 '_clone_url',
 '_collaborators_url',
 '_comments_url',
 '_commits_url',
 '_compare_url',
 '_completeIfNeeded',
 '_completeIfNotSet',
 '_contents_url',
 

To show the entire contents of a repository:

In [11]:
for content_file in repository.get_contents(""):
  print(content_file)

ContentFile(path=".ci")
ContentFile(path=".dir-locals.el")
ContentFile(path=".editorconfig")
ContentFile(path=".gitattributes")
ContentFile(path=".github")
ContentFile(path=".gitignore")
ContentFile(path=".idea")
ContentFile(path="CONTRIBUTING.md")
ContentFile(path="LICENSE.txt")
ContentFile(path="NOTICE.txt")
ContentFile(path="README.asciidoc")
ContentFile(path="TESTING.asciidoc")
ContentFile(path="Vagrantfile")
ContentFile(path="benchmarks")
ContentFile(path="build.gradle")
ContentFile(path="buildSrc")
ContentFile(path="client")
ContentFile(path="dev-tools")
ContentFile(path="distribution")
ContentFile(path="docs")
ContentFile(path="gradle.properties")
ContentFile(path="gradle")
ContentFile(path="gradlew")
ContentFile(path="gradlew.bat")
ContentFile(path="libs")
ContentFile(path="licenses")
ContentFile(path="modules")
ContentFile(path="plugins")
ContentFile(path="qa")
ContentFile(path="rest-api-spec")
ContentFile(path="server")
ContentFile(path="settings.gradle")
ContentFile(path="te

### Working with commits
To get a specific commit by commit id (sha):

In [12]:
repository.get_commits()[0]

Commit(sha="c322f3f4d575db1c3eaaa73d1924ffdbfb81c435")

In [13]:
commit = repository.get_commit("c322f3f4d575db1c3eaaa73d1924ffdbfb81c435").commit

print(commit.message, commit.author.name, commit.author.date, commit.url, sep="\n")

json spec - add description for autoscaling (#55748)
Jake Landis
2020-04-28 22:36:20
https://api.github.com/repos/elastic/elasticsearch/git/commits/c322f3f4d575db1c3eaaa73d1924ffdbfb81c435


## Resources

- PyGithub docs: [link](https://pygithub.readthedocs.io/en/latest/introduction.html)

- PyGithub classes: [link](https://pygithub.readthedocs.io/en/latest/github_objects.html)

- GitHub APIs: [link](https://developer.github.com/v3/)