# Description

This notebook tests the PyGithub package to read a GitHub repository containing a Manubot-based manuscript.

# Modules

In [1]:
from github import Auth, Github
from IPython.display import display
from proj import conf

# Settings/paths

In [2]:
REPO = "pivlab/manubot-ai-editor-code-test-mutator-epistasis-manuscript"
# PR 2: gpt-3.5-turbo
# PR 3: gpt-4-0125-preview
PR = 3

# Get Repo

In [3]:
auth = Auth.Token(conf.github.API_TOKEN)

In [4]:
g = Github(auth=auth)

In [5]:
repo = g.get_repo(REPO)

# Get Pull Request

In [6]:
pr = repo.get_pull(PR)

In [7]:
list(pr.get_files())

[File(sha="5e9d3cccc415ea4275cda3896a06c942bb46cb9e", filename="content/01.abstract.md"),
 File(sha="dd47994278c9db83acb8446c7a27ba02d9d8ac98", filename="content/02.introduction.md"),
 File(sha="8ad1922ad244491463719dec68a8bb7636acbcba", filename="content/03.results.md"),
 File(sha="13e0461ce5fb268620cc4187f570e876b99b9a8b", filename="content/04.discussion.md"),
 File(sha="bd81f3473f5d9720a4c33eeebec55a8f31dc1732", filename="content/05.methods.md")]

In [8]:
pr_commits = list(pr.get_commits())

In [9]:
pr_commits[0].parents

[Commit(sha="5c98a032259b13a812dc490a7a13edfa78768ba9")]

In [10]:
pr_prev = pr_commits[0].parents[0].sha
print(pr_prev)

5c98a032259b13a812dc490a7a13edfa78768ba9


In [11]:
pr_curr = pr_commits[0].sha
print(pr_curr)

e72330a56ef55de0c390fb1e8a68310663d6c834


# Get file list

In [12]:
pr_files = [f for f in pr.get_files() if f.filename.endswith(".md")]
display(pr_files)

[File(sha="5e9d3cccc415ea4275cda3896a06c942bb46cb9e", filename="content/01.abstract.md"),
 File(sha="dd47994278c9db83acb8446c7a27ba02d9d8ac98", filename="content/02.introduction.md"),
 File(sha="8ad1922ad244491463719dec68a8bb7636acbcba", filename="content/03.results.md"),
 File(sha="13e0461ce5fb268620cc4187f570e876b99b9a8b", filename="content/04.discussion.md"),
 File(sha="bd81f3473f5d9720a4c33eeebec55a8f31dc1732", filename="content/05.methods.md")]

# Get file content

In [13]:
pr_filename = pr_files[1].filename
display(pr_filename)

'content/02.introduction.md'

In [14]:
print(repo.get_contents(pr_filename, pr_prev).decoded_content.decode("utf-8"))

## Introduction

Germline mutation rates reflect the complex interplay between DNA proofreading and repair pathways, exogenous sources of DNA damage, and life-history traits. 
For example, parental age is an important determinant of mutation rate variability; in many mammalian species, the number of germline *de novo* mutations observed in offspring increases as a function of paternal and maternal age [@PMID:28959963;@PMID:31549960;@PMID:35771663;@PMID:32804933;@PMID:31492841].
Rates of germline mutation accumulation are also variable across human families [@PMID:26656846;@PMID:31549960], likely due to either genetic variation or differences in environmental exposures.
Although numerous protein-coding genes contribute to the maintenance of genome integrity, genetic variants that increase germline mutation rates, known as *mutator alleles*, have proven difficult to discover in mammals.

The dearth of observed germline mutators in mammalian genomes is not necessarily surprising, since al

In [15]:
print(repo.get_contents(pr_filename, pr_curr).decoded_content.decode("utf-8"))

## Introduction

The rate at which mutations accumulate in the germline, the cells responsible for passing genetic material to offspring, is influenced by a complex mix of factors.
These include the mechanisms that correct DNA errors, repair damaged DNA, external factors that can harm DNA, and various life-history characteristics.
One significant factor affecting mutation rate variability is the age of the parents.
Studies have shown that the number of new mutations found in the offspring tends to increase with both the father's and mother's age across many mammal species (references).
Additionally, the rate at which mutations accumulate can differ significantly between human families (references), which may be attributed to genetic differences or varying environmental factors that individuals are exposed to.
While a large number of genes are known to play a role in protecting the integrity of our genetic information, identifying specific genetic variations, known as mutator alleles, t

# Close connections

In [16]:
g.close()