# Description

This notebook tests the PyGithub package to read a GitHub repository containing a Manubot-based manuscript.

# Modules

In [1]:
from github import Auth, Github
from IPython.display import display
from proj import conf

# Settings/paths

In [2]:
REPO = "pivlab/manubot-ai-editor-code-test-phenoplier-manuscript"
# PR 2: gpt-3.5-turbo
# PR 3: gpt-4-0125-preview
PR = 3

# Get Repo

In [3]:
auth = Auth.Token(conf.github.API_TOKEN)

In [4]:
g = Github(auth=auth)

In [5]:
repo = g.get_repo(REPO)

# Get Pull Request

In [6]:
pr = repo.get_pull(PR)

In [7]:
list(pr.get_files())

[File(sha="e77cc7a085cf8d2f05e5c8d5dd94fe149961e335", filename="content/01.abstract.md"),
 File(sha="f67a7e41a5081e7feadff4f53f3bcfc70db8e143", filename="content/02.introduction.md"),
 File(sha="c34e130a675bee917274480f27389cb0f3aa1c3c", filename="content/04.05.00.results_framework.md"),
 File(sha="9ad7ef6f5a4cc2288c86a8807f52b9630baff667", filename="content/05.discussion.md"),
 File(sha="bb7fbd5889a3ce1866a95f3bce2d92778779787b", filename="content/07.00.methods.md"),
 File(sha="726098813744ee5919c17a2db2b669dc2db1df73", filename="content/50.00.supplementary_material.md")]

In [8]:
pr_commits = list(pr.get_commits())

In [9]:
pr_commits[0].parents

[Commit(sha="523b22bc18a901c5583a5d321d3ce35a30a3b8de")]

In [10]:
pr_prev = pr_commits[0].parents[0].sha
print(pr_prev)

523b22bc18a901c5583a5d321d3ce35a30a3b8de


In [11]:
pr_curr = pr_commits[0].sha
print(pr_curr)

9a735ab6bb05c3ce05a2a7ed6dd268489c9b9c29


# Get file list

In [12]:
pr_files = [f for f in pr.get_files() if f.filename.endswith(".md")]
display(pr_files)

[File(sha="e77cc7a085cf8d2f05e5c8d5dd94fe149961e335", filename="content/01.abstract.md"),
 File(sha="f67a7e41a5081e7feadff4f53f3bcfc70db8e143", filename="content/02.introduction.md"),
 File(sha="c34e130a675bee917274480f27389cb0f3aa1c3c", filename="content/04.05.00.results_framework.md"),
 File(sha="9ad7ef6f5a4cc2288c86a8807f52b9630baff667", filename="content/05.discussion.md"),
 File(sha="bb7fbd5889a3ce1866a95f3bce2d92778779787b", filename="content/07.00.methods.md"),
 File(sha="726098813744ee5919c17a2db2b669dc2db1df73", filename="content/50.00.supplementary_material.md")]

# Get file content

In [13]:
pr_filename = pr_files[2].filename
display(pr_filename)

'content/04.05.00.results_framework.md'

In [14]:
print(repo.get_contents(pr_filename, pr_prev).decoded_content.decode("utf-8"))

### PhenoPLIER: an integration framework based on gene co-expression patterns

![
**Schematic of the PhenoPLIER framework.**
**a)** High-level schematic of PhenoPLIER (a gene module-based method) in the context of TWAS (single-gene) and GWAS (single-variant).
In GWAS, we identify variants associated with traits.
In TWAS, first, we identify variants that are associated with gene expression levels (eQTLs); then, prediction models based on eQTLs are used to impute gene expression, which is used to compute gene-trait associations.
Resources such as LINCS L1000 provide information about how a drug perturbs gene expression; at the bottom-right corner, we show how a drug downregulates two genes (A and C).
In PhenoPLIER, these data types are integrated using groups of genes co-expressed across one or more conditions (such as cell types) that we call gene modules or latent variables/LVs. Created with BioRender.com.
**b)** The integration process in PhenoPLIER uses low-dimensional representation

In [15]:
print(repo.get_contents(pr_filename, pr_curr).decoded_content.decode("utf-8"))

### PhenoPLIER: an integration framework based on gene co-expression patterns

![
**Schematic of the PhenoPLIER framework.**
**a)** High-level schematic of PhenoPLIER (a gene module-based method) in the context of TWAS (single-gene) and GWAS (single-variant).
In GWAS, we identify variants associated with traits.
In TWAS, first, we identify variants that are associated with gene expression levels (eQTLs); then, prediction models based on eQTLs are used to impute gene expression, which is used to compute gene-trait associations.
Resources such as LINCS L1000 provide information about how a drug perturbs gene expression; at the bottom-right corner, we show how a drug downregulates two genes (A and C).
In PhenoPLIER, these data types are integrated using groups of genes co-expressed across one or more conditions (such as cell types) that we call gene modules or latent variables/LVs. Created with BioRender.com.
**b)** The integration process in PhenoPLIER uses low-dimensional representation

# Close connections

In [16]:
g.close()