<a href="https://colab.research.google.com/github/agonist11/colabadmixtools/blob/main/ColabADMIXTOOLS_Quick_Start_V5_1.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# **ColabADMIXTOOLS Quick-Start Notebook V5.1**
---
Special thanks to `Florio` and the community for contributions, testing, and documentation.

> Revised 08-03-2025



<a href="https://github.com/agonist11/colabadmixtools" target="_blank">**Check out the GitHub with other tools**</a>


---

## **What this notebook does**

- Mounts your Google Drive and creates Drive-backed folders for R libraries and AADR data `[Section A]`
- Installs or reinstalls the `admixtools2` package into your Drive-backed R library  
- Downloads AADR v62 from Dataverse (or unzips your uploaded archive) into Drive  
- Provides a reusable cell to run **qpAdm** via `rpy2` `[Section B] `
- Persists everything on Drive so you can restart the Colab VM and pick up right where you left off  

---

## **Customize your own analyses**

For all available admixtools2 functions and parameters, see the official reference and build your own workflows:  
<a href="https://uqrmaie1.github.io/admixtools/reference/index.html" target="_blank">**admixtools2 Reference Documentation**</a>

Create a copy of this Notebook to save your changes.

---

Feel free to fork or star the repo for your own datasets and tweaks:  
<a href="https://github.com/agonist11/colabadmixtools" target="_blank">https://github.com/agonist11/colabadmixtools</a>


## **[Section A] First-Time Users or Fresh Re-Installations Only**

In [None]:
#@title **1. Mount Google Drive & Prepare Folders**
from google.colab import drive
import os, shutil

drive.mount('/content/drive', force_remount=True)

ROOT   = '/content/drive/MyDrive/colabadmixtools'
R_LIBS = os.path.join(ROOT, 'R_libs')
AADR   = os.path.join(ROOT, 'AADR')

os.makedirs(ROOT, exist_ok=True)
if os.path.exists(R_LIBS): shutil.rmtree(R_LIBS)
os.makedirs(R_LIBS, exist_ok=True)

os.environ['R_LIBS_USER'] = R_LIBS
print(f"Prepared:\n • {R_LIBS}\n • {AADR}")


In [None]:
#@title **2. Install admixtools (v2) into Google Drive**
%%bash
# install remotes if missing
if ! Rscript -e "quit(status = if (!requireNamespace('remotes', quietly=TRUE)) 1 else 0)"; then
  Rscript -e "install.packages('remotes', repos='https://cloud.r-project.org')"
fi

# install the correct GitHub repo (it's 'uqrmaie1/admixtools', not 'admixtools2')
Rscript -e "remotes::install_github('uqrmaie1/admixtools',
                                   dependencies=TRUE,
                                   force=TRUE,
                                   lib=Sys.getenv('R_LIBS_USER'))"
echo "admixtools installed into $R_LIBS_USER"


In [None]:
#@title **3. Download AADR v62 from Dataverse into Google Drive**
import os, shutil, requests

ROOT = '/content/drive/MyDrive/colabadmixtools'
AADR = os.path.join(ROOT, 'AADR')

if os.path.exists(AADR): shutil.rmtree(AADR)
os.makedirs(AADR, exist_ok=True)

DATAVERSE = "https://dataverse.harvard.edu"
PID        = "doi:10.7910/DVN/FFIDCW"
meta_url   = f"{DATAVERSE}/api/datasets/:persistentId/?persistentId={PID}"

md    = requests.get(meta_url).json()
files = md['data']['latestVersion']['files']
print(f"→ {len(files)} file(s) found. Downloading…")

for f in files:
    fid   = f['dataFile']['id']
    name  = f.get('label', str(fid))
    outp  = os.path.join(AADR, name)
    dlurl = f"{DATAVERSE}/api/access/datafile/{fid}"
    print("Downloading", name)
    with requests.get(dlurl, stream=True) as r:
        r.raise_for_status()
        with open(outp, 'wb') as fp:
            for chunk in r.iter_content(8192):
                fp.write(chunk)

print(f"AADR v62 downloaded to {AADR}")


## **[Section B] Returning Users Only**

In [None]:
#@title **4. Mount Google Drive & set R library path**
from google.colab import drive
import os

# 1) Mount your Drive
drive.mount('/content/drive', force_remount=True)

# 2) Point R to your persistent library (do NOT delete this folder here)
R_LIBS = '/content/drive/MyDrive/colabadmixtools/R_libs'
os.makedirs(R_LIBS, exist_ok=True)

# 3) Export so R sees it
os.environ['R_LIBS_USER'] = R_LIBS

print("R_LIBS_USER →", R_LIBS)


In [None]:
#@title **5. Run qpAdm with your existing install**
prefix     = "/content/drive/MyDrive/colabadmixtools/AADR/v62.0_HO_public"  #@param {type:"string"}
target     = "MXL.DG"                                                     #@param {type:"string"}
left_pops  = "IBS.DG,Nahua.DG,Yoruba.DG"                                   #@param {type:"string"}
right_pops = "Mbuti.DG,Russia_UstIshim_IUP.DG,Russia_Kostenki14_UP.SG,Georgia_Dzudzuana_UP.SG,Russia_Sidelkino_HG.SG,Israel_Natufian.AG,Russia_MA1_UP.SG,Brazil_LocaDoSuin_Sambaqui_9100BP.AG,Switzerland_Bichon_Epipaleolithic.SG"  #@param {type:"string"}

# parse your comma‐lists
left  = [p.strip() for p in left_pops.split(',')]
right = [p.strip() for p in right_pops.split(',')]

# import rpy2
import rpy2.robjects as ro
from rpy2.robjects import StrVector

# 1) Tell R to look in your Drive-backed library first
ro.r('.libPaths(c(Sys.getenv("R_LIBS_USER"), .libPaths()))')

# 2) Load admixtools from that folder
ro.r('library(admixtools, quietly=TRUE)')

# 3) Push your Python variables into R
ro.globalenv['prefix'] = prefix
ro.globalenv['left']   = StrVector(left)
ro.globalenv['right']  = StrVector(right)
ro.globalenv['target'] = target

# 4) Run qpAdm
res = ro.r('qpadm(prefix, left, right, target=target, allsnps=TRUE, verbose=TRUE)')
print(res)


# **Tools (In Progress)**

In [None]:
#@title **Tool 1: Upload ZIP via File Browser & Unzip to `/content/`**
#@markdown Slower.
from google.colab import files
import zipfile, os

# 1. Prompt user to upload a ZIP file
uploaded = files.upload()

# 2. Find the first ZIP in the uploads
zip_files = [name for name in uploaded.keys() if name.lower().endswith('.zip')]
if not zip_files:
    print("No .zip file uploaded. Please upload a ZIP and re-run this cell.")
else:
    zip_path = zip_files[0]
    # 3. Unzip into /content/
    with zipfile.ZipFile(zip_path, 'r') as z:
        z.extractall('/content/')
    print(f"Unpacked {zip_path} → /content/")


In [None]:
#@title **Tool 2: Mount Google Drive & Unzip to `/content/`**
#@markdown Faster.
zip_file_path = "/content/drive/MyDrive/colabadmixtools/Florio_mergedHO.zip"  #@param {type:"string"}

from google.colab import drive
import zipfile, os

# 1. Mount your Drive
drive.mount('/content/drive', force_remount=True)

# 2. Validate & unzip into the Colab runtime root
if os.path.isfile(zip_file_path) and zip_file_path.lower().endswith('.zip'):
    with zipfile.ZipFile(zip_file_path, 'r') as z:
        z.extractall('/content/')
    print(f"Unpacked {os.path.basename(zip_file_path)} → /content/")
else:
    print(f"ZIP file not found at:\n  {zip_file_path}\nPlease check the path and try again.")
