Skip to content


Repository files navigation

DTSC 5301 and STAT 5000 Final Report

The final project for Data Science as a Field and STAT 5000.

Website link:

How to generally do work for Quarto STAT 5000 website and PDF

  1. Install Quarto and install.packages("rmarkdown") if you haven't already.
  2. Whenever you do git pull, always rebase: git pull --rebase.
  3. Make a branch for any major work that will affect the actual report files (aka the .qmd files), and start your branch name with initials, e.g. BJ_work1.
  4. ALWAYS keep your git updated, do git pull always before any commiting, pushing, branches, etc.
  5. When pushing, do a force push.
  6. cd into the 5000-final folder
  7. Run quarto render on the CLI/bash
  8. Do a git pull, commit and push (do force push if needed)
  9. Github will automatically turn the docs/ folder file changes into the website

How the website generation works

As Github workflows (.github/workflows/), there are 2 separate scripts:

  1. A script for auto-generating the static site file from the 5301_Final_Report.ipynb DTSC 5301 file.
  2. Workflow for deploying static content to GitHub Pages; Run every time and only after the docs/ directory is updated.

Directory Structure

  • docs/: Where the github site is being hosted from, and what gets generated when we export the .ipynb notebook / Quarto into a webpage HTML. DO NOT manually change or touch, will auto-update upon changes to 5301_Final_Report.ipynb/5000-final/.
  • images/: Where images should go.
  • data/: Where data we are using should go.
  • 5301_Final_Report.ipynb: The actual file that gets turned into the webpage. Use this naming scheme only, when exporting from DeepNote export only as this file name and replace the existing file.
  • 5000-final: Where the Quarto code for the STATS 5000 report should go.
  • requirements.txt: Update this with any python packages you use in 5301_Final_Report.ipynb.

Deepnote to github/website instructions:

  1. Right-click on the 5301_Final_Report file in the Deepnote sidebar, click on "Export as .ipynb"
  2. Re-upload the file downloaded to Deepnote under the github repo folder in the sidebar, make sure you are overwriting the previous 5301_Final_Report.ipynb.
  3. Click on Bhav's terminal #1 in the sidebar under the github repo (or create a new terminal, either way)
  4. In that terminal execute:
    1. cd DTSC-5301-Final-Report/
    2. git commit -am "YOUR MESSAGE HERE"
    3. git push
  5. OR. Repeat steps 2-4 locally if you have the repo checked out and know how to, same result of commiting+pushing from deepnote itself.

Info on Files in /data

filenames purpose recommendations
Croped_ff_np.csv Permutation evaluation (older version) for fairface, no preprocessing on cropped images. Updated this file to look at the same files as the uncropped dataset. Remove from github data folder.
MasterDataFrame.csv Final master data file containing all input and output files Keep as-is with no changes
crop_df_np.csv Permutation evaluation for DeepFace, cropped images, no pre-processing Retain; rename to PERM_DF_c_np.csv
crop_df_p_mtcnn.csv Permutation evaluation for DeepFace, cropped images, preprocessed with MTCNN backend. Retain; rename to PERM_DF_c_p_mtcnn.csv
crop_df_p_opencv.csv Permutation evaluation for DeepFace, cropped images, preprocessed with OpenCV backend. Retain; rename to PERM_DF_c_p_opencv.csv
cropped_UTK.csv Permutation evaluation (older version), list of cropped files to perform evaluation. Remove from github data folder
cropped_UTK_dataset.csv Permutation evaluation (newest version), list of cropped files to perform evaluation. Retain with no changes
cropped_ff_p.csv Permutation evaluation (older version), used older version of cropped images dataset. Remove from github data folder.
joined_permutations.csv Permutation evaluation (newest version), joined all permutation outputs from DeepFace and FairFace to a single file Retain with no changes
new_ff_c_np.csv Permutation evaluation (newest version), FairFaice outputs for cropped images with no preprocessing Retain; rename to PERM_FF_c_np.csv
new_ff_c_p.csv Permutation evaluation (newest version), FairFaice outputs for cropped images with dlib preprocessing Retain; rename to PERM_FF_c_p.csv
new_ff_uc_np.csv Permutation evaluation (newest version), FairFaice outputs for uncropped images with no preprocessing Retain; rename to PERM_FF_uc_np.csv
new_ff_uc_p.csv Permutation evaluation (newest version), FairFaice outputs for uncropped images with dlib preprocessing. Retain; rename to PERM_FF_uc_p.csv
non_normalized_DeepFace_uncropped_DF_all.csv Final dataset of DeepFace Outputs (non-normalized) Retain; rename to Master_DF_non_normalized.csv
non_normalized_FairFace_uncropped_FF_all.csv Final dataset of FairFace Outputs (non-normalized) Retain; rename to Master_FF_non_normalized.csv
uncropped_DF_all.csv Final normalized output for DeepFace - used to build MasterDataFrame.csv Retain with no changes
uncropped_FF_all.csv Final normalized output for FairFace - used to build MasterDataFrame.csv Retain with no changes
uncropped_UTK.csv Permutation evaluation (older version) - source data file for iteration script Remove from github data folder.
uncropped_UTK_dataset.csv Permutation evaluation (newest version) - source data file for uncropped images in iteration script Retain with no changes
uncropped_df_np.csv Permutation evaluation (newest version) - DeepFace uncropped images with no preprocessing Retain; rename to PERM_DF_uc_np.csv
uncropped_df_p_mtcnn.csv Permutation Evaluation (newest version) - DeepFace uncropped images with mtcnn preprocessing Retain; rename to PERM_DF_uc_p_mtcnn.csv
uncropped_df_p_opencv.csv Permutation Evaluation (newest version) - DeepFace uncropped images with opencv preprocessing Retain; rename to PERM_DF_uc_p_opencv.csv
uncropped_ff_np.csv Permutation Evaluation (older version) - FairFace uncropped images with no preprocessing Remove from github data folder.
uncropped_ff_p.csv Permutation Evaluation (older version) - FairFace uncropped images with dlib preprocessing. Remove from github data folder.