This repository contains various scripts and data which we used in our paper to evaluate the S.H.I.E.L.D. architecture. Below is a breakdown of the directory structure and key files.
This folder contains various assessment specifications from the course Software Engineering Fundamentals at the University of New South Wales, including:
- lab01_academics.md, lab01_leap.md, ... lab08_snapnews.md: Different lab assignments and specifications.
- project.md: Project-specific specifications.
This folder contains scripts related to vector store creation and evaluation:
- create_vector_store.py: Creates a ChromaDB instance from the assessment specs. (Note: A vector store already exists in utils/vs.)
- eval_baseline.py: Baseline evaluation script.
- eval_courseassist.py: Reproduced and used to evaluate the CourseAssist architecture (not completely compatible with S.H.I.E.L.D.).
- eval_shield.py: Evaluates the S.H.I.E.L.D. architecture.
- test.json: Contains testing data used in our evaluations.
Helper functions and utilities:
- vs/: Subdirectory for vector store utilities.
- call_gpts.py: Script to call GPT models.
- io_functions.py: Handles input and output operations.
- vector_store_functions.py: Functions related to vector store management.
The evaluation scripts generate JSON output files for manual review. These results should be stored in the evaluations
folder.
- OpenAI API keys are not provided in this repository. Ensure that your environment is configured with the necessary credentials before running any script that requires GPT-based models. For access to the fine-tuned intent classifier model in eval_shield.py please email z5604369@ad.unsw.edu.au, or alternatively gpt-4o-mini can be used as a substitute.
- The outputs we got in our evaluation along with other details can be found here: https://sites.google.com/view/shield-tutor/home