GPT-4 COSMIC Measurement Tool
This app requires python 3.
The COSMET repository is organized into several directories:
| Directory | Description |
|---|---|
[datasets] |
Contains the datasets used for the empirical evaluation. |
[FLAM] |
Contains the source code used to fine-tune the hyperparameters of GPT-4 using the GridSearchCV algorithm. |
[COSMET] |
Contains the COSMET tool sources. |
[Comparison Research] |
Contains the results of the systematic search on Scopus to identify all articles proposing COSMIC automation tools and identify potential approaches to compare with COSMET. |
[deepCOSMIC] |
Contains the experiments with DEEP-COSMIC-UC. |
[results] |
Contains the COSMET empirical evaluation results. |
[metrics] |
Contains the source code to compute the COSMET metrics. |
The datasets directory contains the datasets used for the the empirical evaluation of the COSMET approach. The folder contains the following subfolders:
MIS, contains the ALBERGATE use case model, the manual and automatic measurementReal-time, contains the Automatic Line Switching and Rice Cooker use case models, the manual and automatic measurementsIoT_Telemedicine, contains the FID-CPM, FID-TCT, and FID-MTC use case models, the manual and automatic measurementsML, contains the U-CURE use case model, the manual and automatic measurementRQ2-RQ3, contains the manual and automatic measurement for for user studytest set, contains the test set used to refine the GPT-4 model for the Sentence Splitter and the COSMIC Analyzer components.
The FLAM directory contains the source code used to fine-tune the GPT-model hyperparameters.
The testSplit.py script execute the GridSearchCV algorithm using the test set (described int the Project dataset section) to fine-tune the hyperparameters for the Sentence Splitter component. It produces the split.log file in the ./log folder if executed.
The testAnalysis.py script execute the GridSearchCV algorithm using the test set (described int the Project dataset section) to fine-tune the hyperparameters for the COSMIC Analyzer component. It produces the analysis.log file in the ./log folder if executed.
Please, follow the SETUP instructions before running the scripts.
The COSMET directory contains the source code of the COSMET Web tool. The app.py script run the web application.
Please, follow the SETUP instructions before running the scripts.
The Comparison Research directory contains the results of the systematic search we performed on Scopus to identify all the articles proposing COSMIC automation tools and identify potential approaches to compare with COSMET. The results are reported in the cosmic_SLR.xlsx file.
The deepCOSMIC directory contains the source code of DEEP-COSMIC-UC (Ochodek, Mirosław, Kopczyńska, Sylwia, & Staron, Miroslaw. (2020). Deep learning model for end-to-end approximation of COSMIC functional size based on use-case names. Information and Software Technology, 123, 106310. Elsevier) and the results of the experiments using it with the COSMET datasets.
The results directory contains the results of the empirical evaluation of the COSMET approach.
The metrics directory contains the source code used to evaluate the 1-Rouge, BLEU, BERTscore metrics for the analyzed use case models. Moreover, it contains the script to evaluate the MAE and MdAE metrics for COSMET and DEEP-COSMIC-UC. Finally, it contains the script to evaluate the kappa score for the ground truth creation.
Download the COSMET repository and go to the COSMET directory.
Installa poetry. Go to the folder of your interest (COSMET, FLAM, metrics) and run
poetry install
If you don’t already have an account with OpenAI, you can create an account here. Once you have an account and get logged in, you can click on the name icon in the upper right, and select View API keys.
You can press the button that says Create new secret key. Copy the secret key from the pop-up, and export the environment variable OPENAI_API_KEY:
export OPENAI_API_KEY=<paste-your-key-here>
If you’ve already blown through your free OpenAI credits and have moved to paid account, it’s a good idea to set a usage limit. Since we are deploying this app publicly, you want to make sure that you don’t accidentally spend more money than intended.
From the COSMET directory run:
poetry run streamlit run ./cosmet/app.py
In the 'metrics' folder you can find the python files to calculate COSMET metrics:
RQ1_RQ2_metrics.pyto compute 1-rouge, BLEU, and BERTscore for RQ1 and RQ2mae.pyto compute the MAE and MdAE metrics for COSMET and DEEP-COSMIC-UC (RQ1)kappa.pyto compute the kappa score for the ground truth creation