Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replay mode #9

Merged
merged 8 commits into from
May 24, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
12 changes: 10 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -103,8 +103,16 @@ The `StandardStryker.yml` workflow runs the standard StrykerJS mutators.

### Reproducing results

The results of LLMorpheus are non-deterministic, so even if you run it from the
same package on the same machine multiple times, you will get different results.
The results of LLMorpheus are non-deterministic, so even if you run it from the same package on the same machine multiple times, it is likely that you will get different results.

That said, LLMorpheus has an option `--replay <dirName>` for replaying an execution using prompts and completions observed during a previous execution. If you use this option, the model name, temperature, max number of tokens, and system prompt will be inferred from the previously observed execution. However, the `--mutate` and `--ignore` settings will be taken from command-line arguments, so as to enable replaying executions of LLMorpheus partially, on a subset of the files that it was previous run on.

For example, let's say that the zip-a-folder project is located in a directory `/Users/xxx/zip-a-folder/`, and that the results of a previous execution of LLMorpheus on zip-a-folder can be found in a directory `/Users/xxx/recording`. Further, let's assume that we want to mutate the files that match the pattern `"lib/*.ts"`. You can do that by running the following command:
```
node benchmark/createMutants.js --path /Users/xxx/zip-a-folder --mutate "lib/**.ts" --replay /Users/xxx/recording
```

A collection of recorded executions of LLMorheus is made available at <https://github.com/neu-se/mutation-testing-data>. Here, you can find recorded executions that were made using the codellama-13b-instruct, codellama-34b-instruct, and mixtral-8x7b-instruct LLMs there, and using various templates and temperature settings. The data is organized in subdirectories that reflect the model, template, and temperature used, e.g. `codellama-13b-instruct/template-full-0.0` reflects a set of experiments using the `codellama-13b-instruct` model, the `full` template, at temperature `0.0`. This directory contains 5 sets of experiments on 13 subject programs using these settings, contained in subdirectories named `run354`-`run359`. A file `zip/mutants.zip` within each of these subdirectories contains all data that is needed to replay these executions.


## License
Expand Down
106 changes: 62 additions & 44 deletions benchmark/createMutants.ts
Original file line number Diff line number Diff line change
Expand Up @@ -2,9 +2,11 @@ import yargs from "yargs";
import { hideBin } from "yargs/helpers";
import { Model } from "../src/model/Model";
import { CachingModel } from "../src/model/CachingModel";
import { ReplayModel } from "../src/model/ReplayModel";
import { MutantGenerator } from "../src/generator/MutantGenerator";
import { MetaInfo } from "../src/generator/MetaInfo";
import path from "path";
import { IModel } from "../src/model/IModel";

if (require.main === module) {
(async () => {
Expand Down Expand Up @@ -96,59 +98,75 @@ if (require.main === module) {
demandOption: false,
description: "maximum number of prompts to generate",
},
replay: {
type: "string",
default: undefined,
demandOption: false,
description: "replay execution from specified directory",
},
});

const argv = await parser.argv;
const packagePath = argv.path.endsWith("/")
? argv.path
: path.join(argv.path, "/");
let model: IModel;
let metaInfo: MetaInfo;
if (argv.replay !== undefined) {
model = new ReplayModel(argv.replay);
metaInfo = (model as ReplayModel).getMetaInfo();
metaInfo.mutate = argv.mutate;
metaInfo.ignore = argv.ignore;
} else {
const supportedModels = [
"codellama-13b-instruct",
"codellama-34b-instruct",
"mistral-7b-instruct",
"mixtral-8x7b-instruct",
"mixtral-8x22b",
"llama-2-13b-chat",
"llama-2-70b-chat",
];

const supportedModels = [
"codellama-13b-instruct",
"codellama-34b-instruct",
"mistral-7b-instruct",
"mixtral-8x7b-instruct",
"mixtral-8x22b",
"llama-2-13b-chat",
"llama-2-70b-chat",
];
if (!supportedModels.includes(argv.model)) {
console.error(`Invalid model name: ${argv.model}`);
console.error(`Supported models are: ${supportedModels.join(", ")}`);
process.exit(1);
}

if (!supportedModels.includes(argv.model)) {
console.error(`Invalid model name: ${argv.model}`);
console.error(`Supported models are: ${supportedModels.join(", ")}`);
process.exit(1);
}
metaInfo = {
modelName: argv.model,
temperature: argv.temperature,
maxTokens: argv.maxTokens,
maxNrPrompts: argv.maxNrPrompts,
rateLimit: argv.rateLimit,
nrAttempts: argv.nrAttempts,
template: argv.template,
systemPrompt: argv.systemPrompt,
mutate: argv.mutate,
ignore: argv.ignore,
benchmark: argv.benchmark,
};

const metaInfo: MetaInfo = {
modelName: argv.model,
temperature: argv.temperature,
maxTokens: argv.maxTokens,
maxNrPrompts: argv.maxNrPrompts,
rateLimit: argv.rateLimit,
nrAttempts: argv.nrAttempts,
template: argv.template,
systemPrompt: argv.systemPrompt,
mutate: argv.mutate,
ignore: argv.ignore,
benchmark: argv.benchmark,
};
if (!supportedModels.includes(argv.model)) {
console.error(`Invalid model name: ${argv.model}`);
console.error(`Supported models are: ${supportedModels.join(", ")}`);
process.exit(1);
}

if (!supportedModels.includes(argv.model)) {
console.error(`Invalid model name: ${argv.model}`);
console.error(`Supported models are: ${supportedModels.join(", ")}`);
process.exit(1);
const baseModel = new Model(
argv.model,
{ temperature: argv.temperature, max_tokens: argv.maxTokens },
metaInfo
);
model = argv.caching
? new CachingModel(baseModel, argv.cacheDir)
: baseModel;
console.log(
`*** Generating mutants for ${argv.mutate} in ${packagePath}`
);
}

const baseModel = new Model(
argv.model,
{ temperature: argv.temperature, max_tokens: argv.maxTokens },
metaInfo
);
const model = argv.caching
? new CachingModel(baseModel, argv.cacheDir)
: baseModel;
const packagePath = argv.path.endsWith("/")
? argv.path
: path.join(argv.path, "/");
console.log(`*** Generating mutants for ${argv.mutate} in ${packagePath}`);

const mutantGenerator = new MutantGenerator(
model,
path.join(argv.path, "MUTATION_TESTING"),
Expand Down
20 changes: 20 additions & 0 deletions sorters/.editorconfig
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
root = true

# Unix-style newlines with a newline ending every file
[*]
end_of_line = lf
insert_final_newline = true


# Matches multiple files with brace expansion notation
# Set default charset
[*.{ts,tsx,js,jsx,html,sass}]
charset = utf-8
indent_style = space
indent_size = 2
trim_trailing_whitespace = true

# don't use {} for single extension. This won't work: [*.{css}]
[*.css]
indent_style = space
indent_size = 2
6 changes: 6 additions & 0 deletions sorters/.eslintignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
node_modules/
reports/
documentation/
dist/
build/
*.js