Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incompatible with docker-desktop #20

Closed
twiecki opened this issue Apr 3, 2024 · 17 comments
Closed

Incompatible with docker-desktop #20

twiecki opened this issue Apr 3, 2024 · 17 comments
Assignees
Labels
❔question Further information is requested

Comments

@twiecki
Copy link

twiecki commented Apr 3, 2024

Installed (on arm64) successfully, then running:

>>python run.py --model_name gpt4   --data_path https://github.com/pymc-devs/pymc/issues/7223 --config_file config/default_from_url.yaml
Parsing command file: config/commands/defaults.sh
Parsing command file: config/commands/search.sh
Parsing command file: config/commands/edit_linting.sh
Parsing command file: config/commands/_split_string.py
Parsing command file: config/commands/defaults.sh
Parsing command file: config/commands/search.sh
Parsing command file: config/commands/edit_linting.sh
Parsing command file: config/commands/_split_string.py
INFO     📙 Arguments: agent:
           config:
             _commands:
             - arguments:
                 line_number:
                   description: the line number to move the window to (if not provided, the
                     window will start at the top of the file)
                   required: false
                   type: integer
                 path:
                   description: the path to the file to open
                   required: true
                   type: string
               code: 'open() {    if [ -z "$1" ]    then        echo "Usage: open <file>"        return    fi    #
                 Check if the second argument is provided    if [ -n "$2" ]; then        #
                 Check if the provided argument is a valid number        if ! [[ $2 =~ ^[0-9]+$
                 ]]; then            echo "Usage: open <file> [<line_number>]"            echo
                 "Error: <line_number> must be a number"            return  # Exit if the line
                 number is not valid        fi        local max_line=$(awk ''END {print NR}''
                 $1)        if [ $2 -gt $max_line ]; then            echo "Warning: <line_number>
                 ($2) is greater than the number of lines in the file ($max_line)"            echo
                 "Warning: Setting <line_number> to $max_line"            local line_number=$(jq
                 -n "$max_line")  # Set line number to max if greater than max        elif
                 [ $2 -lt 1 ]; then            echo "Warning: <line_number> ($2) is less than
                 1"            echo "Warning: Setting <line_number> to 1"            local
                 line_number=$(jq -n "1")  # Set line number to 1 if less than 1        else            local
                 OFFSET=$(jq -n "$WINDOW/6" | jq ''floor'')            local line_number=$(jq
                 -n "[$2 + $WINDOW/2 - $OFFSET, 1] | max | floor")        fi    else        local
                 line_number=$(jq -n "$WINDOW/2")  # Set default line number if not provided    fi    if
                 [ -f "$1" ]; then        export CURRENT_FILE=$(realpath $1)        export
                 CURRENT_LINE=$line_number        _constrain_line        _print    elif [ -d
                 "$1" ]; then        echo "Error: $1 is a directory. You can only open files.
                 Use cd or ls to navigate directories."    else        echo "File $1 not found"    fi}'
               docstring: opens the file at the given path in the editor. If line_number is
                 provided, the window will be move to include that line
               end_name: null
               name: open
               signature: open <path> [<line_number>]
             - arguments:
                 line_number:
                   description: the line number to move the window to
                   required: true
                   type: integer
               code: 'goto() {    if [ $# -gt 1 ]; then        echo "goto allows only one line
                 number at a time."        return    fi    if [ -z "$CURRENT_FILE" ]    then        echo
                 "No file open. Use the open command first."        return    fi    if [ -z
                 "$1" ]    then        echo "Usage: goto <line>"        return    fi    if
                 ! [[ $1 =~ ^[0-9]+$ ]]    then        echo "Usage: goto <line>"        echo
                 "Error: <line> must be a number"        return    fi    local max_line=$(awk
                 ''END {print NR}'' $CURRENT_FILE)    if [ $1 -gt $max_line ]    then        echo
                 "Error: <line> must be less than or equal to $max_line"        return    fi    local
                 OFFSET=$(jq -n "$WINDOW/6" | jq ''floor'')    export CURRENT_LINE=$(jq -n
                 "[$1 + $WINDOW/2 - $OFFSET, 1] | max | floor")    _constrain_line    _print}'
               docstring: moves the window to show <line_number>
               end_name: null
               name: goto
               signature: goto <line_number>
             - arguments: null
               code: scroll_down() {    if [ -z "$CURRENT_FILE" ]    then        echo "No file
                 open. Use the open command first."        return    fi    export CURRENT_LINE=$(jq
                 -n "$CURRENT_LINE + $WINDOW - $OVERLAP")    _constrain_line    _print}
               docstring: moves the window down {WINDOW} lines
               end_name: null
               name: scroll_down
               signature: scroll_down
             - arguments: null
               code: scroll_up() {    if [ -z "$CURRENT_FILE" ]    then        echo "No file
                 open. Use the open command first."        return    fi    export CURRENT_LINE=$(jq
                 -n "$CURRENT_LINE - $WINDOW + $OVERLAP")    _constrain_line    _print}
               docstring: moves the window down {WINDOW} lines
               end_name: null
               name: scroll_up
               signature: scroll_down
             - arguments:
                 filename:
                   description: the name of the file to create
                   required: true
                   type: string
               code: "create() {    if [ -z \"$1\" ]; then        echo \"Usage: create <filename>\"\
                 \        return    fi    # Check if the file already exists    if [ -e \"\
                 $1\" ]; then        echo \"Error: File '$1' already exists.\"\t\topen \"$1\"\
                 \        return    fi    # Create the file an empty new line    printf \"\\\
                 n\" > \"$1\"    # Use the existing open command to open the created file \
                 \   open \"$1\"}"
               docstring: creates and opens a new file with the given name
               end_name: null
               name: create
               signature: create <filename>
             - arguments: null
               code: 'submit() {    cd $ROOT    # Check if the patch file exists and is non-empty    if
                 [ -s "/root/test.patch" ]; then        # Apply the patch in reverse        git
                 apply -R < "/root/test.patch"    fi    git add -A    git diff --cached > model.patch    echo
                 "<<SUBMISSION||"    cat model.patch    echo "||SUBMISSION>>"}'
               docstring: submits your current code and terminates the session
               end_name: null
               name: submit
               signature: submit
             - arguments:
                 dir:
                   description: the directory to search in (if not provided, searches in the
                     current directory)
                   required: false
                   type: string
                 search_term:
                   description: the term to search for
                   required: true
                   type: string
               code: 'search_dir() {    if [ $# -eq 1 ]; then        local search_term="$1"        local
                 dir="./"    elif [ $# -eq 2 ]; then        local search_term="$1"        if
                 [ -d "$2" ]; then            local dir="$2"        else            echo "Directory
                 $2 not found"            return        fi    else        echo "Usage: search_dir
                 <search_term> [<dir>]"        return    fi    dir=$(realpath "$dir")    local
                 matches=$(find "$dir" -type f ! -path ''*/.*'' -exec grep -nIH "$search_term"
                 {} + | cut -d: -f1 | sort | uniq -c)    # if no matches, return    if [ -z
                 "$matches" ]; then        echo "No matches found for \"$search_term\" in $dir"        return    fi    #
                 Calculate total number of matches    local num_matches=$(echo "$matches" |
                 awk ''{sum+=$1} END {print sum}'')    # calculate total number of files matched    local
                 num_files=$(echo "$matches" | wc -l | awk ''{$1=$1; print $0}'')    # if num_files
                 is > 100, print an error    if [ $num_files -gt 100 ]; then        echo "More
                 than $num_files files matched for \"$search_term\" in $dir. Please narrow
                 your search."        return    fi        echo "Found $num_matches matches
                 for \"$search_term\" in $dir:"    echo "$matches" | awk ''{$2=$2; gsub(/^\.+\/+/,
                 "./", $2); print $2 " ("$1" matches)"}''    echo "End of matches for \"$search_term\"
                 in $dir"}'
               docstring: searches for search_term in all files in dir. If dir is not provided,
                 searches in the current directory
               end_name: null
               name: search_dir
               signature: search_dir <search_term> [<dir>]
             - arguments:
                 file:
                   description: the file to search in (if not provided, searches in the current
                     open file)
                   required: false
                   type: string
                 search_term:
                   description: the term to search for
                   required: true
                   type: string
               code: 'search_file() {    # Check if the first argument is provided    if [
                 -z "$1" ]; then        echo "Usage: search_file <search_term> [<file>]"        return    fi    #
                 Check if the second argument is provided    if [ -n "$2" ]; then        #
                 Check if the provided argument is a valid file        if [ -f "$2" ]; then            local
                 file="$2"  # Set file if valid        else            echo "Usage: search_file
                 <search_term> [<file>]"            echo "Error: File name $2 not found. Please
                 provide a valid file name."            return  # Exit if the file is not valid        fi    else        #
                 Check if a file is open        if [ -z "$CURRENT_FILE" ]; then            echo
                 "No file open. Use the open command first."            return  # Exit if no
                 file is open        fi        local file="$CURRENT_FILE"  # Set file to the
                 current open file    fi    local search_term="$1"    file=$(realpath "$file")    #
                 Use grep to directly get the desired formatted output    local matches=$(grep
                 -nH "$search_term" "$file")    # Check if no matches were found    if [ -z
                 "$matches" ]; then        echo "No matches found for \"$search_term\" in $file"        return    fi    #
                 Calculate total number of matches    local num_matches=$(echo "$matches" |
                 wc -l | awk ''{$1=$1; print $0}'')        # calculate total number of lines
                 matched    local num_lines=$(echo "$matches" | cut -d: -f1 | sort | uniq |
                 wc -l | awk ''{$1=$1; print $0}'')    # if num_lines is > 100, print an error    if
                 [ $num_lines -gt 100 ]; then        echo "More than $num_lines lines matched
                 for \"$search_term\" in $file. Please narrow your search."        return    fi    #
                 Print the total number of matches and the matches themselves    echo "Found
                 $num_matches matches for \"$search_term\" in $file:"    echo "$matches" |
                 cut -d: -f1-2 | sort -u -t: -k2,2n | while IFS=: read -r filename line_number;
                 do        echo "Line $line_number:$(sed -n "${line_number}p" "$file")"    done    echo
                 "End of matches for \"$search_term\" in $file"}'
               docstring: searches for search_term in file. If file is not provided, searches
                 in the current open file
               end_name: null
               name: search_file
               signature: search_file <search_term> [<file>]
             - arguments:
                 dir:
                   description: the directory to search in (if not provided, searches in the
                     current directory)
                   required: false
                   type: string
                 file_name:
                   description: the name of the file to search for
                   required: true
                   type: string
               code: 'find_file() {    if [ $# -eq 1 ]; then        local file_name="$1"        local
                 dir="./"    elif [ $# -eq 2 ]; then        local file_name="$1"        if
                 [ -d "$2" ]; then            local dir="$2"        else            echo "Directory
                 $2 not found"            return        fi    else        echo "Usage: find_file
                 <file_name> [<dir>]"        return    fi    dir=$(realpath "$dir")    local
                 matches=$(find "$dir" -type f -name "$file_name")    # if no matches, return    if
                 [ -z "$matches" ]; then        echo "No matches found for \"$file_name\" in
                 $dir"        return    fi    # Calculate total number of matches    local
                 num_matches=$(echo "$matches" | wc -l | awk ''{$1=$1; print $0}'')    echo
                 "Found $num_matches matches for \"$file_name\" in $dir:"    echo "$matches"
                 | awk ''{print $0}''}'
               docstring: finds all files with the given name in dir. If dir is not provided,
                 searches in the current directory
               end_name: null
               name: find_file
               signature: find_file <file_name> [<dir>]
             - arguments:
                 end_line:
                   description: the line number to end the edit at (inclusive)
                   required: true
                   type: integer
                 replacement_text:
                   description: the text to replace the current selection with
                   required: true
                   type: string
                 start_line:
                   description: the line number to start the edit at
                   required: true
                   type: integer
               code: 'edit() {    if [ -z "$CURRENT_FILE" ]    then        echo ''No file open.
                 Use the `open` command first.''        return    fi    local start_line="$(echo
                 $1: | cut -d: -f1)"    local end_line="$(echo $1: | cut -d: -f2)"    if [
                 -z "$start_line" ] || [ -z "$end_line" ]    then        echo "Usage: edit
                 <start_line>:<end_line>"        return    fi    local re=''^[0-9]+$''    if
                 ! [[ $start_line =~ $re ]]; then        echo "Usage: edit <start_line>:<end_line>"        echo
                 "Error: start_line must be a number"        return    fi    if ! [[ $end_line
                 =~ $re ]]; then        echo "Usage: edit <start_line>:<end_line>"        echo
                 "Error: end_line must be a number"        return    fi    # Bash array starts
                 at 0, so let''s adjust    local start_line=$((start_line - 1))    local end_line=$((end_line))    local
                 line_count=0    local replacement=()    while IFS= read -r line    do        replacement+=("$line")        ((line_count++))    done    #
                 Create a backup of the current file    cp "$CURRENT_FILE" "/root/$(basename
                 "$CURRENT_FILE")_backup"    # Read the file line by line into an array    mapfile
                 -t lines < "$CURRENT_FILE"    local new_lines=("${lines[@]:0:$start_line}"
                 "${replacement[@]}" "${lines[@]:$((end_line))}")    # Write the new stuff
                 directly back into the original file    printf "%s\n" "${new_lines[@]}" >|
                 "$CURRENT_FILE"        # Run linter    if [[ $CURRENT_FILE == *.py ]]; then        lint_output=$(flake8
                 --select=F821,F822,F831,E111,E112,E113,E999,E902 "$CURRENT_FILE" 2>&1)    else        #
                 do nothing        lint_output=""    fi    # if there is no output, then the
                 file is good    if [ -z "$lint_output" ]; then        export CURRENT_LINE=$start_line        _constrain_line        _print        echo
                 "File updated. Please review the changes and make sure they are correct (correct
                 indentation, no duplicate lines, etc). Edit the file again if necessary."    else        echo
                 "Your proposed edit has introduced new syntax error(s). Please understand
                 the fixes and retry your edit commmand."        echo ""        echo "ERRORS:"        _split_string
                 "$lint_output"        echo ""        # Save original values        original_current_line=$CURRENT_LINE        original_window=$WINDOW        #
                 Update values        export CURRENT_LINE=$(( (line_count / 2) + start_line
                 )) # Set to "center" of edit        export WINDOW=$((line_count + 10)) # Show
                 +/- 5 lines around edit        echo "This is how your edit would have looked
                 if applied"        echo "-------------------------------------------------"        _constrain_line        _print        echo
                 "-------------------------------------------------"        echo ""        #
                 Restoring CURRENT_FILE to original contents.        cp "/root/$(basename "$CURRENT_FILE")_backup"
                 "$CURRENT_FILE"        export CURRENT_LINE=$(( ((end_line - start_line + 1)
                 / 2) + start_line ))        export WINDOW=$((end_line - start_line + 10))        echo
                 "This is the original code before your edit"        echo "-------------------------------------------------"        _constrain_line        _print        echo
                 "-------------------------------------------------"        # Restore original
                 values        export CURRENT_LINE=$original_current_line        export WINDOW=$original_window        echo
                 "Your changes have NOT been applied. Please fix your edit command and try
                 again."        echo "You either need to 1) Specify the correct start/end line
                 arguments or 2) Correct your edit code."        echo "DO NOT re-run the same
                 failed edit command. Running it again will lead to the same error."    fi    #
                 Remove backup file    rm -f "/root/$(basename "$CURRENT_FILE")_backup"}'
               docstring: replaces lines <start_line> through <end_line> (inclusive) with the
                 given text in the open file. The replacement text is terminated by a line
                 with only end_of_edit on it. All of the <replacement text> will be entered,
                 so make sure your indentation is formatted properly. Python files will be
                 checked for syntax errors after the edit. If the system detects a syntax error,
                 the edit will not be executed. Simply try to edit the file again, but make
                 sure to read the error message and modify the edit command you issue accordingly.
                 Issuing the same command a second time will just lead to the same error message
                 again.
               end_name: end_of_edit
               name: edit
               signature: |-
                 edit <start_line>:<end_line>
                 <replacement_text>
                 end_of_edit
             _subroutines: {}
             blocklist:
             - vim
             - vi
             - emacs
             - nano
             - nohup
             - git
             blocklist_error_template: Interactive operation '{name}' is not supported by this
               environment
             blocklist_standalone:
             - python
             - python3
             - ipython
             - bash
             - sh
             - exit
             - /bin/bash
             - /bin/sh
             - nohup
             - vi
             - vim
             - emacs
             - nano
             command_docs: |+
               open:
                 docstring: opens the file at the given path in the editor. If line_number is provided, the window will be move to include that line
                 signature: open <path> [<line_number>]
                 arguments:
                   - path (string) [required]: the path to the file to open
                   - line_number (integer) [optional]: the line number to move the window to (if not provided, the window will start at the top of the file)

               goto:
                 docstring: moves the window to show <line_number>
                 signature: goto <line_number>
                 arguments:
                   - line_number (integer) [required]: the line number to move the window to

               scroll_down:
                 docstring: moves the window down {WINDOW} lines
                 signature: scroll_down

               scroll_up:
                 docstring: moves the window down {WINDOW} lines
                 signature: scroll_down

               create:
                 docstring: creates and opens a new file with the given name
                 signature: create <filename>
                 arguments:
                   - filename (string) [required]: the name of the file to create

               submit:
                 docstring: submits your current code and terminates the session
                 signature: submit

               search_dir:
                 docstring: searches for search_term in all files in dir. If dir is not provided, searches in the current directory
                 signature: search_dir <search_term> [<dir>]
                 arguments:
                   - search_term (string) [required]: the term to search for
                   - dir (string) [optional]: the directory to search in (if not provided, searches in the current directory)

               search_file:
                 docstring: searches for search_term in file. If file is not provided, searches in the current open file
                 signature: search_file <search_term> [<file>]
                 arguments:
                   - search_term (string) [required]: the term to search for
                   - file (string) [optional]: the file to search in (if not provided, searches in the current open file)

               find_file:
                 docstring: finds all files with the given name in dir. If dir is not provided, searches in the current directory
                 signature: find_file <file_name> [<dir>]
                 arguments:
                   - file_name (string) [required]: the name of the file to search for
                   - dir (string) [optional]: the directory to search in (if not provided, searches in the current directory)

               edit:
                 docstring: replaces lines <start_line> through <end_line> (inclusive) with the given text in the open file. The replacement text is terminated by a line with only end_of_edit on it. All
         of the <replacement text> will be entered, so make sure your indentation is formatted properly. Python files will be checked for syntax errors after the edit. If the system detects a syntax
         error, the edit will not be executed. Simply try to edit the file again, but make sure to read the error message and modify the edit command you issue accordingly. Issuing the same command a
         second time will just lead to the same error message again.
                 signature: edit <start_line>:<end_line>
               <replacement_text>
               end_of_edit
                 arguments:
                   - start_line (integer) [required]: the line number to start the edit at
                   - end_line (integer) [required]: the line number to end the edit at (inclusive)
                   - replacement_text (string) [required]: the text to replace the current selection with

             command_files:
             - config/commands/defaults.sh
             - config/commands/search.sh
             - config/commands/edit_linting.sh
             - config/commands/_split_string.py
             demonstration_template: |
               Here is a demonstration of how to correctly accomplish this task.
               It is included to show you how to correctly use the interface.
               You do not need to follow exactly what is done in the demonstration.
               --- DEMONSTRATION ---
               {demonstration}
               --- END OF DEMONSTRATION ---
             demonstrations:
             - trajectories/demonstrations/replay__marshmallow-code__marshmallow-1867__default__t-0.20__p-0.95__c-2.00__install-1___install_from_source/marshmallow-code__marshmallow-1867.traj
             env_variables:
               CURRENT_FILE: ''
               CURRENT_LINE: '0'
               OVERLAP: '2'
               SEARCH_FILES: ()
               SEARCH_INDEX: '0'
               SEARCH_RESULTS: ()
               WINDOW: '100'
             format_error_template: |
               Your output was not formatted correctly. You must always include one discussion and one command as part of your response. Make sure you do not have multiple discussion/command tags.
               Please make sure your output precisely matches the following format:
               DISCUSSION
               Discuss here with yourself about what your planning and what you're going to do in this step.

               ```
               command(s) that you're going to run
               ```
             history_processor: {}
             history_processor_args: {}
             instance_template: "We're currently solving the following issue within our repository.\
               \ Here's the issue text:\nISSUE:\n{issue}\n\nINSTRUCTIONS:\nNow, you're going\
               \ to solve this issue on your own. Your terminal session has started and you're\
               \ in the repository's root directory. You can use any bash commands or the special\
               \ interface to help you. Edit all the files you need to and run any checks or\
               \ tests that you want. \nRemember, YOU CAN ONLY ENTER ONE COMMAND AT A TIME.\
               \ You should always wait for feedback after every command. \nWhen you're satisfied\
               \ with all of the changes you've made, you can submit your changes to the code\
               \ base by simply running the submit command.\nNote however that you cannot use\
               \ any interactive session commands (e.g. python, vim) in this environment, but\
               \ you can write scripts and run them. E.g. you can write a python script and\
               \ then run it with `python <script_name>.py`.\n\nNOTE ABOUT THE EDIT COMMAND:\
               \ Indentation really matters! When editing a file, make sure to insert appropriate\
               \ indentation before each line! \n\nIMPORTANT TIPS:\n1. Always start by trying\
               \ to replicate the bug that the issues discusses. \n   If the issue includes\
               \ code for reproducing the bug, we recommend that you re-implement that in your\
               \ environment, and run it to make sure you can reproduce the bug.\n   Then start\
               \ trying to fix it.\n   When you think you've fixed the bug, re-run the bug\
               \ reproduction script to make sure that the bug has indeed been fixed.\n   \n\
               \   If the bug reproduction script does not print anything when it succesfully\
               \ runs, we recommend adding a print(\"Script completed successfully, no errors.\"\
               ) command at the end of the file,\n   so that you can be sure that the script\
               \ indeed ran fine all the way through. \n\n2. If you run a command and it doesn't\
               \ work, try running a different command. A command that did not work once will\
               \ not work the second time unless you modify it!\n\n3. If you open a file and\
               \ need to get to an area around a specific line that is not in the first 100\
               \ lines, say line 583, don't just use the scroll_down command multiple times.\
               \ Instead, use the goto 583 command. It's much quicker. \n   \n4. If the bug\
               \ reproduction script requires inputting/reading a specific file, such as buggy-input.png,\
               \ and you'd like to understand how to input that file, conduct a search in the\
               \ existing repo code, to see whether someone else has already done that. Do\
               \ this by running the command: find_file \"buggy-input.png\" If that doensn't\
               \ work, use the linux 'find' command. \n\n5. Always make sure to look at the\
               \ currently open file and the current working directory (which appears right\
               \ after the currently open file). The currently open file might be in a different\
               \ directory than the working directory! Note that some commands, such as 'create',\
               \ open files, so they might change the current  open file.\n\n6. When editing\
               \ files, it is easy to accidentally specify a wrong line number or to write\
               \ code with incorrect indentation. Always check the code after you issue an\
               \ edit to make sure that it reflects what you wanted to accomplish. If it didn't,\
               \ issue another command to fix it.\n\n7. It may be necessary to install the\
               \ repository from source before you can run code. Please think about how to\
               \ install the environment from the repository directory if you need to do so.\n\
               \   \n\n(Open file: {open_file})\n(Current directory: {working_dir})\nbash-$"
             next_step_no_output_template: |-
               Your command ran successfully and did not produce any output.
               (Open file: {open_file})
               (Current directory: {working_dir})
               bash-$
             next_step_template: |-
               {observation}
               (Open file: {open_file})
               (Current directory: {working_dir})
               bash-$
             parse_command: {}
             parse_function: {}
             put_demos_in_history: false
             state_command:
               arguments: null
               code: |
                 state() {
                   local working_dir="$PWD";
                   if [ -z $CURRENT_FILE ]; then
                       echo '{"open_file": "n/a", "working_dir": "'$working_dir'"}';
                   else
                       echo '{"open_file": "'$(realpath $CURRENT_FILE)'", "working_dir": "'$working_dir'"}';
                   fi
                 };
               docstring: null
               end_name: null
               name: state
               signature: null
             strategy_template: null
             submit_command: submit
             subroutine_types: []
             system_template: "SETTING: You are an autonomous programmer, and you're working\
               \ directly in the command line with a special interface.\n\nThe special interface\
               \ consists of a file editor that shows you {WINDOW} lines of a file at a time.\n\
               In addition to typical bash commands, you can also use the following commands\
               \ to help you navigate and edit files.\n\nCOMMANDS:\n{command_docs}\n\nPlease\
               \ note that THE EDIT COMMAND REQUIRES PROPER INDENTATION. \nIf you'd like to\
               \ add the line '        print(x)' you must fully write that out, with all those\
               \ spaces before the code! Indentation is important and code that is not indented\
               \ correctly will fail and require fixing before it can be run.\n\nRESPONSE FORMAT:\n\
               Your shell prompt is formatted as follows:\n(Open file: <path>) <cwd> $\n\n\
               You need to format your output using two fields; discussion and command.\nYour\
               \ output should always include _one_ discussion and _one_ command field EXACTLY\
               \ as in the following example:\nDISCUSSION\nFirst I'll start by using ls to\
               \ see what files are in the current directory. Then maybe we can look at some\
               \ relevant files to see what they look like.\n```\nls -a\n```\n\nYou should\
               \ only include a *SINGLE* command in the command section and then wait for a\
               \ response from the shell before continuing with more discussion and commands.\
               \ Everything you include in the DISCUSSION section will be saved for future\
               \ reference.\nIf you'd like to issue two commands at once, PLEASE DO NOT DO\
               \ THAT! Please instead first submit just the first command, and then after receiving\
               \ a response you'll be able to issue the second command. \nYou're free to use\
               \ any other bash commands you want (e.g. find, grep, cat, ls, cd) in addition\
               \ to the special commands listed above.\nHowever, the environment does NOT support\
               \ interactive session commands (e.g. python, vim), so please do not invoke them."
             util_functions:
             - arguments: null
               code: '_print() {    local total_lines=$(awk ''END {print NR}'' $CURRENT_FILE)    echo
                 "[File: $(realpath $CURRENT_FILE) ($total_lines lines total)]"    lines_above=$(jq
                 -n "$CURRENT_LINE - $WINDOW/2" | jq ''[0, .] | max | floor'')    lines_below=$(jq
                 -n "$total_lines - $CURRENT_LINE - $WINDOW/2" | jq ''[0, .] | max | round'')    if
                 [ $lines_above -gt 0 ]; then        echo "($lines_above more lines above)"    fi    cat
                 $CURRENT_FILE | grep -n $ | head -n $(jq -n "[$CURRENT_LINE + $WINDOW/2, $WINDOW/2]
                 | max | floor") | tail -n $(jq -n "$WINDOW")    if [ $lines_below -gt 0 ];
                 then        echo "($lines_below more lines below)"    fi}'
               docstring: null
               end_name: null
               name: _print
               signature: _print
             - arguments: null
               code: _constrain_line() {    if [ -z "$CURRENT_FILE" ]    then        echo "No
                 file open. Use the open command first."        return    fi    local max_line=$(awk
                 'END {print NR}' $CURRENT_FILE)    local half_window=$(jq -n "$WINDOW/2" |
                 jq 'floor')    export CURRENT_LINE=$(jq -n "[$CURRENT_LINE, $max_line - $half_window]
                 | min")    export CURRENT_LINE=$(jq -n "[$CURRENT_LINE, $half_window] | max")}
               docstring: null
               end_name: null
               name: _constrain_line
               signature: _constrain_line
           config_file: config/default_from_url.yaml
           model:
             host_url: localhost:11434
             model_name: gpt4
             per_instance_cost_limit: 2.0
             replay_path: null
             temperature: 0.2
             top_p: 0.95
             total_cost_limit: 0.0
         environment:
           base_commit: null
           container_name: null
           data_path: https://github.com/pymc-devs/pymc/issues/7223
           image_name: swe-agent
           install_environment: true
           no_mirror: false
           split: dev
           timeout: 35
           verbose: true
         instance_filter: .*
         skip_existing: true
         suffix: ''

INFO     💽 Loaded dataset from https://github.com/pymc-devs/pymc/issues/7223
Traceback (most recent call last):
  File "/Users/twiecki/micromamba/envs/swe-agent/lib/python3.9/site-packages/urllib3/connectionpool.py", line 793, in urlopen
    response = self._make_request(
  File "/Users/twiecki/micromamba/envs/swe-agent/lib/python3.9/site-packages/urllib3/connectionpool.py", line 496, in _make_request
    conn.request(
  File "/Users/twiecki/micromamba/envs/swe-agent/lib/python3.9/site-packages/urllib3/connection.py", line 400, in request
    self.endheaders()
  File "/Users/twiecki/micromamba/envs/swe-agent/lib/python3.9/http/client.py", line 1280, in endheaders
    self._send_output(message_body, encode_chunked=encode_chunked)
  File "/Users/twiecki/micromamba/envs/swe-agent/lib/python3.9/http/client.py", line 1040, in _send_output
    self.send(msg)
  File "/Users/twiecki/micromamba/envs/swe-agent/lib/python3.9/http/client.py", line 980, in send
    self.connect()
  File "/Users/twiecki/micromamba/envs/swe-agent/lib/python3.9/site-packages/docker/transport/unixconn.py", line 27, in connect
    sock.connect(self.unix_socket)
FileNotFoundError: [Errno 2] No such file or directory

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/twiecki/micromamba/envs/swe-agent/lib/python3.9/site-packages/requests/adapters.py", line 486, in send
    resp = conn.urlopen(
  File "/Users/twiecki/micromamba/envs/swe-agent/lib/python3.9/site-packages/urllib3/connectionpool.py", line 847, in urlopen
    retries = retries.increment(
  File "/Users/twiecki/micromamba/envs/swe-agent/lib/python3.9/site-packages/urllib3/util/retry.py", line 470, in increment
    raise reraise(type(error), error, _stacktrace)
  File "/Users/twiecki/micromamba/envs/swe-agent/lib/python3.9/site-packages/urllib3/util/util.py", line 38, in reraise
    raise value.with_traceback(tb)
  File "/Users/twiecki/micromamba/envs/swe-agent/lib/python3.9/site-packages/urllib3/connectionpool.py", line 793, in urlopen
    response = self._make_request(
  File "/Users/twiecki/micromamba/envs/swe-agent/lib/python3.9/site-packages/urllib3/connectionpool.py", line 496, in _make_request
    conn.request(
  File "/Users/twiecki/micromamba/envs/swe-agent/lib/python3.9/site-packages/urllib3/connection.py", line 400, in request
    self.endheaders()
  File "/Users/twiecki/micromamba/envs/swe-agent/lib/python3.9/http/client.py", line 1280, in endheaders
    self._send_output(message_body, encode_chunked=encode_chunked)
  File "/Users/twiecki/micromamba/envs/swe-agent/lib/python3.9/http/client.py", line 1040, in _send_output
    self.send(msg)
  File "/Users/twiecki/micromamba/envs/swe-agent/lib/python3.9/http/client.py", line 980, in send
    self.connect()
  File "/Users/twiecki/micromamba/envs/swe-agent/lib/python3.9/site-packages/docker/transport/unixconn.py", line 27, in connect
    sock.connect(self.unix_socket)
urllib3.exceptions.ProtocolError: ('Connection aborted.', FileNotFoundError(2, 'No such file or directory'))

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/twiecki/micromamba/envs/swe-agent/lib/python3.9/site-packages/docker/api/client.py", line 213, in _retrieve_server_version
    return self.version(api_version=False)["ApiVersion"]
  File "/Users/twiecki/micromamba/envs/swe-agent/lib/python3.9/site-packages/docker/api/daemon.py", line 181, in version
    return self._result(self._get(url), json=True)
  File "/Users/twiecki/micromamba/envs/swe-agent/lib/python3.9/site-packages/docker/utils/decorators.py", line 44, in inner
    return f(self, *args, **kwargs)
  File "/Users/twiecki/micromamba/envs/swe-agent/lib/python3.9/site-packages/docker/api/client.py", line 236, in _get
    return self.get(url, **self._set_request_timeout(kwargs))
  File "/Users/twiecki/micromamba/envs/swe-agent/lib/python3.9/site-packages/requests/sessions.py", line 602, in get
    return self.request("GET", url, **kwargs)
  File "/Users/twiecki/micromamba/envs/swe-agent/lib/python3.9/site-packages/requests/sessions.py", line 589, in request
    resp = self.send(prep, **send_kwargs)
  File "/Users/twiecki/micromamba/envs/swe-agent/lib/python3.9/site-packages/requests/sessions.py", line 703, in send
    r = adapter.send(request, **kwargs)
  File "/Users/twiecki/micromamba/envs/swe-agent/lib/python3.9/site-packages/requests/adapters.py", line 501, in send
    raise ConnectionError(err, request=request)
requests.exceptions.ConnectionError: ('Connection aborted.', FileNotFoundError(2, 'No such file or directory'))

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/Users/twiecki/projects/SWE-agent/run.py", line 223, in <module>
    main(args)
  File "/Users/twiecki/projects/SWE-agent/run.py", line 66, in main
    env = SWEEnv(args.environment)
  File "/Users/twiecki/projects/SWE-agent/sweagent/environment/swe_env.py", line 101, in __init__
    self._reset_container()
  File "/Users/twiecki/projects/SWE-agent/sweagent/environment/swe_env.py", line 349, in _reset_container
    self._init_container()
  File "/Users/twiecki/projects/SWE-agent/sweagent/environment/swe_env.py", line 371, in _init_container
    client = docker.from_env()
  File "/Users/twiecki/micromamba/envs/swe-agent/lib/python3.9/site-packages/docker/client.py", line 94, in from_env
    return cls(
  File "/Users/twiecki/micromamba/envs/swe-agent/lib/python3.9/site-packages/docker/client.py", line 45, in __init__
    self.api = APIClient(*args, **kwargs)
  File "/Users/twiecki/micromamba/envs/swe-agent/lib/python3.9/site-packages/docker/api/client.py", line 197, in __init__
    self._version = self._retrieve_server_version()
  File "/Users/twiecki/micromamba/envs/swe-agent/lib/python3.9/site-packages/docker/api/client.py", line 220, in _retrieve_server_version
    raise DockerException(
docker.errors.DockerException: Error while fetching server API version: ('Connection aborted.', FileNotFoundError(2, 'No such file or directory'))

It also didn't seem to find keys.cfg, I had to set GITHUB_TOKEN as an env variable.

@bishopZ
Copy link

bishopZ commented Apr 3, 2024

I have the same result. I'm on a m1 mac.

At first it complains that there is no GITHUB_TOKEN, even though it's in the keys.cfg file.

If I export GITHUB_TOKEN as an env variable, I get the error shown above.

@twiecki
Copy link
Author

twiecki commented Apr 3, 2024

Then it's probably arm64 related.

@twiecki twiecki changed the title Error while fetching server API version: ('Connection aborted.', FileNotFoundError(2, 'No such file or directory')) arm64: Error while fetching server API version: ('Connection aborted.', FileNotFoundError(2, 'No such file or directory')) Apr 3, 2024
@klieret
Copy link
Member

klieret commented Apr 3, 2024

Regarding GITHUB_TOKEN, this should be fixed soon with #31

@klieret klieret added the 🐛 bug Something isn't working label Apr 3, 2024
@klieret
Copy link
Member

klieret commented Apr 3, 2024

I'm on a M1 and cannot reproduce this. This is silly, but did you double check that your docker daemon is running?

@klieret klieret self-assigned this Apr 3, 2024
@klieret
Copy link
Member

klieret commented Apr 3, 2024

I can reproduce this by killing docker and rerunning the command.

So the fix is simply: Make sure that docker is running.

I agree that the error handling could be improved. I'll open a PR for that.

@klieret klieret added ❔question Further information is requested and removed 🐛 bug Something isn't working labels Apr 3, 2024
@timothycarambat
Copy link

Ensure docker is running
Docker desktop > Settings > Advanced > Allow the default Docker socket to be used (requires password) needs to be enabled.

@klieret
Copy link
Member

klieret commented Apr 3, 2024

Feel free to reopen if issue persists.

@twiecki
Copy link
Author

twiecki commented Apr 4, 2024

I'm positive that Docker is running (docker run works). So it must be something else. I can't reopen the issue.

@entuerem
Copy link

entuerem commented Apr 10, 2024

Ensure docker is running Docker desktop > Settings > Advanced > Allow the default Docker socket to be used (requires password) needs to be enabled.

I'm seeing this issue with Docker Desktop (the recommended way of installing docker according to the docker docs) on Ubuntu 23. This setup uses a different location for the default Docker socket which probably causes the FileNotFoundError thrown by the docker python library. Creating a symlink at the expected location with the following command fixes it:

sudo ln -s -f /home/<user>/.docker/desktop/docker.sock /var/run/docker.sock

@twiecki
Copy link
Author

twiecki commented Apr 10, 2024

Thanks @entuerem, that solved that problem. Now I'm getting:

ERROR    Unexpected container setup output: Unable to find image 'swe-agent:latest' locally
         docker: Error response from daemon: pull access denied for swe-agent, repository does not exist or may require 'docker login': denied: requested access to the resource is denied.
         See 'docker run --help'.

Traceback (most recent call last):
  File "/Users/twiecki/micromamba/envs/swe-agent/lib/python3.9/site-packages/docker/api/client.py", line 265, in _raise_for_status
    response.raise_for_status()
  File "/Users/twiecki/micromamba/envs/swe-agent/lib/python3.9/site-packages/requests/models.py", line 1021, in raise_for_status
    raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 404 Client Error: Not Found for url: http+docker://localhost/v1.43/containers/swe-agent-515b31641c/json

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/Users/twiecki/projects/SWE-agent/run.py", line 223, in <module>
    main(args)
  File "/Users/twiecki/projects/SWE-agent/run.py", line 66, in main
    env = SWEEnv(args.environment)
  File "/Users/twiecki/projects/SWE-agent/sweagent/environment/swe_env.py", line 101, in __init__
    self._reset_container()
  File "/Users/twiecki/projects/SWE-agent/sweagent/environment/swe_env.py", line 349, in _reset_container
    self._init_container()
  File "/Users/twiecki/projects/SWE-agent/sweagent/environment/swe_env.py", line 378, in _init_container
    self.container_obj = client.containers.get(self.container_name)
  File "/Users/twiecki/micromamba/envs/swe-agent/lib/python3.9/site-packages/docker/models/containers.py", line 951, in get
    resp = self.client.api.inspect_container(container_id)
  File "/Users/twiecki/micromamba/envs/swe-agent/lib/python3.9/site-packages/docker/utils/decorators.py", line 19, in wrapped
    return f(self, resource_id, *args, **kwargs)
  File "/Users/twiecki/micromamba/envs/swe-agent/lib/python3.9/site-packages/docker/api/container.py", line 792, in inspect_container
    return self._result(
  File "/Users/twiecki/micromamba/envs/swe-agent/lib/python3.9/site-packages/docker/api/client.py", line 271, in _result
    self._raise_for_status(response)
  File "/Users/twiecki/micromamba/envs/swe-agent/lib/python3.9/site-packages/docker/api/client.py", line 267, in _raise_for_status
    raise create_api_error_from_http_exception(e) from e
  File "/Users/twiecki/micromamba/envs/swe-agent/lib/python3.9/site-packages/docker/errors.py", line 39, in create_api_error_from_http_exception
    raise cls(e, response=response, explanation=explanation) from e
docker.errors.NotFound: 404 Client Error for http+docker://localhost/v1.43/containers/swe-agent-515b31641c/json: Not Found ("No such container: swe-agent-515b31641c")

Running docker login successfully didn't fix it.

@twiecki twiecki changed the title arm64: Error while fetching server API version: ('Connection aborted.', FileNotFoundError(2, 'No such file or directory')) Incompatible with docker-desktop Apr 10, 2024
@klieret
Copy link
Member

klieret commented Apr 10, 2024

Hi @twiecki : Sorry for missing your commend last week! (and I didn't know that you couldn't re-open the issue!)

Thanks for following up with detailed logs.

@timothycarambat @entuerem This is only for using the fully containerized version of the software, right? (i.e., when you mount the docker socket)

@twiecki Could it be that you didn't build/pull the swe-agent container? (step 2 of the express setup instructions or running setup.sh with the conda instructions)

@klieret
Copy link
Member

klieret commented Apr 10, 2024

Also note that we have updated the images to support arm64 as well as amd64 :)

@entuerem
Copy link

Hi @twiecki : Sorry for missing your commend last week! (and I didn't know that you couldn't re-open the issue!)

Thanks for following up with detailed logs.

@timothycarambat @entuerem This is only for using the fully containerized version of the software, right? (i.e., when you mount the docker socket)

@twiecki Could it be that you didn't pull the swe-agent container (step 2 of the express setup instructions)

No. This happened when I followed the "Setup with conda (development version)" instructions. On Ubuntu 23 this is the only issue I encountered. After creating the symlink it worked. Thx for this. Currently analyzing how you guys wrote this thing :)

@Amon1412
Copy link

Amon1412 commented Apr 11, 2024

The message below is the same problem I'm having when executing commands in Windows docker desctop, is it the same problem? How can i solve it.

Traceback (most recent call last):
  File "/app/run.py", line 108, in main
    observation, info = env.reset(index)
  File "/app/sweagent/environment/swe_env.py", line 216, in reset
    self.communicate_with_handling(
  File "/app/sweagent/environment/swe_env.py", line 485, in communicate_with_handling
    logs = self.communicate(input, timeout_duration=timeout_duration)
  File "/app/sweagent/environment/swe_env.py", line 468, in communicate
    output = self._communicate(
  File "/app/sweagent/environment/swe_env.py", line 437, in _communicate
    raise e
  File "/app/sweagent/environment/swe_env.py", line 430, in _communicate
    buffer = read_with_timeout(self.container, self.get_pids, timeout_duration)
  File "/app/sweagent/environment/utils.py", line 128, in read_with_timeout
    raise TimeoutError("Timeout reached while reading from subprocess.\nCurrent buffer: {}\nRunning PIDs: {}".format(buffer.decode(), pids))
TimeoutError: Timeout reached while reading from subprocess.
Current buffer:
Running PIDs: [['1994', 'pip']]

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/app/run.py", line 321, in <module>
    main(args)
  File "/app/run.py", line 157, in main
    env.reset_container()
  File "/app/sweagent/environment/swe_env.py", line 362, in reset_container
    self._reset_container()
  File "/app/sweagent/environment/swe_env.py", line 356, in _reset_container
    self._init_scripts()
  File "/app/sweagent/environment/swe_env.py", line 403, in _init_scripts
    self.communicate_with_handling(
  File "/app/sweagent/environment/swe_env.py", line 485, in communicate_with_handling
    logs = self.communicate(input, timeout_duration=timeout_duration)
  File "/app/sweagent/environment/swe_env.py", line 468, in communicate
    output = self._communicate(
  File "/app/sweagent/environment/swe_env.py", line 437, in _communicate
    raise e
  File "/app/sweagent/environment/swe_env.py", line 434, in _communicate
    exit_code = read_with_timeout(self.container, self.get_pids, 5).strip()
  File "/app/sweagent/environment/utils.py", line 128, in read_with_timeout
    raise TimeoutError("Timeout reached while reading from subprocess.\nCurrent buffer: {}\nRunning PIDs: {}".format(buffer.decode(), pids))
TimeoutError: Timeout reached while reading from subprocess.
Current buffer: 0

Running PIDs: []

@klieret
Copy link
Member

klieret commented Apr 11, 2024

@Amon1412 Please open a separate bug report.

@klieret
Copy link
Member

klieret commented Apr 12, 2024

Could it be that you didn't build/pull the swe-agent container? (step 2 of the express setup instructions or running setup.sh with the conda instructions)

@twiecki

@twiecki
Copy link
Author

twiecki commented Apr 13, 2024

Yes, that was indeed the problem and pulling fixed. I now have a different problem for which I'll open a new issue. thanks!

@twiecki twiecki closed this as completed Apr 13, 2024
@PierrunoYT PierrunoYT mentioned this issue May 3, 2024
5 tasks
ethanabrooks pushed a commit to reflectionai/SWE-agent that referenced this issue Jun 21, 2024
Update README.md

Update make_demos README

Update make_demos README

Add demonstration trajectories

Add support for ollama models

Fix setup.py type

Fix "idented" typo

Update README.md

Added link to GitHub token explanation

Update README.md

Fix broken links in readme (princeton-nlp#6)

Typo fix readme (princeton-nlp#19)

immensly -> immensely

Add correspondence

fix: allow token from keys.cfg to get passed to ghapi (princeton-nlp#31)

Fix unbound variable in error handling (princeton-nlp#32)

More helpful error message if docker is not running (princeton-nlp#33)

See princeton-nlp#20

chore: remove gnureadline dependency (princeton-nlp#12)

Doc: add TOGETHER_API_KEY to keys.cfg section of README (princeton-nlp#34)

I noticed there is also a `TOGETHER_API_KEY` key that can be set in `keys.cfg`, but it wasn't mentioned in the README, so wanted to add it:

https://github.com/princeton-nlp/SWE-agent/blob/6c9ebf0ea8a263806b276da7ba3b1eda1f4a9475/sweagent/agent/models.py#L509-L511

Fix typo omitted (princeton-nlp#45)

ommitted -> omitted

Increase portability of setup.sh; abort on failure

In reference to princeton-nlp#42

config_file is a required arg in run_replay.sh (princeton-nlp#48)

Fixes princeton-nlp#46

Handle with missing prompt_eval_count in Ollama (princeton-nlp#49)

Closes princeton-nlp#44

feat(models): natively support claude haiku (princeton-nlp#9)

fixed typo in config/README (princeton-nlp#55)

Update README.md

Add very basic pre-commit config (princeton-nlp#62)

Open PR to repository

More conditions to open PR; better commit msg; refactor

Refactor: Move open PR code to env

Remove debug messages; print PR URL; open PR as draft

Skip PR creation if there are associated commits

Refactor open-PR config and add override to skip if referenced

Allow to specify separate URL to push to a fork

Remove left-over prototyping code

Add trajectory to PR

Only allow overriding skip_if_commits_reference_issue on your own repo

Update run.py

Remove type hint to avoid flake8 false positive

Fix: Unexpected keyword 'split' in load_dataset (princeton-nlp#76)

Closes princeton-nlp#70

Fix: Allow run_replay with github URLs as data_path (princeton-nlp#58)

Closes princeton-nlp#47

feat: add support for azure openai (princeton-nlp#16)

* feat: add support for azure openai

Signed-off-by: Chapman Pendery <cpendery@microsoft.com>

* fix: feedback

Signed-off-by: Chapman Pendery <cpendery@microsoft.com>

* fix: add api_version

Co-authored-by: Massimiliano Pronesti <massimiliano.pronesti@gmail.com>

* docs: add azure openai version to readme

Signed-off-by: Chapman Pendery <cpendery@microsoft.com>

* style: fix formatting

Signed-off-by: Chapman Pendery <cpendery@microsoft.com>

---------

Signed-off-by: Chapman Pendery <cpendery@microsoft.com>
Co-authored-by: Massimiliano Pronesti <massimiliano.pronesti@gmail.com>

Add try/catch around PatchSet creation in evaluation

Clean up run_replay

Fix searching for flag-like strings, e.g., search_file "--flag"

Update README.md

Containerize application (princeton-nlp#81)

Fix: Using docker images from dockerhub (princeton-nlp#85)

Add release script for dockerhub (princeton-nlp#86)

Fix docker setup: updated image names (princeton-nlp#87)

Add run via docker instructions to readme (princeton-nlp#90)

* Add run via docker instructions to readme

* Add note about windows

* Add proper hint styling

Small refactor: Add quicksart section (princeton-nlp#56)

* Restructure readme: quickstart before eval

* Remove mention of PR creation

Small style fixes to readme

Add note about windows with conda installation

Update README.md

Mention --open_pr flag

Update README.md

Update run.sh

Update run_from_url.sh

Update run.py default model arguments

Update default model arguments - greedy decoding and 3.00 per instance cost

Shell script highlighting in readme

Update README.md

Fix: Update default image name (princeton-nlp#102)

Doc: Consolidate containerized run examples

Add issue template

fix: bad newline getting sent on windows (princeton-nlp#79)

Signed-off-by: Chapman Pendery <cpendery@microsoft.com>

Make sure that keys.cfg doesn't get copied to Docker

Add templates for issues, pr

Doc: Remove leftover "click to expand box"

Fix release script: latest tag can already exist on dockerhub

Add docs for how to write your own commands

Mount keys.cfg within container

Workaround for princeton-nlp#109

Doc: Missing backslash

Improve bug report template (princeton-nlp#113)

Add template workflow diagram

Change doc_improvement to question

Warning about containers being only for arm64 at the moment

Code quality: Improve inference of return type

Add flag to raise exceptions in run.py

Forward unparsed arguments in run_replay.py to run.py

Fix: Unbound local variable/name shadowing

This probably only ran because of name shadowing

Do not leave python when calling run.py

This helps with debugging run_replay

Separately save patch files + some typing cleanup (princeton-nlp#126)

Closes princeton-nlp#41

Allow to configure openapi base url (princeton-nlp#118)

---------

Co-authored-by: Kilian Lieret <kilian.lieret@posteo.de>

Remove azure override of model name (princeton-nlp#127)

Add pre-commit badge

Add markdown link checker (princeton-nlp#129)

* Add markdown link checker

* Fix & ignore broken markdown links

Add markdown link checker badge

Add run_replay integration test

Add CI with github actions

Add CI badge

Fix: Choosing TogetherAI models (princeton-nlp#130)

Closes 101

Revert "Remove azure override of model name (princeton-nlp#127)"

This reverts commit 311467c.

See discussion in princeton-nlp#127

Advertise experimental amd64 docker builds

Fix typo in server.py

seperately -> separately

Update README.md - move badges to bottom

Improve bug template

Improve bug report template

Improve bug report template

Improve bug report template

Better link for issue formatting

Upload coverage data to codecov (princeton-nlp#140)

Add codecov config and badge

chore: update pre-commit hooks (princeton-nlp#141)

updates:
- [github.com/pre-commit/pre-commit-hooks: v4.5.0 → v4.6.0](pre-commit/pre-commit-hooks@v4.5.0...v4.6.0)
- [github.com/pycqa/flake8: 4.0.1 → 7.0.0](PyCQA/flake8@4.0.1...7.0.0)

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

multiplatform docker builds (princeton-nlp#131)

* Select the right conda path from within the container

* Build multiplatform images

Improve test coverage (princeton-nlp#142)

Doc: Remove architecture notice for docker

Update README.md - change LLM to LM :)

Update README.md

Add Ollama support section

Update README.md

Update README.md

Update README.md

Add ollama link

Increase coverage of swe-env tests (princeton-nlp#154)

Fix typo in README.md (princeton-nlp#155)

typo in `docker built -t sweagent/swe-agent-run:latest .` corrected to `build`

[skip-CI]

Doc style: Use GH markdown admonitions

Doc: More installation hints

Doc fix: Change wording (docker socket)

Issues: Add 'question' label to questions; distinguish from bug

Issue templates: 'question' label; disam from bugs

Improve error handling of docker issues (princeton-nlp#165)

Closes princeton-nlp#114
Closes princeton-nlp#123
Closes princeton-nlp#159

Fix: Correctly catch docker connection errors

Allow to supply installation commands when running on gh issues (princeton-nlp#153)

* Allow to supply installation commands when running on gh issue

* Add doc for env specification

Issue template: Two more checkboxes for dupes/version

CI: Test OpenAI model (princeton-nlp#166)

Minor improvements for models.py

* refactor: Simple refactoring for clean code

* change the fstring issue for flake8

* Fix up prefix matching issue

* resolve conflicts

* update the model list

Fix warnings about simple_parsing import paths (princeton-nlp#176)

Fix signature of ParseCommandDetailed (princeton-nlp#177)

Simple typing improvements

Use ruff and enable some more checks (princeton-nlp#174)

* Check for unused imports and variables

* Fix some issues

* Remove some more unneeded imports

* Switch to using ruff for checks

* Remove two more imports

Update evaluation to reflect swebench `get_model_report`

Remove left-over debug statements

Test creation of persistent container (princeton-nlp#184)

Typing fixes & improvements (princeton-nlp#187)

Make github token fully optional (princeton-nlp#189)

Closes princeton-nlp#152

Improve --help message option headers (princeton-nlp#192)

The docstrings of the argument dataclasses are also used in the --help
message. If they aren't set, the signature of the dataclass is shown
instead.

Update README

nit: typos (princeton-nlp#212)

Update README.md

No need to specify platform in docker pull (princeton-nlp#210)

Signed-off-by: 勇里 <yongli.zzp@antgroup.com>

No need to specify platform in docker command

Fix: undefined local var replay_task_instances_path

Make patch note more noticeable (princeton-nlp#214)

* WIP

* More noticeable message about patch file being produced

Closes princeton-nlp#206

test: add tests for parsing functions (princeton-nlp#218)

* test: add tests for parsing functions

* refactore: fix redundant arguments

chore(models): simplify conditions and fix return types (princeton-nlp#216)

* chore(models): simplify conditions and fix return types

* undo formatting

---------

Co-authored-by: pmprones <massimiliano.pronesti@amadeus.com>

Rename is_from_github_url and minor typing fixes

Add --problem_statement flag

Allow to run on local repository

Git apply patch if running locally

Test running on local repo

Use --data_path for local problem stmts and --repo_path for local repos

Various fixes and improved tests for swe-env

Make instance a dataclass

Care was taken to add any missing fields to not break with old
datafiles.

Revert "Make instance a dataclass"

This reverts commit 97bf5e3.

Do not introduce dataclass

Fix: Throw ValueError if local repo is dirty

Test replay of batch mode

Mention local run in readme

Bump version

Fix opening PR from fork (princeton-nlp#229)

Fix opening PR from fork

Add changelog

Tests to use fast experimental communication strategy (princeton-nlp#230)

chore: update pre-commit hooks (princeton-nlp#231)

updates:
- [github.com/astral-sh/ruff-pre-commit: v0.3.5 → v0.3.7](astral-sh/ruff-pre-commit@v0.3.5...v0.3.7)

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

Fix pypi package installation command

Update to evaluation logic

Doc: missing 'no' in error message about --open_pr

Better error handling for --open_pr (princeton-nlp#239)

Closes princeton-nlp#237

Speed up testing with persistent containers & remove them end of session (princeton-nlp#238)

Closes princeton-nlp#228
Closes princeton-nlp#201

Do not attempt to save patch with empty patch (princeton-nlp#242)

* Fixed a potential error

I've ran into this error several times, where it says model_patch can't be None and ending the entire program.

* Do not attempt to save patch with empty patch

---------

Co-authored-by: Kilian Lieret <kilian.lieret@posteo.de>

Readme: GH token is optional

Add usage doc to run.py (princeton-nlp#243)

Remove debug print statement with experimental communicate

Update authors

fix: TARGETARCH not set on some OS/docker setups (princeton-nlp#249)

Add GPT4-turbo model (princeton-nlp#252)

Update authors

Add isolated flag to flake8 linting

Add isolated flag to flake8 linting

Fix typo - "doensn't" in templates (princeton-nlp#254)

Fix typo - "succesfully" in templates (princeton-nlp#255)

Catch one more docker error if docker isn't running (princeton-nlp#257)

Refactor run.py main function into class with hook structure (princeton-nlp#253)

* WIP

* Refactor run.py into class with hook structure

Closes princeton-nlp#170

* Add some more unit tests

* Some more tests

Added support for Bedrock-provided Claude models

Refactored to AnthropicModel and BedrockModel to avoid code duplication; Added custom error messages

Added Claude 3 Opus
https://aws.amazon.com/blogs/aws/anthropics-claude-3-opus-model-on-amazon-bedrock/

Fixed model name logic and typing bugs; Added missing return statements

Fixed None submission bug

Fixed token-counting for older models with Bedrock
anthropics/anthropic-sdk-python#353

Added max_tokens_to_sample for older models to avoid Bedrock val errors; Changed anthropic_history_to_messages output type

Added missing rich_argparse pkg

Change from claude 2 to claude 2.0 (see anthropics/anthropic-sdk-python#255)

Changed alias name (claude --> claude-2) and target (claude-2.0 --> claude-2.1)

pkg: merge all packaging stuff into pyproject.toml (princeton-nlp#256)

* pkg: merge all packaging stuff into pyproject.toml

* Add trivial test for packaging

* Add Carlos' email to packaging

---------

Co-authored-by: Kilian Lieret <kilian.lieret@posteo.de>

Use legacy API for claude-2.1

Thanks to @mikanfactory for spotting this!

Add hooks to agent (princeton-nlp#258)

* Add hooks to agent

* Test hook & fix non-running other tests

Update defaults.sh - scroll_down was misnamed

Use a shorter timeout duration for tests (princeton-nlp#264)

Adding more hooks to env and agent (princeton-nlp#265)

Update defaults and add last_5_history configs

chore: update pre-commit hooks (princeton-nlp#268)

updates:
- [github.com/astral-sh/ruff-pre-commit: v0.3.7 → v0.4.1](astral-sh/ruff-pre-commit@v0.3.7...v0.4.1)

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

Pass Python version to get_environment_yml

This ensures that the `environment.yml` is correctly constructed with the
specific Python version required for the instance.

Update swe_env.py

Replicates installation behavior from SWE-bench at https://github.com/princeton-nlp/SWE-bench/blob/cfb20092bbbee9683176177b2f59b85f522e7f27/swebench/harness/context_manager.py#L354-L376

Minor condition changes

Update edit_linting.sh - fix grammar issue

Update cursors_edit_linting.sh - fix grammar issue

Fix Together model validation error (princeton-nlp#236)

* test: add unit test for Together model

* fix: deal with the new Together API

* chore: specify together version

* refactor: clean code

* change together model versioning from ">=~" to ">=" and write comment

* raise exception when together SDK version is below 1.1.0

* refactor: update unit test format

* speficy max_tokens

chore: update pre-commit hooks (princeton-nlp#282)

updates:
- [github.com/astral-sh/ruff-pre-commit: v0.4.1 → v0.4.2](astral-sh/ruff-pre-commit@v0.4.1...v0.4.2)

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

WIP: Create GH codespaces

Codespaces: Fix permissions for talking to docker daemon

Codespaces: Pull swe-agent image; conda init

Codespace: Automatically activate swe-agent env

Codespaces: Fix: don't overwrite bashrc (princeton-nlp#288)

[Skip-ci]

Update README.md

Codespaces: Run additional setup as onCreateCommand

Update devcontainer.json

Revert "Update devcontainer.json"

This reverts commit c8542e7.

Add helpful message about conda env activation (princeton-nlp#289)

Codespaces: Use pip install instead of creating new conda env (princeton-nlp#291)

Doc: Avoid invalid github token (princeton-nlp#292)

[skip-ci]

Improve codespace setup & documentation (princeton-nlp#293)

[skip-CI]

* Codespaces: Remove shell setting; fix extensions setting

[skip-ci]

* Codespaces: Copy sample keys.cfg

[skip-ci]

* Codespaces: Add codespace badge

[skip-CI]

Doc: Add codespace video

Codespace: Add startup message to terminal (princeton-nlp#294)

[skip-ci]

CI: Use pip for installation instead conda (princeton-nlp#299)

* CI: Use pip for installation instead conda

* Make sure that python is set up

docker ignore everything from gitignore

[skip-ci]

Setup: do not duplicate requirements (princeton-nlp#300)

* WIP

* Fix: Need to copy app first before pip install .

CI: Add GHA to test running setup.sh (princeton-nlp#302)

Fix readme badge links (princeton-nlp#303)

Enh: Allow to directly specify problem statement (princeton-nlp#308)

fix:typo

Fix: Include demonstrations in dockerignore (princeton-nlp#311)

[skip-ci]

Update README.md

chore: update pre-commit hooks (princeton-nlp#318)

updates:
- [github.com/astral-sh/ruff-pre-commit: v0.4.2 → v0.4.3](astral-sh/ruff-pre-commit@v0.4.2...v0.4.3)

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

lint: use `typos` as precommit's hook (princeton-nlp#259)

* lint: use typos as precommit hook

* fixing typos

Doc: Recommend pip install instead of conda (princeton-nlp#304)

* Doc: Recommend pip install instead of conda

* Fix numbering

[skip-ci]

* Doc: Make installation with pip the default

Doc fix: Misleading comment about env vars with docker

Comment out all keys in sample keys.cfg by default

Update swe_env.py

fix typo

Doc: Fix links to installation issues section

[skip-ci]

Doc: Fix link to installation issues section

Web: Lay flask scaffolding

Do not use unix signal calls

Web: Can start runs from flask

Web: Split feed into two

Web: Use agent hooks

Web: Separate messages in feeds; markdown support

WIP

Web: Add prompts to feed

Web: Switch to using jquery

Web: Add step index and scroll to it

Web: Moved most of the interface to react

Web: Bring back highlighting

minor changes for server and client endpoints to better handling cors

Web Fix: Every message to appear only once

Web feat: Restore scrolling behavior

Web feat: Kill running computation

Web: Rename folder web -> api

Web: Remove files from flask prototype

Web refactor: Split up server.py

Web feat: Display log messages (partially broken)

Unfortunately all threads share the same stdout, so it's not trivial at
all to redirect different threads to different stdouts

Web enh: Control button activity depending on run state

Web enh: Auto-scroll log messages

Web enh: Only scroll and highlight after computation is finished

Web enh: Make sure that killing thread succeeds

Web: Factor out Feed.js; fix highlighting of step == null

Web WIP: Started to integrate swe-agent/demo parts

Web WIP: Styling and refactoring

Web WIP: Split up message types

Web enh: Bring in some highlighting

Web feat: Include the rest of the demo code

minor refactor of the server to fix 403 code and also missing secret_key

adding requirements.txt there are many version conflicts in the codebase, it's hard to run the server without having the correct version. Adding the requirements to standardize the future setup

Web: Fix port of server for websocket

Web: Redirect all relevant stderr & handle errors in thread

Web: Rename feeds

Web: Add warning message if server is not connected

Web: Simple script to start web server

Codespace: Install npm

Web: Make sure that pm2 is found in cleanup method

Web: Factor out run control

Web: Allow different ways to specify PS; repo path; bootstrap

Web: Place controls in accordion

Web: Format test run checkbox as switch

Web fix: Reset highlighted step after running

Web: Add flask dependencies

disabled bubbles' scrolling and text color

Rearranged input elements

removed unnecessary elements

create copy function for log panel

change color for highlighted messages

Web: Replace accordion with tabs

Web: Various Styling improvements

Web fix: Checkbox default state not reflected

Web fix: Highlighting in terminal (restore linebreaks)

Web enh: Remove highlight if mouse leaves message

Web enh: Add timeout to highlight/scroll

Web enh: Run button layout; logo; remove header

Web: Add link to github readme

Web feat: Model selection

Web enh: Fix spacing of code blocks

Better messages for InstantEmptySubmitTestModel

Web: Remove "Thought" and fix info msg styling

Web enh: Add start message; style no connection error msg

Web style: Remove three dots; move logos into window bars

Web style: Descriptions for other text fields

Web ref: Move CSS to appropriate files

Web: Move swe-agent logo to top bar

Web: Font-size adjustments

Web: Minimize menu when run started

Web: Only show "Copy to clipboard" after run

Web: Show critical errors in top banner

Web: Show explicit support for local PS or repos

Web: Improve handling of container closing

Web: Assume compute has finished when 20s no update

Web: Always use experimental speedups

Web: Add note about successful pitch; real example by default

Web: Catch bug with empty observation

Web: Reformat code with prettier

Print helpful error message when flask isn't available

Close environment when raising exception

Web: Always raise exceptions

Web: Switch to silver logos

Web: Change title of agent feed

Web feat: Allow to specify python version & req pkgs

Web feat: Allow to specify path to shell script

Web: Temporarily disable timeout-based setIsComputing

Web feat: Set custom install command

Web style fix: Position of logo for narrow screens

Fix: Handling of long problem statements

Style: Black format api code

[pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Remove typo/comment

Fix: Handling gh issue URLs as problem statements

Doc: Add gif of web interface

[skip ci]

Doc: Add web UI instructions

[skip ci]

Fix typo

[skip ci]

Fix: Catch container not found and retry after wait

Fixes princeton-nlp#322

Doc: Add information of how to open correct browser window (princeton-nlp#324)

[skip ci]

Doc: Suggest starting web UI in GH codespaces

Update README.md - slight rewording of a header

Web: Fix script_path input (princeton-nlp#334)

Closes princeton-nlp#333

[skip ci]

Update README.md - updating bibtex

Update README.md

Update README.md

Readme: Fix links

[skip ci]

Improve handling of incorrect repo_path configs (princeton-nlp#340)

Always get base_commit hash (can be specified as tag/branch) (princeton-nlp#341)

Fix: Don't print patch msg for exit_cost patch (princeton-nlp#343)

Closes princeton-nlp#342

Add gpt-4o model (princeton-nlp#344)

Co-authored-by: Ray Myers <rmyers@indeed.com>

Fix: Do not request job control in bash (princeton-nlp#345)

Closes princeton-nlp#331

It's unlikely that job control was ever granted. Currently we're getting

ERROR    Unexpected container setup output: /bin/bash: cannot set terminal process group (-1): Inappropriate ioctl for device
         /bin/bash: no job control in this shell

Because of this.

Fix: --base_commit not used for gh urls (princeton-nlp#346)

chore: update pre-commit hooks (princeton-nlp#347)

updates:
- [github.com/crate-ci/typos: v1.20.7 → v1.21.0](crate-ci/typos@v1.20.7...v1.21.0)
- [github.com/astral-sh/ruff-pre-commit: v0.4.3 → v0.4.4](astral-sh/ruff-pre-commit@v0.4.3...v0.4.4)
- [github.com/pre-commit/mirrors-prettier:  → v4.0.0-alpha.8](pre-commit/mirrors-prettier@...v4.0.0-alpha.8)

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

Fix: Separate data path/traj dir cause exception (princeton-nlp#348)

Readme: Shorten ACI text

[skip ci]

Update README.md

Update README.md

Remove duplicated abstract method (princeton-nlp#355)

Web: Refactor state into one runConfig with use-immer (princeton-nlp#350)

Web: Allow to specify commit hash (princeton-nlp#358)

Closes princeton-nlp#336

CI: Use uv pip install (princeton-nlp#360)

* CI: Use uv pip install

* CI: Try with explicit virtuale_env

Web: Shorten long error messages in banner (princeton-nlp#361)

Closes princeton-nlp#330

Wait longer if processes still running (princeton-nlp#364)

Closes princeton-nlp#363

Update default_sys-env_cursors_window100-detailed_cmd_format-full_history-1_demos.yaml - adding warning to experimental config

Update default_sys-env_cursors_window100-detailed_cmd_format-last_5_history-1_demos.yaml - adding warning to experimental config

Update xml_sys-env_cursors_window100-detailed_cmd_format-full_history-1_demos.yaml - adding warning to experimental config

Update xml_sys-env_cursors_window100-detailed_cmd_format-last_5_history-1_demos.yaml - adding warning to experimental config

Update README.md - clarify that traj arg has to be absolute path

Fix handling of not_generated/no_generation in inspector (princeton-nlp#332)

* Fix typo in inspector server.py

This leads to "Results format not recognized" error whenever viewing the eval report for a trajectory.

* Fix: Consistently handle no_generation vs not_generated

---------

Co-authored-by: Kilian Lieret <kilian.lieret@posteo.de>

Inspector: Better labels for roles (princeton-nlp#368)

Closes princeton-nlp#365

Change icons for trajectory viewer (princeton-nlp#370)

Closes princeton-nlp#365

Move documentation to mkdocs (princeton-nlp#371)

Docs: Add installation overview page (princeton-nlp#377)

Docs: Add github button; edit feature

Docs: Change color preferences

Docs: Add next prev/buttons

CI: Skip CI for PRs that only touch docs

Docs: Switch to documentation

Add default environment_setup config (princeton-nlp#351)

[skip ci]

Docs: Fix max-width tag of doc link

[skip ci]

Doc: Significantly expand CL tutorial

Doc: Restore docs on starting web UI on GH codespaces

Doc: Add copy button; highlight specific lines

Doc/CI: Speed up documentation build

Doc: Move config docs to mkdocs

CI: Set VIRTUAL_ENV for uv

Doc: Fix inclusion of image in config.md

Doc: Attempt to use relative image path

Doc: Add changelog

Closes princeton-nlp#335

Docs: Add more READMEs to mkdocs

Remind people not to use screenshots when reporting bugs

Remind people not to use screenshots for error messages

Upper bound request version to avoid docker-py bug (princeton-nlp#390)

Closes princeton-nlp#379

Doc: Replace symlinks with markdown files with links (princeton-nlp#392)

Closes princeton-nlp#388

Docs: Add search (princeton-nlp#393)

Closes princeton-nlp#387

Search is added by default but must be manually added if any other plugins are
configured

See https://github.com/squidfunk/mkdocs-material/blob/master/docs/setup/setting-up-site-search.md

Docs: Add code of conduct (princeton-nlp#394)

[skip ci]

Add nodejs to swe-agent-run container (princeton-nlp#396)

Docs: Note about old images from the hub (princeton-nlp#395)

Docs: Advice to update pip if unsuccessful (princeton-nlp#399)

Show error log if web server fails (princeton-nlp#400)

[skip ci]

CI: Fix passing python path to uv (princeton-nlp#401)

Docs: Detailed way to start the web server (princeton-nlp#402)

Docs: Use grids for prettier selections (princeton-nlp#403)

Doc: Avoid duplicate information

Docs: Add footer with links to report bugs (princeton-nlp#404)

Docs/CI: Install mkdocs-include-markdown-plugin

Improve question issue template

Update question issue template

Update question issue template

Update question issue template

Doc: Typo fix

Split between configuration and development (princeton-nlp#407)

Remove requests upper bound, add docker-py lower bound (princeton-nlp#406)

Closes princeton-nlp#391

deprecate action from get_submission (princeton-nlp#274)

Doc: Fix links to website pages (princeton-nlp#411)

Print trajectory path only at beginning/end (princeton-nlp#408)

Closes princeton-nlp#381

Fix: IndexError when replaying incomplete trajectories (princeton-nlp#410)

Closes princeton-nlp#124

Add dev dependencies (princeton-nlp#414)

Add dev notes (princeton-nlp#415)

Docs: Move contribution guide to root to help gh discover it

CI: Use github token during CI operations  (princeton-nlp#412)

Fixes princeton-nlp#405

Make use case for discord clearer

Enh: Suppress openai logging; improve formatting of stats (princeton-nlp#416)

Closes princeton-nlp#382

Tweaks to use swe-agent web UI from docker (princeton-nlp#423)

Speed up evaluation by caching task environments as docker images (princeton-nlp#317)

* cache task environment as docker images with separate tags

* save env vars inside the task image before docker commit, debug timing

* increase docker api timeout to afford long commits

* fix

* fix

* remove timing collection code

* some cleanup

* remove timings storage

* use close func to stop container

* address review comment, type hint

chore: update pre-commit hooks (princeton-nlp#424)

updates:
- [github.com/astral-sh/ruff-pre-commit: v0.4.4 → v0.4.5](astral-sh/ruff-pre-commit@v0.4.4...v0.4.5)

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

Add test for caching of task envs

Make cached image name depend only on relevant features

Document --cache_task_images

Doc: Port more content from readme to docs/ (princeton-nlp#427)

* Doc: Port more content from readme to docs/

* Fix links

Remove signal dependency (princeton-nlp#428)

Do not use select if running on Windows (princeton-nlp#429)

* Do not use select if running on Windows

* Test on windows

Ensure that uv is avialable in containers (princeton-nlp#431)

Use custom Config class to support env and keys.cfg (princeton-nlp#430)

* Use custom Config class to support env and keys.cfg

* Fix patching

* Doc: Document use of environment variables

* Doc: swap out env reference

Doc: Document running web server from docker container (princeton-nlp#426)

* Doc: Document running web server from docker container

* Fix link

Fix: Correct path to keys.cfg

Fix: Config doesn't take pathlib.Path (princeton-nlp#434)

Strip trailing whitespace & black formatting

Allow ruff to write fixes

[skip ci]

Sort imports

Code quality: Convert to make use of PEP 585 and PEP 604

CI: Add pyupgrade via ruff

Add more fixable ruff checks

Fix compatibility with main branch

Fix unittest by excluding test data from formatting

Doc: Add note about running tests (princeton-nlp#435)

Add flake8-errmsg to tests

Some more ruff checks

Format: Use trailing commas

CI: Add pytest rules

CI: Add flake8 simplify

Code qual: Some one-off fixes

Docs: Note about updates (princeton-nlp#438)

Remove direct imports in __init__.py; improve error handling of keys_config (princeton-nlp#436)

keys_config

Doc: Add notes about merge-conflicts after formatting changes (princeton-nlp#439)

[skip CI]

Dev: Exclude format commits from showing up in git blame

[skip ci]

Bump version

[skip ci]

Doc: Update changelog (princeton-nlp#441)

CI: Release to dockerhub via github actions (princeton-nlp#440)

* CI: Release to dockerhub via github actions

* Checkout code

* Fix name

[skip ci]

* Run daily by midnight

* Doc: remove notice about later docker images

Doc: Add badge for container build

Doc: Document keywords of run.py (princeton-nlp#443)

Closes princeton-nlp#442

Doc: Fix links to paper

Doc: Fix broken formatting

Update README.md

Resolve relative paths to demonstrations and commands (princeton-nlp#444)

* Resolve relative paths to demonstrations

Closes princeton-nlp#225

* Resolve more paths relative to REPO_ROOT

* Allow to override config root

* Document

Docs: Links to good first issues/help wanted

Docs: Add more prominent note about formatting merge conflicts

Update citation

Doc: Add placeholder for updating forks

Docs: Add verbose notes about avoiding formatting merge conflicts (princeton-nlp#448)

* Docs: Add verbose notes about avoiding formatting merge conflicts

* Include report footer

Doc: Fix link to migration

Docs: Update link to fix formatting issues

Doc: Pull correct image for updating

Docs: Improve installation steps

Chore: Fix whitespace error

Update demonstrations.md

Update and rename faq.md to usage_faq.md

Improve landing page and add background section (princeton-nlp#458)

* Docs: Improve navigation from front page

* Docs: Improve landing page

* Fix link to changelog

Docs: Start to add API documentation (princeton-nlp#460)

Doc: Fix formatting and links

CI/Docs: Add mkdocstrings to dependencies

CI: Only run test build containers if changed (princeton-nlp#462)

Docs/CI: Fix docs build & run for PRs (princeton-nlp#461)

* CI: Always run mkdocs for testing

* Actually build

* Need to install complete dev

* Specify python root

* Fix link

Docs: Fix inclusion of code structure

Doc: Format fix

Ensure container_name is reset for non-persistent containers (princeton-nlp#463)

* Ensure container_name is reset for non-persistent containers

Might help with princeton-nlp#451

* Always draw new container name

Docs: Bring back some more ACI text

Fix: Raise unclassified exception; use from e (princeton-nlp#464)

* Fix: Raise unclassified exception; use from e

* Improve exception logging

Change run return_type default to "info_trajectory"; doc improvements (princeton-nlp#466)

* Change run return_type default to "info_trajectory"; doc improvements

* Doc: Ensure that all public methods have docstring stub

Otherwise not shown in docs

add swe env docstrings (princeton-nlp#468)

* Change run return_type default to "info_trajectory"; doc improvements

* Doc: Ensure that all public methods have docstring stub

Otherwise not shown in docs

* Doc: Add SWEEnv docstrings
ethanabrooks pushed a commit to reflectionai/SWE-agent that referenced this issue Jun 21, 2024
Add results + preview image

Fix website link

Update README.md

Update make_demos README

Update make_demos README

Add demonstration trajectories

Add support for ollama models

Fix setup.py type

Fix "idented" typo

Update README.md

Added link to GitHub token explanation

Update README.md

Fix broken links in readme (princeton-nlp#6)

Typo fix readme (princeton-nlp#19)

immensly -> immensely

Add correspondence

fix: allow token from keys.cfg to get passed to ghapi (princeton-nlp#31)

Fix unbound variable in error handling (princeton-nlp#32)

More helpful error message if docker is not running (princeton-nlp#33)

See princeton-nlp#20

chore: remove gnureadline dependency (princeton-nlp#12)

Doc: add TOGETHER_API_KEY to keys.cfg section of README (princeton-nlp#34)

I noticed there is also a `TOGETHER_API_KEY` key that can be set in `keys.cfg`, but it wasn't mentioned in the README, so wanted to add it:

https://github.com/princeton-nlp/SWE-agent/blob/6c9ebf0ea8a263806b276da7ba3b1eda1f4a9475/sweagent/agent/models.py#L509-L511

Fix typo omitted (princeton-nlp#45)

ommitted -> omitted

Increase portability of setup.sh; abort on failure

In reference to princeton-nlp#42

config_file is a required arg in run_replay.sh (princeton-nlp#48)

Fixes princeton-nlp#46

Handle with missing prompt_eval_count in Ollama (princeton-nlp#49)

Closes princeton-nlp#44

feat(models): natively support claude haiku (princeton-nlp#9)

fixed typo in config/README (princeton-nlp#55)

Update README.md

Add very basic pre-commit config (princeton-nlp#62)

Open PR to repository

More conditions to open PR; better commit msg; refactor

Refactor: Move open PR code to env

Remove debug messages; print PR URL; open PR as draft

Skip PR creation if there are associated commits

Refactor open-PR config and add override to skip if referenced

Allow to specify separate URL to push to a fork

Remove left-over prototyping code

Add trajectory to PR

Only allow overriding skip_if_commits_reference_issue on your own repo

Update run.py

Remove type hint to avoid flake8 false positive

Fix: Unexpected keyword 'split' in load_dataset (princeton-nlp#76)

Closes princeton-nlp#70

Fix: Allow run_replay with github URLs as data_path (princeton-nlp#58)

Closes princeton-nlp#47

feat: add support for azure openai (princeton-nlp#16)

* feat: add support for azure openai

Signed-off-by: Chapman Pendery <cpendery@microsoft.com>

* fix: feedback

Signed-off-by: Chapman Pendery <cpendery@microsoft.com>

* fix: add api_version

Co-authored-by: Massimiliano Pronesti <massimiliano.pronesti@gmail.com>

* docs: add azure openai version to readme

Signed-off-by: Chapman Pendery <cpendery@microsoft.com>

* style: fix formatting

Signed-off-by: Chapman Pendery <cpendery@microsoft.com>

---------

Signed-off-by: Chapman Pendery <cpendery@microsoft.com>
Co-authored-by: Massimiliano Pronesti <massimiliano.pronesti@gmail.com>

Add try/catch around PatchSet creation in evaluation

Clean up run_replay

Fix searching for flag-like strings, e.g., search_file "--flag"

Update README.md

Containerize application (princeton-nlp#81)

Fix: Using docker images from dockerhub (princeton-nlp#85)

Add release script for dockerhub (princeton-nlp#86)

Fix docker setup: updated image names (princeton-nlp#87)

Add run via docker instructions to readme (princeton-nlp#90)

* Add run via docker instructions to readme

* Add note about windows

* Add proper hint styling

Small refactor: Add quicksart section (princeton-nlp#56)

* Restructure readme: quickstart before eval

* Remove mention of PR creation

Small style fixes to readme

Add note about windows with conda installation

Update README.md

Mention --open_pr flag

Update README.md

Update run.sh

Update run_from_url.sh

Update run.py default model arguments

Update default model arguments - greedy decoding and 3.00 per instance cost

Shell script highlighting in readme

Update README.md

Fix: Update default image name (princeton-nlp#102)

Doc: Consolidate containerized run examples

Add issue template

fix: bad newline getting sent on windows (princeton-nlp#79)

Signed-off-by: Chapman Pendery <cpendery@microsoft.com>

Make sure that keys.cfg doesn't get copied to Docker

Add templates for issues, pr

Doc: Remove leftover "click to expand box"

Fix release script: latest tag can already exist on dockerhub

Add docs for how to write your own commands

Mount keys.cfg within container

Workaround for princeton-nlp#109

Doc: Missing backslash

Improve bug report template (princeton-nlp#113)

Add template workflow diagram

Change doc_improvement to question

Warning about containers being only for arm64 at the moment

Code quality: Improve inference of return type

Add flag to raise exceptions in run.py

Forward unparsed arguments in run_replay.py to run.py

Fix: Unbound local variable/name shadowing

This probably only ran because of name shadowing

Do not leave python when calling run.py

This helps with debugging run_replay

Separately save patch files + some typing cleanup (princeton-nlp#126)

Closes princeton-nlp#41

Allow to configure openapi base url (princeton-nlp#118)

---------

Co-authored-by: Kilian Lieret <kilian.lieret@posteo.de>

Remove azure override of model name (princeton-nlp#127)

Add pre-commit badge

Add markdown link checker (princeton-nlp#129)

* Add markdown link checker

* Fix & ignore broken markdown links

Add markdown link checker badge

Add run_replay integration test

Add CI with github actions

Add CI badge

Fix: Choosing TogetherAI models (princeton-nlp#130)

Closes 101

Revert "Remove azure override of model name (princeton-nlp#127)"

This reverts commit 311467c.

See discussion in princeton-nlp#127

Advertise experimental amd64 docker builds

Fix typo in server.py

seperately -> separately

Update README.md - move badges to bottom

Improve bug template

Improve bug report template

Improve bug report template

Improve bug report template

Better link for issue formatting

Upload coverage data to codecov (princeton-nlp#140)

Add codecov config and badge

chore: update pre-commit hooks (princeton-nlp#141)

updates:
- [github.com/pre-commit/pre-commit-hooks: v4.5.0 → v4.6.0](pre-commit/pre-commit-hooks@v4.5.0...v4.6.0)
- [github.com/pycqa/flake8: 4.0.1 → 7.0.0](PyCQA/flake8@4.0.1...7.0.0)

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

multiplatform docker builds (princeton-nlp#131)

* Select the right conda path from within the container

* Build multiplatform images

Improve test coverage (princeton-nlp#142)

Doc: Remove architecture notice for docker

Update README.md - change LLM to LM :)

Update README.md

Add Ollama support section

Update README.md

Update README.md

Update README.md

Add ollama link

Increase coverage of swe-env tests (princeton-nlp#154)

Fix typo in README.md (princeton-nlp#155)

typo in `docker built -t sweagent/swe-agent-run:latest .` corrected to `build`

[skip-CI]

Doc style: Use GH markdown admonitions

Doc: More installation hints

Doc fix: Change wording (docker socket)

Issues: Add 'question' label to questions; distinguish from bug

Issue templates: 'question' label; disam from bugs

Improve error handling of docker issues (princeton-nlp#165)

Closes princeton-nlp#114
Closes princeton-nlp#123
Closes princeton-nlp#159

Fix: Correctly catch docker connection errors

Allow to supply installation commands when running on gh issues (princeton-nlp#153)

* Allow to supply installation commands when running on gh issue

* Add doc for env specification

Issue template: Two more checkboxes for dupes/version

CI: Test OpenAI model (princeton-nlp#166)

Minor improvements for models.py

* refactor: Simple refactoring for clean code

* change the fstring issue for flake8

* Fix up prefix matching issue

* resolve conflicts

* update the model list

Fix warnings about simple_parsing import paths (princeton-nlp#176)

Fix signature of ParseCommandDetailed (princeton-nlp#177)

Simple typing improvements

Use ruff and enable some more checks (princeton-nlp#174)

* Check for unused imports and variables

* Fix some issues

* Remove some more unneeded imports

* Switch to using ruff for checks

* Remove two more imports

Update evaluation to reflect swebench `get_model_report`

Remove left-over debug statements

Test creation of persistent container (princeton-nlp#184)

Typing fixes & improvements (princeton-nlp#187)

Make github token fully optional (princeton-nlp#189)

Closes princeton-nlp#152

Improve --help message option headers (princeton-nlp#192)

The docstrings of the argument dataclasses are also used in the --help
message. If they aren't set, the signature of the dataclass is shown
instead.

Update README

nit: typos (princeton-nlp#212)

Update README.md

No need to specify platform in docker pull (princeton-nlp#210)

Signed-off-by: 勇里 <yongli.zzp@antgroup.com>

No need to specify platform in docker command

Fix: undefined local var replay_task_instances_path

Make patch note more noticeable (princeton-nlp#214)

* WIP

* More noticeable message about patch file being produced

Closes princeton-nlp#206

test: add tests for parsing functions (princeton-nlp#218)

* test: add tests for parsing functions

* refactore: fix redundant arguments

chore(models): simplify conditions and fix return types (princeton-nlp#216)

* chore(models): simplify conditions and fix return types

* undo formatting

---------

Co-authored-by: pmprones <massimiliano.pronesti@amadeus.com>

Rename is_from_github_url and minor typing fixes

Add --problem_statement flag

Allow to run on local repository

Git apply patch if running locally

Test running on local repo

Use --data_path for local problem stmts and --repo_path for local repos

Various fixes and improved tests for swe-env

Make instance a dataclass

Care was taken to add any missing fields to not break with old
datafiles.

Revert "Make instance a dataclass"

This reverts commit 97bf5e3.

Do not introduce dataclass

Fix: Throw ValueError if local repo is dirty

Test replay of batch mode

Mention local run in readme

Bump version

Fix opening PR from fork (princeton-nlp#229)

Fix opening PR from fork

Add changelog

Tests to use fast experimental communication strategy (princeton-nlp#230)

chore: update pre-commit hooks (princeton-nlp#231)

updates:
- [github.com/astral-sh/ruff-pre-commit: v0.3.5 → v0.3.7](astral-sh/ruff-pre-commit@v0.3.5...v0.3.7)

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

Fix pypi package installation command

Update to evaluation logic

Doc: missing 'no' in error message about --open_pr

Better error handling for --open_pr (princeton-nlp#239)

Closes princeton-nlp#237

Speed up testing with persistent containers & remove them end of session (princeton-nlp#238)

Closes princeton-nlp#228
Closes princeton-nlp#201

Do not attempt to save patch with empty patch (princeton-nlp#242)

* Fixed a potential error

I've ran into this error several times, where it says model_patch can't be None and ending the entire program.

* Do not attempt to save patch with empty patch

---------

Co-authored-by: Kilian Lieret <kilian.lieret@posteo.de>

Readme: GH token is optional

Add usage doc to run.py (princeton-nlp#243)

Remove debug print statement with experimental communicate

Update authors

fix: TARGETARCH not set on some OS/docker setups (princeton-nlp#249)

Add GPT4-turbo model (princeton-nlp#252)

Update authors

Add isolated flag to flake8 linting

Add isolated flag to flake8 linting

Fix typo - "doensn't" in templates (princeton-nlp#254)

Fix typo - "succesfully" in templates (princeton-nlp#255)

Catch one more docker error if docker isn't running (princeton-nlp#257)

Refactor run.py main function into class with hook structure (princeton-nlp#253)

* WIP

* Refactor run.py into class with hook structure

Closes princeton-nlp#170

* Add some more unit tests

* Some more tests

Added support for Bedrock-provided Claude models

Refactored to AnthropicModel and BedrockModel to avoid code duplication; Added custom error messages

Added Claude 3 Opus
https://aws.amazon.com/blogs/aws/anthropics-claude-3-opus-model-on-amazon-bedrock/

Fixed model name logic and typing bugs; Added missing return statements

Fixed None submission bug

Fixed token-counting for older models with Bedrock
anthropics/anthropic-sdk-python#353

Added max_tokens_to_sample for older models to avoid Bedrock val errors; Changed anthropic_history_to_messages output type

Added missing rich_argparse pkg

Change from claude 2 to claude 2.0 (see anthropics/anthropic-sdk-python#255)

Changed alias name (claude --> claude-2) and target (claude-2.0 --> claude-2.1)

pkg: merge all packaging stuff into pyproject.toml (princeton-nlp#256)

* pkg: merge all packaging stuff into pyproject.toml

* Add trivial test for packaging

* Add Carlos' email to packaging

---------

Co-authored-by: Kilian Lieret <kilian.lieret@posteo.de>

Use legacy API for claude-2.1

Thanks to @mikanfactory for spotting this!

Add hooks to agent (princeton-nlp#258)

* Add hooks to agent

* Test hook & fix non-running other tests

Update defaults.sh - scroll_down was misnamed

Use a shorter timeout duration for tests (princeton-nlp#264)

Adding more hooks to env and agent (princeton-nlp#265)

Update defaults and add last_5_history configs

chore: update pre-commit hooks (princeton-nlp#268)

updates:
- [github.com/astral-sh/ruff-pre-commit: v0.3.7 → v0.4.1](astral-sh/ruff-pre-commit@v0.3.7...v0.4.1)

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

Pass Python version to get_environment_yml

This ensures that the `environment.yml` is correctly constructed with the
specific Python version required for the instance.

Update swe_env.py

Replicates installation behavior from SWE-bench at https://github.com/princeton-nlp/SWE-bench/blob/cfb20092bbbee9683176177b2f59b85f522e7f27/swebench/harness/context_manager.py#L354-L376

Minor condition changes

Update edit_linting.sh - fix grammar issue

Update cursors_edit_linting.sh - fix grammar issue

Fix Together model validation error (princeton-nlp#236)

* test: add unit test for Together model

* fix: deal with the new Together API

* chore: specify together version

* refactor: clean code

* change together model versioning from ">=~" to ">=" and write comment

* raise exception when together SDK version is below 1.1.0

* refactor: update unit test format

* speficy max_tokens

chore: update pre-commit hooks (princeton-nlp#282)

updates:
- [github.com/astral-sh/ruff-pre-commit: v0.4.1 → v0.4.2](astral-sh/ruff-pre-commit@v0.4.1...v0.4.2)

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

WIP: Create GH codespaces

Codespaces: Fix permissions for talking to docker daemon

Codespaces: Pull swe-agent image; conda init

Codespace: Automatically activate swe-agent env

Codespaces: Fix: don't overwrite bashrc (princeton-nlp#288)

[Skip-ci]

Update README.md

Codespaces: Run additional setup as onCreateCommand

Update devcontainer.json

Revert "Update devcontainer.json"

This reverts commit c8542e7.

Add helpful message about conda env activation (princeton-nlp#289)

Codespaces: Use pip install instead of creating new conda env (princeton-nlp#291)

Doc: Avoid invalid github token (princeton-nlp#292)

[skip-ci]

Improve codespace setup & documentation (princeton-nlp#293)

[skip-CI]

* Codespaces: Remove shell setting; fix extensions setting

[skip-ci]

* Codespaces: Copy sample keys.cfg

[skip-ci]

* Codespaces: Add codespace badge

[skip-CI]

Doc: Add codespace video

Codespace: Add startup message to terminal (princeton-nlp#294)

[skip-ci]

CI: Use pip for installation instead conda (princeton-nlp#299)

* CI: Use pip for installation instead conda

* Make sure that python is set up

docker ignore everything from gitignore

[skip-ci]

Setup: do not duplicate requirements (princeton-nlp#300)

* WIP

* Fix: Need to copy app first before pip install .

CI: Add GHA to test running setup.sh (princeton-nlp#302)

Fix readme badge links (princeton-nlp#303)

Enh: Allow to directly specify problem statement (princeton-nlp#308)

fix:typo

Fix: Include demonstrations in dockerignore (princeton-nlp#311)

[skip-ci]

Update README.md

chore: update pre-commit hooks (princeton-nlp#318)

updates:
- [github.com/astral-sh/ruff-pre-commit: v0.4.2 → v0.4.3](astral-sh/ruff-pre-commit@v0.4.2...v0.4.3)

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

lint: use `typos` as precommit's hook (princeton-nlp#259)

* lint: use typos as precommit hook

* fixing typos

Doc: Recommend pip install instead of conda (princeton-nlp#304)

* Doc: Recommend pip install instead of conda

* Fix numbering

[skip-ci]

* Doc: Make installation with pip the default

Doc fix: Misleading comment about env vars with docker

Comment out all keys in sample keys.cfg by default

Update swe_env.py

fix typo

Doc: Fix links to installation issues section

[skip-ci]

Doc: Fix link to installation issues section

Web: Lay flask scaffolding

Do not use unix signal calls

Web: Can start runs from flask

Web: Split feed into two

Web: Use agent hooks

Web: Separate messages in feeds; markdown support

WIP

Web: Add prompts to feed

Web: Switch to using jquery

Web: Add step index and scroll to it

Web: Moved most of the interface to react

Web: Bring back highlighting

minor changes for server and client endpoints to better handling cors

Web Fix: Every message to appear only once

Web feat: Restore scrolling behavior

Web feat: Kill running computation

Web: Rename folder web -> api

Web: Remove files from flask prototype

Web refactor: Split up server.py

Web feat: Display log messages (partially broken)

Unfortunately all threads share the same stdout, so it's not trivial at
all to redirect different threads to different stdouts

Web enh: Control button activity depending on run state

Web enh: Auto-scroll log messages

Web enh: Only scroll and highlight after computation is finished

Web enh: Make sure that killing thread succeeds

Web: Factor out Feed.js; fix highlighting of step == null

Web WIP: Started to integrate swe-agent/demo parts

Web WIP: Styling and refactoring

Web WIP: Split up message types

Web enh: Bring in some highlighting

Web feat: Include the rest of the demo code

minor refactor of the server to fix 403 code and also missing secret_key

adding requirements.txt there are many version conflicts in the codebase, it's hard to run the server without having the correct version. Adding the requirements to standardize the future setup

Web: Fix port of server for websocket

Web: Redirect all relevant stderr & handle errors in thread

Web: Rename feeds

Web: Add warning message if server is not connected

Web: Simple script to start web server

Codespace: Install npm

Web: Make sure that pm2 is found in cleanup method

Web: Factor out run control

Web: Allow different ways to specify PS; repo path; bootstrap

Web: Place controls in accordion

Web: Format test run checkbox as switch

Web fix: Reset highlighted step after running

Web: Add flask dependencies

disabled bubbles' scrolling and text color

Rearranged input elements

removed unnecessary elements

create copy function for log panel

change color for highlighted messages

Web: Replace accordion with tabs

Web: Various Styling improvements

Web fix: Checkbox default state not reflected

Web fix: Highlighting in terminal (restore linebreaks)

Web enh: Remove highlight if mouse leaves message

Web enh: Add timeout to highlight/scroll

Web enh: Run button layout; logo; remove header

Web: Add link to github readme

Web feat: Model selection

Web enh: Fix spacing of code blocks

Better messages for InstantEmptySubmitTestModel

Web: Remove "Thought" and fix info msg styling

Web enh: Add start message; style no connection error msg

Web style: Remove three dots; move logos into window bars

Web style: Descriptions for other text fields

Web ref: Move CSS to appropriate files

Web: Move swe-agent logo to top bar

Web: Font-size adjustments

Web: Minimize menu when run started

Web: Only show "Copy to clipboard" after run

Web: Show critical errors in top banner

Web: Show explicit support for local PS or repos

Web: Improve handling of container closing

Web: Assume compute has finished when 20s no update

Web: Always use experimental speedups

Web: Add note about successful pitch; real example by default

Web: Catch bug with empty observation

Web: Reformat code with prettier

Print helpful error message when flask isn't available

Close environment when raising exception

Web: Always raise exceptions

Web: Switch to silver logos

Web: Change title of agent feed

Web feat: Allow to specify python version & req pkgs

Web feat: Allow to specify path to shell script

Web: Temporarily disable timeout-based setIsComputing

Web feat: Set custom install command

Web style fix: Position of logo for narrow screens

Fix: Handling of long problem statements

Style: Black format api code

[pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Remove typo/comment

Fix: Handling gh issue URLs as problem statements

Doc: Add gif of web interface

[skip ci]

Doc: Add web UI instructions

[skip ci]

Fix typo

[skip ci]

Fix: Catch container not found and retry after wait

Fixes princeton-nlp#322

Doc: Add information of how to open correct browser window (princeton-nlp#324)

[skip ci]

Doc: Suggest starting web UI in GH codespaces

Update README.md - slight rewording of a header

Web: Fix script_path input (princeton-nlp#334)

Closes princeton-nlp#333

[skip ci]

Update README.md - updating bibtex

Update README.md

Update README.md

Readme: Fix links

[skip ci]

Improve handling of incorrect repo_path configs (princeton-nlp#340)

Always get base_commit hash (can be specified as tag/branch) (princeton-nlp#341)

Fix: Don't print patch msg for exit_cost patch (princeton-nlp#343)

Closes princeton-nlp#342

Add gpt-4o model (princeton-nlp#344)

Co-authored-by: Ray Myers <rmyers@indeed.com>

Fix: Do not request job control in bash (princeton-nlp#345)

Closes princeton-nlp#331

It's unlikely that job control was ever granted. Currently we're getting

ERROR    Unexpected container setup output: /bin/bash: cannot set terminal process group (-1): Inappropriate ioctl for device
         /bin/bash: no job control in this shell

Because of this.

Fix: --base_commit not used for gh urls (princeton-nlp#346)

chore: update pre-commit hooks (princeton-nlp#347)

updates:
- [github.com/crate-ci/typos: v1.20.7 → v1.21.0](crate-ci/typos@v1.20.7...v1.21.0)
- [github.com/astral-sh/ruff-pre-commit: v0.4.3 → v0.4.4](astral-sh/ruff-pre-commit@v0.4.3...v0.4.4)
- [github.com/pre-commit/mirrors-prettier:  → v4.0.0-alpha.8](pre-commit/mirrors-prettier@...v4.0.0-alpha.8)

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

Fix: Separate data path/traj dir cause exception (princeton-nlp#348)

Readme: Shorten ACI text

[skip ci]

Update README.md

Update README.md

Remove duplicated abstract method (princeton-nlp#355)

Web: Refactor state into one runConfig with use-immer (princeton-nlp#350)

Web: Allow to specify commit hash (princeton-nlp#358)

Closes princeton-nlp#336

CI: Use uv pip install (princeton-nlp#360)

* CI: Use uv pip install

* CI: Try with explicit virtuale_env

Web: Shorten long error messages in banner (princeton-nlp#361)

Closes princeton-nlp#330

Wait longer if processes still running (princeton-nlp#364)

Closes princeton-nlp#363

Update default_sys-env_cursors_window100-detailed_cmd_format-full_history-1_demos.yaml - adding warning to experimental config

Update default_sys-env_cursors_window100-detailed_cmd_format-last_5_history-1_demos.yaml - adding warning to experimental config

Update xml_sys-env_cursors_window100-detailed_cmd_format-full_history-1_demos.yaml - adding warning to experimental config

Update xml_sys-env_cursors_window100-detailed_cmd_format-last_5_history-1_demos.yaml - adding warning to experimental config

Update README.md - clarify that traj arg has to be absolute path

Fix handling of not_generated/no_generation in inspector (princeton-nlp#332)

* Fix typo in inspector server.py

This leads to "Results format not recognized" error whenever viewing the eval report for a trajectory.

* Fix: Consistently handle no_generation vs not_generated

---------

Co-authored-by: Kilian Lieret <kilian.lieret@posteo.de>

Inspector: Better labels for roles (princeton-nlp#368)

Closes princeton-nlp#365

Change icons for trajectory viewer (princeton-nlp#370)

Closes princeton-nlp#365

Move documentation to mkdocs (princeton-nlp#371)

Docs: Add installation overview page (princeton-nlp#377)

Docs: Add github button; edit feature

Docs: Change color preferences

Docs: Add next prev/buttons

CI: Skip CI for PRs that only touch docs

Docs: Switch to documentation

Add default environment_setup config (princeton-nlp#351)

[skip ci]

Docs: Fix max-width tag of doc link

[skip ci]

Doc: Significantly expand CL tutorial

Doc: Restore docs on starting web UI on GH codespaces

Doc: Add copy button; highlight specific lines

Doc/CI: Speed up documentation build

Doc: Move config docs to mkdocs

CI: Set VIRTUAL_ENV for uv

Doc: Fix inclusion of image in config.md

Doc: Attempt to use relative image path

Doc: Add changelog

Closes princeton-nlp#335

Docs: Add more READMEs to mkdocs

Remind people not to use screenshots when reporting bugs

Remind people not to use screenshots for error messages

Upper bound request version to avoid docker-py bug (princeton-nlp#390)

Closes princeton-nlp#379

Doc: Replace symlinks with markdown files with links (princeton-nlp#392)

Closes princeton-nlp#388

Docs: Add search (princeton-nlp#393)

Closes princeton-nlp#387

Search is added by default but must be manually added if any other plugins are
configured

See https://github.com/squidfunk/mkdocs-material/blob/master/docs/setup/setting-up-site-search.md

Docs: Add code of conduct (princeton-nlp#394)

[skip ci]

Add nodejs to swe-agent-run container (princeton-nlp#396)

Docs: Note about old images from the hub (princeton-nlp#395)

Docs: Advice to update pip if unsuccessful (princeton-nlp#399)

Show error log if web server fails (princeton-nlp#400)

[skip ci]

CI: Fix passing python path to uv (princeton-nlp#401)

Docs: Detailed way to start the web server (princeton-nlp#402)

Docs: Use grids for prettier selections (princeton-nlp#403)

Doc: Avoid duplicate information

Docs: Add footer with links to report bugs (princeton-nlp#404)

Docs/CI: Install mkdocs-include-markdown-plugin

Improve question issue template

Update question issue template

Update question issue template

Update question issue template

Doc: Typo fix

Split between configuration and development (princeton-nlp#407)

Remove requests upper bound, add docker-py lower bound (princeton-nlp#406)

Closes princeton-nlp#391

deprecate action from get_submission (princeton-nlp#274)

Doc: Fix links to website pages (princeton-nlp#411)

Print trajectory path only at beginning/end (princeton-nlp#408)

Closes princeton-nlp#381

Fix: IndexError when replaying incomplete trajectories (princeton-nlp#410)

Closes princeton-nlp#124

Add dev dependencies (princeton-nlp#414)

Add dev notes (princeton-nlp#415)

Docs: Move contribution guide to root to help gh discover it

CI: Use github token during CI operations  (princeton-nlp#412)

Fixes princeton-nlp#405

Make use case for discord clearer

Enh: Suppress openai logging; improve formatting of stats (princeton-nlp#416)

Closes princeton-nlp#382

Tweaks to use swe-agent web UI from docker (princeton-nlp#423)

Speed up evaluation by caching task environments as docker images (princeton-nlp#317)

* cache task environment as docker images with separate tags

* save env vars inside the task image before docker commit, debug timing

* increase docker api timeout to afford long commits

* fix

* fix

* remove timing collection code

* some cleanup

* remove timings storage

* use close func to stop container

* address review comment, type hint

chore: update pre-commit hooks (princeton-nlp#424)

updates:
- [github.com/astral-sh/ruff-pre-commit: v0.4.4 → v0.4.5](astral-sh/ruff-pre-commit@v0.4.4...v0.4.5)

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

Add test for caching of task envs

Make cached image name depend only on relevant features

Document --cache_task_images

Doc: Port more content from readme to docs/ (princeton-nlp#427)

* Doc: Port more content from readme to docs/

* Fix links

Remove signal dependency (princeton-nlp#428)

Do not use select if running on Windows (princeton-nlp#429)

* Do not use select if running on Windows

* Test on windows

Ensure that uv is avialable in containers (princeton-nlp#431)

Use custom Config class to support env and keys.cfg (princeton-nlp#430)

* Use custom Config class to support env and keys.cfg

* Fix patching

* Doc: Document use of environment variables

* Doc: swap out env reference

Doc: Document running web server from docker container (princeton-nlp#426)

* Doc: Document running web server from docker container

* Fix link

Fix: Correct path to keys.cfg

Fix: Config doesn't take pathlib.Path (princeton-nlp#434)

Strip trailing whitespace & black formatting

Allow ruff to write fixes

[skip ci]

Sort imports

Code quality: Convert to make use of PEP 585 and PEP 604

CI: Add pyupgrade via ruff

Add more fixable ruff checks

Fix compatibility with main branch

Fix unittest by excluding test data from formatting

Doc: Add note about running tests (princeton-nlp#435)

Add flake8-errmsg to tests

Some more ruff checks

Format: Use trailing commas

CI: Add pytest rules

CI: Add flake8 simplify

Code qual: Some one-off fixes

Docs: Note about updates (princeton-nlp#438)

Remove direct imports in __init__.py; improve error handling of keys_config (princeton-nlp#436)

keys_config

Doc: Add notes about merge-conflicts after formatting changes (princeton-nlp#439)

[skip CI]

Dev: Exclude format commits from showing up in git blame

[skip ci]

Bump version

[skip ci]

Doc: Update changelog (princeton-nlp#441)

CI: Release to dockerhub via github actions (princeton-nlp#440)

* CI: Release to dockerhub via github actions

* Checkout code

* Fix name

[skip ci]

* Run daily by midnight

* Doc: remove notice about later docker images

Doc: Add badge for container build

Doc: Document keywords of run.py (princeton-nlp#443)

Closes princeton-nlp#442

Doc: Fix links to paper

Doc: Fix broken formatting

Update README.md

Resolve relative paths to demonstrations and commands (princeton-nlp#444)

* Resolve relative paths to demonstrations

Closes princeton-nlp#225

* Resolve more paths relative to REPO_ROOT

* Allow to override config root

* Document

Docs: Links to good first issues/help wanted

Docs: Add more prominent note about formatting merge conflicts

Update citation

Doc: Add placeholder for updating forks

Docs: Add verbose notes about avoiding formatting merge conflicts (princeton-nlp#448)

* Docs: Add verbose notes about avoiding formatting merge conflicts

* Include report footer

Doc: Fix link to migration

Docs: Update link to fix formatting issues

Doc: Pull correct image for updating

Docs: Improve installation steps

Chore: Fix whitespace error

Update demonstrations.md

Update and rename faq.md to usage_faq.md

Improve landing page and add background section (princeton-nlp#458)

* Docs: Improve navigation from front page

* Docs: Improve landing page

* Fix link to changelog

Docs: Start to add API documentation (princeton-nlp#460)

Doc: Fix formatting and links

CI/Docs: Add mkdocstrings to dependencies

CI: Only run test build containers if changed (princeton-nlp#462)

Docs/CI: Fix docs build & run for PRs (princeton-nlp#461)

* CI: Always run mkdocs for testing

* Actually build

* Need to install complete dev

* Specify python root

* Fix link

Docs: Fix inclusion of code structure

Doc: Format fix

Ensure container_name is reset for non-persistent containers (princeton-nlp#463)

* Ensure container_name is reset for non-persistent containers

Might help with princeton-nlp#451

* Always draw new container name

Docs: Bring back some more ACI text

Fix: Raise unclassified exception; use from e (princeton-nlp#464)

* Fix: Raise unclassified exception; use from e

* Improve exception logging

Change run return_type default to "info_trajectory"; doc improvements (princeton-nlp#466)

* Change run return_type default to "info_trajectory"; doc improvements

* Doc: Ensure that all public methods have docstring stub

Otherwise not shown in docs

add swe env docstrings (princeton-nlp#468)

* Change run return_type default to "info_trajectory"; doc improvements

* Doc: Ensure that all public methods have docstring stub

Otherwise not shown in docs

* Doc: Add SWEEnv docstrings
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
❔question Further information is requested
Projects
None yet
Development

No branches or pull requests

6 participants