Security: sanitize/replace shell invocations to avoid injection

Potential command injection risk in shell invocations that include user-controlled values without shell-escaping.

Examples
- `openml_OS/helpers/api_helper.php` `validate_arff()`: builds a `sed` command using `$name` and `$did` to prepend info to an ARFF file:
  ```php
  $info = '% Data set "'.$name.'". ... d/'.$did;
  $string = '1s/^/'.$info.'\n/';
  $command2 = "sed -i -e '$string' $newUrl";
  exec(CMD_PREFIX . $command2, $res, $code);
  ```
  `$name` originates from dataset metadata and may contain characters that break quoting. This should avoid shell entirely.

- `openml_OS/controllers/Api_splits.php`: multiple `system()`/`exec()` calls build Java commands with request-derived inputs. Some checks (`is_safe`, `is_numeric`) exist, but defense-in-depth suggests escaping args or using `proc_open` with argv arrays.

Proposed fix
- Replace the `sed` call with pure PHP file I/O to prepend the info line.
- When shelling out to Java, strictly validate and/or escape all args (e.g., via `escapeshellarg`) and prefer argv arrays.
- Add unit tests covering edge-case names.

Acceptance criteria
- No use of unescaped user-provided values in shell commands.
- Prepend operation implemented without shell invocation.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Security: sanitize/replace shell invocations to avoid injection #1265

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Security: sanitize/replace shell invocations to avoid injection #1265

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions