Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle context.newline correctly in tube.interactive() #2129

Merged
merged 7 commits into from
May 18, 2023

Conversation

peace-maker
Copy link
Member

The context.newline or self.newline variable isn't obeyed when typing in interactive mode. It is used when sending and receiving lines through tube.sendline though, causing a mismatch.

The receiving end of the tube.interactive() already has handling of newlines, but the sending side does not.

Example:

from pwn import *
io = process('cat')
io.newline = b'\r\n'
io.sendline(b'auto')
io.interactive()
$ python testinteractive.py DEBUG
[x] Starting local process '/usr/bin/cat' argv=[b'cat']
[+] Starting local process '/usr/bin/cat' argv=[b'cat'] : pid 19060
[DEBUG] Sent 0x6 bytes:
    b'auto\r\n'
[*] Switching to interactive mode
[DEBUG] Received 0x6 bytes:
    b'auto\r\n'
auto
$ test
[DEBUG] Sent 0x5 bytes:
    b'test\n'
[DEBUG] Received 0x5 bytes:
    b'test\n'
test

The expected outcome would be to send b'test\r\n'.

The same problem arises in non-term mode (PWNLIB_NOTERM=1), where stdin is read in binary mode causing the OS line seperators to come through. Correctly replacing them with the context.newline setting allows to use the interactive input on windows hosts as well, without constantly sending \r\n.

The `context.newline` or `self.newline` variable isn't obeyed when typing in interactive mode. It is used when sending and receiving lines through `tube.sendline` though, causing a mismatch.

The receiving end of the `tube.interactive()` already has handling of newlines, but the sending side does not.

Example:
```python
from pwn import *
io = process('cat')
io.newline = b'\r\n'
io.sendline(b'auto')
io.interactive()
```

```
$ python testinteractive.py DEBUG
[x] Starting local process '/usr/bin/cat' argv=[b'cat']
[+] Starting local process '/usr/bin/cat' argv=[b'cat'] : pid 19060
[DEBUG] Sent 0x6 bytes:
    b'auto\r\n'
[*] Switching to interactive mode
[DEBUG] Received 0x6 bytes:
    b'auto\r\n'
auto
$ test
[DEBUG] Sent 0x5 bytes:
    b'test\n'
[DEBUG] Received 0x5 bytes:
    b'test\n'
test
```

The expected outcome would be to send `b'test\r\n'.

The same problem arises in non-term mode (`PWNLIB_NOTERM=1`), where stdin is read in binary mode causing the OS line seperators to come through. Correctly replacing them with the `context.newline` setting allows to use the interactive input on windows hosts as well, without constantly sending `\r\n`.
@Arusekk
Copy link
Member

Arusekk commented Nov 1, 2022

I am not sure what is actually correct here: I believe we should send every byte as we get it regardless of OS when not in TERM mode (for example when the exploit is run non-interactively as python exploit.py < input.txt).

In TERM mode though, I agree we should send a context.newline on the Return keystroke. How to do it OS-agnostic is a mistery to me, though. I once thought that we should react to every LF and ignore every CR, but apparently vt100-compatible terminals only send a CR on a Return keystroke (this is further complicated on non-raw termios(3) settings by options like ICRNL/INLCR, IGNCR, not to mention OCRNL/ONLCR, ONOCR, ONLRET counterparts which are thankfully not relevant here). You say that cmd.exe sends CR+LF, and I bet there are some terminals that only send LF.

So I guess the solution we should pick is to ignore LFs sent directly after CRs?

@peace-maker
Copy link
Member Author

I'm not sure how commonly used those termios settings are. I'd expect data.replace(os.linesep, context.newline) to be enough for 99% of the cases and already better than the current behavior.

I agree with not messing with the input when not dealing with a tty.

Don't mess with the line endings when piping data through.
@peace-maker
Copy link
Member Author

So I guess the solution we should pick is to ignore LFs sent directly after CRs?

I don't understand. Do you want to always send CR if there's a CRLF?

@Arusekk
Copy link
Member

Arusekk commented Nov 4, 2022

I meant always send LF on LF and LF on CR, but if there is an LF directly after a CR, then ignore it. But if you have a better idea, then I am open to discuss it.

@peace-maker
Copy link
Member Author

The point of this PR was to send context.newline in interactive(), so always sending LF makes it a bit more consistent, but doesn't fix the initial issue? We could use your heuristic to detect newlines and replace them with context.newline. I've got a feeling that we're having communication problems here :D

@Arusekk
Copy link
Member

Arusekk commented Nov 7, 2022

Sure! I almost forgot the whole point, I never used the feature myself, because I really like to have control over the actual bytes in the protocol.

Anyway, the next thing I am afraid of is changing newlines on receiving from a tube vs changing newlines on sending to a tube.
As far as I am concerned, these should probably only be used in sendline/recvline functions.

Term handling is another thing, as I mentioned before. If an interactive tube is launched in TERM mode, then it should probably simply use sendline() every time it gets a logical line (i.e. confirmed by an Enter keystroke). Again, I am not sure what is the correct thing to do here. I will probably merge your changes when you say you are satisfied with them, but I am just pointing out some things I thought of.

@zachriggle
Copy link
Member

Maybe we can make this a parameter to e.g. .interactive(newline='\r\n')?

I don't want .recv(4096) to replace all \n with \r\n for example.

But yeah, .interactive should definitely append the correct value for self.newline

@yixinBC
Copy link

yixinBC commented Jan 30, 2023

I think I have the same issue but in another way.
Sometimes I use pwntools on my Windows,I write the script correctly to get the remote Linux shell,but when I send a line cat flagin my terminal(powershell backend),the response is that cat: 'flag'$'\r': No such file or directory(I must write .sendline(b"cat flag") before .interactive() in my script as substitute).That's really an annoyed thing.
If pwntools can automatically replace the input \r\n to \n, it will help me a lot.

@peace-maker
Copy link
Member Author

Sorry for the bit rot, I wasn't aware this was stuck on me! Reading through the messages again I feel like I don't understand some of the concerns since they seem to relate to other parts of the tube module that I'm not touching here. I'll try to sum up how I see the scope of this patch.

Anyway, the next thing I am afraid of is changing newlines on receiving from a tube vs changing newlines on sending to a tube. As far as I am concerned, these should probably only be used in sendline/recvline functions.

I expected the input in interactive to be keyboard-typeable characters only, always submitted by pressing enter. If stdin is not a tty, the newlines aren't changed, so piping input to your exploit won't be affected. I agree that the newline should only be replaced in the sendline/recvline functions. I'm assuming you don't enter non-printable characters through your keyboard in interactive() in term mode, so whenever the OS adds a newline, as a user I'd expect pwntools to use sendline to send that line and thus send the line ending configured in self.newline. This patch replaces the newline directly instead of stripping it and using sendline since you'd need to specialcase sending of non-tty stdin data to use normal send. The current implementation touches the minimal amount of lines.

Term handling is another thing, as I mentioned before. If an interactive tube is launched in TERM mode, then it should probably simply use sendline() every time it gets a logical line (i.e. confirmed by an Enter keystroke). Again, I am not sure what is the correct thing to do here. I will probably merge your changes when you say you are satisfied with them, but I am just pointing out some things I thought of.

I don't know of a good os-agnostic way to detect the Enter key. You can probably do that in the term module for unix somehow, but that's beyond me. The implementation uses term.readline.readline(), which returns the line including the line terminator. That might be possible to change, I didn't look at it. The other chunk of this patch is handling of newlines in tty input when term mode isn't supported or disabled - like when using a windows terminal which adds \r\n on enter. It looks for the os.linesep sequence and replaces it with the configured newline. I feel this is the simplest solution to handle different standard OS. The manual heuristic you suggested to detect the line seperator is only required if you configure your terminal to emit \r\n on e.g. linux, causing os.linesep to be wrong, but I don't know if that's worth the trouble yet?

I'm confident this won't break existing behavior and is a good enough first step to cover the basics of sending newlines in terminals. If there are problems with a terminal using some of the termios settings you mentioned earlier, those can be adressed with a test case in hand?

Maybe we can make this a parameter to e.g. .interactive(newline='\r\n')?

We could add @LocalContext to interactive to allow changing of context.newline as a parameter? Or rather newline = newline if newline else self.newline. Do you really need such a shorthand that often? I can't think of a scenario where you'd need to change tube.newline only for interactive() and change it back afterwards. I'd expect you to set it once and use it for all sendline/recvline calls in your script. I can add it regardless of course.

I don't want .recv(4096) to replace all \n with \r\n for example.

self.newline/tube.newline defaults to context.newline and is only used in sendline, recvline, and derivatives as well as the receiving side of interactive - at least before this patch. This PR only changes the newlines sent by interactive in term mode or when stdin is a tty to be the one specified in tube.newline as if you'd use sendline. So this change won't replace anything regarding recv.

@Arusekk
Copy link
Member

Arusekk commented May 9, 2023

Alright, I see it now. But I think this repeated read(1) might cause the thing to lag a bit, e.g. if you want to send some data without a newline; is it helpful in actual interactive mode? I don't know for sure, but I had several situations when I needed to send something confirming it with a Ctrl+D instead of a newline.
I think we can assume prefixes of os.linesep to be a newline read from a tty, we can also assume that when a newline is set at all, the input will always be line-based, but the default PWNLIB_NOTERM behavior should be IMO to leak the raw OS line separators through and feed every byte as soon as we read it; most line-based interactive things are tolerant against CR/LF/CRLF or even LFCR, so I guess leaving data as-is by default is sane. This disagrees a bit with context.newline being \n by default, so I am open to your suggestions there as well (maybe leave terminal without special treatment only if PWNLIB_NOTERM is set, context.newline is set to \n, and io.newline is not set to anything?).

@Arusekk
Copy link
Member

Arusekk commented May 9, 2023

Also, term should add context.newline on an Enter keystroke, not os.linesep nor \n (and maybe we can make it configurable, newline=self.newline like prompt='$ ').

@peace-maker
Copy link
Member Author

Alright, I see it now. But I think this repeated read(1) might cause the thing to lag a bit, e.g. if you want to send some data without a newline; is it helpful in actual interactive mode? I don't know for sure, but I had several situations when I needed to send something confirming it with a Ctrl+D instead of a newline.

The read(1) is present in the old code - did you experience lag previously?

if term.term_mode:
data = term.readline.readline(prompt = prompt, float = True)
else:
stdin = getattr(sys.stdin, 'buffer', sys.stdin)
data = stdin.read(1)

The new loop would exit on EOF/Ctrl+D as well, since stdin.read(1) would return an empty bytes object.

I think we can assume prefixes of os.linesep to be a newline read from a tty, we can also assume that when a newline is set at all, the input will always be line-based, but the default PWNLIB_NOTERM behavior should be IMO to leak the raw OS line separators through and feed every byte as soon as we read it; most line-based interactive things are tolerant against CR/LF/CRLF or even LFCR, so I guess leaving data as-is by default is sane. This disagrees a bit with context.newline being \n by default, so I am open to your suggestions there as well (maybe leave terminal without special treatment only if PWNLIB_NOTERM is set, context.newline is set to \n, and io.newline is not set to anything?).

True, the io.TextIOWrapper has complex normalization of newlines as well. Maybe we can mimic that behavior? Only leaking the raw line endings when NOTERM is set sounds weird to me. I'd expect the context.newline option to always work. Although this won't break the usecase of windows users running in a terminal which appends '\r\n', since NOTERM won't be set. I initially thought tube.interactive is separate from term.readline and NOTERM, but I guess it's more tightly coupled, since it's the only place where readline is used extensively. Why do you think NOTERM should leak the raw OS line separators? We're talking about the case when stdin is a tty, right?

Also, term should add context.newline on an Enter keystroke, not os.linesep nor \n (and maybe we can make it configurable, newline=self.newline like prompt='$ ').

Yes, appending context.newline sounds like a good change in term.readline. Then we don't need to change it in tube.interactive in term mode.

return force_to_bytes(buffer) + b'\n'

That's a bigger change than I can confidently reason about though. Should the newline parameter be added to the different readline wrappers like raw_input, str_input and eval_input as well? Python's default input() strips the newline before returning. The same is true for the real readline library. So pwntools returning a newline seems like a bug in the first place?

print(input("no pwntools: "))
from pwn import *
print(input("yes pwntools: "))
print(term.readline.readline(prompt="readline: "))
$ python test.py
no pwntools: wat
wat
yes pwntools: wat
wat

readline: wat
b'wat\n'

So maybe we should think about not returning a newline in term.readline instead of making it configurable instead? Thinking as I type this post 😆

@Arusekk
Copy link
Member

Arusekk commented May 10, 2023

Did you experience lag previously?

I do not mean a time lag, but a need for every piece to end with a newline. Suppose you type abc^D: it got sent before, now it waits for an extra newline or EOF; you now need to type abc^D^D, which was interpreted before as 'send abc and exit the interactive mode'. Now the same can only be done with abc^D^D^D.

Why do you think NOTERM should leak the raw OS line separators? We're talking about the case when stdin is a tty, right?

Even when stdin is a tty. I think we can leave the OS-default way of inputting control characters, like something^V^M^V^Jsomething else^V^Mand something^V^Jno terminator^D could send exactly the sequence, just like cat | PWNLIB_NOTERM=1 ./exploit.py being equivalent to PWNLIB_NOTERM=1 ./exploit.py maybe? Especially if the user does not change anything explicitly (like they have not set either context.newline nor tube.newline). If they have, I believe it should parse newlines anyway? I'd rather maybe not. Not sure.

I also think the readline() thing keeping its API consistent with stdlib would be great, so removing the + \n bit is a nice idea.

@peace-maker
Copy link
Member Author

Oh! Now I understand, I didn't use ^D to send input without newline before. It doesn't seem to work in term mode either.
Well, fuck, there's no good solution. Maybe some non-blocking read(1) and some timeout if we see a prefix of the os.linesep to wait for the rest of the linesep and replace it before sending?

stdin.readline() requires ^D^D as well to return the input without a newline btw, so this appears to be a quirk of this implementation explicitly relying on not handling newlines :D

@peace-maker
Copy link
Member Author

What do you think about this? Going by your #2129 (comment) + stopping readline from returning a \n.

        from pwnlib.args import term_mode
        try:
            os_linesep = os.linesep.encode()
            to_skip = b""
            while not go.isSet():
                if term.term_mode:
                    data = term.readline.readline(prompt=prompt, float=True)
                    if data:
                        data += self.newline
                else:
                    stdin = getattr(sys.stdin, "buffer", sys.stdin)
                    data = stdin.read(1)
                    # Keep OS's line separator if NOTERM is set and
                    # the user did not specify a custom newline
                    # even if stdin is a tty.
                    if sys.stdin.isatty() and (
                        term_mode
                        or context.newline != b"\n"
                        or self._newline is not None
                    ):
                        if to_skip:
                            if to_skip[:1] != data:
                                data = os_linesep[: -len(to_skip)] + data
                            else:
                                to_skip = to_skip[1:]
                                if to_skip:
                                    continue
                                data = self.newline
                        elif data and os_linesep.startswith(data):
                            if len(os_linesep) > 1:
                                to_skip = os_linesep[1:]
                                continue
                            data = self.newline

                if data:
                    try:
                        self.send(data)
                    except EOFError:
                        go.set()
                        self.info("Got EOF while sending in interactive")
                else:
                    go.set()
        except KeyboardInterrupt:
            self.info("Interrupted")
            go.set()

@Arusekk
Copy link
Member

Arusekk commented May 18, 2023

Great idea! I think I am happy enough with it, go ahead.

Only replace newlines in a tty if NOTERM wasn't explicitly set or
the `newline` setting was set.
The builtin `input()` doesn't include the newline.
@Arusekk Arusekk merged commit 58222cc into Gallopsled:dev May 18, 2023
@peace-maker peace-maker deleted the interactive_newline branch May 18, 2023 21:09
@peace-maker peace-maker mentioned this pull request Sep 4, 2023
gogo2464 pushed a commit to gogo2464/pwntools that referenced this pull request Sep 10, 2023
…d#2129)

* Handle `context.newline` correctly in `tube.interactive()`

The `context.newline` or `self.newline` variable isn't obeyed when typing in interactive mode. It is used when sending and receiving lines through `tube.sendline` though, causing a mismatch.

The receiving end of the `tube.interactive()` already has handling of newlines, but the sending side does not.

Example:
```python
from pwn import *
io = process('cat')
io.newline = b'\r\n'
io.sendline(b'auto')
io.interactive()
```

```
$ python testinteractive.py DEBUG
[x] Starting local process '/usr/bin/cat' argv=[b'cat']
[+] Starting local process '/usr/bin/cat' argv=[b'cat'] : pid 19060
[DEBUG] Sent 0x6 bytes:
    b'auto\r\n'
[*] Switching to interactive mode
[DEBUG] Received 0x6 bytes:
    b'auto\r\n'
auto
$ test
[DEBUG] Sent 0x5 bytes:
    b'test\n'
[DEBUG] Received 0x5 bytes:
    b'test\n'
test
```

The expected outcome would be to send `b'test\r\n'.

The same problem arises in non-term mode (`PWNLIB_NOTERM=1`), where stdin is read in binary mode causing the OS line seperators to come through. Correctly replacing them with the `context.newline` setting allows to use the interactive input on windows hosts as well, without constantly sending `\r\n`.

* Update CHANGELOG.md

* Only replace newlines in TTYs

Don't mess with the line endings when piping data through.

* Always send bytes immediately on receive

Only replace newlines in a tty if NOTERM wasn't explicitly set or
the `newline` setting was set.

* Don't return a newline after term.readline lines

The builtin `input()` doesn't include the newline.

* Remove bogous newline from CHANGELOG
peace-maker added a commit to peace-maker/pwntools that referenced this pull request Feb 14, 2024
This is a regression from Gallopsled#2129 which was based on a bogus test which mixed `input` and `readline` calls.

Only `input` (and `raw_input`) are documented to strip the newline, but `readline` itself should include it.

Fixes Gallopsled#2342
peace-maker added a commit that referenced this pull request Feb 16, 2024
* Fix readline omitting a trailing \n

This is a regression from #2129 which was based on a bogus test which mixed `input` and `readline` calls.

Only `input` (and `raw_input`) are documented to strip the newline, but `readline` itself should include it.

Fixes #2342

* Update CHANGELOG
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants