Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Normalize upload file name #606

Merged
merged 6 commits into from
Mar 20, 2024
Merged

Conversation

rumanzo
Copy link
Contributor

@rumanzo rumanzo commented Mar 6, 2024

We found a problem, that caused due too simple input normalize.

~/projects/transfer.sh main* ❯ echo test >'%21adasd'
~/projects/transfer.sh main* ❯ curl -v --upload-file "./%0A%0D" "http://localhost:8080/%0A%0D"                                                                                                                                                                                                                       
*   Trying 127.0.0.1:8080...
* Connected to localhost (127.0.0.1) port 8080
> PUT /%0A%0D HTTP/1.1
> Host: localhost:8080
> User-Agent: curl/8.4.0
> Accept: */*
> Content-Length: 5
>
* We are completely uploaded and fine
< HTTP/1.1 200 OK
< Content-Type: text/plain
< Server: Transfer.sh HTTP Server
< X-Made-With: <3 by DutchCoders
< X-Served-By: Proudly served by DutchCoders
< X-Url-Delete: http://localhost:8080/8lzy2BjFp3/%0A%0D/3HDeD5tY446rktKZ88fw
< Date: Wed, 06 Mar 2024 10:06:28 GMT
< Content-Length: 39
<
* Connection #0 to host localhost left intact
http://localhost:8080/8lzy2BjFp3/%0A%0D
~/projects/transfer.sh main* ❯ ls temp/8lzy2BjFp3                                                                                                                                                                                                                                                                   
??          ??.metadata
~/projects/transfer.sh main* ❯ curl 'http://localhost:8080/8lzy2BjFp3/%0A%0D' -v                                                                                                                                                                                                                                     
*   Trying 127.0.0.1:8080...
* Connected to localhost (127.0.0.1) port 8080
> GET /8lzy2BjFp3/%0A%0D HTTP/1.1
> Host: localhost:8080
> User-Agent: curl/8.4.0
> Accept: */*
>
< HTTP/1.1 200 OK
< Cache-Control: no-store
< Connection: keep-alive
< Content-Disposition: attachment; filename="  "
< Content-Length: 5
< Content-Type:
< Server: Transfer.sh HTTP Server
< Vary: Range, Referer, X-Decrypt-Password
< X-Made-With: <3 by DutchCoders
< X-Remaining-Days: n/a
< X-Remaining-Downloads: n/a
< X-Served-By: Proudly served by DutchCoders
< Date: Wed, 06 Mar 2024 15:55:26 GMT
<
test
* Connection #0 to host localhost left intact
~/projects/transfer.sh main* ❯ curl 'http://localhost:8080/8lzy2BjFp3/%0A%0D'  -H 'Accept: text/html' -v                                                                                                                                                                                                             
*   Trying [::1]:8080...
* connect to ::1 port 8080 failed: Connection refused
*   Trying 127.0.0.1:8080...
* Connected to localhost (127.0.0.1) port 8080
> GET /8lzy2BjFp3/%0A%0D HTTP/1.1
> Host: localhost:8080
> User-Agent: curl/8.4.0
> Accept: text/html
>
< HTTP/1.1 500 Internal Server Error
< Content-Type: text/plain; charset=utf-8
< Server: Transfer.sh HTTP Server
< Vary: Range, Referer, X-Decrypt-Password
< X-Content-Type-Options: nosniff
< X-Made-With: <3 by DutchCoders
< X-Served-By: Proudly served by DutchCoders
< Date: Wed, 06 Mar 2024 15:55:18 GMT
< Content-Length: 65
<
runtime error: invalid memory address or nil pointer dereference
* Connection #0 to host localhost left intact

If we look into https://github.com/dutchcoders/transfer.sh/blob/main/server/handlers.go#L253 filename variable when we use GET method with HEADERS, we will see "\n\r" in variable, and it's lead to runtime error.
I realized filename normalization and trimming all newlines in user input in sanitize function

@rumanzo rumanzo changed the title Filenamenormalize Normalize upload file name Mar 6, 2024
server/token.go Outdated Show resolved Hide resolved
func sanitize(fileName string) string {
return path.Base(fileName)
t := transform.Chain(norm.NFD, runes.Remove(runes.In(unicode.Mn)), norm.NFC)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure about unicode.Mn: we should allow diactrics, and as far as I can see this will drop.

In general I'm little lost in unicode composition/decomposition and compatibility/canonical

can you please explain what the transformation will look like for the following:

  • straße
  • tëst

Also we should apply sanitize everytime only in input, to remove extraneous character upon saving or processing

Especially is important to not introduce a breaking change when users will try to donwload files alread affected by the sanitization issue

Copy link
Contributor Author

@rumanzo rumanzo Mar 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right, I did too agressive normalization.
with deleting unicode.Mn characters we get straße, without deleting straße
with deleting unicode.Mn characters we get test, without deleting tëst
list of characters https://www.fileformat.info/info/unicode/category/Mn/list.htm

Copy link
Contributor Author

@rumanzo rumanzo Mar 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also we should apply sanitize everytime only in input, to remove extraneous character upon saving or processing

As I can see you do this only on input (put, post, zip, gzip, etc). So previously uploaded file even with '\n\r' downloading after this commit.
Filenames can cantain a lot of unusual codes, even utf-16 or cesu8, and I afraid that it can cause problem in some cases, may be not with transfersh itself, but with http servers (frontends). I'm a little bit paranoic
I will do as you say, but removing newline characters is most important.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And I found way better than use regexp

	t := transform.Chain(
		norm.NFD,
		runes.Remove(runes.In(unicode.Cc)),
		runes.Remove(runes.In(unicode.Cf)),
		runes.Remove(runes.In(unicode.Co)),
		runes.Remove(runes.In(unicode.Other)),
		runes.Remove(runes.In(unicode.Zl)),
		norm.NFC)

https://pkg.go.dev/unicode#pkg-variables

How about this lists?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that's good.

what' about adding unicode.Cs and unicode.Zp?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I can see you do this only on input (put, post, zip, gzip, etc).

please, let me double check, when I looked briefly I was not to able to catch every usage :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that's good.

what' about adding unicode.Cs and unicode.Zp?

I agree, it's a good offer

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

great, fine :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that's good.

what' about adding unicode.Cs and unicode.Zp?

Added this to transform chain
Double check usage of sanitize funciton. I see usage only in virustotal, clamav and postHandler, putHandler, zipHandler and tarGzHandler

@aspacca
Copy link
Collaborator

aspacca commented Mar 7, 2024

hi @rumanzo , thanks for the PR, please, see my comments

@aspacca aspacca merged commit 1eecc22 into dutchcoders:main Mar 20, 2024
6 checks passed
@rumanzo rumanzo deleted the filenamenormalize branch March 22, 2024 16:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants