Skip to content
This repository has been archived by the owner on Apr 26, 2024. It is now read-only.

curl/wget (need help) #11

Open
131 opened this issue Jun 16, 2018 · 6 comments
Open

curl/wget (need help) #11

131 opened this issue Jun 16, 2018 · 6 comments

Comments

@131
Copy link

131 commented Jun 16, 2018

I'm trying to PR a working curl / wget (using /dev/tcp)

set -ex

function __curl() {
  read proto server path <<<$(echo ${1//// })
  DOC=/${path// //}
  HOST=${server//:*}
  PORT=${server//*:}
  [[ x"${HOST}" == x"${PORT}" ]] && PORT=80

  exec 3<>/dev/tcp/${HOST}/$PORT
  echo -en "GET ${DOC} HTTP/1.0\r\nHost: ${HOST}\r\n\r\n" >&3
  (while read line; do
   [[ "$line" == $'\r' ]] && break
  done && cat) <&3
  exec 3>&-
}

__curl http://www.google.com/favicon.ico > mine.ico
md5sum mine.ico

Yet 'im stuck on the && cat to handle the file body (binary)
I'm sure i can use a new file descriptor && echo , but my bash skill ends here 😞
I can link & use a pure bash [1], yet i'm sure there is something more elegant to do here.

[1] https://unix.stackexchange.com/questions/83926/how-to-download-a-file-using-just-bash-and-nothing-else-no-curl-wget-perl-et

@dylanaraps
Copy link
Owner

dylanaraps commented Jun 16, 2018

I got it working. It's a little slow as it requires two while loops. I'm going to work on making this even faster but for now it's an example. Usage is script url > file.

Example script:

#!/usr/bin/env bash
#
# Download a file in pure bash.

download() {
    IFS=/ read -r _ _ host query <<< "$1"

    # Send the HTTP request.
    exec 3<"/dev/tcp/${host}/80"; {
        printf '%s\r\n%s\r\n\r\n' \
               "GET /${query} HTTP/1.0" \
               "Host: $host"
    } >&3

    # Strip the HTTP headers.
    while IFS= read -r line; do
        [[ "$line" == $'\r' ]] && break
    done <&3

    # Output the file.
    nul='\0'
    while IFS= read -d '' -r line || { nul=""; [[ -n "$line" ]]; }; do
        printf "%s%b" "$line" "$nul"
    done <&3

    exec 3>&-
}

download "$1"

@131
Copy link
Author

131 commented Jun 16, 2018

The fist loop is reasonably slow as it will just drop a sane amount of headers, i can't understand how the 2nd loop (a simple cat !!) can be so complicated (hence, slow i guess)

@dylanaraps
Copy link
Owner

Bash is slow at file IO and it doesn't handle binary data very well. I'm sure it can be optimized but I have some doubts as to whether or not this will ever be faster than wget/curl.

@131
Copy link
Author

131 commented Jun 17, 2018

According to the "bash bible" - yours :p a simple cat alternative might be

file_data="$(<"file")"

Yet i cannot make this work with my design, but i do not understand why

@dylanaraps
Copy link
Owner

cat handles binary data correctly iirc, bash doesn't. What causes a larger problem is that bash handles binary data and null bytes differently depending on which version you're using (In 4.4+ null bytes are skipped and never reach the variable).

@darnir
Copy link

darnir commented Aug 9, 2018

All the other examples here make sense and can often be faster than invoking another program. However, in the case of networking, I think it makes sense to depend on the binaries, both for useability and performance.

In the case of wget / curl replacements, all of these only work when you have a HTTP endpoint. This code is not going to work for HTTPS.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

5 participants
@131 @darnir @dylanaraps and others