Skip to content

Commit

Permalink
Initial import.
Browse files Browse the repository at this point in the history
  • Loading branch information
VictorLowther committed Feb 17, 2008
0 parents commit 04a07ed
Show file tree
Hide file tree
Showing 7 changed files with 712 additions and 0 deletions.
1 change: 1 addition & 0 deletions INSTALL
@@ -0,0 +1 @@
Just copy s3 and hmac into a directory in your $PATH.
339 changes: 339 additions & 0 deletions LICENSE

Large diffs are not rendered by default.

35 changes: 35 additions & 0 deletions README
@@ -0,0 +1,35 @@
NOTE: THIS LOCATION IS UNMAINTANED -- I HAVE SWITCHED TO USING GIT.

Current code @ git://fnordovax.org/another-s3-bash/ or
http://git.fnordovax.org/another-s3-bash/

Implement scripts for perfoming basic file manupulation on Amazon S3.

s3 -- basic S3 functionality. Implements the following commands:
put: put a file onto S3
get: copy a file from S3
list: list the contents of an S3 bucket
buckets: list buckets
rm: remove a file from S3
rmrf: remove all files from a bucket on S3

All commands follow the same format:
s3 cmd bucket remote_name local_name (if different from remote_name)
You will need two environment variables set:
S3_ACCESS_KEY_ID: your S3 access key id.
S3_SECRET_ACCESS_KEY: the name of the file that contains your S3 secret key.

s3 only prints diagnostics if prerequisites are not met,
so check the shell return codes.


hmac -- calculate an hmac. Used by s3.
Call it like so:
hmac hash keyfile file

The hash will be printed in binary form to stdout.
If file is missing, hmac will hash whatever is input on stdin.


There are several things which are incomplete, missing, or flat-out wrong.
Send patches to victor.lowther@gmail.com if you fix something!
24 changes: 24 additions & 0 deletions TODO
@@ -0,0 +1,24 @@
For hmac:
* Support hashing using keys larger than the base block size of the hash
* Trap more errors to make sure things die loudly.

For s3:
* Do some sort of "real" XML parsing instead of the ugly hackjobs currently
used.
* More xml encoding escapes.
* Better dependency checking. I rely on alot of GNUisms, but there just
does not seem to be a really portable way of finding a file size. :(
This would involve matching revisions of utilities as well as their
presence.
* x-amz-whatever headers. Currently, signature generation supports including
them, and curl_headers can be used to portably generate them, but there
is no method of actually introducing them into the HTML bitstream.
* Support for just passing the S3 url instead of having the tool construct one.
I don't need it, and I think that writing a URI parser for a short shell
script is Too Much Work, but someone probably thinks it is interesting.
* HEAD variants of s3_list and s3_bucket. Because sometimes all we really
want are the headers.

For both:
* Figure out the earliest version of bash that offers all the bashisms we
rely on, and code the scripts to die loudly of we don't have it.
84 changes: 84 additions & 0 deletions hmac
@@ -0,0 +1,84 @@
#!/bin/bash
# Implement HMAC functionality on top of the OpenSSL digest functions.
# licensed under the terms of the GNU GPL v2
# Copyright 2007 Victor Lowther <victor.lowther@gmail.com>

die() {
echo $*
exit 1
}

check_deps() {
local res=0
while [ $# -ne 0 ]; do
which "${1}" >& /dev/null || { res=1; echo "${1} not found."; }
shift
done
(( res == 0 )) || die "aborting."
}

# write a byte (passed as hex) to stdout
write_byte() {
# $1 = byte to write
printf "\\x$(printf "%x" ${1})"
}

# make an hmac pad out of a key.
# this is not the most secure way of doing it, but it is
# the most expedient.
make_hmac_pad() {
# using key in file $1 and byte in $2, create the appropriate hmac pad
# Pad keys out to $3 bytes
# if key is longer than $3, use hash $4 to hash the key first.
local x y a size remainder oifs
[[ -f ${1} ]] || die "${1} does not exist when making hmac pads."
(( remainder = ${3} ))
# in case someone else was messing with IFS.
for x in $(od -v -t u1 < "${1}"|cut -b 9-);
do
write_byte $((${x} ^ ${2}))
(( remainder -= 1 ))
done
for ((y=0; remainder - y ;y++)); do
write_byte $((0 ^ ${2}))
done
}

# utility functions for making hmac pads
hmac_ipad() {
make_hmac_pad "${1}" 0x36 ${2} "${3}"
}

hmac_opad() {
make_hmac_pad "${1}" 0x5c ${2} "${3}"
}

# hmac something
do_hmac() {
# $1 = algo to use. Must be one that openssl knows about
# $2 = keyfile to use
# $3 = file to hash. uses stdin if none is given.
# accepts input on stdin, leaves it on stdout.
# Output is binary, if you want something else pipe it accordingly.
local blocklen keysize x
case "${1}" in
sha) blocklen=64 ;;
sha1) blocklen=64 ;;
md5) blocklen=64 ;;
md4) blocklen=64 ;;
sha256) blocklen=64 ;;
sha512) blocklen=128 ;;
*) die "Unknown hash ${1} passed to hmac!" ;;
esac
keysize="$(wc -c "${2}")"
(( ${keysize%%[!0-9 ]*} > blocklen )) && \
die "Prehashing large-size keys not implemented yet. Sorry."
cat <(hmac_ipad ${2} ${blocklen} "${1}") "${3:--}" | openssl dgst "-${1}" -binary | \
cat <(hmac_opad ${2} ${blocklen} "${1}") - | openssl dgst "-${1}" -binary
}

[[ ${1} ]] || die "Must pass the name of the hash function to use to ${0}".

[[ -f ${2} ]] || die "Must pass file containing the secret to $0"
check_deps od openssl
do_hmac "${@}"
8 changes: 8 additions & 0 deletions install.sh
@@ -0,0 +1,8 @@
#!/bin/bash
res=0
for x in curl grep openssl sed stat cat date od; do
which "${x}" >& /dev/null || { res=1; echo Missing ${x}. Please install it first.; }
done
((res == 0)) || exit 1
[[ -x /usr/bin/hmac ]] && echo hmac already exists! || cp hmac /usr/bin/hmac
[[ -x /usr/bin/s3 ]] && echo s3 already exists! || cp s3 /usr/bin/s3
221 changes: 221 additions & 0 deletions s3
@@ -0,0 +1,221 @@
#!/bin/bash
# basic amazon s3 operations
# Licensed under the terms of the GNU GPL v2
# Copyright 2007 Victor Lowther <victor.lowther@gmail.com>



# print a message and bail
die() {
echo $*
exit 1
}

# check to see if the variable name passed exists and holds a value.
# Die if it does not.
check_or_die() {
[[ ${!1} ]] || die "Environment variable ${1} is not set."
}

# check to see if we have all the needed S3 variables defined.
# Bail if we do not.
check_s3() {
local sak x
for x in S3_ACCESS_KEY_ID S3_SECRET_ACCESS_KEY; do
check_or_die ${x};
done
[[ -f ${S3_SECRET_ACCESS_KEY} ]] || die "S3_SECRET_ACCESS_KEY must point to a file!"
sak="$(wc -c "${S3_SECRET_ACCESS_KEY}")"
(( ${sak%%[!0-9 ]*} == 40 )) || \
die "S3 Secret Access Key is not exactly 40 bytes long. Please fix it."
}
# check to see if our external dependencies exist
check_dep() {
local res=0
while [[ $# -ne 0 ]]; do
which "${1}" >& /dev/null || { res=1; echo "${1} not found."; }
shift
done
(( res == 0 )) || die "aborting."
}

check_deps() {
check_dep openssl date hmac cat grep curl
check_s3
}

urlenc() {
# $1 = string to url encode
# output is on stdout
# we don't urlencode everything, just enough stuff.
echo -n "${1}" |
sed 's/%/%25/g
s/ /%20/g
s/#/%23/g
s/\$/%24/g
s/\&/%26/g
s/+/%2b/g
s/,/%2c/g
s/:/%3a/g
s/;/%3b/g
s/?/%3f/g
s/@/%40/g
s/ /%09/g'
}

xmldec() {
# no parameters.
# accept input on stdin, put it on stdout.
# patches accepted to get more stuff
sed 's/\&quot;/\"/g
s/\&amp;/\&/g
s/\&lt;/</g
s/\&gt;/>/g'
}

## basic S3 functionality. x-amz-header functionality is not implemented.
# make an S3 signature string, which will be output on stdout.
s3_signature_string() {
# $1 = HTTP verb
# $2 = date string, must be in UTC
# $3 = bucket name, if any
# $4 = resource path, if any
# $5 = content md5, if any
# $6 = content MIME type, if any
# $7 = canonicalized headers, if any
# signature string will be output on stdout
local verr="Must pass a verb to s3_signature_string!"
local verb="${1:?verr}"
local bucket="${3}"
local resource="${4}"
local derr="Must pass a date to s3_signature_string!"
local date="${2:?derr}"
local mime="${6}"
local md5="${5}"
local headers="${7}"
printf "%s\n%s\n%s\n%s\n%s%s%s" \
"${verb}" "${md5}" "${mime}" "${date}" \
"${headers}" "${bucket}" "${resource}" | \
hmac sha1 "${S3_SECRET_ACCESS_KEY}" | openssl base64 -e -a
}

# cheesy, but it is the best way to have multiple headers.
curl_headers() {
# each arg passed will be output on its own line
local parms=$#
for ((;$#;)); do
echo "header = \"${1}\""
shift
done
}

s3_curl() {
# invoke curl to do all the heavy HTTP lifting
# $1 = method (one of GET, PUT, or DELETE. HEAD is not handled yet.)
# $2 = remote bucket.
# $3 = remote name
# $4 = local name.
local bucket remote date sig md5 arg inout headers
# header handling is kinda fugly, but it works.
bucket="${2:+/${2}}/" # slashify the bucket
remote="$(urlenc "${3}")" # if you don't, strange things may happen.
stdopts="--connect-timeout 10 --fail --silent"
[[ $CURL_S3_DEBUG == true ]] && stdopts="${stdopts} --show-error --fail"
case "${1}" in
GET) arg="-o" inout="${4:--}" # stdout if no $4
;;
PUT) [[ ${2} ]] || die "PUT can has bucket?"
if [[ ! ${3} ]]; then
arg="-X PUT"
headers[${#headers[@]}]="Content-Length: 0"
elif [[ -f ${4} ]]; then
md5="$(openssl dgst -md5 -binary "${4}"|openssl base64 -e -a)"
arg="-T" inout="${4}"
headers[${#headers[@]}]="Expect: 100-continue"
else
die "Cannot write non-existing file ${4}"
fi
;;
DELETE) arg="-X DELETE"
;;
HEAD) arg="-I" ;;
*) die "Unknown verb ${1}. It probably would not have worked anyways." ;;
esac
date="$(TZ=UTC date '+%a, %e %b %Y %H:%M:%S %z')"
sig=$(s3_signature_string ${1} "${date}" "${bucket}" "${remote}" "${md5}")

headers[${#headers[@]}]="Authorization: AWS ${S3_ACCESS_KEY_ID}:${sig}"
headers[${#headers[@]}]="Date: ${date}"
[[ ${md5} ]] && headers[${#headers[@]}]="Content-MD5: ${md5}"
curl ${arg} "${inout}" ${stdopts} -K <(curl_headers "${headers[@]}") \
"http://s3.amazonaws.com${bucket}${remote}"
return $?
}

s3_put() {
# $1 = remote bucket to put it into
# $2 = remote name to put
# $3 = file to put. This must be present if $2 is.
s3_curl PUT "${1}" "${2}" "${3:-${2}}"
return $?
}

s3_get() {
# $1 = bucket to get file from
# $2 = remote file to get
# $3 = local file to get into. Will be overwritten if it exists.
# If this contains a path, that path must exist before calling this.
s3_curl GET "${1}" "${2}" "${3:-${2}}"
return $?
}

s3_test() {
# same args as s3_get, but uses the HEAD verb instead of the GET verb.
s3_curl HEAD "${1}" "${2}" >/dev/null
return $?
}

# Hideously ugly, but it works well enough.
s3_buckets() {
s3_get |grep -o '<Name>[^>]*</Name>' |sed 's/<[^>]*>//g' |xmldec
return $?
}

# this will only return the first thousand entries, alas
# Mabye some kind soul can fix this without writing an XML parser in bash?
# Also need to add xml entity handling.
s3_list() {
# $1 = bucket to list
[ "x${1}" == "x" ] && return 1
s3_get "${1}" |grep -o '<Key>[^>]*</Key>' |sed 's/<[^>]*>//g'| xmldec
return $?
}

s3_delete() {
# $1 = bucket to delete from
# $2 = item to delete
s3_curl DELETE "${1}" "${2}"
return $?
}

# because this uses s3_list, it suffers from the same flaws.
s3_rmrf() {
# $1 = bucket to delete everything from
s3_list "${1}" | while read f; do
s3_delete "${1}" "${f}";
done
}

check_deps
case $1 in
put) shift; s3_put "$@" ;;
get) shift; s3_get "$@" ;;
rm) shift; s3_delete "$@" ;;
ls) shift; s3_list "$@" ;;
test) shift; s3_test "$@" ;;
buckets) s3_buckets ;;
rmrf) shift; s3_rmrf "$@" ;;
*) die "Unknown command ${1}."
;;
esac

0 comments on commit 04a07ed

Please sign in to comment.