Another assorted collection of fun in Golang.
If you have any feedback drop me a line at p.staszewski@gmail.com or look for dRbiG on FreeNode.
Grabber is a concurrent declarative web scraper and downloader.
This is the rewrite of grabber.go.
Camtagger is a simple command-line mass-tagger for Camlistore.
If you're not interested in Camlistore (but you should be) this will be of no use to you.
AdHole is a simple transparent advertisement and tracking blocker intended for personal use. Simple, minimal, low-level.
Extract links (or actually anything) based on regexps from websites. One major feature is support to go through 'all pages' by following a 'Next' or alike link (again, regexp matching). You can have more than one extracting regexp per URL, and they are processed in parallel. I guess you can do similar stuff with httrack or maybe even curl/wget, but I personally prefer to be able to extract all links so I can run wget on the right machine in the right time. The code is not commented, but then it should be reasonably easy to understand anyway.
A 'webapp' that discovers IPs via ARP cache and then ping-checks them - to bring you a succinct report of what is online and what is offline in your neighbourhood. It considers itself only with the last state change (keeping both the 'since' and 'time elapsed'). It can clean old entries based on a Regexp match. Latest addition is the ability to read standard zone files, so that a hostname can be displayed alongside the IP address. Tested on FreeBSD and Linux, works like a charm.
Thanks to goproxy this is basically a 'one-liner' - it's a simple HTTP proxy that will dump the URI of each request that goes through it to stdout.
Simple implementation of Wake-On-Lan. This is a part of my another 'project' where I implemented this functionality in a number of languages (currently C, Go, Ocaml, Ruby and Scheme). Before running you should edit the broadcast IP and the static MAC table. Each argument will be first looked up in the table, if that fails it will be treated as an MAC address. In both cases if the parsing goes ok a magic packet will be sent three times (without delay). There is only rudimentary error checking. Tested only on Linux (where it works).
Rather simple uploader for pomf.se. Simple example of using
mime/multipart
post a file and encoding/json
to parse the results. Without
arguments it will try to upload whatever comes on stdin
. Otherwise you can
specify filenames as arguments, and it will try to upload each file. Exits
successfully only if all uploads were successful and dies very quickly on
almost any error.
Updates:
No longer maintained here. Please find the rewritten version here.
Added basic support for headers injection. To use add headers
to your
target definition (usual hash). Note that this gives you also the ability
for basic cookie injection. Warning: if you have multiple targets defined
in a single file be aware that headers are currently read from a global
variable, so there might be problems 'between the targets' (i.e. an edge
case where a link has been pushed to queue from target 1 but gets processed
when target 2 is set as the source of headers). Obvious work-around is to
use one target per file.
Added new mode single
. Useful when results are paginated in chronological
order - as you don't know the last page's link you can use single to go to
the 'last page' and then follow every 'previous page' link.
Added basic downloading statistics (number of files, total size and average download speed).
General:
Purely declarative web scrapper with parallel downloading. Intended to automate
the common pattern found in websites: archive page pagination - post links -
resource links. You should open unixporn.json
example to see the structure of
a target definition. Non-URL data extraction example is in golang.json
.
There are two modes: follow
- use for pagination, and every
- use for link
extraction. Then there are three actions that can be taken: print
- will
print the link to stdout, log
- will log the link (either stdout
or
stdin
, depending on the state of the -log
flag), download
- will send
the link to the built-in parallel downloader (which will download only if the
file doesn't already exist), and finally raw
- this mode will print to stdout
whatever was matched, without trying to resolve it into a full URL (this way
you can use this software to extract pretty much anything).
The do
actions can be chained in arbitrary manner and form a command path
that will be executed linearly. You can define multiple targets in a single
file. It will work with SSL connection however it will not do any
verification.
This is by far the most involved piece of software that I have written in Golang, and although it's far from beauty, I had a lot of fun writing it.
Depends on gokogiri. Tested under Linux and FreeBSD.
Check the status of your deliveries from the command-line (and fast). This one uses either XPath or plain old Regexp extractor, so you can deal easily with both HTML and JSON. It also has support for custom download function for those idiotic services that think PHPSESSID is a security feature. Provide your tracking numbers as arguments. Depends on gokogiri and sanitize.
Prototype of a stand-alone image-based CAPTCHA generator and server. The two
main goals are performance and simplicity, with intent to experiment with
different forms of CAPTCHAs (as the mutilated-text images suck). The API
consists of /gen
which returns either a 500
error or a simple JSON
response (t
oken as int
and a
nswer as string
), and /image?id=TOKEN
.
Right now it will generate random string images without any 'effects'.
It has some command-line options such as how long should the images remain
downloadable. Being a prototype it lacks proper performance testing and some
additional security (like protection of /gen
with a passkey or IP policy).
The only dependency is freetype-go.
There are no runtime dependencies. The sample.ttf
font comes from
Ruby/SDL repository (if I remember
correctly) and I claim no copyright for it.
Open firewall by issuing a get request. Rough but tested. Important config only in code. Not that you should trust binaries for such stuff anyway. Needs SSL before I'd actually recommend it.
Copyright (c) 2012 - 2017 Piotr S. Staszewski.
Usual 2-clause BSD license for everything. See LICENSE.txt for details.