This is a mirror repository. Hack Wget2 at https://gitlab.com/gnuwget/wget2
C CSS Shell HTML M4 Makefile Other
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
benchmarks New benchmark script Aug 30, 2017
ci-test.d Add scripts for testing on gcc and OpenCSW buildfarm Apr 30, 2018
contrib * contrib/commit-check: Don't check the name Aug 16, 2018
data Remove data/ directory Jan 11, 2018
docs * Makefile.am: Make check-local fail on error Aug 12, 2018
examples Add example batch_loader Aug 4, 2018
fuzz * fuzz/fuzzer.h: Disable pedantic and unused warnings Apr 30, 2018
gnulib @ 66ae2f3 Update gnulib submodule Mar 23, 2018
include Add base64 URL/file safety Aug 17, 2018
lib Use less warnings for gnulib compilation on --enable-gcc-warnings Jun 15, 2016
libwget Add base64 URL/file safety Aug 17, 2018
m4 Update copyright to 2018 Apr 30, 2018
po * po/POTFILES.in: Remove libwget/html_url.c May 7, 2018
src Add new option --dns-cache-preload Aug 8, 2018
tests Add verify-sig missing signature file behavior test. Jun 11, 2018
unit-tests Extend OOM handling Jul 15, 2018
.dir-locals.el .dir-locals.el: new file Oct 23, 2015
.gitignore * .gitignore: Update gitignore file Feb 15, 2018
.gitlab-ci.yml * .gitlab-ci.yml: Increase version of cache Jul 5, 2018
.gitmodules Use https:// instead of git:// Apr 4, 2017
.lgtm.yml * .lgtm.yml: Call ./bootstrap Apr 13, 2018
.travis.sh Fix some more URLs from http:// to https:// Dec 12, 2017
.travis.yml * .travis.yml: Fix travis-ci build Nov 29, 2017
.travis_setup.sh Fix some more URLs from http:// to https:// Dec 12, 2017
AUTHORS * AUTHORS: Fix name of Josef Möllers Aug 16, 2018
COPYING Fix some more URLs from http:// to https:// Dec 12, 2017
COPYING.LESSER Fix some more URLs from http:// to https:// Dec 12, 2017
ChangeLog * Makefile.am: Add empty line to ChangeLog after date/author Apr 11, 2017
Makefile.am * Makefile: Perform checks in check-local unconditionally Aug 14, 2018
NEWS Prepare Release 1.99.1 (alpha) Apr 30, 2018
README Fix README Sep 22, 2015
README.md Add information related to the build process in Haiku [skip ci] Aug 5, 2018
bootstrap Update gnulib submodule Mar 23, 2018
bootstrap.conf * bootstrap.conf: Enable syncing with translationproject.org Jun 7, 2018
cfg.mk * cfg.mk: Escape dots in path components May 7, 2018
ci-test Add scripts for testing on gcc and OpenCSW buildfarm Apr 30, 2018
configure.ac Remove usage of getaddrinfo_a() and gai_* functions Aug 4, 2018
libwget.pc.in Fix Name + Libs in pkg-config file Jan 9, 2016
pthread_sigmask.c.mingw Update copyright to 2018 Apr 30, 2018
todo.txt fixed some clang 3.4 warnings Jan 11, 2014

README.md

Build status Coverage status Coverity Scan Build Status

Solaris OpenCSW Build Status Solaris amd64 Build Status Solaris i386 Build Status Solaris Sparc Build Status Solaris SparcV9

GNU Wget2 - Introduction

GNU Wget2 is the successor of GNU Wget, a file and recursive website downloader.

Designed and written from scratch it wraps around libwget, that provides the basic functions needed by a web client.

Wget2 works multi-threaded and uses many features to allow fast operation.

In many cases Wget2 downloads much faster than Wget1.x due to HTTP zlib compression, parallel connections and use of If-Modified-Since HTTP header.

GNU Wget2 is licensed under GPLv3+.

Libwget is licensed under LGPLv3+.

Features

A non-exhaustive list of features

  • Support for HTTP/1.1 and HTTP/2.0 protocol
  • brotli decompression support (Accept-Encoding: br)
  • HPKP - HTTP Public Key Pinning (RFC7469) with persistent database
  • TCP Fast Open for plain text and for HTTPS
  • TLS Session Resumption including persistent session data cache
  • TLS False Start (with GnuTLS >= 3.5.0)
  • HTTP2 support via nghttp2 and GnuTLS ALPN including streaming/pipelining
  • OCSP stapling + OCSP server querying as a fallback (experimental, needs GnuTLS >= 3.3.11)
  • Use libpsl for cookie domain checking (using Public Suffix List)
  • Support link conversion (-k/--convert-links and -K/--backup-converted)
  • Support for RFC 6266 compliant Content-Disposition
  • RFC 6797 HSTS (HTTP Strict Transport Security)
  • Support for bzip2 Content-Encoding / Accept-Encoding compression type
  • New Year 2014 gimmick: added support for XZ Content-Encoding / Accept-Encoding compression type
  • Character encoding of input files may be specified despite from local and remote encoding (--input-encoding)
  • Support scanning RSS 2.0 feeds from local files (--force-rss -i <filename>)
  • Support scanning RSS 2.0 feeds.
  • Support scanning Atom 1.0 feeds from local files (--force-atom -i <filename>)
  • Support scanning Atom 1.0 feeds.
  • Support scanning URLs from local Sitemap XML file (--force-sitemap -i <filename>)
  • Support scanning sitemap files given in robots.txt (Sitemap XML, gzipped Sitemap XML, plain text) including sitemap index files.
  • Support arbitrary number of proxies for parallel downloads
  • Multithreaded download of single files (option --chunk-size)
  • Internationalized Domain Names in Applications (compile-selectable IDNA2008 or IDNA2003)
  • ICEcast / SHOUTcast support via library (see examples/getstream.c)
  • respect /robots.txt "Robot Exclusion Standard" and <META name="robots" ...>
  • new option --secure-protocol=PFS to have TLS only plus forcing Perfect Forward Secrecy (PFS)
  • IDN support for international domains
  • autotools support
  • proxy support
  • cookies (session/non-session), detection of supercookies via Mozilla Public Suffix List (use the new option --cookie-suffixes <filename>, better: put it into ~/.wgetrc)
  • recursive download of websites with or without spanning hosts
  • download of single web pages / resources
  • zlib/gzip compressed HTTP/HTTPS downloads (gzip, deflate)
  • number of parallel download threads is adjustable
  • include directive for config files (wildcards allowed)
  • support for keep-alive connections
  • included CSS, HTML, XML parser needed for recursive downloads
  • gettext support
  • HTTPS via libgnutls
  • support for Metalink RFC 6249 (Metalink/HTTP: Mirrors and Hashes)
  • support for Metalink RFC 5854 (Metalink Download Description Format / .meta4 files)
  • support for Metalink 3
  • Metalink checksumming via libgnutls
  • DNS lookup cache
  • IPv4 and IPv6 support
  • built and tested on Linux, OSX, OpenBSD, FreeBSD, Solaris, Windows

Links

Online Docs

Mailing List

Bug Tracker

Development

Code Coverage

Fuzz Code Coverage

Build Requirements

The following packages are needed to build the software

  • autotools (autoconf, autogen, automake, autopoint, libtool)
  • python (recommended for faster bootstrap)
  • rsync
  • tar
  • makeinfo (part of texinfo)
  • pkg-config >= 0.28 (recommended)
  • doxygen (for creating the documentation)
  • pandoc (for creating the wget2 man page)
  • gettext >= 0.18.2
  • libiconv (needed for IRI and IDN support)
  • libz >= 1.2.3 (the distribution may call the package zlib*, eg. zlib1g on Debian)
  • liblzma >= 5.1.1alpha (optional, if you want HTTP lzma decompression)
  • libbz2 >= 1.0.6 (optional, if you want HTTP bzip2 decompression)
  • libbrotlidec >= 1.0.0 (optional, if you want HTTP brotli decompression)
  • libgnutls (3.3, 3.5 or 3.6)
  • libidn2 >= 0.9 + libunistring >= 0.9.3 (libidn >= 1.25 if you don't have libidn2)
  • flex >= 2.5.35
  • libpsl >= 0.5.0
  • libnghttp2 >= 1.3.0 (optional, if you want HTTP/2 support)
  • libmicrohttpd >= 0.9.51 (optional, if you want to run the test suite)
  • lzip (optional, if you want to build distribution tarballs)
  • lcov (optional, for coverage reports)
  • libgpgme >= 0.4.2 (optional, for automatic signature verification)

The versions are recommended, but older versions may also work.

Building from git

Download project and prepare sources with

	git clone https://gitlab.com/gnuwget/wget2.git
	cd wget2
	./bootstrap
	# on shell failure try 'bash ./bootstrap'

Build Wget2 with

	./configure
	make

In Haiku build Wget2 with

    setarch x86
    ./configure --prefix=/boot/home/config/non-packaged
    rm /boot/home/config/non-packaged/wget2 && mv /boot/home/config/non-packaged/wget2_noinstall /boot/home/config/non-packaged/wget2

Test the functionality

	make check

Install Wget2 and libwget

	sudo make install (or su -c "make install")

Valgrind Testing

To run the test suite with valgrind memcheck

	make check-valgrind

or if you want valgrind memcheck by default

	./configure --enable-valgrind-tests
	make check

To run single tests with valgrind (e.g. test-k)

	cd tests
	VALGRIND_TESTS=1 ./test-k

Why not directly using valgrind like 'valgrind --leak-check=full ./test-k' ? Well, you want to valgrind 'wget2' and not the test program itself, right ?

Coverage Report

To generate and view the test code coverage (works with gcc, not with clang)

	make check-coverage
	<browser> lcov/index.html

Control Flow Integrity with clang

To instrument clang's CFI:

	CC="clang-5.0" CFLAGS="-g -fsanitize=cfi -fno-sanitize-trap=all -fno-sanitize=cfi-icall -flto -fvisibility=hidden" NM=/usr/bin/llvm-nm-5.0 RANLIB=/usr/bin/llvm-ranlib-5.0 AR=/usr/bin/llvm-ar-5.0 LD=/usr/bin/gold ./configure
	make clean
	make check

With clang-5.0 -fsanitize=cfi-icall does not work as expected. Our logger callback functions are typed correctly, but falsely cause a hiccup.