Proposed plan
This is the proposed plan for the wiki's contents. This file aims at helping contributors divide the editing work with limited conflicts. Please strike the items that have been processed using ~~
around the line. This plan is not strict and really is only a proposal. Please discuss possible adjustments on the mailing list. Please don't start discussions if you don't intend to contribute. This page must not be referenced in the home page and will be deleted once enough contents have been ported or it has significantly diverged.
First intro line
Team
Places
haproxy.orgmailing listdiscoursegithub
Release cycle
- development
- stable
- LTS
Contributing code
- read CONTRIBUTING
- read coding-style
- read git log
Participating with no code
- read problem reports
- review / adjust patches
- help others
- contribute to the wiki
- test the code
- suggest use cases
- report issues, gdb traces
- bisect issues
Presentation
How a proxy works
Terminology
- client
- server
- frontend / service
- backend / farm
- active / backup
- connection, session, transaction, request, response
Topologies
- edge + short silos
- central LB + a bunch of servers, multiple layers
- service clusters (stacks of [haproxy + servers])
- sidecar
Setting up HA for haproxy
- keepalived / ucarp / pacemaker ?
- LVS
- ECMP
- ELB
Common use cases
-
as a basic proxy
- IPv6 to IPv4 gatewaying
- port filtering
- TLS enforcement / cert validation
- protocol inspection. E.g. HTTP+SSH, SMTP banner delay
- authentication
- transparent proxying
- logging / anomaly detection / time measurement
- DoS protection (stick tables, tarpit)
- traffic aggregation (multiple interfaces attachment)
- traffic limitation (maxconn)
-
as an accelerating proxy
- TLS offloading
- traffic compression
- response caching
-
as a load balancer
- classical stateless L7 LB
- classical stateful L7 LB
- when to use round robin -> short requests / web applications
- when to use least conn -> long sessions
- when to use first -> ephemeral VMs, fast scale-in/scale-out
- when to use hashing -> affinity (e.g. caches)
- consistent vs map-based hashing
- persistence vs hashing
- inbound vs outbound load balancing
- backup server(s)
- grouping traffic to a single server (active/backup for data bases)
Advanced use cases
- providing TLS to Varnish (in + out)
- caching clusters with consistent hashing and small object caching
- H2 in front of Nginx (max-reuse)
- using priorities to speed up critical parts of a site
- service discovery via DNS, CLI, Lua
- managing certificates at scale / let's encrypt
- tuning for extreme loads. pitfalls.
- accessing services inside Linux containers using namespaces
- multi-site abuser eviction (stick-tables + peers)
Scripting in Lua
On the fly management
- stats page
- CLI
- signals
- master-worker
- agent-check
- add-acl/del-acl
- DNS
Operating system specificities
- Linux >= 3.9 : SO_REUSEPORT
- Linux >= 4.2 : IP_BIND_ADDRESS_NO_PORT
Performance considerations
- orders of magnitude for a few typical metrics
- cost of processing for various operations
- cost of traversal for various topologies
- optimizing for lowest latency
- optimizing for highest throughput
- optimizing for TCO
Principles
- what
- why
- when
- beware of audience
Conducting a benchmark
- define purpose
- define expected metrics
- define ideal conditions
- take note of real conditions
- ensure reproducibility / minimise noise
- problems are part of the process
- report
Archived results
- one per page : date, title, report