distill text and various semantic information from websites, cutting out noise and styling with headless chrome and the accessibility tree. essentially a glorified website -> markdown convertor for now. (see wikipedia_sample.md for a demo)
# setup headless chrome and ublock origin
go run cmd/setup/main.go
# run the distiller
cd cmd/distill-test
go build && ./distill-test
# check the results
cat out_en.wikipedia.org.md