In this project, we present Percival, a browser-embedded, lighweight, deep learning powered ad blocker. Percival is built into Blink - Chromium Rendering Engine. Percival embeds itself into the image rendering pipeline, which makes it possible to intercept rendering of iframes, images created by complex JavaScript transformations as well as Gifs and regular images. Percival inspects these image frames and performs blocking based on deep learning image classification.
- Overall Architecture
- Install and Run Percival
- Browsing with Percival
- Google Image Search Results
- Detailed Architecture
- Acknowledgement
This code was tested with Chromium 74.0.3691.0(64 bit) on Ubuntu(16.04), MacOSX High Sierra 10.13.6.
* A 64 bit intel machine with 8GB of RAM. 16GB is recommended.
* 100 GB free disk space
* Python v2
git clone https://chromium.googlesource.com/chromium/tools/depot_tools.git
export PATH="$PATH:/path/to/depot_tools"
mkdir chromium && cd chromium
fetch --nohooks chromium
cd src
./build/install-build-deps.sh
gclient runhooks
gn args out/fastbuild
This will open an editor, add the following lines to the file
is_debug = false
is_component_build = true
use_jumbo_build = true
symbol_level = 0
enable_nacl = false
After you exit out, it will generate the build files. For this project, we didn't use Icecc distributed compiler. We didn't use ccache either.
autoninja -C out/fastbuild chrome
Copy darknight.patch from /patches/darknight.patch to src directory
git apply darknight.patch
In Browser model coming soon.
./out/fastbuild/chrome --no-sandbox
To evaluate the performance of our system, we used top 5,000 URLs from Alexa to test against Chromium compiled on Ubuntu Linux 16.04, with and without our system activated. We also tested Percival in Brave, a privacy-oriented Chromium-based browser, which blocks ads using filter lists by default. For each experiment we measured render time. In our evaluation we show an increase of 178.23 ms of render time when running Percival in the rendering critical path of Chromium and 281.85 ms when running inside Brave browser with ad-blocker and shields turned on.
This delay is a function to the number and complexity of the images on the page and the time the classifier takes to classify each of them. We measure the rendering time impact when we classify each image synchronously.
Render Time evaluation in Chromium and Brave browser.
Results from browsing the web with Percival. Percival can block ads and sponsored content from popular websites.
Browsing yahoo.com with Percival
Browsing cnn.com with Percival
We used Google Images as a way to fetch images from distributions that have high or low ad intent. For example, we fetched the results with the search query "Advertisement".
Google image search results for query "Obama"
Google image search results for query "Liverpool"
Google Image Search Results for query "Pastry"
Google Image search results for query "Coffee"
Any web page can be thought of as a collection of HTML, CSS, and JavaScript code which is delivered over the network; the rendering engine parses this code and builds the DOM tree and Layout Tree and issues the OpenGL calls via Skia (Google Graphics library)
The browser, having built the DOM tree and processed the style-sheets calls the rendering engine next to determine the visual geometry of all the elements, i.e. compute the coordinates of the rectangles corresponding to the regions these elements occupy on the screen; this is called layout stage, and the information is stored in the layout tree. Once the geometry of the objects is known, blink paints these elements, i.e. recording a paint operation in a list of display items (an abstraction for objects the user will see in the content area).
This is followed by the rasterization process, which takes these display items and turns them into bitmaps. Rasterization issues OpenGL draw calls via the Skia library which abstracts hardware operations.
We would like to thank Steven Kobes Vladimir Levin Aleksandar Stojiljkovic and the entire Chromium Graphics Team for the extensive documentation, presentations and google docs detailing various parts of the graphics pipeline.
We would also like to thank Tobias Hermann for Frugally Deep. Percival uses a fork of Frugally Deep.