Instrumentation Schema Documentation

HTTP Requests

Column Name Description
id unique record ID
crawl_id ID of the crawler instance (maps to crawl table)
visit_id ID of the page visit (maps to site_visits table)
url URL requested in the HTTP request
top_level_url the URL of the tab responsible for the request (the URL that appears in the address bar)
method HTTP method
referrer HTTP Referer header URL
headers JSON stringified HTTP header
is_XHR True if the request is an XHR
is_frame_load True if the request is for an iframe document
is_full_page True if the request is for a top-level (i.e. tab) document
is_third_party_channel True if the URL being requested is third-party to the window requesting it. See the IDL defintion
is_third_party_window True if the window responsible for the request is third-party. See the IDL defintion
triggering_origin The origin of the triggeringPrincipal for this request. The triggeringPrincipal is defined in this IDL
loading_origin The origin of the loadingPrincipal for this request. The loadingPrincipal is defined in this IDL
loading_href The URL of the document from which this request is loading. I.e. the tab's URL if loading in the main context of the iframe document's URL if loading in an iframe.
req_call_stack JS call stack that triggered the request (if triggered in javascript)
content_policy_type An integer identifier for the HTML tag that resulted in this request. See nsIContentPolicy.idl for a description of the types.
post_body The body content of the POST, if it exists.
channel_id A unique ID which identifies an HTTP channel. It can be used to link data across the http_requests, http_responses, and http_redirects tables. Although this ID should be globally unique, we recommend using both the channel_id and visit_id when linking rows across tables.
time_stamp Time stamp indicating when the request occurred.
