Google Search Console Api

Google Analytics Core v4 Reporting Api

Overview

The Google Analytics Reporting API v4 provides programmatic methods to access report data in Google Analytics (Universal Analytics properties only). With the Google Analytics Reporting API.

More details can be found via Googles documentation Here.

Configuration

Param	Description	Example
start_date	Date phrase or date string for the start date of the query	2022-01-01, 'today', 'yesterday', '3_days_ago', or 5_months_ago', '1_year_ago'
end_date	Date phrase or date string for the end date of the query	2022-01-01, 'today', 'yesterday', '3_days_ago', or 5_months_ago', '1_year_ago'
metrics	List of metrics that will be batch queried from the v4 Api	[ctr, clicks]
dimensions	List of dimensions that will be batch queried from the v4 Api	[date]
type	GSC Search Type	"Web", "Image", "News" etc...
row_limit	Set paging limit per response row	25000
aggregation_type	Aggregation functionality based on given metrics and dimensions	"auto", "byPage" or "byPath"
credentials	Expects a .pickle that use can generate with the feature described below
dimensionFilterGroups	The value for the filter to match or exclude, depending on the operator.	[{"groupType":"string","filters":[{"dimension":"string","operator":"string","expression":"string"}]}]
data_state	If "all" (case-insensitive), data will include fresh data. If "final" (case-insensitive) or if this parameter is omitted, the returned data will include only finalized data.

Search Types

Type	Description
"discover"	Discover results
googleNews"	Results from news.google.com and the Google News app on Android and iOS. Doesn't include results from the "News" tab in Google Search.
"news"	Search results from the "News" tab in Google Search.
"image"	Search results from the "Image" tab in Google Search.
"video"	Video search results.
"web"	[Default] Filter results to the combined ("All") tab in Google Search. Does not include Discover or Google News results.

Dimension Filter Groups

Type	Description
"contains"	The row value must either contain or equal your expression (non-case-sensitive).
"equals"	[Default] Your expression must exactly equal the row value (case-sensitive for page and query dimensions).
"notContains"	The row value must not contain your expression either as a substring or a (non-case-sensitive) complete match.
"notEquals"	Your expression must not exactly equal the row value (case-sensitive for page and query dimensions).
"includingRegex"	An RE2 syntax regular expression that must be matched.
"excludingRegex"	An RE2 syntax regular expression that must NOT be matched.

Aggregation Type

Type	Description
"auto"	[Default] Let the service decide the appropriate aggregation type.
"byPage"	Aggregate values by URI.
"byProperty"	Aggregate values by property. Not supported for type=discover or type=googleNews

Example

An example of the configuration object could look like:

{
  "start_date": "2022-02-01",
  "end_date": "2022-02-05",
  "dimensions": [
    "date"
  ],
  "metrics": [
    "clicks",
    "ctr"
  ],
  "search_type": "",
  "row_limit": 25000,
  "site_url": "https://www.mywebsite.com/",
  "aggregation_type": "auto"
}

Setting up a python pipeline:

from turbo_stream.google_search_console.reader import GoogleSearchConsoleReader


def google_analytics_pipeline():
    reader = GoogleSearchConsoleReader(
        configuration={
            "start_date": "2022-02-01",
            "end_date": "2022-02-05",
            "dimensions": [
                "date"
            ],
            "metrics": [
                "clicks",
                "ctr"
            ],
            "search_type": "",
            "row_limit": 25000,
            "site_url": "https://www.mywebsite.com/",
            "aggregation_type": "auto"
        },
        credentials="creds.pickle",
    )

    data = reader.run_query()  # start the above query
    print(data)  # option to return the response object as a flat json structure

    # current option to write to AWS s3 exists, key file extension is supported
    reader.write_data_to_s3(bucket="my-bucket", key="path/data.json")
    reader.write_data_to_s3(bucket="my-bucket", key="path/data.csv")
    reader.write_data_to_s3(bucket="my-bucket", key="path/data.parquet")

    # anything else will be written as a blob with its given extension

    # additional option to partition data before writing to s3
    # this allows users to write file names into a bucket grouped by a given field
    # commonly a date field, as this allows to write to s3 without creating duplication
    reader.write_partition_data_to_s3(bucket="my-bucket", path="my/path", partition="ga:date", fmt="json")
    reader.write_partition_data_to_s3(bucket="my-bucket", path="my/path", partition="ga:date", fmt="csv")
    reader.write_partition_data_to_s3(bucket="my-bucket", path="my/path", partition="ga:date", fmt="parquet")

Features

turbo-stream comes with detailed logging functionality so that all pipelines can be tracked via logs. The request object has functionality to stagger requests to reduce rolling quota issues, as well as retry support for common GA timeout and HttpError errors, with the opportunity to attempt to re-run the query 5 times before failing. This is common across all vendors.

A user-friendly method to generate a .pickle file for future authentication is available in the reader as the generate_authentication() method. For the first time, you would need to log in with your web browser based on this web authentication flow. After that, it will save your credentials in a pickle file. Every subsequent time you run the script, it will use the “pickled” credentials stored in credentials.pickle to build the connection to Search Console.

from turbo_stream.google_search_console.reader import GoogleSearchConsoleReader


def main():
    reader = GoogleSearchConsoleReader(
        configuration={},
        credentials="google_search_console_creds.json"  # to generate, make use of the secrets.json
    )

    reader.generate_authentication(auth_file_location="google_search_console_creds.pickle")


if __name__ == "__main__":
    main()

Comments

                     \    /\                                    
                      )  ( ')    < meow...                      
                     (  /  )                                    
                      \(__)|

Provide feedback

Saved searches

Use saved searches to filter your results more quickly