-
Notifications
You must be signed in to change notification settings - Fork 3.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
New Source: Google Search Console #2257
Comments
Thank you @JordanChoo! Are you sure the non-sampled data is available through their API? |
Yes, @michel-tricot. Within the UI you can only export 1K rows of data across a single dimension while with the API you can pull 1M across multiple dimensions. To date I've been using a janky Supermetrics --> GSheet --> BigQuery workaround |
@JordanChoo would you be interested in contributing this as a connector to Airbyte? We can offer any support needed, code pairing etc.. and the bonus is that maintenance can be shared between Airbyte/the community/you |
@sherifnada, at this time I wouldn't be able to help out on this since I only know JS right now😢 |
Q1: Is there anyone working on this? Q2: Is it any more complicated than wrapping the Singer tap? ( https://github.com/singer-io/tap-google-search-console ) |
@michael4tasman we're not currently working on this but plan to offer it some time in the next couple of months. the only friction is setting up the sandbox environment/CI to test that the connector is working on a recurring basis. We have already verified a domain for Google Search Console, so at this point we are ready to generate an API key and start querying it during CI.
Thanks for sharing the Singer tap -- it looks high quality and well maintained. So I think we can go with it! Will prioritize this shortly. Expect it sometime later this month or the first half of May. Does that work with your timeline? Alternatively you're more than welcome to open a PR which wraps the Singer tap if you'd like it sooner than that. |
I need it sooner than that, and I'm happy to work on wrapping the Singer tap, but would appreciate the offer of guidance/pairing made upthread. |
@michael4tasman glad to hear it. You can get started by using the module autogenerator as described here: https://docs.airbyte.io/contributing-to-airbyte/building-new-connector Please feel free to book a pairing session with me at any time here: https://calendly.com/sherif-nada/code-pairing-session If none of the times work for you, please reach out at sherif@airbyte.io to find a different time. |
Integration VettingWebhook-based? (no/partially/yes) no Available authentication modes (API key/Oauth/other) Oauth2.0, but can use Creating an account Already created, but need to enable the Google Search Console API How to populate the account with data? For the working with Google Search Console, we have to create a Available streams for sync
Integration supports incremental sync? Only for Other information/blockers Can use Singer Tap for Google Search Console. |
@michael4tasman good news! We'll start work on this sooner than scheduled. Expecting delivery this week or the next one. |
We can use Singer Tap to implement new source. It is required to generate credentials according to instruction and put to Lastpass Singer Tap supports Full and Incremental syncing. Account requires data populating. We can do this via API for Sites and Sitemaps Stream, but not for Performance Reports Streams. Singer Tap handles errors and rate limits using backoff. Blockers
Task breakdown
|
@vitaliizazmic can you verify if we need to populate any data manually if we already have a live website serving traffic linked to Google Search Console? |
@sherifnada - if you want to backfill data (GSC provides up to 16 months) that would have to be done manually |
thanks for the heads up @JordanChoo ! |
@vitaliizazmic heads up -- the singer tap doesn't currently support service-account based oauth. See the client class here. We should fork and change this class to implement JWT OAuth like described here. I've added the service account credentials to Lastpass. They were generated using oauth with domain-wide-delegation. |
@sherifnada I've checked service account credentials from Lastpass. All works fine. I fetched sites, sitemaps and performance report. I think, it will be enough and we don't need to populate data manually. |
…ptance test configs, change tap repo to airbyte
* Google search console source #2257 - new source * Google search console source #2257 - reformat * Google search console source #2257 - adding gcc to docker container * Google search console source #2257 - remove unused files, update acceptance test configs, change tap repo to airbyte * Google search console source #2257 - updating acceptance tests configs * Google search console source #2257 - updating acceptance cursor_paths * Google search console source #2257 - temporary disable tests * Google search console source #2257 - disable performance_report_date stream * Google search console source #2257 - disable performance_report_date stream (update docs) * Google search console source #2257 - disable performance_report_date stream for tests * Google search console source #2257 - updating singer tap fork
…ing sync_mode and destination_sync_mode to streams)
Hey everyone - we just released Google Search Console. Add it by going to the Admin page in the UI and adding it as a new connector. Parameters to add it are:
big thanks to @vitaliizazmic for making it happen! |
* Jira source #1389 - adding schemas for streams * Jira source #1389 - supporting streams * Jira source #1389 - creating_project script * Jira source #1389 - updating docs * Jira source #1389 - fixing check method * Jira source #1389 - uploading missing schemes * Jira source #1389 - disabling JQL and Server info streams * Jira source #1389 - fixing according to PR comments * Jira source #1389 - fixing filter_sharing and screen_tab_fields streams * Update airbyte-integrations/connectors/source-jira/source_jira/client.py * Google search console source #2257 - improving configured catalog(adding sync_mode and destination_sync_mode to streams) * Jira Source - incremental sync * Jira source #1390 - issues incremental sync * Jira source #1390 - issue worklogs incremental sync * Source Jira #1390 - incremental sync improving * Source Jira #1390 - migrating to airbyte-cdk, creating CHANGELOG.md * Source Jira #1389 - reformat * Jira Source HTTP CDK * Source Jira #3453 - cleaning branch * Source Jira #3453 - cleaning branch (fix) * Source Jira #3453 - abstractmethod get_updated_state * Jira dummy data #2100 #2101 * Jira source #2100 - data generator * Jira source #2100 - issue related streams populating * Jira source #2100 - project related streams populating * Jira source #2101 - populating data for non issue or project related streams * Source Jira #2100 - improving according to comments * Source Jira #2100 - format * Source Jira #1389 - bump version * Source Jira #1389 - enabling base_read acceptance test divided by stream groups * Source Jira #1389 - bump version Co-authored-by: Sherif A. Nada <snadalive@gmail.com>
Tell us about the new integration you’d like to have
Which source and which destination? Which frequency?
Google Search Console to BigQuery on a daily basis
Describe the context around this new integration
Which team in your company wants this integration, what for? This helps us understand the use case.
The Google Search Console UI samples the data and provides a very limited amount of export functionality. By being able to programmatically save GSC you are able to get all of the data without any sampling.
This information will allow us to gain deeper insight into how a website and/or URL is performing across countries, keywords and devices. By allowing us to save the data into BigQuery it ensures that we can easily visualize it in Google Data Studio along with other data viz platforms that integrate with BigQuery.
Describe the alternative you are considering or using
What are you considering doing if you don’t have this integration through Airbyte?
Planning on building out the custom pipeline myself.
The text was updated successfully, but these errors were encountered: