Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: introduce organic keywords for ahrefs client and top keyword in top pages #257

Merged
merged 10 commits into from
Jun 19, 2024

Conversation

dzehnder
Copy link
Contributor

@dzehnder dzehnder commented Jun 8, 2024

In context for SITES-22601 (auto fix broken backlinks), we need access to the keywords of a page to determine the best possible alternate url (as suggested by research)

Cost estimation for the API consumption:

  • Request per page:
    Worst Case:
    Each row in the query costs 12 Units (sum_traffic: 10, keyword: 1, best_position_url: 1)
    Using a limit of 200 rows per request: 2400 Units (12 x 200).
    For 200 broken backlinks: 480k Units (2400 x 200)

  • For import:
    How many keywords should be imported?
    For example, page with 20k keywords (with traffic):
    Getting all keywords: 240k Units (12 x 20 000)
    Getting top 2k keywords: 24k Units (12 x 2000)

@dzehnder dzehnder added the enhancement New feature or request label Jun 8, 2024
@dzehnder dzehnder self-assigned this Jun 8, 2024
Copy link

github-actions bot commented Jun 8, 2024

This PR will trigger a minor release when merged.

@@ -174,4 +174,28 @@ export default class AhrefsAPIClient {

return this.sendRequest('/site-explorer/metrics-history', queryParams);
}

async getOrganicKeywords(url, country = 'us', keywordFilter = [], limit = 200) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

missing input parameter validation

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need a country, what if this is a global site?
I wouldn't set the default to be us, but rather none.

Copy link
Contributor Author

@dzehnder dzehnder Jun 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes the country is mandatory for this Ahrefs request because keywords are country specific.
For the Google API, this is not the case, there you can get them across all countries, which in some cases also creates confusing results (in this case needs to be filtered by subdomain or path, which results ultimately in the same issue).

I see two options for this:

  • we extract the country from either url or hreflang tag
  • we collect the keywords from the top-pages (already suggested by @iuliag through other channels). This might have some disadvantages as top-pages differ from keyword rankings. On the other hand, we can save an additional Ahrefs API request. We could also increase the limit in this case to import maybe top 2k pages for better accuracy

packages/spacecat-shared-ahrefs-client/src/index.js Outdated Show resolved Hide resolved
@iuliag
Copy link
Contributor

iuliag commented Jun 11, 2024

I'm thinking we might also want to try with just adding top_keyword to the top pages query https://github.com/adobe/spacecat-shared/blob/main/packages/spacecat-shared-ahrefs-client/src/index.js#L122, as that would only increase the current cost of top pages with 1 unit per row (200 units for top pages of the whole site).

Copy link
Contributor

@iuliag iuliag left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍🏻
With an observation to address below.

@@ -174,4 +174,28 @@ export default class AhrefsAPIClient {

return this.sendRequest('/site-explorer/metrics-history', queryParams);
}

async getOrganicKeywords(url, country = 'us', keywordFilter = [], limit = 200) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need a country, what if this is a global site?
I wouldn't set the default to be us, but rather none.

@dzehnder dzehnder changed the title feat: introduce organic keywords for ahrefs client feat: introduce organic keywords for ahrefs client and top keyword in top pages Jun 19, 2024
@dzehnder dzehnder merged commit 371f1c4 into main Jun 19, 2024
9 checks passed
@dzehnder dzehnder deleted the ahrefs-organic-keywords branch June 19, 2024 09:56
adobe-bot pushed a commit that referenced this pull request Jun 19, 2024
# [@adobe/spacecat-shared-ahrefs-client-v1.3.0](https://github.com/adobe/spacecat-shared/compare/@adobe/spacecat-shared-ahrefs-client-v1.2.6...@adobe/spacecat-shared-ahrefs-client-v1.3.0) (2024-06-19)

### Features

* introduce organic keywords for ahrefs client and top keyword in top pages ([#257](#257)) ([371f1c4](371f1c4))
@adobe-bot
Copy link

🎉 This PR is included in version @adobe/spacecat-shared-ahrefs-client-v1.3.0 🎉

The release is available on:

Your semantic-release bot 📦🚀

adobe-bot pushed a commit that referenced this pull request Jun 19, 2024
# [@adobe/spacecat-shared-data-access-v1.30.0](https://github.com/adobe/spacecat-shared/compare/@adobe/spacecat-shared-data-access-v1.29.2...@adobe/spacecat-shared-data-access-v1.30.0) (2024-06-19)

### Features

* introduce organic keywords for ahrefs client and top keyword in top pages ([#257](#257)) ([371f1c4](371f1c4))
@adobe-bot
Copy link

🎉 This PR is included in version @adobe/spacecat-shared-data-access-v1.30.0 🎉

The release is available on:

Your semantic-release bot 📦🚀

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request released
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants