<a href="https://colab.research.google.com/github/simodepth/how-to-define-search-intent-with-Python.md/blob/main/How_to_define_Search_Intent_to_GSC_Queries_with_Python.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

#Apply Search Intent to each of your Google Search Console Queries 


---

As May Core Update rages with semantic search turning the tide in SEO strategy, we can't do anything but working around **search intent**.

The following Python framework is designed to help you automate search intent definition on each query you export from your Google Search Console performance section.


#Requirements and Assumptions

1. Upload the [Knowledge Graph API](https://developers.google.com/knowledge-graph/how-tos/authorizing) every time you need to use the script
2. Export queries from GSC in CSV and UTF-8 format
3. Have a column in the original CSV named **"Top queries"**

In [None]:
#@title Run Import Packages
import pandas as pd
import requests
import json
import nltk
nltk.download('punkt')
nltk.download('averaged_perceptron_tagger')
from collections import Counter
%load_ext google.colab.data_table

[nltk_data] Downloading package punkt to /root/nltk_data...
[nltk_data]   Unzipping tokenizers/punkt.zip.
[nltk_data] Downloading package averaged_perceptron_tagger to
[nltk_data]     /root/nltk_data...
[nltk_data]   Unzipping taggers/averaged_perceptron_tagger.zip.


In [None]:
#@title Get the Knowledge graph API and upload Queries.csv
apikey= "AIzaSyAcyii-wv6qKCV4lLrEel3z08P_gdaOvzc"
df = pd.read_csv("/content/Queries.csv") #@param {type:"string"}
total_queries = len(df.index)
query_list = df['Top queries'].tolist()

FileNotFoundError: ignored

In [None]:
#@title Set up intent words 
informative = ['what','who','when','where','which','why','how', 'news', 'fixtures'] #@param {type:"string"}
transactional = ['buy','order','purchase','cheap','price','tickets','shop','sale','offer'] #@param {type:"string"}
commercial = ['best','top','review','comparison','compare','vs','versus','ultimate'] #@param {type:"string"}
navigational = ['liverpool', 'liverpool fc', 'lfc'] #@param {type:"string"}

info_filter = df[df['Top queries'].str.contains('|'.join(informative))]
trans_filter = df[df['Top queries'].str.contains('|'.join(transactional))]
comm_filter = df[df['Top queries'].str.contains('|'.join(commercial))]
navigational_filter = df[df['Top queries'].str.contains('|'.join(custom))]

info_filter['Intent'] = "Informational"
trans_filter['Intent'] = "Transactional"
comm_filter['Intent'] = "Commercial"
navigational_filter['Intent'] = "navigational"

info_count = len(info_filter)
trans_count = len(trans_filter)
comm_count = len(comm_filter)
navigational_count = len(navigational_filter)

In [None]:
#@title Search Intent Breakdown per Queries
print("Total: " + str(total_queries))
print("Info: " + str(info_count) + " | " + str(round((info_count/total_queries)*100,1)) + "%")
print("Trans: " + str(trans_count) + " | " + str(round((trans_count/total_queries)*100,1)) + "%")
print("Comm: " + str(comm_count) + " | " + str(round((comm_count/total_queries)*100,1)) + "%")
print("navigational: " + str(navigational_count) + " | " + str(round((navigational_count/total_queries)*100,1)) + "%")

In [None]:
#@title Get the Intent per each Query
df_intents = pd.concat([info_filter,trans_filter,comm_filter,custom_filter]).sort_values('Clicks', ascending=False)
df_intents = df_intents.drop_duplicates(subset='Top queries', keep="first")
df_intents = df_intents[ ['Top queries'] + ['Clicks'] + ['Impressions'] + ['Intent'] + ['CTR'] + ['Position'] ]
df_intents