<a href="https://colab.research.google.com/github/simodepth/Keyword-Research/blob/main/How_to_define_Search_Intent_to_GSC_Queries_with_Python.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

#Apply Search Intent to each of your Google Search Console Queries 


---

As the semantic search is prone to turn the tide in search marketing, **search intent** must pivot your SEO strategy.

The following Python framework is designed to help you automate search intent definition on each query you export from the Performance section of your Google Search Console.


#Requirements and Assumptions

1. Upload the [Knowledge Graph API](https://developers.google.com/knowledge-graph/how-tos/authorizing) every time you need to use the script
2. Export queries from GSC in CSV and UTF-8 format
3. Have a column in the original CSV named **"Top queries"**

In [1]:
#@title Run Import Packages
import pandas as pd
import requests
import json
import nltk
nltk.download('punkt')
nltk.download('averaged_perceptron_tagger')
from collections import Counter
%load_ext google.colab.data_table

[nltk_data] Downloading package punkt to /root/nltk_data...
[nltk_data]   Unzipping tokenizers/punkt.zip.
[nltk_data] Downloading package averaged_perceptron_tagger to
[nltk_data]     /root/nltk_data...
[nltk_data]   Unzipping taggers/averaged_perceptron_tagger.zip.


In [2]:
#@title Get the Knowledge graph API and upload Queries.csv
apikey= "#_NLP_API#" #Paste the path directory in the brackets from the left bar once the API is uploaded
df = pd.read_csv("/content/MY1Y - Queries.csv") #@param {type:"string"}
total_queries = len(df.index)
query_list = df['Top queries'].tolist()

In [3]:
#@title Set up intent words 
informative = ['what','who','when','where','which','why','how', 'news', 'fixtures'] #@param {type:"string"}
transactional = ['buy','order','purchase','cheap','price','tickets','shop','sale','offer'] #@param {type:"string"}
commercial = ['best','top','review','comparison','compare','vs','versus','ultimate'] #@param {type:"string"}
navigational = "['my first year', 'my1st year']" #@param {type:"string"}

info_filter = df[df['Top queries'].str.contains('|'.join(informative))]
trans_filter = df[df['Top queries'].str.contains('|'.join(transactional))]
comm_filter = df[df['Top queries'].str.contains('|'.join(commercial))]
navigational_filter = df[df['Top queries'].str.contains('|'.join(navigational))]

info_filter['Intent'] = "Informational"
trans_filter['Intent'] = "Transactional"
comm_filter['Intent'] = "Commercial"
navigational_filter['Intent'] = "navigational"

info_count = len(info_filter)
trans_count = len(trans_filter)
comm_count = len(comm_filter)
navigational_count = len(navigational_filter)

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  if sys.path[0] == '':
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  del sys.path[0]
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  


In [4]:
#@title What is the trending search intent in your query file?
print("Total: " + str(total_queries))
print("Info: " + str(info_count) + " | " + str(round((info_count/total_queries)*100,1)) + "%")
print("Trans: " + str(trans_count) + " | " + str(round((trans_count/total_queries)*100,1)) + "%")
print("Comm: " + str(comm_count) + " | " + str(round((comm_count/total_queries)*100,1)) + "%")
print("navigational: " + str(navigational_count) + " | " + str(round((navigational_count/total_queries)*100,1)) + "%")

Total: 1000
Info: 48 | 4.8%
Trans: 9 | 0.9%
Comm: 1 | 0.1%
navigational: 1000 | 100.0%


In [6]:
#@title Get the Intent per each Query
df_intents = pd.concat([info_filter,trans_filter,comm_filter,navigational_filter]).sort_values('Clicks', ascending=False)
df_intents = df_intents.drop_duplicates(subset='Top queries', keep="first")
df_intents = df_intents[ ['Top queries'] + ['Clicks'] + ['Impressions'] + ['Intent'] + ['CTR'] + ['Position'] ]
df_intents

Unnamed: 0,Top queries,Clicks,Impressions,Intent,CTR,Position
0,my first years,4943,27431,navigational,18.02%,1.00
1,my 1st years,2127,11363,navigational,18.72%,1.00
2,personalised baby gifts,609,13375,navigational,4.55%,2.52
3,my1styears,404,2082,navigational,19.40%,1.01
4,1st years,281,1631,navigational,17.23%,1.00
...,...,...,...,...,...,...
736,1st birthday party,2,63,navigational,3.17%,11.92
735,baby boy keepsake gifts,2,63,navigational,3.17%,8.65
997,baby rucksack,1,590,navigational,0.17%,9.30
998,baby activity cube,1,490,navigational,0.20%,9.01
