<a href="https://colab.research.google.com/github/lazarinastoy/mlfoseo/blob/MLinSEO-keyword-research-scripts/Google's_Autocomplete_APIs_and_Endpoints_Keyword_Research_for_Marketers_Use_Cases_Template_by_Lazarina_Stoy_for_MLforSEO.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

#About this script

In [None]:
# Copyright (c) 2024 ML Marketing Consulting, ltd. - MLforSEO and Lazarina Stoy.
# All rights reserved.

# This script is provided as-is, without warranty of any kind, express or implied,
# including but not limited to the warranties of merchantability, fitness for a particular purpose,
# or noninfringement. In no event shall the authors or copyright holders be liable for any claim,
# damages, or other liability, whether in an action of contract, tort, or otherwise, arising from,
# out of, or in connection with the software or the use or other dealings in the software.

# For any questions, contact: team@mlforseo.com or www.mlforseo.com

See the associated tutorial on using this Google Colab on MLforSEO blog - [How to use Google Autocomplete API and Place API for Keyword Suggestions with Python](https://www.mlforseo.com/machine-learning-implementation-guides/keyword-research/how-to-use-google-autocomplete-apis-for-keyword-suggestions/)

If you're interested in [ML-enabled Semantic Keyword Research - purchase the course](https://academy.mlforseo.com/product/semantic-ml-enabled-keyword-research-course/).

# Google Search Autocomplete



This script is designed to streamline the process of keyword research by leveraging Google Autocomplete. Here's a step-by-step explanation of what it does and how it can be useful:

The goal is to generate a comprehensive list of keyword suggestions based on user-provided seed keywords and organize them into clusters. Each suggestion is linked back to its seed keyword for easy analysis.

## Key Features:
* User-Friendly Input:


The user uploads a CSV file containing up to 100 seed keywords.
Each keyword serves as a starting point for generating related suggestions.
* Automated Keyword Expansion:

The script queries Google Suggest for each seed keyword, exploring multiple variations by appending letters (a-z) and blank spaces.
This ensures a wide range of suggestions, capturing potential long-tail keywords.
* Stopword Filtering:

A built-in list of common stopwords removes irrelevant terms, ensuring only meaningful words contribute to clustering.
*  Keyword Clustering:

Suggestions are grouped based on common terms.
Each suggestion is tagged with its corresponding cluster (common word) and the originating seed keyword.
* Batch Processing:

The script processes keywords in batches of 5 to optimize performance and avoid query throttling by Google.
## What you get:

The output is a CSV file containing:

*  Keyword: The suggested search term.
* Cluster: The common word used for grouping.
* Seed Keyword: The original keyword that generated the suggestion.


This structured format helps users easily identify high-potential keywords and their thematic clusters.

## Who Can Benefit?
* SEO Professionals: Build content strategies around relevant, high-traffic keywords.
* Marketers: Discover trending search terms to improve ad targeting.
* Content Creators: Identify long-tail keywords for blog posts, videos, and other media.

##How to Use the Script:

1. Prepare a CSV file with a column named Keywords containing your seed keywords.
2. Upload the file when prompted.
3. The script will process the keywords, generate suggestions, and download a clustered keyword file.

This tool simplifies the often time-consuming task of keyword suggestion using Google Autosuggest, providing actionable insights in a few simple steps.

In [None]:
## Step 1: Upload Keyword File

import pandas as pd
import requests
import json
import time
import string
from nltk.tokenize import word_tokenize
from collections import Counter
from google.colab import files
import nltk


# Download necessary NLTK data
nltk.download('punkt')
nltk.download('punkt_tab') # Download the punkt_tab resource

# Prompt user to upload a CSV file with keywords
uploaded = files.upload()
df = pd.read_csv(list(uploaded.keys())[0])  # Read the uploaded file
keywords = df['Keywords'].tolist()  # Assume the column is named 'Keywords'

## Step 2: Define Helper Functions

# Basic stopwords list for English
basic_stopwords = {
    'i', 'me', 'my', 'myself', 'we', 'our', 'ours', 'ourselves', 'you',
    'your', 'yours', 'yourself', 'yourselves', 'he', 'him', 'his', 'himself',
    'she', 'her', 'hers', 'herself', 'it', 'its', 'itself', 'they', 'them',
    'their', 'theirs', 'themselves', 'what', 'which', 'who', 'whom', 'this',
    'that', 'these', 'those', 'am', 'is', 'are', 'was', 'were', 'be', 'been',
    'being', 'have', 'has', 'had', 'having', 'do', 'does', 'did', 'doing',
    'a', 'an', 'the', 'and', 'but', 'if', 'or', 'because', 'as', 'until',
    'while', 'of', 'at', 'by', 'for', 'with', 'about', 'against', 'between',
    'into', 'through', 'during', 'before', 'after', 'above', 'below', 'to',
    'from', 'up', 'down', 'in', 'out', 'on', 'off', 'over', 'under', 'again',
    'further', 'then', 'once', 'here', 'there', 'when', 'where', 'why',
    'how', 'all', 'any', 'both', 'each', 'few', 'more', 'most', 'other',
    'some', 'such', 'no', 'nor', 'not', 'only', 'own', 'same', 'so', 'than',
    'too', 'very', 's', 't', 'can', 'will', 'just', 'don', 'should', 'now'
}

def get_google_suggestions(keyword, lang_code, letterlist):
    """Fetch suggestions for a given keyword and language from Google Suggest."""
    suggestions = []
    headers = {'User-agent': 'Mozilla/5.0'}
    for letter in letterlist:
        URL = f"http://suggestqueries.google.com/complete/search?client=firefox&hl={lang_code}&q={keyword} {letter}"
        response = requests.get(URL, headers=headers)
        result = json.loads(response.content.decode('utf-8'))
        if result:
            suggestions.extend(result[1])
        time.sleep(0.5)  # Reduced sleep for faster processing
    return suggestions

def clean_and_cluster_suggestions(all_suggestions, stop_words, seed_words):
    """Clean suggestions by removing stopwords and tokenize them for clustering."""
    wordlist = []
    for suggestion in all_suggestions:
        words = word_tokenize(str(suggestion).lower())
        for word in words:
            if word not in stop_words and word not in seed_words and len(word) > 1:
                wordlist.append(word)
    return [word for word, count in Counter(wordlist).most_common(200)]

## Step 3: Process Keywords in Batches

lang_code = "en"  # Language code
batch_size = 5
letterlist = [""] + list(string.ascii_lowercase)  # Include empty and alphabetical combinations
all_clusters = []

# Process keywords in batches
for i in range(0, len(keywords), batch_size):
    batch_keywords = keywords[i:i + batch_size]

    # Filter out empty keywords and tokenize seed words
    batch_keywords = list(filter(None, batch_keywords))
    seed_words = [word_tokenize(keyword.lower()) for keyword in batch_keywords]
    seed_words = [item for sublist in seed_words for item in sublist]  # Flatten the list

    # Get suggestions for each keyword in the batch
    for keyword in batch_keywords:
        suggestions = get_google_suggestions(keyword, lang_code, letterlist)
        most_common_words = clean_and_cluster_suggestions(suggestions, basic_stopwords, seed_words)

        # Assign suggestions and common words to their seed keyword
        for common_word in most_common_words:
            for suggestion in suggestions:
                if common_word in suggestion:
                    all_clusters.append([suggestion, common_word, keyword])  # Include the seed keyword here

## Step 4: Save and Download the Result

cluster_df = pd.DataFrame(all_clusters, columns=['Keyword', 'Cluster', 'Seed Keyword'])
cluster_df.to_csv("keywords_clustered.csv", index=False)
files.download("keywords_clustered.csv")
cluster_df


[nltk_data] Downloading package punkt to /root/nltk_data...
[nltk_data]   Package punkt is already up-to-date!
[nltk_data] Downloading package punkt_tab to /root/nltk_data...
[nltk_data]   Unzipping tokenizers/punkt_tab.zip.


Saving seed.csv to seed (1).csv


# YouTube Search Autocomplete



The process_keywords function, combined with get_youtube_suggestions, is designed to help users extract YouTube search autocomplete suggestions based on a list of seed keywords. These suggestions provide insights into popular search trends and related queries on YouTube.

This tool can be especially valuable for content creators, marketers, and researchers who want to identify relevant topics, optimize video content, or analyze user search behavior on YouTube.

## Key Features
* Bulk Keyword Processing: Upload a CSV file containing multiple keywords to fetch autocomplete suggestions for all of them.
* Automated Data Extraction: The function connects to YouTube's suggestion API and retrieves suggestions in real-time.
* Customizable Outputs: Generates a downloadable CSV file that pairs each seed keyword with its autocomplete suggestions.
* Error Handling: Gracefully handles errors like missing data, connectivity issues, or API response anomalies.
* Rate-Limiting Compliance: Includes delays between API calls to avoid hitting request limits.


## Who Can Benefit
* YouTube Content Creators: Discover trending topics and long-tail keywords to improve video SEO and reach a broader audience.
* Digital Marketers: Identify search terms to optimize ad campaigns or create targeted content strategies.
* SEO Professionals: Research user behavior and improve keyword targeting for better ranking on search engines.
* Academics & Researchers: Analyze search trends and user interests on YouTube for academic studies or reports.
* Businesses: Gain insights into customer preferences to tailor product or service-related content on YouTube.

## How to Use
1. Prepare Your Keywords File:

* Create a CSV file with a single column named Keywords.
*  Populate this column with the seed keywords you want to analyze.
2. Upload the File:

* Run the notebook cell containing files.upload() to upload your CSV file.
3. Run the Analysis:

* The script will process the uploaded file, extract autocomplete suggestions for each keyword, and store the results in a DataFrame.
4. Download the Results:

*  The suggestions are saved in a CSV file named youtube_autosuggestions.csv, which can be downloaded for further use.
5. Explore the Data:

* Review the suggestions to identify trends, optimize your content, or use them in your marketing strategies.


In [None]:

import requests
import pandas as pd
import json
import time
from google.colab import files

def get_youtube_suggestions(keyword):
    """
    Fetch YouTube autocomplete suggestions for a given keyword.

    Parameters:
    keyword (str): The search query string.

    Returns:
    list of tuples: Each tuple contains the seed keyword and its suggestion.
    """
    suggestions = []
    try:
        url = "https://suggestqueries.google.com/complete/search"
        params = {
            'client': 'youtube',
            'ds': 'yt',
            'q': keyword,
            'hl': 'en'
        }
        response = requests.get(url, params=params)
        response.raise_for_status()
        raw_data = response.text

        # Extract JSON-like content from the JavaScript response
        start = raw_data.find('[')
        end = raw_data.rfind(']') + 1
        json_data = json.loads(raw_data[start:end])

        # Process suggestions
        seed_keyword = json_data[0]
        for item in json_data[1]:
            suggestions.append((seed_keyword, item[0]))
    except Exception as e:
        print(f"Error fetching suggestions for '{keyword}': {e}")

    return suggestions

def process_keywords(file_path):
    """
    Process a list of keywords from an uploaded file and fetch YouTube suggestions.

    Parameters:
    file_path (str): Path to the uploaded CSV file containing a 'Keywords' column.

    Returns:
    DataFrame: A DataFrame containing the seed keywords and their suggestions.
    """
    df = pd.read_csv(file_path)
    print("Uploaded file columns:", df.columns)  # Debug: Print column names

    if 'Keywords' not in df.columns:
        raise ValueError("The uploaded file must contain a 'Keywords' column.")

    all_suggestions = []

    for keyword in df['Keywords'].dropna():
        suggestions = get_youtube_suggestions(keyword)
        all_suggestions.extend(suggestions)
        time.sleep(0.5)  # To prevent hitting rate limits

    result_df = pd.DataFrame(all_suggestions, columns=['Seed Keyword', 'Suggestion'])
    return result_df

# Step 1: Upload the keywords file
uploaded = files.upload()
file_path = next(iter(uploaded.keys()))

# Step 2: Process the keywords and fetch suggestions
try:
    suggestions_df = process_keywords(file_path)

    # Step 3: Save and download the results
    output_file = "youtube_autosuggestions.csv"
    suggestions_df.to_csv(output_file, index=False)
    files.download(output_file)

    # Display first few rows of the DataFrame
    suggestions_df.head()
except Exception as e:
    print(f"Error: {e}")


# Query Autocomplete via Place API

Google’s Query Autocomplete service helps users find location-based suggestions as they type. By using the Places API, you can implement autocomplete suggestions for geographical searches, such as searching for "pizza near New York" and getting relevant suggestions like "pizza near Paris" or "pizza near Disneyland."

## Key Features:
Provides query predictions for geographical searches.
Returns suggestions in real-time based on partial input.
Can include geographic details, like addresses and locations.
Supports language customization for better local results.
## How It Works:
A Query Autocomplete request is made via a URL, where users provide an input (search term) and, optionally, other parameters like language, location, and radius. The API returns location-based suggestions based on user input.

You can specify additional parameters, such as:

* language: Set the desired language for results.
* location: Bias results to a certain location.
* radius: Define the distance within which to return results.

## Key Considerations:

* API Key: You must enable the API in a Google Cloud project with enabled billing beforehand to get your API key.
* Quota Limits: The Query Autocomplete API shares quotas with other Google Maps services. Make sure you're within the API limits.
* Response Structure: The API response includes a list of predictions, each containing a human-readable description of the suggestion, matched substrings, and structured formatting for easy display.

## Example Response:
For an input like pizza near par, you might receive suggestions like:

* pizza near Paris, France
* pizza near Paris Beauvais Airport, France
* pizza near Disneyland Park, California, USA

## Why It’s Useful:
* Search Suggestions: Ideal for applications where users need to type locations or businesses. It helps them find places based on their input.
* Local Optimization: Helps users find locations near them or within a defined area.
* Global Reach: Supports multiple languages and location-based filtering, making it ideal for international use.

##Key Features:
* User Input: The user is prompted to enter their Google API key, which is required for the Places API. Additionally, the user can specify the language code, location (latitude and longitude), and search radius (in meters) for the query.

* File Upload: The user is able to upload a CSV file containing a column named "Keywords," which will be used to fetch autocomplete suggestions from the Google Places API.

* Query Autocomplete: The script fetches autocomplete suggestions from the Google Places API for each keyword in the "Keywords" column of the uploaded CSV file. This is done by sending a request to the API for each keyword.

* Downloadable Output: After fetching the suggestions, the results are stored in a new CSV file. The user can then download this file, which contains the original seed keywords along with their corresponding suggestions.
##Notes:
* The user must replace YOUR_API_KEY with their actual Google API key.
* The code expects the uploaded CSV file to have a "Keywords" column containing search terms. You can also specify other optional parameters like location and radius to bias the suggestions based on location.
* A delay of 0.5 seconds between requests is included to avoid hitting the API rate limits.



In [None]:
import requests
import pandas as pd
import json
import time
from google.colab import files

def get_place_autocomplete_suggestions(input_keyword, api_key, language='en', location=None, radius=50000):
    """
    Fetch Google Places Query Autocomplete suggestions for a given input keyword.

    Parameters:
    input_keyword (str): The input text string for autocomplete.
    api_key (str): The Google API key.
    language (str): The language for the query results (default is 'en').
    location (tuple): Latitude and longitude to bias the search (optional).
    radius (int): Search radius in meters (default is 50,000 meters).

    Returns:
    list of tuples: Each tuple contains the input keyword and a predicted place description.
    """
    suggestions = []

    try:
        url = "https://maps.googleapis.com/maps/api/place/queryautocomplete/json"
        params = {
            'input': input_keyword,
            'key': api_key,
            'language': language,
            'radius': radius
        }

        # Add location bias if provided
        if location:
            params['location'] = f"{location[0]},{location[1]}"

        response = requests.get(url, params=params)
        response.raise_for_status()
        data = response.json()

        # Process the suggestions from the response
        if data.get('status') == 'OK':
            for prediction in data.get('predictions', []):
                suggestions.append((input_keyword, prediction['description']))
        else:
            print(f"Error fetching suggestions for '{input_keyword}': {data.get('error_message', 'No suggestions found')}")

    except Exception as e:
        print(f"Error: {e}")

    return suggestions

def process_keywords(file_path, api_key, language='en', location=None, radius=50000):
    """
    Process a list of keywords from an uploaded file and fetch Google Places query suggestions.

    Parameters:
    file_path (str): Path to the uploaded CSV file containing a 'Keywords' column.
    api_key (str): The Google API key.
    language (str): The language for the query results (default is 'en').
    location (tuple): Latitude and longitude to bias the search (optional).
    radius (int): Search radius in meters (default is 50,000 meters).

    Returns:
    DataFrame: A DataFrame containing the seed keywords and their suggestions.
    """
    df = pd.read_csv(file_path)
    print("Uploaded file columns:", df.columns)  # Debug: Print column names

    if 'Keywords' not in df.columns:
        raise ValueError("The uploaded file must contain a 'Keywords' column.")

    all_suggestions = []

    # Fetch suggestions for each keyword in the 'Keywords' column
    for keyword in df['Keywords'].dropna():
        suggestions = get_place_autocomplete_suggestions(keyword, api_key, language, location, radius)
        all_suggestions.extend(suggestions)
        time.sleep(0.5)  # To prevent hitting rate limits

    result_df = pd.DataFrame(all_suggestions, columns=['Seed Keyword', 'Suggestion'])
    return result_df

# Step 1: Request user input for API key and parameters
api_key = input("Please enter your Google API key: ")
language = input("Enter language code (default is 'en'): ") or 'en'
location_input = input("Enter location (latitude,longitude) or press Enter to skip: ")
location = tuple(map(float, location_input.split(','))) if location_input else None
radius = int(input("Enter search radius in meters (default is 50000): ") or 50000)

# Step 2: Upload the keywords file
uploaded = files.upload()
file_path = next(iter(uploaded.keys()))

# Step 3: Process the keywords and fetch suggestions
try:
    suggestions_df = process_keywords(file_path, api_key, language, location, radius)

    # Step 4: Save and download the results
    output_file = "place_autosuggestions.csv"
    suggestions_df.to_csv(output_file, index=False)
    files.download(output_file)

    # Display first few rows of the DataFrame
    suggestions_df.head()
except Exception as e:
    print(f"Error: {e}")


# Place Autocomplete via Place API

In [None]:
import requests
import pandas as pd
import json
import time
from google.colab import files

def get_place_autocomplete_suggestions(input_text, api_key, **kwargs):
    """
    Fetch Place Autocomplete suggestions for a given input text.

    Parameters:
    input_text (str): The input string for place autocomplete.
    api_key (str): Your Google Places API key.
    kwargs (dict): Optional parameters for the Place Autocomplete API such as:
                   - location
                   - radius
                   - language
                   - components
                   - types

    Returns:
    list of tuples: Each tuple contains the input text and the suggested place description.
    """
    suggestions = []
    try:
        url = "https://maps.googleapis.com/maps/api/place/autocomplete/json"
        params = {
            'input': input_text,
            'key': api_key,
            **kwargs
        }

        response = requests.get(url, params=params)
        response.raise_for_status()

        data = response.json()

        if data.get("status") == "OK":
            for prediction in data.get("predictions", []):
                suggestions.append((input_text, prediction.get("description")))
        else:
            print(f"Error: {data.get('status')} - {data.get('error_message', 'No error message')}")
    except Exception as e:
        print(f"Error fetching suggestions for '{input_text}': {e}")

    return suggestions

def process_autocomplete_keywords(file_path, api_key, **kwargs):
    """
    Process a list of input texts from an uploaded file and fetch Place Autocomplete suggestions.

    Parameters:
    file_path (str): Path to the uploaded CSV file containing a 'Keywords' column.
    api_key (str): Your Google Places API key.
    kwargs (dict): Optional parameters for Place Autocomplete API.

    Returns:
    DataFrame: A DataFrame containing the input texts and their suggestions.
    """
    df = pd.read_csv(file_path)
    print("Uploaded file columns:", df.columns)  # Debug: Print column names

    if 'Keywords' not in df.columns:
        raise ValueError("The uploaded file must contain a 'Keywords' column.")

    all_suggestions = []

    for keyword in df['Keywords'].dropna():
        suggestions = get_place_autocomplete_suggestions(keyword, api_key, **kwargs)
        all_suggestions.extend(suggestions)
        time.sleep(0.5)  # To prevent hitting rate limits

    result_df = pd.DataFrame(all_suggestions, columns=['Input Text', 'Suggestion'])
    return result_df

# Step 1: Upload the keywords file
uploaded = files.upload()
file_path = next(iter(uploaded.keys()))

# Step 2: Get API key and optional parameters from the user
api_key = input("Enter your Google Places API key: ")
location = input("Enter location as 'latitude,longitude' (or leave blank): ").strip()
radius = input("Enter radius in meters (or leave blank): ").strip()
language = input("Enter language code (e.g., 'en', 'fr') (or leave blank): ").strip()
components = input("Enter components (e.g., 'country:us') (or leave blank): ").strip()
types = input("Enter types (e.g., 'geocode') (or leave blank): ").strip()

# Collect optional parameters
optional_params = {k: v for k, v in {
    'location': location,
    'radius': radius,
    'language': language,
    'components': components,
    'types': types
}.items() if v}

# Step 3: Process the keywords and fetch autocomplete suggestions
try:
    suggestions_df = process_autocomplete_keywords(file_path, api_key, **optional_params)

    # Step 4: Save and download the results
    output_file = "place_autocomplete_suggestions.csv"
    suggestions_df.to_csv(output_file, index=False)
    files.download(output_file)

    # Display first few rows of the DataFrame
    suggestions_df.head()
except Exception as e:
    print(f"Error: {e}")
