# Collect current car data from Google search via Rapid API

Notebook to collect needed data to augment a chatbot with more recent information.
Here are the basic steps:

1. Connect to the Google search API via RapidAPI
2. Run as many queries as needed
3. Combine the results into a dataframe, then export to a .csv


## Imports

In [11]:
import pandas as pd
import http.client
import json
import requests

## 1) Function to connect to Google search and return results

In [None]:
credentials = {}
try:
    with open('credentials.json') as file:
        credentials = json.load(file)
except FileNotFoundError:
    print("Error: file credentials.json was not found.")
print(credentials)    

rapid_api_key = credentials['RapidAPIKey']
print(rapid_api_key)

In [21]:
# Use [Real-Time Web Search] from Rapid API
# Source: https://rapidapi.com/letscrape-6bRBa3QguO5/api/real-time-web-search

url = "https://real-time-web-search.p.rapidapi.com/search"

headers = {
	"x-rapidapi-key": rapid_api_key,
	"x-rapidapi-host": "real-time-web-search.p.rapidapi.com"
}

def collect_data(query, limit):
    selected_elements_text = []

    # querystring = {"query":query,"limit":limit,"related_keywords":"true"}
    querystring = {"q":query,"limit":limit}

    # session = requests.Session()
    res = requests.get(url, headers=headers, params=querystring)
    # data = res.json()["results"]
    data = res.json()["data"]
    
    # Retrieve the 'title' and 'description' elements
    for data in data:
        # selected_elements_text.append(data["title"] + ". " + data["description"])
        selected_elements_text.append(data["title"] + ". " + data["snippet"])


    return selected_elements_text


## 2) Run the queries and collect the results

In [16]:
# Search Google for "after:2023 top 1 best selling cars in the world". Limit to 50 results.
# query="after%3A2023%20top%201%20best%20selling%20cars%20in%20the%20world&limit=50&related_keywords=true"
query = "after:2023 what is the best selling car in the world in 2023"
best_selling_car_data = collect_data(query, 50)

# Search Google for "after:2023 car with the lowest total cost of ownership". Limit to 50 results.
# query="after%3A2023%20car%20with%20the%20lowest%20total%20cost%20of%20ownership&limit=50&related_keywords=true"
query = "after:2023 what was the car with the lowest total cost of ownership in 2023"
lowest_total_cost_car_data = collect_data(query, 50)


## 3) Store all the results in one DataFrame and export to a .csv

In [17]:
print(len(best_selling_car_data))
print(len(lowest_total_cost_car_data))

45
49


In [18]:
# Combine the collected data above into one DataFrame
all_data = best_selling_car_data + lowest_total_cost_car_data

print(len(lowest_total_cost_car_data))
df = pd.DataFrame(all_data)

# shuffle entire DataFrame and reset index
df.sample(frac=1).reset_index(drop=True)


49


Unnamed: 0,0
0,"NEW: EVs cost less to own than the most popular gas- .... Compact Sedans: Owning a Chevrolet Bolt EUV costs 25 percent less than a Toyota Corolla LE, translating to over $10,000 in savings over seven ..."
1,"Cost of Owning a Car for a Year in Every State. However, its average car insurance premium of $2,546 is the second highest after Florida. Maine. Total costs for one year of ownership: $25,312 ..."
2,"10 2024 Model EVs With The Lowest Cost Of Ownership .... Total Ownership Costs Over 5 Years: $46,916 ... The 2024 Nissan Leaf S has an attractive price, but sitting in first place isn't always the best."
3,"5 Best Low-Cost-to-Own Vehicles | Save Money Without .... With a starting price of around $27,000 and one of the best warranties in the industry, it's a standout in the low-cost ownership category. For ..."
4,World's Best-Selling Car Is the Tesla Model Y. For the first time in history the most popular new car across the planet is electric. Tesla sold more than 1.2 million Model Ys in 2023 to ...
...,...
89,"The True Cost of Owning a Car in California. Let's delve into the various costs that go into owning a car. Average Car Loan Interest Rates in California. As of the third quarter of 2023, ..."
90,"The Cheapest New Cars of 2024. 1. 2024 Nissan Versa: $16,680 · 2. 2024 Mitsubishi Mirage: $16,695 · 3. 2024 Hyundai Venue: $19,900 · 4. 2024 Kia Forte: $19,990 · 5. 2024 Kia Soul: ..."
91,Total cost of car ownership by state in 2024. Alaska is the cheapest state for car ownership due to low insurance rates and a very low (zero in some areas) sales tax. California replaced ...
92,"Was The Toyota Corolla Or The Tesla Model Y The World's .... Last week, we presented sales data that initially indicated the Tesla Model Y as the world's best-selling vehicle in 2023."


In [19]:
# Check data
pd.set_option('display.max_colwidth', -1)

print(df.iloc[77])
print(df.iloc[90])


0    The Cheapest New Cars of 2024. 1. 2024 Nissan Versa: $16,680 · 2. 2024 Mitsubishi Mirage: $16,695 · 3. 2024 Hyundai Venue: $19,900 · 4. 2024 Kia Forte: $19,990 · 5. 2024 Kia Soul: ...
Name: 77, dtype: object
0    Discounts on year-old vehicles have more than doubled. Used vehicle prices fell more steeply than nearly any other category in April's consumer price index, falling 6.9% from April 2023 to last month ...
Name: 90, dtype: object


  pd.set_option('display.max_colwidth', -1)


In [20]:
# Export to a csv
df.to_csv('recent_car_data.csv', header=["text"], index=False)