<a href="https://colab.research.google.com/github/A1exanderBates/IndeedAPI_to_CloudStorage_Pipeline/blob/main/Pushing_RapidAPI_to_GCP_Guide.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Guide to Using RapidAPI with Google Cloud

Step 1. Import Libraries

In [1]:
import requests
import pandas as pd
import json
from google.cloud import storage

Step 2. Copy/Paste the Rapid API code

In [2]:
url = "https://indeed11.p.rapidapi.com/"


payload = {
    "search_terms": "data visualization",
    "location": "New York City, NY",
    "page": 1,
    "fetch_full_text": "yes"
}

headers = {
	"content-type": "application/json",
	"X-RapidAPI-Key": "{api key here}", # insert here,
	"X-RapidAPI-Host": "indeed11.p.rapidapi.com"
}
my_list = []

response = requests.request("POST", url, json=payload, headers=headers)

Step 3. Convert the response to JSON format and assign to a variable

In [3]:
raw_json_str = response.json()

Step 4. Create a JSON file and store in temporary directory

In [4]:
# Creating a function that converts the json variable above to a json file
def writeToJSONFile(path, fileName, data):
    filePathNameWExt = path + '/' + fileName + '.json'
    with open(filePathNameWExt, 'w') as fp:
        json.dump(data, fp)

path = '/tmp'
fileName = 'Indeed_Data_JSON' # Can be renamed
data = raw_json_str

# Executing the function
writeToJSONFile(path, fileName, data)

Step 5. Convert to CSV

Note: This cell below will return an error because no API key has been included above. I purposely excluded it to avoid reaching my monthly API request quota.

In [None]:
# Converting the JSON file to CSV format
dataframe = pd.read_json('/tmp/Indeed_Data_JSON.json')
dataframe.to_csv('/tmp/Indeed_Data_CSV.csv')

Step 6. Push CSV file to your Cloud Storage Buclet

In [None]:
# Creating a function to push CSV file to Cloud Storage Bucket
def push_to_gcs(file, bucket):
    file_name = file.split('/')[-1]
    print(f"Pushing {file_name} to GCS...")
    blob = bucket.blob(file_name)
    blob.upload_from_filename(file)
    print(f"File pushed to {blob.id} succesfully.")

file_name = 'Indeed_Data_CSV.csv' # This is your csv file created in step 5
file_path = '/tmp/' + file_name

# Move csv file to Cloud Storage
storage_client = storage.Client()
bucket_name = 'indeed_api_bucket' # This is your cloud storage bucket
bucket = storage_client.get_bucket(bucket_name)
push_to_gcs(file_path, bucket)
    
    

Step 7. Add your dependencies to the requirements.txt file

In [None]:
google-cloud-storage
requests>=2.20.0
pandas>=1.4.3

**Putting it all together:** Copy/Paste all of the code from steps 1-6 into the main.py tab of your Google Function. Rename the **Entry Point** 'push_to_gcs' (don't include the single quotes). Lastly, add the code from step 7 to the requirements.txt file. 

**Success!** If you come across any errors when testing/deploying the function, be sure to check out the logs to determine the error reason. It should specify at which line the error occurred.