## Scraping API from Job Listings site using Request
**Target site:** techinasia.com

**Disclaimer:** This scraping is for learning purposes only and as a portfolio only. No purpose for sale at all.

### Library Initialization
**requests** to make a request to the API URL.


**pandas** to save into DataFrame then export to .json/.csv

**dotenv** to read environment variable

In [38]:
import requests
import pandas as pd
import os
from dotenv import load_dotenv
load_dotenv()

True

Initialize variables for url, payload and headers. I purposely didn't provide the full url and payload.

In [39]:
url = os.getenv("TECHINASIA_URL")

payload = "{"requests": [{"indexName": "job_postings","params": "query=data&hitsPerPage=1000&maxValuesPerFacet=1000&page=0&facets=%5B%22*%22%2C%22city]}"

headers = {
    'Connection': 'keep-alive',
    'accept': 'application/json',
    'content-type': 'application/x-www-form-urlencoded',
    'User-Agent': 'Mozilla/5.0 (Linux; Android 6.0; Nexus 5 Build/MRA58N) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/96.0.4664.110 Mobile Safari/537.36',
    'sec-ch-ua-platform': "Android",
    'Accept-Encoding': 'gzip, deflate, br',
    'Accept-Language': 'id-ID,id;q=0.9,en-US;q=0.8,en;q=0.7'
}

In the payload above, there is the following code:

``query=`` - *This is used to enter the keywords of the job being searched for. Leave it blank if you want to see all data.*

``hitsPerPage=1000&maxValuesPerFacet=1000`` - *It is used to retrieve how much data.*

In this exploration I use the keyword "data" and request 1000 data because there is no IP blocking and instead of making requests back and forth, just 1 time and immediately many if data is available. And get results from around 300 job vacancies data.

In [40]:
response = requests.request("POST", url, headers=headers, data=payload)
data = response.json()

Initialize jobs variable to save value from response

In [41]:
jobs = []

The response is .json, it will uses a loop to get all the data and append to jobs variable. 

Here I will only take the job title, job type, company name, company industry, company location, salary, experience, and skills.

In [42]:
for item in data['results'][0]['hits']:
    skills = [i['name'] for i in item['job_skills']]

    job = {
        'Title': item['title'],
        'Job Type': item['job_type']['name'],
        'Company Name': item['company']['name'],
        'Company Industry': item['industries'][0]['name'],
        'Company City': item['city']['name'],
        'Company Country': item['city']['country_name'],
        'Salary Avg': item['salary_avg'],
        'Salary Min': item['salary_min'],
        'Salary Max': item['salary_max'],
        'Experience': item['experience'],
        'Min Experience': item['experience_min'],
        'Max Experience': item['experience_max'],
        'Skills': skills,
        'Published': item['published_at']
    }

    jobs.append(job)

Create DataFrame from ``jobs`` and export to .csv

In [43]:
df = pd.DataFrame(jobs)
df.to_csv('job_lists.csv')