# **Collecting Job Data Using APIs**


#### Instructions


To run the actual lab, firstly you need to click on the [Jobs_API](https://cf-courses-data.s3.us.cloud-object-storage.appdomain.cloud/IBM-DA0321EN-SkillsNetwork/labs/module%201/Accessing%20Data%20Using%20APIs/Jobs_API.ipynb) notebook link. The file contains flask code which is required to run the Jobs API data.


## Dataset Used in this Assignment

The dataset used in this lab comes from the following source: https://www.kaggle.com/promptcloud/jobs-on-naukricom under the under a **Public Domain license**.

> Note: Using a modified subset of that dataset.

The original dataset is a csv. I have converted the csv to json as per the requirement.


In [None]:
import requests # need this module to make an API call
import pandas as pd

In [None]:
api_url = "http://api.open-notify.org/astros.json" # this url gives use the astronaut data

In [None]:
response = requests.get(api_url) # Call the API using the get method and store the
                                # output of the API call in a variable called response.

In [None]:
if response.ok:             # if all is well() no errors, no network timeouts)
    data = response.json()  # store the result in json format in a variable called data
                            # the variable data is of type dictionary.

In [None]:
print(data)   # print the data just to check the output or for debugging

{'message': 'success', 'people': [{'name': 'Jasmin Moghbeli', 'craft': 'ISS'}, {'name': 'Andreas Mogensen', 'craft': 'ISS'}, {'name': 'Satoshi Furukawa', 'craft': 'ISS'}, {'name': 'Konstantin Borisov', 'craft': 'ISS'}, {'name': 'Oleg Kononenko', 'craft': 'ISS'}, {'name': 'Nikolai Chub', 'craft': 'ISS'}, {'name': "Loral O'Hara", 'craft': 'ISS'}], 'number': 7}


The number of astronauts currently on ISS.

In [None]:
print(data.get('number'))

7


The names of the astronauts currently on ISS.


In [None]:
astronauts = data.get('people')
print("There are {} astronauts on ISS".format(len(astronauts)))
print("And their names are :")
for astronaut in astronauts:
    print(astronaut.get('name'))

There are 7 astronauts on ISS
And their names are :
Jasmin Moghbeli
Andreas Mogensen
Satoshi Furukawa
Konstantin Borisov
Oleg Kononenko
Nikolai Chub
Loral O'Hara


Hope the warmup was helpful. Good luck with your next lab!


## Collecting Jobs Data using Jobs API


### Objective: Determining the number of jobs currently open for various technologies  and for various locations


Collecting the number of job postings for the following locations using the API:

* Los Angeles
* New York
* San Francisco
* Washington DC
* Seattle
* Austin
* Detroit


In [None]:
# Required libraries
import pandas as pd
import json

In [None]:
api_url="http://127.0.0.1:5000/data/all"
def get_number_of_jobs_T(technology):

    response_api = requests.get(api_url)

    number_of_jobs = 0

    if response_api.ok:
        jobs = response_api.json()

    for job in jobs:
        key = job.get('Key Skills')

        if key.find(technology) > -1 :
            number_of_jobs = number_of_jobs + 1

    number_of_jobs

    return technology,number_of_jobs

Calling the function for Python and checking if it works.


In [None]:
get_number_of_jobs_T("Python")

ConnectionError: HTTPConnectionPool(host='127.0.0.1', port=5000): Max retries exceeded with url: /data/all (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7d75573ce260>: Failed to establish a new connection: [Errno 111] Connection refused'))

#### Function to find number of jobs in US for a location of your choice


In [None]:
def get_number_of_jobs_L(location):

    response_api = requests.get(api_url)

    number_of_jobs = 0

    if response_api.ok:
        jobs = response_api.json()

    for job in jobs:
        loc = job.get('Location')

        if loc.find(location) > -1 :
            number_of_jobs = number_of_jobs + 1

    number_of_jobs
    return location,number_of_jobs

Calling function for Los Angeles and check if it is working.




In [None]:
get_number_of_jobs_L("Los Angeles")

### Storing the results in an excel file


Calling the API for all the given technologies above and write the results in an excel spreadsheet.


In [None]:
locations = ['Los Angeles','New York','San Francisco','Washington DC','Seattle','Austin','Detroit']


libraries required to create excel spreadsheet


In [None]:
from openpyxl import Workbook

Creating a workbook and select the active worksheet


In [None]:
wb1 = Workbook()
ws1 = wb1.active

Finding the number of jobs postings for each of the location in the above list.
Writing the Location name and the number of jobs postings into the excel spreadsheet.


In [None]:
ws1.append(['Location','Number of Jobs'])

for i in range(len(locations)):
    ws1.append(get_number_of_jobs_L(locations[i]))

Saving into an excel spreadsheet named 'job-postings.xlsx'.


In [None]:
wb1.save('2.a-job-postings (Collected from API).xlsx')
wb1.close()

#### In the similar way, we can try for below given technologies and results  can be stored in an excel sheet.


Collecting the number of job postings for the following languages using the API:

*   C
*   C#
*   C++
*   Java
*   JavaScript
*   Python
*   Scala
*   Oracle
*   SQL Server
*   MySQL Server
*   PostgreSQL
*   MongoDB


In [None]:
wb2 = Workbook()
ws2 = wb2.active

languages = ['C','C#','C++','Java','JavaScript','Python','Scala','Oracle','SQL Server','MySQL Server','PostgreSQL','MongoDB']

ws2.append(['Languages','Number of Jobs'])

for i in range(len(languages)):
    ws2.append(get_number_of_jobs_T(languages[i]))

wb2.save('2.a-job-postings-languages (Collected from API).xlsx')
wb2.close()