<a href="https://colab.research.google.com/github/dataCatalystSolutions/job_scraping/blob/main/scrape_upwork_job_posting.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

To obtain the `api_key` and `access_token` required for accessing the Upwork API, you need to follow these steps:

### 1. **Apply for API Access**:
   - Go to the Upwork API [application page](https://www.upwork.com/developer/keys/apply).
   - Fill out the application form with details about your project, including how you plan to use the API.
   - Submit the form and wait for approval from Upwork.

### 2. **Obtain Your API Key**:
   - Once your application is approved, you'll receive an API key from Upwork.
   - This key is usually provided on the developer dashboard under your API applications. Make sure to store it securely.

### 3. **Generate an Access Token**:
   - Upwork’s API typically uses OAuth 2.0 for authorization.
   - You will need to implement an OAuth flow to get an `access_token`. This involves:
     - Redirecting users to Upwork's authorization server to log in and grant permissions to your app.
     - Handling the authorization callback to capture the authorization code.
     - Exchanging the authorization code for an `access_token` using a POST request to Upwork’s token endpoint.
  
### Example of OAuth Flow (Simplified):
   
```python
import requests

# Step 1: Direct the user to Upwork's OAuth authorization URL
# This is typically done in a web app where users log in

# Step 2: Capture the authorization code after the user logs in and approves your app

authorization_code = "AUTHORIZATION_CODE_RECEIVED"

# Step 3: Exchange authorization code for access token
token_url = "https://www.upwork.com/api/v3/oauth/token"
data = {
    'grant_type': 'authorization_code',
    'code': authorization_code,
    'redirect_uri': 'YOUR_REDIRECT_URI',
    'client_id': 'YOUR_CLIENT_ID',
    'client_secret': 'YOUR_CLIENT_SECRET'
}

response = requests.post(token_url, data=data)
token_info = response.json()
access_token = token_info.get("access_token")
```

### 4. **Using the API Key and Access Token**:
   - After obtaining both the `api_key` and `access_token`, you can use them in your requests to authenticate API calls as shown in the code examples.

Make sure to review Upwork's [API documentation](https://developers.upwork.com/) for specific details on how to implement the OAuth process and use their API securely and effectively.

To extract job postings using the Upwork API and fetch details like titles, descriptions, required skills, and client preferences, you can use a Python script that interacts with the Upwork GraphQL API. Here’s a basic example to help you get started:

### Step 1: Install Required Libraries
You’ll need to install the `requests` library for handling HTTP requests and `graphqlclient` for making GraphQL queries.

```bash
pip install requests graphqlclient
```

In [None]:
!pip install requests graphqlclient

Collecting graphqlclient
  Downloading graphqlclient-0.2.4.tar.gz (2.6 kB)
  Preparing metadata (setup.py) ... [?25l[?25hdone
Building wheels for collected packages: graphqlclient
  Building wheel for graphqlclient (setup.py) ... [?25l[?25hdone
  Created wheel for graphqlclient: filename=graphqlclient-0.2.4-py3-none-any.whl size=3136 sha256=7d238a4c00fb966f022050e0e174e19dbf70f448ba06b05e5d0dbbc77b551f47
  Stored in directory: /root/.cache/pip/wheels/28/82/aa/a17f0155204dd9b0d3666ba074d763ffeb679811d3c74205f7
Successfully built graphqlclient
Installing collected packages: graphqlclient
Successfully installed graphqlclient-0.2.4


### Step 2: Importing Required Libraries
  ```python
  import requests
  from graphqlclient import GraphQLClient
  ```
  * **requests**: This library is used for making HTTP requests to APIs. It simplifies the process of sending GET, POST, and other HTTP requests.
  * **graphqlclient**: This is a library designed to interact with GraphQL APIs. It allows you to send GraphQL queries to a specified endpoint.

In [None]:
import requests
from graphqlclient import GraphQLClient

### Step 3: Initialize the GraphQL Client
  ```python
  client = GraphQLClient('https://www.upwork.com/ab/graphql')
```
  - **`GraphQLClient`**: This is a class provided by the `graphqlclient` Python library, which facilitates sending requests to a GraphQL API endpoint and receiving responses.
  - **`'https://www.upwork.com/ab/graphql'`**: This URL is the specific endpoint of the Upwork API that handles GraphQL queries. By passing this URL to the `GraphQLClient`, you configure the client to interact with Upwork's GraphQL API.

In [None]:
client = GraphQLClient('https://www.upwork.com/ab/graphql')

### Step 4: Set Up Your API Credentials
```python
  api_key = 'YOUR_API_KEY'
  access_token = 'YOUR_ACCESS_TOKEN'
```
* **api_key**: This is your unique key provided by Upwork after your API access request is approved.
* **access_token**: This token is obtained through the OAuth 2.0 authorization process and is used to authenticate your API requests.


In [None]:
api_key = 'YOUR_API_KEY'
access_token = 'YOUR_ACCESS_TOKEN'

### Step 5: Set Up the Headers for the API Request

```python
headers = {
    'Authorization': f'Bearer {access_token}',
    'Upwork-Api-Key': api_key,
    'Content-Type': 'application/json'
}
```

* **Authorization**: The `Bearer` keyword is used to pass the `access_token` in the HTTP request header, which tells the API that you're an authenticated user.
* **Upwork-Api-Key**: This header includes your API key, which is necessary for the Upwork API to validate your request.
* **Content-Type**: This specifies that the data being sent in the request is in JSON format, which is standard for APIs.

In [None]:
headers = {
    'Authorization': f'Bearer {access_token}',
    'Upwork-Api-Key': api_key,
    'Content-Type': 'application/json'
}

### Step 6: Define Your GraphQL Query

```python
search_term = "Python Developer"
result_count = 10
query = f"""
query {{
  jobs(query: "{search_term}", first: {result_count}) {{
    nodes {{
      id
      title
      description
      skills {{
        name
      }}
      budget
      jobType
      client {{
        country
        rating
        feedbackScore
      }}
    }}
  }}
}}
"""
```
- **query**: This string defines the GraphQL query you want to send to the Upwork API. It asks for job postings that match the search term "Python Developer."
- **jobs(query: "Python Developer", first: 10)**: This part of the query specifies that you want the first 10 job postings related to "Python Developer."
- **nodes**: Within each job posting, you're requesting specific fields:
  - **id**: The unique identifier for the job.
  - **title**: The job title.
  - **description**: The job description.
  - **skills**: A list of skills required for the job.
  - **budget**: This field will retrieve the budget allocated for the job posting.
  - **jobType**: This field will indicate whether the job is an hourly contract or a fixed-price job.
  - **client**: Information about the client, such as their country and rating.


In [None]:
search_term = "Python Developer"
result_count = 10
query = f"""
query {{
  jobs(query: "{search_term}", first: {result_count}) {{
    nodes {{
      id
      title
      description
      skills {{
        name
      }}
      budget
      jobType
      client {{
        country
        rating
        feedbackScore
      }}
    }}
  }}
}}
"""

### Step 7: Execute the Query

```python
def fetch_job_postings():
    try:
        response = client.execute(query=query, headers=headers)
        return response
    except Exception as e:
        print(f"An error occurred: {e}")
```
- **fetch_job_postings()**: This function sends the GraphQL query to the Upwork API using the `execute` method of the `GraphQLClient`.
- **try-except block**: The function attempts to execute the query and return the response. If an error occurs (e.g., network issues, API errors), it catches the exception and prints an error message.


In [None]:
def fetch_job_postings():
    try:
        response = client.execute(query=query, headers=headers)
        return response
    except Exception as e:
        print(f"An error occurred: {e}")

### Step 8: Fetch and Print Job Postings

```python
job_postings = fetch_job_postings()
print(job_postings)
```
- **job_postings**: This variable stores the response returned by the `fetch_job_postings()` function.
- **print(job_postings)**: This line outputs the job postings to the console. The response is typically in JSON format, which includes the job details you requested.




In [None]:
job_postings = fetch_job_postings()
print(job_postings)

### Complete Code



In [None]:
import requests
from graphqlclient import GraphQLClient

# Initialize the GraphQL client
client = GraphQLClient('https://www.upwork.com/ab/graphql')

# Replace these with your actual API keys and access token
api_key = 'YOUR_API_KEY'
access_token = 'YOUR_ACCESS_TOKEN'

# Set up the headers for the API request
headers = {
    'Authorization': f'Bearer {access_token}',
    'Upwork-Api-Key': api_key,
    'Content-Type': 'application/json'
}

# Define your GraphQL query
search_term = "Python Developer" # customize your serach
result_count = 10

query = f"""
query {{
  jobs(query: "{search_term}", first: {result_count}) {{
    nodes {{
      id
      title
      description
      skills {{
        name
      }}
      budget
      jobType
      client {{
        country
        rating
        feedbackScore
      }}
    }}
  }}
}}
"""

# Execute the query
def fetch_job_postings():
    try:
        response = client.execute(query=query, headers=headers)
        return response
    except Exception as e:
        print(f"An error occurred: {e}")

# Fetch the job postings
job_postings = fetch_job_postings()
print(job_postings)