# Upwork Market Data Analysis for Optimized Job Proposals

## Little Story
I recently started on Upwork and invested over $200 in connects, but I wasn't seeing any success.

Determined to figure out why, I realized that being new to the platform and lacking badges was a factor, but there wasn't much I could do about that, since I'm just stating. I then thought my proposals might be the issue and tried to improve them, but still had no luck.

That’s when I decided to create a client account to get an insider’s perspective. By posting a job, I noticed something crucial: the first two lines of a proposal are vital. Clients see a list of proposals with only the first two lines visible, so these lines need to be a compelling hook.

With this insight, I wondered if perhaps my profile wasn’t aligned with current market demands. This led me to undertake a market data analysis on Upwork. I began by gathering months' worth of RSS files containing job listings to understand better what clients are looking for today.

## Objective
The primary objective of this project is to analyze Upwork job listings to identify current market needs and optimize job proposals to increase success rates. By leveraging advanced data extraction techniques and data visualization tools, the project aims to provide valuable insights into job market trends, required skills, and other critical factors influencing hiring decisions on Upwork.

## Phases of the Project

### 1. Data Collection
- **Frequency**: Collect RSS feeds of job listings every two days. (When possible)
- **Automation**: Develop a script to automate the download of RSS feeds and convert them into JSON format for easier handling.
- **Storage**: Save the JSON files in a structured folder system.

### 2. Data Transformation (ETL Process)
- **Extraction**: Extract relevant information from the job listings using advanced language models and tools such as ChatGPT API, Kor, and LangChain.
- **Transformation**: Structure the unstructured data into a consistent format that includes key job details such as title, responsibilities, skills, qualifications, hourly rate, posting date, category, country, and additional skills.
- **Loading**: Load the transformed data into a database or a structured file format suitable for analysis.

### 3. Data Analysis and Visualization
- **Tool**: Use Power BI to create interactive dashboards.
- **Metrics and Insights**:
  - **Job Title**: Categorize and analyze the most common job titles.
  - **Job Responsibilities**: Identify frequently listed tasks and duties.
  - **Required Skills**: Determine the most in-demand skills and tools.
  - **Preferred Qualifications**: Highlight advantageous qualifications and experiences.
  - **Hourly Range**: Analyze the offered salary ranges.
  - **Posted Date**: Track the volume of job postings over time.
  - **Category**: Examine job categories and their distribution.
  - **Country**: Map job listings by location.
  - **Additional Skills**: Identify additional skills that enhance job proposals.

## Methodology


### 1. Data Collection
- Implement a Python script to automate the fetching of RSS feeds from Upwork every two days.
- Convert the RSS feeds into JSON format for structured data handling.
- Save the JSON files in an organized directory for subsequent processing.

### 2. Data Transformation
- Use natural language processing (NLP) techniques and large language models (LLMs) like ChatGPT API to parse and extract detailed information from job descriptions.
- Utilize Kor and LangChain for efficient data extraction and transformation.
- Ensure the extracted data includes:
  - Job Title
  - Link to Job Listing
  - Job Responsibilities
  - Required Skills
  - Preferred Qualifications
  - Hourly Range
  - Posted Date
  - Category
  - Country
  - Additional Skills

### 3. Data Loading
- Store the transformed data in a relational database or a structured data file (e.g., CSV, JSON) for analysis.
- Ensure data integrity and consistency throughout the ETL process.

### 4. Data Analysis and Visualization
- Import the transformed data into Power BI.
- Create interactive and visually appealing dashboards that provide insights into the Upwork job market.
- Develop visualizations that help identify trends and patterns in job listings, such as word clouds for job titles and skills, bar charts for job categories, and geographic maps for job locations.

## Expected Outcomes
- **Comprehensive Dashboard**: A Power BI dashboard offering a detailed analysis of Upwork job listings, showcasing critical insights such as in-demand skills, job categories, and salary ranges.
- **Optimized Proposals**: Enhanced understanding of market needs to tailor my profile more effectively, thereby increasing the chances of success on Upwork.
- **Market Trends**: Identification of emerging trends and shifts in the job market, enabling proactive adjustments to job search strategies.

## Tools and Technologies
- **Data Collection and Transformation**: Python, RSS Feeds, JSON, ChatGPT API, Kor, LangChain.
- **Data Visualization**: Power BI.
- **Storage**: Relational Database or Structured Data Files (e.g., CSV, JSON).

## Conclusion
By systematically analyzing Upwork job listings through automated data collection, advanced NLP techniques, and comprehensive data visualization, this project aims to provide actionable insights for optimizing job proposals. The resulting Power BI dashboard will serve as a powerful tool for understanding market demands and tailoring job search strategies to enhance success rates on Upwork.

In [14]:
#Imports:
import os
import json
import requests
import time
import re
#from bs4 import BeautifulSoup

from langchain.callbacks import get_openai_callback

def printOutput(output):
    print(json.dumps(output,sort_keys=True, indent=3))

In [2]:
#OpenAi Key:
openaikeyenv = "OpenAIKey1"
openaikey = os.getenv(openaikeyenv) #If you want to run the script, Change this line with your own APIKey

#Show your ApiKey:
#print(openaikey)

In [5]:
pathenv = "DataScienceProjectsPath"
subpath = os.getenv(pathenv)
dubpath = r"20240517 UPWORK RSS Feed\1-Original Data"
dubpath2 = r"20240517 UPWORK RSS Feed\2-Prepared Data\RAwFiles"
print(subpath)

fullpath = os.path.join(subpath,dubpath)
directory = os.path.join(subpath,dubpath2)
print(fullpath)

C:\Users\saadl\OneDrive\Bureau\Loug\Data Science Projects
C:\Users\saadl\OneDrive\Bureau\Loug\Data Science Projects\20240517 UPWORK RSS Feed\1-Original Data


In [6]:
files=os.listdir(directory)

In [7]:
files

['20240513_RSS_PowerBI.json',
 '20240514_RSS_PowerBI.json',
 '20240516_RSS_PowerBI.json',
 '20240517_RSS_PowerBI.json',
 '20240520_RSS_PowerBI.json',
 '20240522_RSS_PowerBI.json',
 '20240524_RSS_PowerBI.json',
 '20240526_RSS_PowerBI.json',
 '20240528_RSS_PowerBI.json',
 '20240529_RSS_PowerBI.json',
 '20240530_RSS_PowerBI.json',
 '20240601_RSS_PowerBI.json',
 '20240603_RSS_PowerBI.json',
 '20240605_RSS_PowerBI.json',
 '20240607_RSS_PowerBI.json',
 '20240613_RSS_PowerBI.json',
 '20240615_RSS_PowerBI.json',
 '20240619_RSS_PowerBI.json',
 '20240621_RSS_PowerBI.json',
 '20240623_RSS_PowerBI.json',
 '20240625_RSS_PowerBI.json',
 '20240627_RSS_PowerBI.json',
 '20240629_RSS_PowerBI.json',
 '20240701_RSS_PowerBI.json',
 '20240703_RSS_PowerBI.json',
 '20240705_RSS_PowerBI.json',
 '20240707_RSS_PowerBI.json',
 '20240709_RSS_PowerBI.json',
 '20240711_RSS_PowerBI.json',
 '20240713_RSS_PowerBI.json',
 '20240715_RSS_PowerBI.json',
 '20240717_RSS_PowerBI.json',
 '20240718_RSS_PowerBI.json',
 '20240720

In [8]:
print(os.path.join(directory,files[0]))
file1_path=os.path.join(directory,files[0])

C:\Users\saadl\OneDrive\Bureau\Loug\Data Science Projects\20240517 UPWORK RSS Feed\2-Prepared Data\RAwFiles\20240513_RSS_PowerBI.json


In [10]:
with open(file=file1_path,mode="r") as file1:
    #print(file1.read())
    content=file1.read()
    json_file1 = json.loads(content)

In [11]:
len(json_file1["rss"]['channel']['item'])

30

In [12]:
items = json_file1['rss']['channel']['item']
items

[{'title': 'Professional dashboard built in MicroStrategy - Upwork',
  'link': 'https://www.upwork.com/jobs/Professional-dashboard-built-MicroStrategy_%7E01ec8934d454ff0ef3?source=rss',
  'description': 'Hi<br /><br />\n I am looking for a professional dashboard built in MicroStrategy, using advanced visualizations and automation. The dataset is small&nbsp;&nbsp;(about 107 rows)and straight forward. Also please indicate how long will it take for you to do the job?<br /><br />\nThanks<br />\nCharu<br /><br /><br /><br /><b>Posted On</b>: May 13, 2024 09:46 UTC<br /><b>Category</b>: Data Visualization<br /><b>Skills</b>:Microsoft Power BI Data Visualization,     Microsoft Power BI,     Dashboard,     Business Intelligence,     SQL,     Microsoft Power BI Development,     Database,     Microsoft Excel,     Data Mining,     BigQuery,     Data Visualization,     Analytics Dashboard,     Data Modeling,     Data Analytics    \n<br /><b>Skills</b>:        Microsoft Power BI Data Visualization,

In [None]:
items[0]['content:encoded']

'How will your workday look?<br />\n- You will create analytical tables;<br />\n- Conduct electricity accounting;<br />\n- Analyze data using various programs and be responsible for transferring data to a new analytics program;<br />\n- Create and manage tools for predictive maintenance;<br />\n- Automate the metrological verification system;<br />\n- Develop new tools (e.g., macro files) and improve existing analytical tools.<br /><br />\nThis position is perfect for you if:<br />\n- You have a higher education (a technical background is an advantage);<br />\n- You are proficient in using MS Excel;<br />\n- You have experience with Power BI and Qlik Sense analytics tools;<br />\n- You are skilled in using the English language.<br /><br />\nIn this position, the following will be useful:<br />\n- Experience in seeking new innovative solutions;<br />\n- Experience in implementing new analytical tools in a company.<br /><br /><b>Hourly Range</b>: $17.00-$18.00\n\n<br /><b>Posted On</b>: 

In [None]:
#for i in items:
    #i.pop('description')

KeyError: 'description'

In [None]:
items

[{'title': 'Energy Resources Analyst - Upwork',
  'link': 'https://www.upwork.com/jobs/Energy-Resources-Analyst_%7E01ea1b338113e20cef?source=rss',
  'content:encoded': 'How will your workday look?<br />\n- You will create analytical tables;<br />\n- Conduct electricity accounting;<br />\n- Analyze data using various programs and be responsible for transferring data to a new analytics program;<br />\n- Create and manage tools for predictive maintenance;<br />\n- Automate the metrological verification system;<br />\n- Develop new tools (e.g., macro files) and improve existing analytical tools.<br /><br />\nThis position is perfect for you if:<br />\n- You have a higher education (a technical background is an advantage);<br />\n- You are proficient in using MS Excel;<br />\n- You have experience with Power BI and Qlik Sense analytics tools;<br />\n- You are skilled in using the English language.<br /><br />\nIn this position, the following will be useful:<br />\n- Experience in seeking ne