# Generate embeddings of the cleaned dataset

Import necessary libraries and load the data
> Remember to have an OpenAI key, you can get one at https://platform.openai.com/account/api-keys

In [None]:
# pyright: reportGeneralTypeIssues=false

import pandas as pd 
import numpy as np
from sentence_transformers import SentenceTransformer
import faiss
import json
from openai import OpenAI
from dotenv import load_dotenv
import os
from tqdm import tqdm
import math
import time

load_dotenv()

OPENAI_KEY = os.getenv("OPENAI_KEY")

Load the datasets of the **green skills** provided by `ESCO`

In [5]:
df_green_skills = pd.read_csv("../data/taxonomies/esco_green_skills_cleaned.csv")
df_green_skills.head()

Unnamed: 0,green_skill,alt_label,description
0,train staff to reduce food waste,teach students food waste reduction practices,Establish new trainings and staff development ...
1,train staff to reduce food waste,inform staff on food waste reduction practices,Establish new trainings and staff development ...
2,train staff to reduce food waste,educate workers on food recycling methods,Establish new trainings and staff development ...
3,train staff to reduce food waste,educate staff on food waste reduction,Establish new trainings and staff development ...
4,develop energy saving concepts,create concepts for energy saving,Use current research results and collaborate w...


Concatenate the `Main Name`, `Alternative Name` and `Description` columns to create a single text representation for each skill.
This will be used to generate richer embeddings.

In [33]:
for _, row in df_green_skills.head(10).iterrows():
    label = row["green_skill"] + ". " + row["alt_label"] + ". " + row["description"]
    print(label)
    # train staff to reduce food waste. teach students food waste reduction practices. Establish new trainings and staff development provisions to support staff knowledge in food waste prevention and food recycling practices. Ensure that staff understands methods of and tools for food recycling, e.g., separating waste.

train staff to reduce food waste. teach students food waste reduction practices. Establish new trainings and staff development provisions to support staff knowledge in food waste prevention and food recycling practices. Ensure that staff understands methods of and tools for food recycling, e.g., separating waste.
train staff to reduce food waste. inform staff on food waste reduction practices. Establish new trainings and staff development provisions to support staff knowledge in food waste prevention and food recycling practices. Ensure that staff understands methods of and tools for food recycling, e.g., separating waste.
train staff to reduce food waste. educate workers on food recycling methods. Establish new trainings and staff development provisions to support staff knowledge in food waste prevention and food recycling practices. Ensure that staff understands methods of and tools for food recycling, e.g., separating waste.
train staff to reduce food waste. educate staff on food wa

## Embeddings of the green skills
Here we are generating the embeddings of the green skill dataset using the `text-embedding-3-large` model from **OpenAI**. The embeddings will be stored in a **FAISS** index for efficient similarity search using **cosine similarity** as metric.

In [None]:
client = OpenAI(api_key = OPENAI_KEY)

embeddings = []

for row in tqdm(df_green_skills.iterrows(), total=df_green_skills.shape[0], desc="Generating embeddings", unit="embedding"):
    text = row[1]["green_skill"] + ". " + row[1]["alt_label"] + ". " + row[1]["description"]
    
    try:
        response = client.embeddings.create(
            input = text,
            model = "text-embedding-3-large"
        )
        embeddings.append(response.data[0].embedding)
    except Exception as e:
        embeddings.append([0.0]*3072)  # Append a zero vector in case of error
        send_warning(f"Error generating embedding for row {row[0]}: {e}")
        time.sleep(5)  # Wait before retrying to avoid hitting rate limits

embeddings[:2]

Generating embeddings:   0%|          | 0/2539 [00:00<?, ?embedding/s]

Generating embeddings: 100%|██████████| 2539/2539 [12:54<00:00,  3.28embedding/s]  


[[-0.04066907614469528,
  0.018801070749759674,
  -0.015321778133511543,
  0.008852869272232056,
  -0.020282993093132973,
  -0.023414356634020805,
  0.049070924520492554,
  -0.013556358404457569,
  -0.016752153635025024,
  0.017074311152100563,
  -0.0025949731934815645,
  -0.027164261788129807,
  -0.014471284113824368,
  0.017357809469103813,
  0.016765039414167404,
  -0.040308259427547455,
  0.012705864384770393,
  0.01338883675634861,
  0.04711221158504486,
  -0.028272481635212898,
  0.045565858483314514,
  -0.05623569339513779,
  -0.017100082710385323,
  -0.009336104616522789,
  -0.008376076817512512,
  0.021700482815504074,
  -0.016790812835097313,
  0.0038626601453870535,
  -0.046416353434324265,
  0.022215932607650757,
  0.027293125167489052,
  0.039122723042964935,
  -0.025695227086544037,
  -0.06860651075839996,
  -0.033865123987197876,
  0.010811582207679749,
  0.042086564004421234,
  -0.00916858296841383,
  -0.01597897708415985,
  0.018504686653614044,
  -0.0388907715678215,


Creating the `FAISS` index and adding the embeddings to it

In [28]:
index = faiss.IndexFlatIP(embeddings[0].__len__())
embeddings = np.array(embeddings).astype('float32')
index.add(embeddings)  

Here we are creating a **dictionary** of tuples in the form `(i, (skill, text))`, where: 
* `i` is the unique key for that entry, useful because **FAISS** only stores the *vector embedding*, not the actual text, so we need to save the identifier to map the vector to the entry.
* `(skill, text)`: a tuple with two elements, `skill` is the official name of the green skill given by the **esco taxonomy**, `text` is the alternative label detected in the semantic search, the same **green skill** can be retrieved as a result of similar job descriptions.

In [18]:
id_to_skill = {i: (skill, text) for i, (skill, text) in enumerate(zip(df_green_skills["green_skill"], df_green_skills["alt_label"]))}
id_to_skill[0]

('train staff to reduce food waste',
 'teach students food waste reduction practices')

Testing the index

In [None]:
query = "Ayudar a la reforestación"  # Test query

query_embedding = client.embeddings.create(
    input=query,
    model="text-embedding-3-large"
).data[0].embedding

query_vec = np.array([query_embedding], dtype=np.float32)

D, I = index.search(query_vec, k=2)

for i, score in zip(I[0], D[0]):
    print(f"Score: {score:.4f}\nSkill: {id_to_skill[i][0]}\nText: {id_to_skill[i][1]}\n")

Score: 0.4316
Skill: plant trees
Text: transplant trees

Score: 0.4180
Skill: care for the wildlife
Text: flora and fauna protecting



Saving the **index** and the **mapping**

In [30]:
faiss.write_index(index, "../data/embeddings/esco_green_skills_text-embedding-3-large.index")
with open("../data/mapping/id_to_skill.json", "w") as f:
    json.dump(id_to_skill, f, indent=4)


Ensuring the **index** and **mapping** were saved correctly. Given that there is a correspondence between the index and the mapping, they should have the same length.

In [None]:
print(len(id_to_skill)) # Length of the mapping
print(df_green_skills.shape) # Shape of the dataframe
print(index.ntotal) # Number of vectors in the index

# They should be the same
print(len(id_to_skill) == df_green_skills.shape[0] == index.ntotal)

2539
(2539, 3)
2539
True


## Embeddings of jobs from january 2025 to april 2025
Here we are generating the embeddings of the job descriptions from january 2025 to april 2025, this will be used later for predicting green skills in those job descriptions.

In [42]:
df_jobs = pd.read_csv("../data/jan_to_apr_2025_with_languages_cleaned.csv")
df_jobs.head()

Unnamed: 0,Title,Job_ID,source,Skills,month,detected_language
0,ventas flotillas,977bbdb05a29b771,indeed,experiencia en prospeccion de clientes,1.0,es
1,ventas flotillas,977bbdb05a29b771,indeed,experiencia en cierre de ventas,1.0,es
2,ventas flotillas,977bbdb05a29b771,indeed,experiencia en giro automotriz,1.0,es
3,ventas flotillas,977bbdb05a29b771,indeed,licencia de manejo indispensable,1.0,es
4,ventas flotillas,977bbdb05a29b771,indeed,habilidad para la prospeccion en empresas (tel...,1.0,es


In [None]:
client = OpenAI(api_key=OPENAI_KEY)
skills = list(df_jobs["Skills"])

job_skills_embeddings = []
save_every = 200
embedding_dim = 3072   

for i, skill in enumerate(tqdm(skills, desc="Generating embeddings for job skills", unit="embedding")):
    try:
        response = client.embeddings.create(
            input=skill,
            model="text-embedding-3-large"
        )
        emb = response.data[0].embedding

        if all(math.isfinite(x) for x in emb):
            job_skills_embeddings.append(emb)
        else:
            print(f"Invalid embedding on {i} — inserting zeros")
            job_skills_embeddings.append([0.0] * embedding_dim)

    except Exception as e:
        send_warning(f"Error generating embedding for job skill at index {i}: {e}")
        job_skills_embeddings.append([0.0] * embedding_dim)
        time.sleep(2)  

    if i > 0 and i % save_every == 0:
        np.save("../data/embeddings/job_skills_embeddings_partial.npy",
                np.array(job_skills_embeddings, dtype=np.float32))
        print(f"Saved {i} embeddings")

np.save("../data/embeddings/job_skills_embeddings.npy",
        np.array(job_skills_embeddings, dtype=np.float32))

print("Done:", len(job_skills_embeddings), "embeddings saved.")


Generating embeddings for job skills:   0%|          | 201/70318 [01:12<6:58:46,  2.79embedding/s] 

Saved 200 embeddings


Generating embeddings for job skills:   1%|          | 401/70318 [02:23<7:08:43,  2.72embedding/s]

Saved 400 embeddings


Generating embeddings for job skills:   1%|          | 601/70318 [03:37<9:17:20,  2.08embedding/s] 

Saved 600 embeddings


Generating embeddings for job skills:   1%|          | 801/70318 [04:53<7:25:25,  2.60embedding/s] 

Saved 800 embeddings


Generating embeddings for job skills:   1%|▏         | 1001/70318 [06:06<10:54:44,  1.76embedding/s]

Saved 1000 embeddings


Generating embeddings for job skills:   2%|▏         | 1201/70318 [07:11<7:11:35,  2.67embedding/s] 

Saved 1200 embeddings


Generating embeddings for job skills:   2%|▏         | 1401/70318 [08:18<7:01:03,  2.73embedding/s] 

Saved 1400 embeddings


Generating embeddings for job skills:   2%|▏         | 1601/70318 [09:23<6:54:21,  2.76embedding/s]

Saved 1600 embeddings


Generating embeddings for job skills:   3%|▎         | 1801/70318 [10:27<6:50:49,  2.78embedding/s] 

Saved 1800 embeddings


Generating embeddings for job skills:   3%|▎         | 2001/70318 [11:34<7:41:14,  2.47embedding/s] 

Saved 2000 embeddings


Generating embeddings for job skills:   3%|▎         | 2201/70318 [12:40<7:01:45,  2.69embedding/s] 

Saved 2200 embeddings


Generating embeddings for job skills:   3%|▎         | 2401/70318 [14:29<6:26:06,  2.93embedding/s]  

Saved 2400 embeddings


Generating embeddings for job skills:   4%|▎         | 2601/70318 [17:53<8:51:10,  2.12embedding/s] 

Saved 2600 embeddings


Generating embeddings for job skills:   4%|▍         | 2802/70318 [19:11<7:14:40,  2.59embedding/s] 

Saved 2800 embeddings


Generating embeddings for job skills:   4%|▍         | 3001/70318 [20:14<7:59:08,  2.34embedding/s] 

Saved 3000 embeddings


Generating embeddings for job skills:   5%|▍         | 3201/70318 [21:14<8:02:25,  2.32embedding/s]

Saved 3200 embeddings


Generating embeddings for job skills:   5%|▍         | 3401/70318 [22:17<6:57:07,  2.67embedding/s]

Saved 3400 embeddings


Generating embeddings for job skills:   5%|▌         | 3601/70318 [23:23<7:53:54,  2.35embedding/s] 

Saved 3600 embeddings


Generating embeddings for job skills:   5%|▌         | 3801/70318 [24:30<7:20:08,  2.52embedding/s] 

Saved 3800 embeddings


Generating embeddings for job skills:   6%|▌         | 4001/70318 [25:38<16:33:06,  1.11embedding/s]

Saved 4000 embeddings


Generating embeddings for job skills:   6%|▌         | 4201/70318 [26:41<8:09:45,  2.25embedding/s] 

Saved 4200 embeddings


Generating embeddings for job skills:   6%|▋         | 4401/70318 [27:47<9:03:15,  2.02embedding/s] 

Saved 4400 embeddings


Generating embeddings for job skills:   7%|▋         | 4601/70318 [28:55<8:07:43,  2.25embedding/s] 

Saved 4600 embeddings


Generating embeddings for job skills:   7%|▋         | 4801/70318 [30:01<7:24:51,  2.45embedding/s] 

Saved 4800 embeddings


Generating embeddings for job skills:   7%|▋         | 5001/70318 [31:12<9:22:57,  1.93embedding/s] 

Saved 5000 embeddings


Generating embeddings for job skills:   7%|▋         | 5201/70318 [32:10<7:09:21,  2.53embedding/s]

Saved 5200 embeddings


Generating embeddings for job skills:   8%|▊         | 5401/70318 [33:07<7:22:30,  2.45embedding/s] 

Saved 5400 embeddings


Generating embeddings for job skills:   8%|▊         | 5601/70318 [34:07<7:40:20,  2.34embedding/s] 

Saved 5600 embeddings


Generating embeddings for job skills:   8%|▊         | 5801/70318 [35:05<8:15:27,  2.17embedding/s] 

Saved 5800 embeddings


Generating embeddings for job skills:   9%|▊         | 6001/70318 [36:13<10:13:33,  1.75embedding/s]

Saved 6000 embeddings


Generating embeddings for job skills:   9%|▉         | 6201/70318 [37:17<8:46:50,  2.03embedding/s] 

Saved 6200 embeddings


Generating embeddings for job skills:   9%|▉         | 6401/70318 [38:23<8:35:49,  2.07embedding/s] 

Saved 6400 embeddings


Generating embeddings for job skills:   9%|▉         | 6601/70318 [39:26<8:32:23,  2.07embedding/s] 

Saved 6600 embeddings


Generating embeddings for job skills:  10%|▉         | 6801/70318 [40:36<8:20:31,  2.12embedding/s] 

Saved 6800 embeddings


Generating embeddings for job skills:  10%|▉         | 7001/70318 [41:54<11:25:16,  1.54embedding/s]

Saved 7000 embeddings


Generating embeddings for job skills:  10%|█         | 7201/70318 [43:01<8:57:20,  1.96embedding/s] 

Saved 7200 embeddings


Generating embeddings for job skills:  11%|█         | 7401/70318 [44:10<8:37:25,  2.03embedding/s] 

Saved 7400 embeddings


Generating embeddings for job skills:  11%|█         | 7601/70318 [45:15<8:35:45,  2.03embedding/s] 

Saved 7600 embeddings


Generating embeddings for job skills:  11%|█         | 7801/70318 [46:18<8:24:01,  2.07embedding/s] 

Saved 7800 embeddings


Generating embeddings for job skills:  11%|█▏        | 8001/70318 [47:24<10:52:01,  1.59embedding/s]

Saved 8000 embeddings


Generating embeddings for job skills:  12%|█▏        | 8201/70318 [48:27<8:36:31,  2.00embedding/s] 

Saved 8200 embeddings


Generating embeddings for job skills:  12%|█▏        | 8401/70318 [49:35<9:03:24,  1.90embedding/s] 

Saved 8400 embeddings


Generating embeddings for job skills:  12%|█▏        | 8601/70318 [50:39<9:30:13,  1.80embedding/s]

Saved 8600 embeddings


Generating embeddings for job skills:  13%|█▎        | 8801/70318 [51:41<8:59:15,  1.90embedding/s]

Saved 8800 embeddings


Generating embeddings for job skills:  13%|█▎        | 9001/70318 [52:47<10:24:02,  1.64embedding/s]

Saved 9000 embeddings


Generating embeddings for job skills:  13%|█▎        | 9201/70318 [53:41<7:58:21,  2.13embedding/s] 

Saved 9200 embeddings


Generating embeddings for job skills:  13%|█▎        | 9401/70318 [54:36<9:08:49,  1.85embedding/s]

Saved 9400 embeddings


Generating embeddings for job skills:  14%|█▎        | 9601/70318 [55:35<8:37:39,  1.95embedding/s] 

Saved 9600 embeddings


Generating embeddings for job skills:  14%|█▍        | 9802/70318 [56:32<6:59:37,  2.40embedding/s]

Saved 9800 embeddings


Generating embeddings for job skills:  14%|█▍        | 10001/70318 [57:27<14:07:25,  1.19embedding/s]

Saved 10000 embeddings


Generating embeddings for job skills:  15%|█▍        | 10201/70318 [58:24<9:15:29,  1.80embedding/s] 

Saved 10200 embeddings


Generating embeddings for job skills:  15%|█▍        | 10401/70318 [59:18<8:40:48,  1.92embedding/s]

Saved 10400 embeddings


Generating embeddings for job skills:  15%|█▌        | 10601/70318 [1:00:15<8:21:33,  1.98embedding/s]

Saved 10600 embeddings


Generating embeddings for job skills:  15%|█▌        | 10801/70318 [1:01:10<8:56:16,  1.85embedding/s]

Saved 10800 embeddings


Generating embeddings for job skills:  16%|█▌        | 11001/70318 [1:02:06<10:38:50,  1.55embedding/s]

Saved 11000 embeddings


Generating embeddings for job skills:  16%|█▌        | 11201/70318 [1:03:11<9:01:00,  1.82embedding/s] 

Saved 11200 embeddings


Generating embeddings for job skills:  16%|█▌        | 11401/70318 [1:04:04<9:26:13,  1.73embedding/s]

Saved 11400 embeddings


Generating embeddings for job skills:  16%|█▋        | 11601/70318 [1:05:03<10:50:29,  1.50embedding/s]

Saved 11600 embeddings


Generating embeddings for job skills:  17%|█▋        | 11801/70318 [1:05:58<9:11:07,  1.77embedding/s] 

Saved 11800 embeddings


Generating embeddings for job skills:  17%|█▋        | 12001/70318 [1:06:54<10:29:08,  1.54embedding/s]

Saved 12000 embeddings


Generating embeddings for job skills:  17%|█▋        | 12201/70318 [1:08:03<9:45:35,  1.65embedding/s] 

Saved 12200 embeddings


Generating embeddings for job skills:  18%|█▊        | 12401/70318 [1:09:10<9:26:26,  1.70embedding/s] 

Saved 12400 embeddings


Generating embeddings for job skills:  18%|█▊        | 12601/70318 [1:10:15<10:14:47,  1.56embedding/s]

Saved 12600 embeddings


Generating embeddings for job skills:  18%|█▊        | 12801/70318 [1:11:25<11:10:48,  1.43embedding/s]

Saved 12800 embeddings


Generating embeddings for job skills:  18%|█▊        | 13001/70318 [1:12:30<11:00:31,  1.45embedding/s]

Saved 13000 embeddings


Generating embeddings for job skills:  19%|█▉        | 13201/70318 [1:13:36<10:01:48,  1.58embedding/s]

Saved 13200 embeddings


Generating embeddings for job skills:  19%|█▉        | 13401/70318 [1:14:47<11:24:27,  1.39embedding/s]

Saved 13400 embeddings


Generating embeddings for job skills:  19%|█▉        | 13601/70318 [1:15:53<10:48:39,  1.46embedding/s]

Saved 13600 embeddings


Generating embeddings for job skills:  20%|█▉        | 13801/70318 [1:16:59<10:04:45,  1.56embedding/s]

Saved 13800 embeddings


Generating embeddings for job skills:  20%|█▉        | 14001/70318 [1:18:05<12:52:24,  1.22embedding/s]

Saved 14000 embeddings


Generating embeddings for job skills:  20%|██        | 14201/70318 [1:19:10<10:47:51,  1.44embedding/s]

Saved 14200 embeddings


Generating embeddings for job skills:  20%|██        | 14401/70318 [1:20:17<10:04:04,  1.54embedding/s]

Saved 14400 embeddings


Generating embeddings for job skills:  21%|██        | 14601/70318 [1:21:20<10:34:57,  1.46embedding/s]

Saved 14600 embeddings


Generating embeddings for job skills:  21%|██        | 14801/70318 [1:22:25<10:39:40,  1.45embedding/s]

Saved 14800 embeddings


Generating embeddings for job skills:  21%|██▏       | 15001/70318 [1:23:32<11:14:06,  1.37embedding/s]

Saved 15000 embeddings


Generating embeddings for job skills:  22%|██▏       | 15201/70318 [1:24:30<9:45:59,  1.57embedding/s] 

Saved 15200 embeddings


Generating embeddings for job skills:  22%|██▏       | 15402/70318 [1:25:22<7:34:41,  2.01embedding/s]

Saved 15400 embeddings


Generating embeddings for job skills:  22%|██▏       | 15601/70318 [1:26:15<9:55:31,  1.53embedding/s]

Saved 15600 embeddings


Generating embeddings for job skills:  22%|██▏       | 15801/70318 [1:27:09<10:16:21,  1.47embedding/s]

Saved 15800 embeddings


Generating embeddings for job skills:  23%|██▎       | 16001/70318 [1:28:00<11:42:12,  1.29embedding/s]

Saved 16000 embeddings


Generating embeddings for job skills:  23%|██▎       | 16201/70318 [1:29:10<10:41:41,  1.41embedding/s]

Saved 16200 embeddings


Generating embeddings for job skills:  23%|██▎       | 16401/70318 [1:30:15<11:11:50,  1.34embedding/s]

Saved 16400 embeddings


Generating embeddings for job skills:  24%|██▎       | 16601/70318 [1:31:21<10:56:33,  1.36embedding/s]

Saved 16600 embeddings


Generating embeddings for job skills:  24%|██▍       | 16801/70318 [1:32:27<10:29:31,  1.42embedding/s]

Saved 16800 embeddings


Generating embeddings for job skills:  24%|██▍       | 17001/70318 [1:33:33<13:06:47,  1.13embedding/s]

Saved 17000 embeddings


Generating embeddings for job skills:  24%|██▍       | 17201/70318 [1:34:42<10:48:35,  1.36embedding/s]

Saved 17200 embeddings


Generating embeddings for job skills:  25%|██▍       | 17401/70318 [1:35:47<11:02:07,  1.33embedding/s]

Saved 17400 embeddings


Generating embeddings for job skills:  25%|██▌       | 17601/70318 [1:36:54<10:36:08,  1.38embedding/s]

Saved 17600 embeddings


Generating embeddings for job skills:  25%|██▌       | 17801/70318 [1:37:58<11:00:46,  1.32embedding/s]

Saved 17800 embeddings


Generating embeddings for job skills:  26%|██▌       | 18001/70318 [1:39:08<11:53:58,  1.22embedding/s]

Saved 18000 embeddings


Generating embeddings for job skills:  26%|██▌       | 18202/70318 [1:40:05<8:16:47,  1.75embedding/s] 

Saved 18200 embeddings


Generating embeddings for job skills:  26%|██▌       | 18401/70318 [1:41:00<11:02:13,  1.31embedding/s]

Saved 18400 embeddings


Generating embeddings for job skills:  26%|██▋       | 18601/70318 [1:41:52<10:40:29,  1.35embedding/s]

Saved 18600 embeddings


Generating embeddings for job skills:  27%|██▋       | 18801/70318 [1:42:54<10:39:35,  1.34embedding/s]

Saved 18800 embeddings


Generating embeddings for job skills:  27%|██▋       | 19001/70318 [1:43:49<12:09:57,  1.17embedding/s]

Saved 19000 embeddings


Generating embeddings for job skills:  27%|██▋       | 19201/70318 [1:44:56<11:21:16,  1.25embedding/s]

Saved 19200 embeddings


Generating embeddings for job skills:  28%|██▊       | 19401/70318 [1:46:03<11:38:13,  1.22embedding/s]

Saved 19400 embeddings


Generating embeddings for job skills:  28%|██▊       | 19601/70318 [1:47:11<11:05:04,  1.27embedding/s]

Saved 19600 embeddings


Generating embeddings for job skills:  28%|██▊       | 19801/70318 [1:48:19<11:56:33,  1.17embedding/s]

Saved 19800 embeddings


Generating embeddings for job skills:  28%|██▊       | 20001/70318 [1:49:28<12:10:33,  1.15embedding/s]

Saved 20000 embeddings


Generating embeddings for job skills:  29%|██▊       | 20201/70318 [1:50:34<11:18:53,  1.23embedding/s]

Saved 20200 embeddings


Generating embeddings for job skills:  29%|██▉       | 20401/70318 [1:51:42<11:27:34,  1.21embedding/s]

Saved 20400 embeddings


Generating embeddings for job skills:  29%|██▉       | 20601/70318 [1:52:47<11:15:14,  1.23embedding/s]

Saved 20600 embeddings


Generating embeddings for job skills:  30%|██▉       | 20801/70318 [1:53:50<11:54:26,  1.16embedding/s]

Saved 20800 embeddings


Generating embeddings for job skills:  30%|██▉       | 21001/70318 [1:54:58<12:52:43,  1.06embedding/s]

Saved 21000 embeddings


Generating embeddings for job skills:  30%|███       | 21201/70318 [1:56:05<12:33:28,  1.09embedding/s]

Saved 21200 embeddings


Generating embeddings for job skills:  30%|███       | 21401/70318 [1:57:08<11:46:02,  1.15embedding/s]

Saved 21400 embeddings


Generating embeddings for job skills:  31%|███       | 21601/70318 [1:58:21<15:20:58,  1.13s/embedding]

Saved 21600 embeddings


Generating embeddings for job skills:  31%|███       | 21801/70318 [1:59:32<11:55:46,  1.13embedding/s]

Saved 21800 embeddings


Generating embeddings for job skills:  31%|███▏      | 22001/70318 [2:00:42<12:46:15,  1.05embedding/s]

Saved 22000 embeddings


Generating embeddings for job skills:  32%|███▏      | 22201/70318 [2:01:50<11:44:30,  1.14embedding/s]

Saved 22200 embeddings


Generating embeddings for job skills:  32%|███▏      | 22401/70318 [2:02:58<11:44:07,  1.13embedding/s]

Saved 22400 embeddings


Generating embeddings for job skills:  32%|███▏      | 22601/70318 [2:04:04<12:05:48,  1.10embedding/s]

Saved 22600 embeddings


Generating embeddings for job skills:  32%|███▏      | 22801/70318 [2:05:13<12:16:25,  1.08embedding/s]

Saved 22800 embeddings


Generating embeddings for job skills:  33%|███▎      | 23001/70318 [2:06:38<12:26:11,  1.06embedding/s]

Saved 23000 embeddings


Generating embeddings for job skills:  33%|███▎      | 23201/70318 [2:07:46<12:33:36,  1.04embedding/s]

Saved 23200 embeddings


Generating embeddings for job skills:  33%|███▎      | 23401/70318 [2:08:53<11:58:32,  1.09embedding/s]

Saved 23400 embeddings


Generating embeddings for job skills:  34%|███▎      | 23601/70318 [2:10:00<11:39:29,  1.11embedding/s]

Saved 23600 embeddings


Generating embeddings for job skills:  34%|███▍      | 23801/70318 [2:11:02<11:31:49,  1.12embedding/s]

Saved 23800 embeddings


Generating embeddings for job skills:  34%|███▍      | 24001/70318 [2:12:07<12:10:50,  1.06embedding/s]

Saved 24000 embeddings


Generating embeddings for job skills:  34%|███▍      | 24202/70318 [2:13:03<8:40:31,  1.48embedding/s] 

Saved 24200 embeddings


Generating embeddings for job skills:  35%|███▍      | 24401/70318 [2:13:59<11:20:12,  1.13embedding/s]

Saved 24400 embeddings


Generating embeddings for job skills:  35%|███▍      | 24601/70318 [2:14:56<11:34:36,  1.10embedding/s]

Saved 24600 embeddings


Generating embeddings for job skills:  35%|███▌      | 24801/70318 [2:15:53<11:19:41,  1.12embedding/s]

Saved 24800 embeddings


Generating embeddings for job skills:  36%|███▌      | 25001/70318 [2:16:48<11:35:46,  1.09embedding/s]

Saved 25000 embeddings


Generating embeddings for job skills:  36%|███▌      | 25201/70318 [2:17:49<13:44:33,  1.10s/embedding]

Saved 25200 embeddings


Generating embeddings for job skills:  36%|███▌      | 25401/70318 [2:18:49<11:56:35,  1.04embedding/s]

Saved 25400 embeddings


Generating embeddings for job skills:  36%|███▋      | 25601/70318 [2:19:48<11:21:53,  1.09embedding/s]

Saved 25600 embeddings


Generating embeddings for job skills:  37%|███▋      | 25802/70318 [2:20:46<8:30:04,  1.45embedding/s] 

Saved 25800 embeddings


Generating embeddings for job skills:  37%|███▋      | 26001/70318 [2:21:38<12:13:29,  1.01embedding/s]

Saved 26000 embeddings


Generating embeddings for job skills:  37%|███▋      | 26201/70318 [2:22:47<11:37:30,  1.05embedding/s]

Saved 26200 embeddings


Generating embeddings for job skills:  38%|███▊      | 26401/70318 [2:23:53<11:59:50,  1.02embedding/s]

Saved 26400 embeddings


Generating embeddings for job skills:  38%|███▊      | 26601/70318 [2:24:59<11:59:56,  1.01embedding/s]

Saved 26600 embeddings


Generating embeddings for job skills:  38%|███▊      | 26801/70318 [2:26:12<12:05:01,  1.00embedding/s]

Saved 26800 embeddings


Generating embeddings for job skills:  38%|███▊      | 27001/70318 [2:27:16<11:55:46,  1.01embedding/s]

Saved 27000 embeddings


Generating embeddings for job skills:  39%|███▊      | 27201/70318 [2:28:19<11:57:12,  1.00embedding/s]

Saved 27200 embeddings


Generating embeddings for job skills:  39%|███▉      | 27401/70318 [2:29:26<11:35:14,  1.03embedding/s]

Saved 27400 embeddings


Generating embeddings for job skills:  39%|███▉      | 27601/70318 [2:30:29<11:42:14,  1.01embedding/s]

Saved 27600 embeddings


Generating embeddings for job skills:  40%|███▉      | 27801/70318 [2:31:38<12:50:01,  1.09s/embedding]

Saved 27800 embeddings


Generating embeddings for job skills:  40%|███▉      | 28001/70318 [2:32:46<11:47:49,  1.00s/embedding]

Saved 28000 embeddings


Generating embeddings for job skills:  40%|████      | 28201/70318 [2:33:41<11:25:58,  1.02embedding/s]

Saved 28200 embeddings


Generating embeddings for job skills:  40%|████      | 28401/70318 [2:34:38<11:28:18,  1.01embedding/s]

Saved 28400 embeddings


Generating embeddings for job skills:  41%|████      | 28601/70318 [2:35:37<14:31:56,  1.25s/embedding]

Saved 28600 embeddings


Generating embeddings for job skills:  41%|████      | 28801/70318 [2:36:32<14:48:02,  1.28s/embedding]

Saved 28800 embeddings


Generating embeddings for job skills:  41%|████      | 29001/70318 [2:37:27<14:25:28,  1.26s/embedding]

Saved 29000 embeddings


Generating embeddings for job skills:  42%|████▏     | 29201/70318 [2:38:37<12:25:00,  1.09s/embedding]

Saved 29200 embeddings


Generating embeddings for job skills:  42%|████▏     | 29401/70318 [2:39:43<14:18:58,  1.26s/embedding]

Saved 29400 embeddings


Generating embeddings for job skills:  42%|████▏     | 29601/70318 [2:40:48<12:09:45,  1.08s/embedding]

Saved 29600 embeddings


Generating embeddings for job skills:  42%|████▏     | 29801/70318 [2:41:55<11:44:52,  1.04s/embedding]

Saved 29800 embeddings


Generating embeddings for job skills:  43%|████▎     | 30001/70318 [2:43:01<11:57:27,  1.07s/embedding]

Saved 30000 embeddings


Generating embeddings for job skills:  43%|████▎     | 30201/70318 [2:44:06<11:31:21,  1.03s/embedding]

Saved 30200 embeddings


Generating embeddings for job skills:  43%|████▎     | 30401/70318 [2:45:12<12:29:16,  1.13s/embedding]

Saved 30400 embeddings


Generating embeddings for job skills:  44%|████▎     | 30601/70318 [2:46:19<12:16:39,  1.11s/embedding]

Saved 30600 embeddings


Generating embeddings for job skills:  44%|████▍     | 30801/70318 [2:47:24<11:59:35,  1.09s/embedding]

Saved 30800 embeddings


Generating embeddings for job skills:  44%|████▍     | 31001/70318 [2:48:34<12:34:42,  1.15s/embedding]

Saved 31000 embeddings


Generating embeddings for job skills:  44%|████▍     | 31201/70318 [2:49:34<11:34:07,  1.06s/embedding]

Saved 31200 embeddings


Generating embeddings for job skills:  45%|████▍     | 31401/70318 [2:50:31<11:24:07,  1.05s/embedding]

Saved 31400 embeddings


Generating embeddings for job skills:  45%|████▍     | 31601/70318 [2:51:27<11:26:16,  1.06s/embedding]

Saved 31600 embeddings


Generating embeddings for job skills:  45%|████▌     | 31801/70318 [2:52:18<11:09:33,  1.04s/embedding]

Saved 31800 embeddings


Generating embeddings for job skills:  46%|████▌     | 32001/70318 [2:53:13<12:32:27,  1.18s/embedding]

Saved 32000 embeddings


Generating embeddings for job skills:  46%|████▌     | 32201/70318 [2:54:23<11:51:53,  1.12s/embedding]

Saved 32200 embeddings


Generating embeddings for job skills:  46%|████▌     | 32401/70318 [2:55:29<11:42:47,  1.11s/embedding]

Saved 32400 embeddings


Generating embeddings for job skills:  46%|████▋     | 32601/70318 [2:56:33<12:03:06,  1.15s/embedding]

Saved 32600 embeddings


Generating embeddings for job skills:  47%|████▋     | 32801/70318 [2:57:38<12:00:50,  1.15s/embedding]

Saved 32800 embeddings


Generating embeddings for job skills:  47%|████▋     | 33001/70318 [2:58:44<11:50:36,  1.14s/embedding]

Saved 33000 embeddings


Generating embeddings for job skills:  47%|████▋     | 33201/70318 [2:59:44<11:21:59,  1.10s/embedding]

Saved 33200 embeddings


Generating embeddings for job skills:  47%|████▋     | 33401/70318 [3:00:42<11:24:23,  1.11s/embedding]

Saved 33400 embeddings


Generating embeddings for job skills:  48%|████▊     | 33601/70318 [3:01:41<11:29:15,  1.13s/embedding]

Saved 33600 embeddings


Generating embeddings for job skills:  48%|████▊     | 33801/70318 [3:02:39<11:08:51,  1.10s/embedding]

Saved 33800 embeddings


Generating embeddings for job skills:  48%|████▊     | 34001/70318 [3:03:33<11:49:23,  1.17s/embedding]

Saved 34000 embeddings


Generating embeddings for job skills:  49%|████▊     | 34201/70318 [3:04:43<11:45:53,  1.17s/embedding]

Saved 34200 embeddings


Generating embeddings for job skills:  49%|████▉     | 34401/70318 [3:05:55<12:01:48,  1.21s/embedding]

Saved 34400 embeddings


Generating embeddings for job skills:  49%|████▉     | 34601/70318 [3:07:04<12:22:04,  1.25s/embedding]

Saved 34600 embeddings


Generating embeddings for job skills:  49%|████▉     | 34801/70318 [3:08:15<12:03:48,  1.22s/embedding]

Saved 34800 embeddings


Generating embeddings for job skills:  50%|████▉     | 35001/70318 [3:09:24<11:20:02,  1.16s/embedding]

Saved 35000 embeddings


Generating embeddings for job skills:  50%|█████     | 35201/70318 [3:10:28<11:28:41,  1.18s/embedding]

Saved 35200 embeddings


Generating embeddings for job skills:  50%|█████     | 35401/70318 [3:11:27<11:17:04,  1.16s/embedding]

Saved 35400 embeddings


Generating embeddings for job skills:  51%|█████     | 35601/70318 [3:12:28<11:25:06,  1.18s/embedding]

Saved 35600 embeddings


Generating embeddings for job skills:  51%|█████     | 35801/70318 [3:13:27<11:10:32,  1.17s/embedding]

Saved 35800 embeddings


Generating embeddings for job skills:  51%|█████     | 36001/70318 [3:14:25<11:38:31,  1.22s/embedding]

Saved 36000 embeddings


Generating embeddings for job skills:  51%|█████▏    | 36201/70318 [3:15:24<11:04:40,  1.17s/embedding]

Saved 36200 embeddings


Generating embeddings for job skills:  52%|█████▏    | 36401/70318 [3:16:32<11:24:41,  1.21s/embedding]

Saved 36400 embeddings


Generating embeddings for job skills:  52%|█████▏    | 36601/70318 [3:17:31<11:09:52,  1.19s/embedding]

Saved 36600 embeddings


Generating embeddings for job skills:  52%|█████▏    | 36801/70318 [3:18:28<10:57:02,  1.18s/embedding]

Saved 36800 embeddings


Generating embeddings for job skills:  53%|█████▎    | 37001/70318 [3:19:28<10:56:55,  1.18s/embedding]

Saved 37000 embeddings


Generating embeddings for job skills:  53%|█████▎    | 37201/70318 [3:20:34<11:13:43,  1.22s/embedding]

Saved 37200 embeddings


Generating embeddings for job skills:  53%|█████▎    | 37401/70318 [3:21:29<11:39:06,  1.27s/embedding]

Saved 37400 embeddings


Generating embeddings for job skills:  53%|█████▎    | 37602/70318 [3:22:24<8:13:29,  1.10embedding/s] 

Saved 37600 embeddings


Generating embeddings for job skills:  54%|█████▍    | 37801/70318 [3:23:18<12:12:04,  1.35s/embedding]

Saved 37800 embeddings


Generating embeddings for job skills:  54%|█████▍    | 38001/70318 [3:24:11<11:46:04,  1.31s/embedding]

Saved 38000 embeddings


Generating embeddings for job skills:  54%|█████▍    | 38201/70318 [3:25:16<11:30:55,  1.29s/embedding]

Saved 38200 embeddings


Generating embeddings for job skills:  55%|█████▍    | 38401/70318 [3:26:24<11:58:04,  1.35s/embedding]

Saved 38400 embeddings


Generating embeddings for job skills:  55%|█████▍    | 38601/70318 [3:27:29<11:05:15,  1.26s/embedding]

Saved 38600 embeddings


Generating embeddings for job skills:  55%|█████▌    | 38801/70318 [3:28:44<12:11:35,  1.39s/embedding]

Saved 38800 embeddings


Generating embeddings for job skills:  55%|█████▌    | 39001/70318 [3:29:53<11:04:29,  1.27s/embedding]

Saved 39000 embeddings


Generating embeddings for job skills:  56%|█████▌    | 39202/70318 [3:30:53<8:02:19,  1.08embedding/s] 

Saved 39200 embeddings


Generating embeddings for job skills:  56%|█████▌    | 39401/70318 [3:31:55<11:05:15,  1.29s/embedding]

Saved 39400 embeddings


Generating embeddings for job skills:  56%|█████▋    | 39601/70318 [3:32:54<10:28:20,  1.23s/embedding]

Saved 39600 embeddings


Generating embeddings for job skills:  57%|█████▋    | 39801/70318 [3:33:49<11:00:04,  1.30s/embedding]

Saved 39800 embeddings


Generating embeddings for job skills:  57%|█████▋    | 40002/70318 [3:34:48<7:57:51,  1.06embedding/s] 

Saved 40000 embeddings


Generating embeddings for job skills:  57%|█████▋    | 40201/70318 [3:35:53<10:39:44,  1.27s/embedding]

Saved 40200 embeddings


Generating embeddings for job skills:  57%|█████▋    | 40401/70318 [3:36:49<10:27:03,  1.26s/embedding]

Saved 40400 embeddings


Generating embeddings for job skills:  58%|█████▊    | 40601/70318 [3:37:46<11:08:53,  1.35s/embedding]

Saved 40600 embeddings


Generating embeddings for job skills:  58%|█████▊    | 40801/70318 [3:38:39<10:34:59,  1.29s/embedding]

Saved 40800 embeddings


Generating embeddings for job skills:  58%|█████▊    | 41001/70318 [3:39:57<10:54:36,  1.34s/embedding]

Saved 41000 embeddings


Generating embeddings for job skills:  59%|█████▊    | 41201/70318 [3:40:56<10:35:43,  1.31s/embedding]

Saved 41200 embeddings


Generating embeddings for job skills:  59%|█████▉    | 41401/70318 [3:41:53<10:12:14,  1.27s/embedding]

Saved 41400 embeddings


Generating embeddings for job skills:  59%|█████▉    | 41601/70318 [3:42:59<10:17:47,  1.29s/embedding]

Saved 41600 embeddings


Generating embeddings for job skills:  59%|█████▉    | 41801/70318 [3:43:53<10:15:34,  1.30s/embedding]

Saved 41800 embeddings


Generating embeddings for job skills:  60%|█████▉    | 42002/70318 [3:44:51<7:52:28,  1.00s/embedding] 

Saved 42000 embeddings


Generating embeddings for job skills:  60%|██████    | 42201/70318 [3:45:49<10:41:41,  1.37s/embedding]

Saved 42200 embeddings


Generating embeddings for job skills:  60%|██████    | 42401/70318 [3:46:48<10:55:31,  1.41s/embedding]

Saved 42400 embeddings


Generating embeddings for job skills:  61%|██████    | 42601/70318 [3:47:44<10:14:53,  1.33s/embedding]

Saved 42600 embeddings


Generating embeddings for job skills:  61%|██████    | 42801/70318 [3:48:47<10:30:41,  1.38s/embedding]

Saved 42800 embeddings


Generating embeddings for job skills:  61%|██████    | 43001/70318 [3:49:44<11:10:22,  1.47s/embedding]

Saved 43000 embeddings


Generating embeddings for job skills:  61%|██████▏   | 43201/70318 [3:50:50<10:38:13,  1.41s/embedding]

Saved 43200 embeddings


Generating embeddings for job skills:  62%|██████▏   | 43401/70318 [3:52:00<10:59:12,  1.47s/embedding]

Saved 43400 embeddings


Generating embeddings for job skills:  62%|██████▏   | 43601/70318 [3:53:06<10:31:21,  1.42s/embedding]

Saved 43600 embeddings


Generating embeddings for job skills:  62%|██████▏   | 43801/70318 [3:54:14<10:41:27,  1.45s/embedding]

Saved 43800 embeddings


Generating embeddings for job skills:  63%|██████▎   | 44001/70318 [3:55:20<10:11:53,  1.40s/embedding]

Saved 44000 embeddings


Generating embeddings for job skills:  63%|██████▎   | 44201/70318 [3:56:27<10:02:12,  1.38s/embedding]

Saved 44200 embeddings


Generating embeddings for job skills:  63%|██████▎   | 44401/70318 [3:57:32<10:17:09,  1.43s/embedding]

Saved 44400 embeddings


Generating embeddings for job skills:  63%|██████▎   | 44601/70318 [3:58:46<10:46:27,  1.51s/embedding]

Saved 44600 embeddings


Generating embeddings for job skills:  64%|██████▎   | 44801/70318 [3:59:52<10:27:31,  1.48s/embedding]

Saved 44800 embeddings


Generating embeddings for job skills:  64%|██████▍   | 45001/70318 [4:00:58<10:41:38,  1.52s/embedding]

Saved 45000 embeddings


Generating embeddings for job skills:  64%|██████▍   | 45201/70318 [4:02:06<11:06:15,  1.59s/embedding]

Saved 45200 embeddings


Generating embeddings for job skills:  65%|██████▍   | 45401/70318 [4:03:17<10:22:25,  1.50s/embedding]

Saved 45400 embeddings


Generating embeddings for job skills:  65%|██████▍   | 45601/70318 [4:04:28<9:56:46,  1.45s/embedding] 

Saved 45600 embeddings


Generating embeddings for job skills:  65%|██████▌   | 45801/70318 [4:05:36<9:45:18,  1.43s/embedding]

Saved 45800 embeddings


Generating embeddings for job skills:  65%|██████▌   | 46001/70318 [4:06:44<10:32:15,  1.56s/embedding]

Saved 46000 embeddings


Generating embeddings for job skills:  66%|██████▌   | 46201/70318 [4:07:54<10:22:05,  1.55s/embedding]

Saved 46200 embeddings


Generating embeddings for job skills:  66%|██████▌   | 46401/70318 [4:09:02<10:16:13,  1.55s/embedding]

Saved 46400 embeddings


Generating embeddings for job skills:  66%|██████▋   | 46601/70318 [4:10:12<9:37:23,  1.46s/embedding] 

Saved 46600 embeddings


Generating embeddings for job skills:  67%|██████▋   | 46801/70318 [4:11:22<10:05:49,  1.55s/embedding]

Saved 46800 embeddings


Generating embeddings for job skills:  67%|██████▋   | 47001/70318 [4:12:34<10:11:45,  1.57s/embedding]

Saved 47000 embeddings


Generating embeddings for job skills:  67%|██████▋   | 47201/70318 [4:13:44<10:32:56,  1.64s/embedding]

Saved 47200 embeddings


Generating embeddings for job skills:  67%|██████▋   | 47401/70318 [4:14:57<9:59:43,  1.57s/embedding] 

Saved 47400 embeddings


Generating embeddings for job skills:  68%|██████▊   | 47601/70318 [4:16:07<9:20:55,  1.48s/embedding]

Saved 47600 embeddings


Generating embeddings for job skills:  68%|██████▊   | 47801/70318 [4:17:18<9:54:42,  1.58s/embedding]

Saved 47800 embeddings


Generating embeddings for job skills:  68%|██████▊   | 48001/70318 [4:18:24<9:47:39,  1.58s/embedding]

Saved 48000 embeddings


Generating embeddings for job skills:  69%|██████▊   | 48201/70318 [4:19:35<9:58:32,  1.62s/embedding]

Saved 48200 embeddings


Generating embeddings for job skills:  69%|██████▉   | 48401/70318 [4:20:50<10:23:51,  1.71s/embedding]

Saved 48400 embeddings


Generating embeddings for job skills:  69%|██████▉   | 48601/70318 [4:22:01<10:23:44,  1.72s/embedding]

Saved 48600 embeddings


Generating embeddings for job skills:  69%|██████▉   | 48801/70318 [4:23:13<9:49:47,  1.64s/embedding] 

Saved 48800 embeddings


Generating embeddings for job skills:  70%|██████▉   | 49001/70318 [4:24:26<9:20:09,  1.58s/embedding]

Saved 49000 embeddings


Generating embeddings for job skills:  70%|██████▉   | 49201/70318 [4:25:24<10:49:05,  1.84s/embedding]

Saved 49200 embeddings


Generating embeddings for job skills:  70%|███████   | 49401/70318 [4:26:22<9:11:01,  1.58s/embedding] 

Saved 49400 embeddings


Generating embeddings for job skills:  71%|███████   | 49602/70318 [4:27:23<6:57:48,  1.21s/embedding]

Saved 49600 embeddings


Generating embeddings for job skills:  71%|███████   | 49801/70318 [4:28:25<9:23:12,  1.65s/embedding]

Saved 49800 embeddings


Generating embeddings for job skills:  71%|███████   | 50001/70318 [4:29:27<8:48:23,  1.56s/embedding]

Saved 50000 embeddings


Generating embeddings for job skills:  71%|███████▏  | 50201/70318 [4:30:31<10:06:53,  1.81s/embedding]

Saved 50200 embeddings


Generating embeddings for job skills:  72%|███████▏  | 50401/70318 [4:31:26<9:01:03,  1.63s/embedding] 

Saved 50400 embeddings


Generating embeddings for job skills:  72%|███████▏  | 50602/70318 [4:32:23<6:34:42,  1.20s/embedding]

Saved 50600 embeddings


Generating embeddings for job skills:  72%|███████▏  | 50801/70318 [4:33:18<8:42:02,  1.60s/embedding]

Saved 50800 embeddings


Generating embeddings for job skills:  73%|███████▎  | 51001/70318 [4:34:17<8:57:20,  1.67s/embedding]

Saved 51000 embeddings


Generating embeddings for job skills:  73%|███████▎  | 51201/70318 [4:35:27<8:59:34,  1.69s/embedding]

Saved 51200 embeddings


Generating embeddings for job skills:  73%|███████▎  | 51401/70318 [4:36:34<8:48:29,  1.68s/embedding]

Saved 51400 embeddings


Generating embeddings for job skills:  73%|███████▎  | 51601/70318 [4:37:46<8:54:51,  1.71s/embedding]

Saved 51600 embeddings


Generating embeddings for job skills:  74%|███████▎  | 51801/70318 [4:39:00<9:13:55,  1.79s/embedding]

Saved 51800 embeddings


Generating embeddings for job skills:  74%|███████▍  | 52001/70318 [4:40:08<9:12:45,  1.81s/embedding]

Saved 52000 embeddings


Generating embeddings for job skills:  74%|███████▍  | 52201/70318 [4:41:17<9:10:42,  1.82s/embedding]

Saved 52200 embeddings


Generating embeddings for job skills:  75%|███████▍  | 52401/70318 [4:42:29<8:59:38,  1.81s/embedding]

Saved 52400 embeddings


Generating embeddings for job skills:  75%|███████▍  | 52601/70318 [4:43:38<8:32:23,  1.74s/embedding]

Saved 52600 embeddings


Generating embeddings for job skills:  75%|███████▌  | 52801/70318 [4:44:49<8:43:23,  1.79s/embedding]

Saved 52800 embeddings


Generating embeddings for job skills:  75%|███████▌  | 53001/70318 [4:45:51<8:40:43,  1.80s/embedding]

Saved 53000 embeddings


Generating embeddings for job skills:  76%|███████▌  | 53201/70318 [4:47:01<8:45:54,  1.84s/embedding]

Saved 53200 embeddings


Generating embeddings for job skills:  76%|███████▌  | 53401/70318 [4:48:07<8:26:03,  1.79s/embedding]

Saved 53400 embeddings


Generating embeddings for job skills:  76%|███████▌  | 53601/70318 [4:49:16<8:55:50,  1.92s/embedding]

Saved 53600 embeddings


Generating embeddings for job skills:  77%|███████▋  | 53801/70318 [4:50:30<8:49:39,  1.92s/embedding] 

Saved 53800 embeddings


Generating embeddings for job skills:  77%|███████▋  | 54001/70318 [4:51:44<8:41:43,  1.92s/embedding]

Saved 54000 embeddings


Generating embeddings for job skills:  77%|███████▋  | 54201/70318 [4:52:43<9:02:44,  2.02s/embedding]

Saved 54200 embeddings


Generating embeddings for job skills:  77%|███████▋  | 54401/70318 [4:53:44<8:12:42,  1.86s/embedding]

Saved 54400 embeddings


Generating embeddings for job skills:  78%|███████▊  | 54601/70318 [4:54:44<7:07:00,  1.63s/embedding]

Saved 54600 embeddings


Generating embeddings for job skills:  78%|███████▊  | 54801/70318 [4:55:40<7:04:50,  1.64s/embedding]

Saved 54800 embeddings


Generating embeddings for job skills:  78%|███████▊  | 55001/70318 [4:56:35<7:23:22,  1.74s/embedding]

Saved 55000 embeddings


Generating embeddings for job skills:  79%|███████▊  | 55202/70318 [4:57:31<5:06:29,  1.22s/embedding]

Saved 55200 embeddings


Generating embeddings for job skills:  79%|███████▉  | 55401/70318 [4:58:30<7:02:11,  1.70s/embedding]

Saved 55400 embeddings


Generating embeddings for job skills:  79%|███████▉  | 55602/70318 [4:59:34<5:02:00,  1.23s/embedding]

Saved 55600 embeddings


Generating embeddings for job skills:  79%|███████▉  | 55801/70318 [5:00:45<7:22:36,  1.83s/embedding]

Saved 55800 embeddings


Generating embeddings for job skills:  80%|███████▉  | 56001/70318 [5:01:42<6:51:12,  1.72s/embedding]

Saved 56000 embeddings


Generating embeddings for job skills:  80%|███████▉  | 56201/70318 [5:02:42<6:42:30,  1.71s/embedding]

Saved 56200 embeddings


Generating embeddings for job skills:  80%|████████  | 56401/70318 [5:03:44<6:52:36,  1.78s/embedding]

Saved 56400 embeddings


Generating embeddings for job skills:  80%|████████  | 56601/70318 [5:04:56<7:08:02,  1.87s/embedding]

Saved 56600 embeddings


Generating embeddings for job skills:  81%|████████  | 56801/70318 [5:05:57<6:12:55,  1.66s/embedding]

Saved 56800 embeddings


Generating embeddings for job skills:  81%|████████  | 57001/70318 [5:06:57<6:46:41,  1.83s/embedding]

Saved 57000 embeddings


Generating embeddings for job skills:  81%|████████▏ | 57201/70318 [5:08:13<6:25:08,  1.76s/embedding]

Saved 57200 embeddings


Generating embeddings for job skills:  82%|████████▏ | 57401/70318 [5:09:23<7:10:26,  2.00s/embedding]

Saved 57400 embeddings


Generating embeddings for job skills:  82%|████████▏ | 57601/70318 [5:10:29<6:18:24,  1.79s/embedding]

Saved 57600 embeddings


Generating embeddings for job skills:  82%|████████▏ | 57801/70318 [5:11:31<6:08:59,  1.77s/embedding]

Saved 57800 embeddings


Generating embeddings for job skills:  82%|████████▏ | 58001/70318 [5:12:38<6:02:06,  1.76s/embedding]

Saved 58000 embeddings


Generating embeddings for job skills:  83%|████████▎ | 58201/70318 [5:13:49<6:11:23,  1.84s/embedding]

Saved 58200 embeddings


Generating embeddings for job skills:  83%|████████▎ | 58401/70318 [5:14:48<6:12:50,  1.88s/embedding]

Saved 58400 embeddings


Generating embeddings for job skills:  83%|████████▎ | 58601/70318 [5:15:59<5:53:53,  1.81s/embedding]

Saved 58600 embeddings


Generating embeddings for job skills:  84%|████████▎ | 58801/70318 [5:17:16<5:55:15,  1.85s/embedding]

Saved 58800 embeddings


Generating embeddings for job skills:  84%|████████▍ | 59001/70318 [5:18:33<5:46:32,  1.84s/embedding]

Saved 59000 embeddings


Generating embeddings for job skills:  84%|████████▍ | 59201/70318 [5:19:44<5:37:08,  1.82s/embedding]

Saved 59200 embeddings


Generating embeddings for job skills:  84%|████████▍ | 59401/70318 [5:21:03<5:53:04,  1.94s/embedding]

Saved 59400 embeddings


Generating embeddings for job skills:  85%|████████▍ | 59601/70318 [5:22:05<5:19:16,  1.79s/embedding]

Saved 59600 embeddings


Generating embeddings for job skills:  85%|████████▌ | 59801/70318 [5:23:04<5:56:50,  2.04s/embedding]

Saved 59800 embeddings


Generating embeddings for job skills:  85%|████████▌ | 60001/70318 [5:24:22<5:31:09,  1.93s/embedding]

Saved 60000 embeddings


Generating embeddings for job skills:  86%|████████▌ | 60201/70318 [5:25:33<5:04:29,  1.81s/embedding]

Saved 60200 embeddings


Generating embeddings for job skills:  86%|████████▌ | 60401/70318 [5:26:49<5:18:35,  1.93s/embedding]

Saved 60400 embeddings


Generating embeddings for job skills:  86%|████████▌ | 60601/70318 [5:28:02<5:35:20,  2.07s/embedding]

Saved 60600 embeddings


Generating embeddings for job skills:  86%|████████▋ | 60801/70318 [5:29:20<4:54:43,  1.86s/embedding]

Saved 60800 embeddings


Generating embeddings for job skills:  87%|████████▋ | 61001/70318 [5:30:14<4:40:44,  1.81s/embedding]

Saved 61000 embeddings


Generating embeddings for job skills:  87%|████████▋ | 61201/70318 [5:31:21<4:48:47,  1.90s/embedding]

Saved 61200 embeddings


Generating embeddings for job skills:  87%|████████▋ | 61401/70318 [5:32:20<4:44:20,  1.91s/embedding]

Saved 61400 embeddings


Generating embeddings for job skills:  88%|████████▊ | 61601/70318 [5:33:21<4:41:16,  1.94s/embedding]

Saved 61600 embeddings


Generating embeddings for job skills:  88%|████████▊ | 61801/70318 [5:34:27<4:35:14,  1.94s/embedding]

Saved 61800 embeddings


Generating embeddings for job skills:  88%|████████▊ | 62001/70318 [5:35:36<4:27:49,  1.93s/embedding]

Saved 62000 embeddings


Generating embeddings for job skills:  88%|████████▊ | 62201/70318 [5:36:39<4:10:53,  1.85s/embedding]

Saved 62200 embeddings


Generating embeddings for job skills:  89%|████████▊ | 62401/70318 [5:37:36<4:29:44,  2.04s/embedding]

Saved 62400 embeddings


Generating embeddings for job skills:  89%|████████▉ | 62601/70318 [5:38:49<4:08:38,  1.93s/embedding]

Saved 62600 embeddings


Generating embeddings for job skills:  89%|████████▉ | 62801/70318 [5:39:59<4:10:23,  2.00s/embedding]

Saved 62800 embeddings


Generating embeddings for job skills:  90%|████████▉ | 63001/70318 [5:41:16<4:01:11,  1.98s/embedding]

Saved 63000 embeddings


Generating embeddings for job skills:  90%|████████▉ | 63201/70318 [5:42:22<3:45:32,  1.90s/embedding]

Saved 63200 embeddings


Generating embeddings for job skills:  90%|█████████ | 63401/70318 [5:43:33<3:53:51,  2.03s/embedding]

Saved 63400 embeddings


Generating embeddings for job skills:  90%|█████████ | 63601/70318 [5:44:49<3:55:48,  2.11s/embedding]

Saved 63600 embeddings


Generating embeddings for job skills:  91%|█████████ | 63801/70318 [5:45:51<3:49:10,  2.11s/embedding]

Saved 63800 embeddings


Generating embeddings for job skills:  91%|█████████ | 64001/70318 [5:46:52<3:27:17,  1.97s/embedding]

Saved 64000 embeddings


Generating embeddings for job skills:  91%|█████████▏| 64201/70318 [5:47:53<3:20:28,  1.97s/embedding]

Saved 64200 embeddings


Generating embeddings for job skills:  92%|█████████▏| 64401/70318 [5:48:55<3:16:52,  2.00s/embedding]

Saved 64400 embeddings


Generating embeddings for job skills:  92%|█████████▏| 64601/70318 [5:50:07<3:06:43,  1.96s/embedding]

Saved 64600 embeddings


Generating embeddings for job skills:  92%|█████████▏| 64801/70318 [5:51:07<3:12:50,  2.10s/embedding]

Saved 64800 embeddings


Generating embeddings for job skills:  92%|█████████▏| 65001/70318 [5:52:17<2:59:30,  2.03s/embedding]

Saved 65000 embeddings


Generating embeddings for job skills:  93%|█████████▎| 65201/70318 [5:53:29<3:01:20,  2.13s/embedding]

Saved 65200 embeddings


Generating embeddings for job skills:  93%|█████████▎| 65401/70318 [5:55:04<2:50:34,  2.08s/embedding] 

Saved 65400 embeddings


Generating embeddings for job skills:  93%|█████████▎| 65601/70318 [5:56:04<2:32:57,  1.95s/embedding]

Saved 65600 embeddings


Generating embeddings for job skills:  94%|█████████▎| 65801/70318 [5:57:12<2:32:08,  2.02s/embedding]

Saved 65800 embeddings


Generating embeddings for job skills:  94%|█████████▍| 66001/70318 [5:58:30<2:30:48,  2.10s/embedding]

Saved 66000 embeddings


Generating embeddings for job skills:  94%|█████████▍| 66201/70318 [5:59:37<2:18:02,  2.01s/embedding]

Saved 66200 embeddings


Generating embeddings for job skills:  94%|█████████▍| 66401/70318 [6:00:49<2:11:22,  2.01s/embedding]

Saved 66400 embeddings


Generating embeddings for job skills:  95%|█████████▍| 66601/70318 [6:01:50<1:59:34,  1.93s/embedding]

Saved 66600 embeddings


Generating embeddings for job skills:  95%|█████████▍| 66801/70318 [6:02:49<2:02:08,  2.08s/embedding]

Saved 66800 embeddings


Generating embeddings for job skills:  95%|█████████▌| 67001/70318 [6:03:49<1:47:19,  1.94s/embedding]

Saved 67000 embeddings


Generating embeddings for job skills:  96%|█████████▌| 67201/70318 [6:04:58<1:47:51,  2.08s/embedding]

Saved 67200 embeddings


Generating embeddings for job skills:  96%|█████████▌| 67401/70318 [6:06:23<1:53:47,  2.34s/embedding]

Saved 67400 embeddings


Generating embeddings for job skills:  96%|█████████▌| 67601/70318 [6:07:20<1:31:03,  2.01s/embedding]

Saved 67600 embeddings


Generating embeddings for job skills:  96%|█████████▋| 67801/70318 [6:08:36<1:28:29,  2.11s/embedding]

Saved 67800 embeddings


Generating embeddings for job skills:  97%|█████████▋| 68001/70318 [6:09:38<1:28:28,  2.29s/embedding]

Saved 68000 embeddings


Generating embeddings for job skills:  97%|█████████▋| 68201/70318 [6:10:38<1:09:29,  1.97s/embedding]

Saved 68200 embeddings


Generating embeddings for job skills:  97%|█████████▋| 68401/70318 [6:11:47<1:05:47,  2.06s/embedding]

Saved 68400 embeddings


Generating embeddings for job skills:  98%|█████████▊| 68601/70318 [6:12:47<56:56,  1.99s/embedding]  

Saved 68600 embeddings


Generating embeddings for job skills:  98%|█████████▊| 68801/70318 [6:13:53<59:38,  2.36s/embedding]

Saved 68800 embeddings


Generating embeddings for job skills:  98%|█████████▊| 69001/70318 [6:15:07<46:06,  2.10s/embedding]

Saved 69000 embeddings


Generating embeddings for job skills:  98%|█████████▊| 69201/70318 [6:16:10<38:44,  2.08s/embedding]

Saved 69200 embeddings


Generating embeddings for job skills:  99%|█████████▊| 69401/70318 [6:17:15<31:31,  2.06s/embedding]

Saved 69400 embeddings


Generating embeddings for job skills:  99%|█████████▉| 69601/70318 [6:18:30<25:46,  2.16s/embedding]

Saved 69600 embeddings


Generating embeddings for job skills:  99%|█████████▉| 69801/70318 [6:19:35<17:25,  2.02s/embedding]

Saved 69800 embeddings


Generating embeddings for job skills: 100%|█████████▉| 70001/70318 [6:20:37<12:09,  2.30s/embedding]

Saved 70000 embeddings


Generating embeddings for job skills: 100%|█████████▉| 70201/70318 [6:21:45<04:05,  2.10s/embedding]

Saved 70200 embeddings


Generating embeddings for job skills: 100%|██████████| 70318/70318 [6:22:17<00:00,  3.07embedding/s]


Done: 70318 embeddings saved.


Create the **FAISS** index and add the embeddings to it. Also save the index to disk.

In [3]:
# Create the embedding index

job_skills_embeddings = np.load("../data/embeddings/job_skills_embeddings.npy")


index_job_skills = faiss.IndexFlatIP(job_skills_embeddings[0].__len__())
job_skills_embeddings = np.array(job_skills_embeddings).astype('float32')
index_job_skills.add(job_skills_embeddings)

faiss.write_index(index_job_skills, "../data/embeddings/job_skills_embeddings.index")

Create a mapping between the index and the skill, similar to the one created for the green skills, here we are using tuples of the form `(i, (job_id, job_skill))`, where:
* `i` is the unique key for that entry, useful because **FAISS** only stores the *vector embedding*, not the actual text, so we need to save the identifier to map the vector to the entry.
* `(job_id, job_skill)`: a tuple with two elements, `job_id` is the unique identifier of the job posting, `job_skill` is the skill extracted from the job posting.

In [43]:
# Index for Job Skills
id_to_job = {i : (id, skill) for i, (id, skill) in enumerate(zip(df_jobs["Job_ID"], df_jobs["Skills"]))}
id_to_job[0]

('977bbdb05a29b771', 'experiencia en prospeccion de clientes')

Save the mapping to a json file.

In [44]:
with open("../data/mapping/id_to_job.json", "w") as f:
    json.dump(id_to_job, f, indent=4)