Module Description:
-------------------
Compare the performance of current model (GEMMA 2b-it) and LLAMA 3.2-1b-it model (latest light-weight model developed by meta). Introduce the torch compile setup for faster inference.

Ownership:
----------
Project: Leveraging Artificial intelligence for Skills Extraction and Research (LAiSER)

Owner:  

        George Washington University Insitute of Public Policy
        Program on Skills, Credentials and Workforce Policy
        Media and Public Affairs Building
        805 21st Street NW
        Washington, DC 20052
        PSCWP@gwu.edu
        https://gwipp.gwu.edu/program-skills-credentials-workforce-policy-pscwp


License:
--------
Copyright 2024 George Washington University Insitute of Public Policy

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files
(the “Software”), to deal in the Software without restriction, including without limitation the rights to use, copy, modify,
merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES
OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE
LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR
IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

Revision History:
-----------------
Rev No. | Date | Author | Description

---
[1.0.0] | 08/13/2025 | Phanindra Kumar K. | Initial Version

### Download the python package from test pipeline

In [7]:
!python3 -m pip install --index-url https://test.pypi.org/simple/ --extra-index-url https://pypi.org/simple laiser[gpu]==0.2.43 -q

[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m39.9/39.9 MB[0m [31m2.0 MB/s[0m eta [36m0:00:00[0m
[?25h

### Check package installations and versions

In [8]:
!pip show vllm

Name: vllm
Version: 0.10.0
Summary: A high-throughput and memory-efficient inference and serving engine for LLMs
Home-page: https://github.com/vllm-project/vllm
Author: vLLM Team
Author-email: 
License: 
Location: /usr/local/lib/python3.11/dist-packages
Requires: aiohttp, blake3, cachetools, cbor2, cloudpickle, compressed-tensors, depyf, diskcache, einops, fastapi, filelock, gguf, huggingface-hub, lark, llguidance, lm-format-enforcer, mistral_common, msgspec, ninja, numba, numpy, openai, opencv-python-headless, outlines_core, partial-json-parser, pillow, prometheus-fastapi-instrumentator, prometheus_client, protobuf, psutil, py-cpuinfo, pybase64, pydantic, python-json-logger, pyyaml, pyzmq, ray, regex, requests, scipy, sentencepiece, tiktoken, tokenizers, torch, torchaudio, torchvision, tqdm, transformers, typing_extensions, watchfiles, xformers, xgrammar
Required-by: 


In [9]:
!pip show laiser

Name: laiser
Version: 0.2.43
Summary: Leveraging Artificial Intelligence for Skills Extraction and Research
Home-page: https://github.com/LAiSER-Software/extract-module
Author: 
Author-email: LAiSER Team <PSCWP@gwu.edu>
License: 
Location: /usr/local/lib/python3.11/dist-packages
Requires: faiss-cpu, google-generativeai, ipython, numpy, pandas, psutil, requests, sentence-transformers, skillNer, spacy, torch, tqdm, transformers
Required-by: 


The following codeblock tests
- Skills extraction using legacy pipeline with GPU.
- Skills extraction using refactored pipeline.
- Skills extraction support to multiple LLMs, Gemini API, CPU-only extraction.

In [None]:
#!/usr/bin/env python3
"""
Test script to verify the refactored skill extractor works with different scenarios
"""

import torch
import pandas as pd
import sys
import os
import argparse

from laiser.skill_extractor import Skill_Extractor
# Import the refactored skill extractor
from laiser.skill_extractor_refactored import SkillExtractorRefactored

# test legacy skill extractor
def test_legacy_skill_extraction():
    """Test the legacy skill extraction functionality"""
    print("Testing legacy skill extraction...")
    # Legacy extraction logic goes here
    use_gpu = True if torch.cuda.is_available() == True else False

    # based on the above arguments, to run this script:
    # python main.py --HF_TOKEN <your_hf_token> --AI_MODEL_ID <your_model_id> --use_gpu True --batch_size 32

    print('\n\nInitializing the Skill Extractor...')
    se = Skill_Extractor(AI_MODEL_ID='TheBloke/Mistral-7B-Instruct-v0.1-AWQ', use_gpu=True)
    print('The Skill Extractor has been initialized successfully!\n')

    # Skill extraction from jobs data
    print('\n\nLoading a sample dataset of 50 jobs...')
    job_sample = pd.read_csv('https://raw.githubusercontent.com/LAiSER-Software/datasets/refs/heads/master/jobs-data/linkedin_jobs_sample_36rows.csv')
    print('The sample jobs dataset has been loaded successfully!\n')

    job_sample = job_sample[['description', 'job_id']]
    job_sample = job_sample[1:3]
    print('The sample dataset has been filtered successfully!\n')
    print('Head of the sample:\n', job_sample.head())

    output = se.extractor(job_sample, 'job_id', text_columns=['description'])
    print('(LEGACY) The skills have been extracted from jobs data successfully...\n')

    # Save the extracted skills to a CSV file
    print(output)
    file_name = f'extracted_skills_for_{len(job_sample)}Jobs.csv'
    output.to_csv(file_name, index=False)
    print('(LEGACY) The extracted skills have been saved to the file named:', file_name)

    # Skill extraction from syllabi data
    print('\n\nLoading a sample dataset of 50 syllabi...')
    syllabi_sample = pd.read_csv('https://raw.githubusercontent.com/LAiSER-Software/datasets/refs/heads/master/syllabi-data/preprocessed_50_opensyllabus_syllabi_data.csv')
    print('The sample syllabi dataset has been loaded successfully!\n')

    syllabi_sample = syllabi_sample[['id', 'description', 'learning_outcomes']]
    syllabi_sample = syllabi_sample[1:3]
    print('The sample dataset has been filtered successfully!\n')
    print('Head of the sample:\n', syllabi_sample.head())

    output = se.extractor(syllabi_sample, 'id', text_columns=['description', 'learning_outcomes'], input_type='syllabus')
    print('The skills have been extracted from syllabi data successfully...\n')

    # Save the extracted skills to a CSV file
    print(output)
    file_name = f'extracted_skills_for_{len(syllabi_sample)}Syllabi.csv'
    output.to_csv(file_name, index=False)
    print('The extracted skills have been saved to the file named:', file_name)

def test_skill_extraction():
    """Test the refactored skill extraction functionality"""
    print("Testing refactored skill extraction...")

    try:

        # Test with transformer model (fallback case)
        print("\n1. Testing with GEMINI API...")
        extractor = SkillExtractorRefactored(
            model_id="gemini",
            api_key="<YOUR_GEMINI_API_KEY>",
            use_gpu=False
        )

        # Load job sample data
        job_sample = pd.read_csv('https://raw.githubusercontent.com/LAiSER-Software/datasets/refs/heads/master/jobs-data/linkedin_jobs_sample_36rows.csv')
        sample_data = job_sample[['description', 'job_id']].head(3)

        # Test basic extraction with single row
        print("\n2. Testing basic skill extraction with single row...")
        try:
            single_row_data = sample_data.iloc[0]  # Get first row as Series
            basic_skills = extractor.extract_skills(single_row_data, method="basic", input_type='job_desc', id_column="job_id")
            print(f"   Basic skills result: {basic_skills}")
            print(f"   Type: {type(basic_skills)}")

            # Save basic skills to CSV
            if basic_skills:
                basic_df = pd.DataFrame({
                    'job_id': [single_row_data['job_id']] * len(basic_skills),
                    'skill': basic_skills
                })
                basic_file_name = 'refactored_basic_skills_single_job.csv'
                basic_df.to_csv(basic_file_name, index=False)
                print(f"   (REFACTORED) Basic skills saved to: {basic_file_name}")
            else:
                print("   No basic skills to save")

        except Exception as e:
            print(f"   Basic extraction failed: {e}")

        # Test KSA extraction with single row
        print("\n3. Testing KSA skill extraction with single row...")
        try:
            single_row_data = sample_data.iloc[0]  # Get first row as Series
            ksa_skills = extractor.extract_skills(single_row_data, method="ksa", input_type='job_desc', id_column="job_id")
            print(f"   KSA skills result: {ksa_skills}")
            print(f"   Type: {type(ksa_skills)}")
            if isinstance(ksa_skills, list):
                print(f"   Number of skills: {len(ksa_skills)}")

                # Save KSA skills to CSV
                if ksa_skills:
                    ksa_df = pd.DataFrame(ksa_skills)
                    ksa_file_name = 'refactored_ksa_skills_single_job.csv'
                    ksa_df.to_csv(ksa_file_name, index=False)
                    print(f"   (REFACTORED) KSA skills saved to: {ksa_file_name}")
                else:
                    print("   No KSA skills to save")

        except Exception as e:
            print(f"   KSA extraction failed: {e}")

        # Test extract_and_align for multiple rows (correct method for DataFrames)
        print("\n4. Testing extract_and_align for multiple rows...")
        try:
            aligned_skills = extractor.extract_and_align(
                sample_data,
                id_column="job_id",
                text_columns=["description"],
                input_type='job_desc'
            )
            print(f"   Aligned skills result shape: {aligned_skills.shape}")
            print(f"   Aligned skills columns: {aligned_skills.columns.tolist()}")
            print(f"   Type: {type(aligned_skills)}")

            # Save the aligned skills to CSV file
            if not aligned_skills.empty:
                aligned_file_name = f'refactored_aligned_skills_for_{len(sample_data)}Jobs.csv'
                aligned_skills.to_csv(aligned_file_name, index=False)
                print(f"   (REFACTORED) Aligned skills saved to: {aligned_file_name}")
            else:
                print("   No aligned skills to save")

        except Exception as e:
            print(f"   Extract and align failed: {e}")

        # Test with pandas Series (your original use case)
        print("\n5. Testing with pandas Series...")
        try:
            series_data = sample_data.iloc[0]  # Get single row as Series
            print(f"   Series data type: {type(series_data)}")
            print(f"   Series keys: {series_data.index.tolist()}")

            series_skills = extractor.extract_skills(series_data, method="ksa", input_type='job_desc', id_column="job_id")
            print(f"   Series KSA skills result: {series_skills}")
            print(f"   Type: {type(series_skills)}")
        except Exception as e:
            print(f"   Series extraction failed: {e}")

        # Test with Gemini API (if API key available)
        print("\n6. Testing with Gemini API...")
        try:
            gemini_extractor = SkillExtractorRefactored(
                model_id="gemini",
                api_key="<YOUR_GEMINI_API_KEY>",
                use_gpu=False
            )
            print("   Gemini extractor initialized")
        except Exception as e:
            print(f"   Gemini initialization: {e}")

        # Test comprehensive extraction and save like legacy pipeline
        print("\n7. Testing comprehensive extraction (like legacy pipeline)...")
        try:
            # Process all sample data with extract_and_align
            comprehensive_skills = extractor.extract_and_align(
                sample_data,
                id_column="job_id",
                text_columns=["description"],
                input_type='job_desc'
            )

            print(f"   Comprehensive extraction completed")
            print(f"   Shape: {comprehensive_skills.shape}")

            # Save comprehensive results to CSV (similar to legacy format)
            if not comprehensive_skills.empty:
                comprehensive_file_name = f'refactored_extracted_skills_for_{len(sample_data)}Jobs.csv'
                comprehensive_skills.to_csv(comprehensive_file_name, index=False)
                print(f"   (REFACTORED) The extracted skills have been saved to: {comprehensive_file_name}")
                print(f"   Columns in output: {comprehensive_skills.columns.tolist()}")
            else:
                print("   No comprehensive skills extracted")

        except Exception as e:
            print(f"   Comprehensive extraction failed: {e}")

        print("\nAll tests completed!")

    except Exception as e:
        print(f"\n❌ Test failed with error: {e}")
        import traceback
        traceback.print_exc()
        return False

    return True

if __name__ == "__main__":
    legacy_check = test_legacy_skill_extraction()
    print("Legacy skill extraction test :", legacy_check)
    success = test_skill_extraction()
    sys.exit(0 if success else 1)


INFO 08-13 15:36:17 [__init__.py:235] Automatically detected platform cuda.




Testing legacy skill extraction...


Initializing the Skill Extractor...
Loading ESCO skill taxonomy data...
Loading FAISS index for ESCO skills...
FAISS index for ESCO skills loaded successfully.
Found 'en_core_web_lg' model. Loading...
GPU is available. Using GPU for Large Language model initialization...


config.json:   0%|          | 0.00/757 [00:00<?, ?B/s]

INFO 08-13 15:37:06 [config.py:1604] Using max model len 32768
INFO 08-13 15:37:08 [llm_engine.py:228] Initializing a V0 LLM engine (v0.10.0) with config: model='TheBloke/Mistral-7B-Instruct-v0.1-AWQ', speculative_config=None, tokenizer='TheBloke/Mistral-7B-Instruct-v0.1-AWQ', skip_tokenizer_init=False, tokenizer_mode=auto, revision=None, override_neuron_config={}, tokenizer_revision=None, trust_remote_code=False, dtype=torch.float16, max_seq_len=32768, download_dir=None, load_format=LoadFormat.AUTO, tensor_parallel_size=1, pipeline_parallel_size=1, disable_custom_all_reduce=False, quantization=awq, enforce_eager=False, kv_cache_dtype=auto,  device_config=cuda, decoding_config=DecodingConfig(backend='auto', disable_fallback=False, disable_any_whitespace=False, disable_additional_properties=False, reasoning_backend=''), observability_config=ObservabilityConfig(show_hidden_metrics_for_version=None, otlp_traces_endpoint=None, collect_detailed_traces=None), seed=0, served_model_name=TheBlo

tokenizer_config.json:   0%|          | 0.00/962 [00:00<?, ?B/s]

tokenizer.model:   0%|          | 0.00/493k [00:00<?, ?B/s]

tokenizer.json: 0.00B [00:00, ?B/s]

special_tokens_map.json:   0%|          | 0.00/72.0 [00:00<?, ?B/s]

generation_config.json:   0%|          | 0.00/116 [00:00<?, ?B/s]

INFO 08-13 15:37:13 [cuda.py:346] Cannot use FlashAttention-2 backend for Volta and Turing GPUs.
INFO 08-13 15:37:13 [cuda.py:395] Using XFormers backend.
INFO 08-13 15:37:14 [parallel_state.py:1102] rank 0 in world size 1 is assigned as DP rank 0, PP rank 0, TP rank 0, EP rank 0
INFO 08-13 15:37:14 [model_runner.py:1083] Starting to load model TheBloke/Mistral-7B-Instruct-v0.1-AWQ...
INFO 08-13 15:37:15 [weight_utils.py:296] Using model weights format ['*.safetensors']


model.safetensors:   0%|          | 0.00/4.15G [00:00<?, ?B/s]

INFO 08-13 15:38:54 [weight_utils.py:312] Time spent downloading weights for TheBloke/Mistral-7B-Instruct-v0.1-AWQ: 98.797442 seconds
INFO 08-13 15:38:55 [weight_utils.py:349] No model.safetensors.index.json found in remote.


Loading safetensors checkpoint shards:   0% Completed | 0/1 [00:00<?, ?it/s]


INFO 08-13 15:39:07 [default_loader.py:262] Loading weights took 12.13 seconds
INFO 08-13 15:39:08 [model_runner.py:1115] Model loading took 3.8812 GiB and 112.126870 seconds
INFO 08-13 15:39:33 [worker.py:295] Memory profiling takes 24.61 seconds
INFO 08-13 15:39:33 [worker.py:295] the current vLLM instance can use total_gpu_memory (14.74GiB) x gpu_memory_utilization (0.90) = 13.27GiB
INFO 08-13 15:39:33 [worker.py:295] model weights take 3.88GiB; non_torch_memory takes 0.05GiB; PyTorch activation peak memory takes 3.38GiB; the rest of the memory reserved for KV Cache is 5.96GiB.
INFO 08-13 15:39:35 [executor_base.py:113] # cuda blocks: 3049, # CPU blocks: 2048
INFO 08-13 15:39:35 [executor_base.py:118] Maximum concurrency for 32768 tokens per request: 1.49x
INFO 08-13 15:39:40 [model_runner.py:1385] Capturing cudagraphs for decoding. This may lead to unexpected consequences if the model is not static. To run the model in eager mode, set 'enforce_eager=True' or use '--enforce-eager' i

Capturing CUDA graph shapes:   0%|          | 0/35 [00:00<?, ?it/s]

INFO 08-13 15:40:49 [model_runner.py:1537] Graph capturing finished in 69 secs, took 0.48 GiB
INFO 08-13 15:40:49 [llm_engine.py:424] init engine (profile, create kv cache, warmup model) took 100.70 seconds
[INFO] Successfully loaded vLLM model: TheBloke/Mistral-7B-Instruct-v0.1-AWQ with dtype: float16 with awq quantization
The Skill Extractor has been initialized successfully!



Loading a sample dataset of 50 jobs...
The sample jobs dataset has been loaded successfully!

The sample dataset has been filtered successfully!

Head of the sample:
                                          description  job_id
1  \nJob description\nAre you interested in worki...       2
2  \nJob description\nWeb Developer (Programmer)\...       3


Adding requests:   0%|          | 0/1 [00:00<?, ?it/s]

Processed prompts:   0%|          | 0/1 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, output: 0.00 toks/s]

[get_ksa_details] No JSON match found in response for skill 'cognitive computing'


Adding requests:   0%|          | 0/1 [00:00<?, ?it/s]

Processed prompts:   0%|          | 0/1 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, output: 0.00 toks/s]

[get_ksa_details] No JSON match found in response for skill 'deep learning'


Adding requests:   0%|          | 0/1 [00:00<?, ?it/s]

Processed prompts:   0%|          | 0/1 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, output: 0.00 toks/s]

Adding requests:   0%|          | 0/1 [00:00<?, ?it/s]

Processed prompts:   0%|          | 0/1 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, output: 0.00 toks/s]

Adding requests:   0%|          | 0/1 [00:00<?, ?it/s]

Processed prompts:   0%|          | 0/1 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, output: 0.00 toks/s]

Adding requests:   0%|          | 0/1 [00:00<?, ?it/s]

Processed prompts:   0%|          | 0/1 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, output: 0.00 toks/s]

[get_ksa_details] No JSON match found in response for skill 'machine learning'


Adding requests:   0%|          | 0/1 [00:00<?, ?it/s]

Processed prompts:   0%|          | 0/1 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, output: 0.00 toks/s]

Adding requests:   0%|          | 0/1 [00:00<?, ?it/s]

Processed prompts:   0%|          | 0/1 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, output: 0.00 toks/s]

Adding requests:   0%|          | 0/1 [00:00<?, ?it/s]

Processed prompts:   0%|          | 0/1 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, output: 0.00 toks/s]

[get_ksa_details] Generation/parsing error for skill 'work with e-services available to citizens': Extra data: line 13 column 1 (char 362)


Adding requests:   0%|          | 0/1 [00:00<?, ?it/s]

Processed prompts:   0%|          | 0/1 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, output: 0.00 toks/s]

Adding requests:   0%|          | 0/1 [00:00<?, ?it/s]

Processed prompts:   0%|          | 0/1 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, output: 0.00 toks/s]

Adding requests:   0%|          | 0/1 [00:00<?, ?it/s]

Processed prompts:   0%|          | 0/1 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, output: 0.00 toks/s]

[get_ksa_details] No JSON match found in response for skill 'automate cloud tasks'


Adding requests:   0%|          | 0/1 [00:00<?, ?it/s]

Processed prompts:   0%|          | 0/1 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, output: 0.00 toks/s]

[get_ksa_details] No JSON match found in response for skill 'write job descriptions'


Adding requests:   0%|          | 0/1 [00:00<?, ?it/s]

Processed prompts:   0%|          | 0/1 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, output: 0.00 toks/s]

Adding requests:   0%|          | 0/1 [00:00<?, ?it/s]

Processed prompts:   0%|          | 0/1 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, output: 0.00 toks/s]

[get_ksa_details] No JSON match found in response for skill 'cloud technologies'


Adding requests:   0%|          | 0/1 [00:00<?, ?it/s]

Processed prompts:   0%|          | 0/1 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, output: 0.00 toks/s]

Adding requests:   0%|          | 0/1 [00:00<?, ?it/s]

Processed prompts:   0%|          | 0/1 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, output: 0.00 toks/s]

Adding requests:   0%|          | 0/1 [00:00<?, ?it/s]

Processed prompts:   0%|          | 0/1 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, output: 0.00 toks/s]

Adding requests:   0%|          | 0/1 [00:00<?, ?it/s]

Processed prompts:   0%|          | 0/1 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, output: 0.00 toks/s]

Adding requests:   0%|          | 0/1 [00:00<?, ?it/s]

Processed prompts:   0%|          | 0/1 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, output: 0.00 toks/s]

Adding requests:   0%|          | 0/1 [00:00<?, ?it/s]

Processed prompts:   0%|          | 0/1 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, output: 0.00 toks/s]

[get_ksa_details] Generation/parsing error for skill 'DevOps': Extra data: line 13 column 1 (char 362)


Adding requests:   0%|          | 0/1 [00:00<?, ?it/s]

Processed prompts:   0%|          | 0/1 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, output: 0.00 toks/s]

[get_ksa_details] No JSON match found in response for skill 'principles of artificial intelligence'


Adding requests:   0%|          | 0/1 [00:00<?, ?it/s]

Processed prompts:   0%|          | 0/1 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, output: 0.00 toks/s]

Adding requests:   0%|          | 0/1 [00:00<?, ?it/s]

Processed prompts:   0%|          | 0/1 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, output: 0.00 toks/s]

Adding requests:   0%|          | 0/1 [00:00<?, ?it/s]

Processed prompts:   0%|          | 0/1 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, output: 0.00 toks/s]

Adding requests:   0%|          | 0/1 [00:00<?, ?it/s]

Processed prompts:   0%|          | 0/1 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, output: 0.00 toks/s]

[get_ksa_details] Generation/parsing error for skill 'web programming': Expecting property name enclosed in double quotes: line 4 column 3 (char 186)


Adding requests:   0%|          | 0/1 [00:00<?, ?it/s]

Processed prompts:   0%|          | 0/1 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, output: 0.00 toks/s]

[get_ksa_details] No JSON match found in response for skill 'web services'


Adding requests:   0%|          | 0/1 [00:00<?, ?it/s]

Processed prompts:   0%|          | 0/1 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, output: 0.00 toks/s]

[get_ksa_details] Generation/parsing error for skill 'World Wide Web Consortium standards': Expecting ',' delimiter: line 5 column 52 (char 154)


Adding requests:   0%|          | 0/1 [00:00<?, ?it/s]

Processed prompts:   0%|          | 0/1 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, output: 0.00 toks/s]

[get_ksa_details] Generation/parsing error for skill 'computer programming': Expecting ',' delimiter: line 9 column 5 (char 162)


Adding requests:   0%|          | 0/1 [00:00<?, ?it/s]

Processed prompts:   0%|          | 0/1 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, output: 0.00 toks/s]

[get_ksa_details] Generation/parsing error for skill 'IBM WebSphere': Expecting ',' delimiter: line 9 column 5 (char 158)


Adding requests:   0%|          | 0/1 [00:00<?, ?it/s]

Processed prompts:   0%|          | 0/1 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, output: 0.00 toks/s]

[get_ksa_details] No JSON match found in response for skill 'write job descriptions'


Adding requests:   0%|          | 0/1 [00:00<?, ?it/s]

Processed prompts:   0%|          | 0/1 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, output: 0.00 toks/s]

[get_ksa_details] Generation/parsing error for skill 'data engineering': Expecting ',' delimiter: line 9 column 5 (char 158)


Adding requests:   0%|          | 0/1 [00:00<?, ?it/s]

Processed prompts:   0%|          | 0/1 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, output: 0.00 toks/s]

[get_ksa_details] Generation/parsing error for skill 'provide petroleum engineering support': Extra data: line 13 column 1 (char 197)


Adding requests:   0%|          | 0/1 [00:00<?, ?it/s]

Processed prompts:   0%|          | 0/1 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, output: 0.00 toks/s]

[get_ksa_details] No JSON match found in response for skill 'marine engineering'


Adding requests:   0%|          | 0/1 [00:00<?, ?it/s]

Processed prompts:   0%|          | 0/1 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, output: 0.00 toks/s]

[get_ksa_details] Generation/parsing error for skill 'computer engineering': Expecting ',' delimiter: line 9 column 5 (char 158)


Adding requests:   0%|          | 0/1 [00:00<?, ?it/s]

Processed prompts:   0%|          | 0/1 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, output: 0.00 toks/s]

[get_ksa_details] Generation/parsing error for skill 'apply basic programming skills': Expecting ',' delimiter: line 9 column 5 (char 158)


Adding requests:   0%|          | 0/1 [00:00<?, ?it/s]

Processed prompts:   0%|          | 0/1 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, output: 0.00 toks/s]

[get_ksa_details] Generation/parsing error for skill 'instruct on technical shore-based operations': Expecting ',' delimiter: line 9 column 5 (char 158)


Adding requests:   0%|          | 0/1 [00:00<?, ?it/s]

Processed prompts:   0%|          | 0/1 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, output: 0.00 toks/s]

[get_ksa_details] Generation/parsing error for skill 'web analytics': Expecting ',' delimiter: line 9 column 5 (char 131)


Adding requests:   0%|          | 0/1 [00:00<?, ?it/s]

Processed prompts:   0%|          | 0/1 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, output: 0.00 toks/s]

[get_ksa_details] Generation/parsing error for skill 'DevOps': Expecting ',' delimiter: line 5 column 52 (char 154)


Adding requests:   0%|          | 0/1 [00:00<?, ?it/s]

Processed prompts:   0%|          | 0/1 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, output: 0.00 toks/s]

[get_ksa_details] Generation/parsing error for skill 'computer science': Expecting ',' delimiter: line 9 column 5 (char 149)


Adding requests:   0%|          | 0/1 [00:00<?, ?it/s]

Processed prompts:   0%|          | 0/1 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, output: 0.00 toks/s]

[get_ksa_details] No JSON match found in response for skill 'work with e-services available to citizens'


Adding requests:   0%|          | 0/1 [00:00<?, ?it/s]

Processed prompts:   0%|          | 0/1 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, output: 0.00 toks/s]

Adding requests:   0%|          | 0/1 [00:00<?, ?it/s]

Processed prompts:   0%|          | 0/1 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, output: 0.00 toks/s]

[get_ksa_details] No JSON match found in response for skill 'provide quality assurance for meteorological services+H40'


Adding requests:   0%|          | 0/1 [00:00<?, ?it/s]

Processed prompts:   0%|          | 0/1 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, output: 0.00 toks/s]

[get_ksa_details] Generation/parsing error for skill 'verify qualifications of water transport crew': Expecting ',' delimiter: line 9 column 5 (char 158)


Adding requests:   0%|          | 0/1 [00:00<?, ?it/s]

Processed prompts:   0%|          | 0/1 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, output: 0.00 toks/s]

[get_ksa_details] Generation/parsing error for skill 'teach computer science': Expecting ',' delimiter: line 9 column 5 (char 158)


Adding requests:   0%|          | 0/1 [00:00<?, ?it/s]

Processed prompts:   0%|          | 0/1 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, output: 0.00 toks/s]

[get_ksa_details] Generation/parsing error for skill 'provide technical expertise': Expecting ',' delimiter: line 9 column 5 (char 158)


Adding requests:   0%|          | 0/1 [00:00<?, ?it/s]

Processed prompts:   0%|          | 0/1 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, output: 0.00 toks/s]

[get_ksa_details] Generation/parsing error for skill 'provide assistance with job search': Expecting ',' delimiter: line 9 column 5 (char 158)


Adding requests:   0%|          | 0/1 [00:00<?, ?it/s]

Processed prompts:   0%|          | 0/1 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, output: 0.00 toks/s]

Adding requests:   0%|          | 0/1 [00:00<?, ?it/s]

Processed prompts:   0%|          | 0/1 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, output: 0.00 toks/s]

Adding requests:   0%|          | 0/1 [00:00<?, ?it/s]

Processed prompts:   0%|          | 0/1 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, output: 0.00 toks/s]

(LEGACY) The skills have been extracted from jobs data successfully...

    Research ID                                        Description  \
0             2  \nJob description\nAre you interested in worki...   
1             2  \nJob description\nAre you interested in worki...   
2             2  \nJob description\nAre you interested in worki...   
3             2  \nJob description\nAre you interested in worki...   
4             2  \nJob description\nAre you interested in worki...   
5             2  \nJob description\nAre you interested in worki...   
6             2  \nJob description\nAre you interested in worki...   
7             2  \nJob description\nAre you interested in worki...   
8             2  \nJob description\nAre you interested in worki...   
9             2  \nJob description\nAre you interested in worki...   
10            2  \nJob description\nAre you interested in worki...   
11            2  \nJob description\nAre you interested in worki...   
12            2  \

Adding requests:   0%|          | 0/1 [00:00<?, ?it/s]

Processed prompts:   0%|          | 0/1 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, output: 0.00 toks/s]

Adding requests:   0%|          | 0/1 [00:00<?, ?it/s]

Processed prompts:   0%|          | 0/1 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, output: 0.00 toks/s]

[get_ksa_details] Generation/parsing error for skill 'statistical quality control': Extra data: line 14 column 1 (char 159)


Adding requests:   0%|          | 0/1 [00:00<?, ?it/s]

Processed prompts:   0%|          | 0/1 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, output: 0.00 toks/s]

[get_ksa_details] Generation/parsing error for skill 'conduct quality control analysis': Extra data: line 14 column 1 (char 159)


Adding requests:   0%|          | 0/1 [00:00<?, ?it/s]

Processed prompts:   0%|          | 0/1 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, output: 0.00 toks/s]

[get_ksa_details] Generation/parsing error for skill 'oversee quality control': Extra data: line 14 column 1 (char 159)


Adding requests:   0%|          | 0/1 [00:00<?, ?it/s]

Processed prompts:   0%|          | 0/1 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, output: 0.00 toks/s]

Adding requests:   0%|          | 0/1 [00:00<?, ?it/s]

Processed prompts:   0%|          | 0/1 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, output: 0.00 toks/s]

[get_ksa_details] Generation/parsing error for skill 'record production data for quality control': Extra data: line 14 column 1 (char 191)


Adding requests:   0%|          | 0/1 [00:00<?, ?it/s]

Processed prompts:   0%|          | 0/1 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, output: 0.00 toks/s]

[get_ksa_details] Generation/parsing error for skill 'total quality control': Extra data: line 14 column 1 (char 159)


Adding requests:   0%|          | 0/1 [00:00<?, ?it/s]

Processed prompts:   0%|          | 0/1 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, output: 0.00 toks/s]

Adding requests:   0%|          | 0/1 [00:00<?, ?it/s]

Processed prompts:   0%|          | 0/1 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, output: 0.00 toks/s]

Adding requests:   0%|          | 0/1 [00:00<?, ?it/s]

Processed prompts:   0%|          | 0/1 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, output: 0.00 toks/s]

[get_ksa_details] Generation/parsing error for skill 'define manufacturing quality criteria': Extra data: line 10 column 1 (char 228)


Adding requests:   0%|          | 0/1 [00:00<?, ?it/s]

Processed prompts:   0%|          | 0/1 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, output: 0.00 toks/s]

Adding requests:   0%|          | 0/1 [00:00<?, ?it/s]

Processed prompts:   0%|          | 0/1 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, output: 0.00 toks/s]

Adding requests:   0%|          | 0/1 [00:00<?, ?it/s]

Processed prompts:   0%|          | 0/1 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, output: 0.00 toks/s]

[get_ksa_details] Generation/parsing error for skill 'revise quality control systems documentation': Expecting ',' delimiter: line 10 column 1 (char 152)


Adding requests:   0%|          | 0/1 [00:00<?, ?it/s]

Processed prompts:   0%|          | 0/1 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, output: 0.00 toks/s]

[get_ksa_details] Generation/parsing error for skill 'exert quality control to processing food': Extra data: line 14 column 1 (char 172)


Adding requests:   0%|          | 0/1 [00:00<?, ?it/s]

Processed prompts:   0%|          | 0/1 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, output: 0.00 toks/s]

[get_ksa_details] Generation/parsing error for skill 'carry out quality control in microbiology laboratories': Extra data: line 14 column 1 (char 172)


Adding requests:   0%|          | 0/1 [00:00<?, ?it/s]

Processed prompts:   0%|          | 0/1 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, output: 0.00 toks/s]

Adding requests:   0%|          | 0/1 [00:00<?, ?it/s]

Processed prompts:   0%|          | 0/1 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, output: 0.00 toks/s]

Adding requests:   0%|          | 0/1 [00:00<?, ?it/s]

Processed prompts:   0%|          | 0/1 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, output: 0.00 toks/s]

Adding requests:   0%|          | 0/1 [00:00<?, ?it/s]

Processed prompts:   0%|          | 0/1 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, output: 0.00 toks/s]

Adding requests:   0%|          | 0/1 [00:00<?, ?it/s]

Processed prompts:   0%|          | 0/1 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, output: 0.00 toks/s]

[get_ksa_details] Generation/parsing error for skill 'check quality of products in textile production line': Extra data: line 14 column 1 (char 172)


Adding requests:   0%|          | 0/1 [00:00<?, ?it/s]

Processed prompts:   0%|          | 0/1 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, output: 0.00 toks/s]

Adding requests:   0%|          | 0/1 [00:00<?, ?it/s]

Processed prompts:   0%|          | 0/1 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, output: 0.00 toks/s]

[get_ksa_details] Generation/parsing error for skill 'perform failure analysis of production process': Extra data: line 14 column 1 (char 156)


Adding requests:   0%|          | 0/1 [00:00<?, ?it/s]

Processed prompts:   0%|          | 0/1 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, output: 0.00 toks/s]

Adding requests:   0%|          | 0/1 [00:00<?, ?it/s]

Processed prompts:   0%|          | 0/1 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, output: 0.00 toks/s]

Adding requests:   0%|          | 0/1 [00:00<?, ?it/s]

Processed prompts:   0%|          | 0/1 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, output: 0.00 toks/s]

[get_ksa_details] Generation/parsing error for skill 'manage quality': Extra data: line 14 column 1 (char 159)


Adding requests:   0%|          | 0/1 [00:00<?, ?it/s]

Processed prompts:   0%|          | 0/1 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, output: 0.00 toks/s]

[get_ksa_details] No JSON match found in response for skill 'data ethics'


Adding requests:   0%|          | 0/1 [00:00<?, ?it/s]

Processed prompts:   0%|          | 0/1 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, output: 0.00 toks/s]

[get_ksa_details] No JSON match found in response for skill 'business analytics'


Adding requests:   0%|          | 0/1 [00:00<?, ?it/s]

Processed prompts:   0%|          | 0/1 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, output: 0.00 toks/s]

[get_ksa_details] Generation/parsing error for skill 'use analytics for commercial purposes': Extra data: line 6 column 1 (char 140)


Adding requests:   0%|          | 0/1 [00:00<?, ?it/s]

Processed prompts:   0%|          | 0/1 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, output: 0.00 toks/s]

[get_ksa_details] No JSON match found in response for skill 'business intelligence'


Adding requests:   0%|          | 0/1 [00:00<?, ?it/s]

Processed prompts:   0%|          | 0/1 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, output: 0.00 toks/s]

Adding requests:   0%|          | 0/1 [00:00<?, ?it/s]

Processed prompts:   0%|          | 0/1 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, output: 0.00 toks/s]

Adding requests:   0%|          | 0/1 [00:00<?, ?it/s]

Processed prompts:   0%|          | 0/1 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, output: 0.00 toks/s]

[get_ksa_details] No JSON match found in response for skill 'data analytics'


Adding requests:   0%|          | 0/1 [00:00<?, ?it/s]

Processed prompts:   0%|          | 0/1 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, output: 0.00 toks/s]

[get_ksa_details] No JSON match found in response for skill 'analyse big data'


Adding requests:   0%|          | 0/1 [00:00<?, ?it/s]

Processed prompts:   0%|          | 0/1 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, output: 0.00 toks/s]

Adding requests:   0%|          | 0/1 [00:00<?, ?it/s]

Processed prompts:   0%|          | 0/1 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, output: 0.00 toks/s]

Adding requests:   0%|          | 0/1 [00:00<?, ?it/s]

Processed prompts:   0%|          | 0/1 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, output: 0.00 toks/s]

[get_ksa_details] No JSON match found in response for skill 'marketing analytics'


Adding requests:   0%|          | 0/1 [00:00<?, ?it/s]

Processed prompts:   0%|          | 0/1 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, output: 0.00 toks/s]

[get_ksa_details] No JSON match found in response for skill 'perform data analysis'


Adding requests:   0%|          | 0/1 [00:00<?, ?it/s]

Processed prompts:   0%|          | 0/1 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, output: 0.00 toks/s]

[get_ksa_details] No JSON match found in response for skill 'data engineering'


Adding requests:   0%|          | 0/1 [00:00<?, ?it/s]

Processed prompts:   0%|          | 0/1 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, output: 0.00 toks/s]

[get_ksa_details] Generation/parsing error for skill 'GDPR': Extra data: line 1 column 111 (char 110)


Adding requests:   0%|          | 0/1 [00:00<?, ?it/s]

Processed prompts:   0%|          | 0/1 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, output: 0.00 toks/s]

[get_ksa_details] No JSON match found in response for skill 'data warehouse'


Adding requests:   0%|          | 0/1 [00:00<?, ?it/s]

Processed prompts:   0%|          | 0/1 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, output: 0.00 toks/s]

[get_ksa_details] No JSON match found in response for skill 'web analytics'


Adding requests:   0%|          | 0/1 [00:00<?, ?it/s]

Processed prompts:   0%|          | 0/1 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, output: 0.00 toks/s]

Adding requests:   0%|          | 0/1 [00:00<?, ?it/s]

Processed prompts:   0%|          | 0/1 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, output: 0.00 toks/s]

[get_ksa_details] No JSON match found in response for skill 'healthcare analytics'


Adding requests:   0%|          | 0/1 [00:00<?, ?it/s]

Processed prompts:   0%|          | 0/1 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, output: 0.00 toks/s]

[get_ksa_details] No JSON match found in response for skill 'collect customer data'


Adding requests:   0%|          | 0/1 [00:00<?, ?it/s]

Processed prompts:   0%|          | 0/1 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, output: 0.00 toks/s]

Adding requests:   0%|          | 0/1 [00:00<?, ?it/s]

Processed prompts:   0%|          | 0/1 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, output: 0.00 toks/s]

[get_ksa_details] No JSON match found in response for skill 'develop data processing applications'


Adding requests:   0%|          | 0/1 [00:00<?, ?it/s]

Processed prompts:   0%|          | 0/1 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, output: 0.00 toks/s]

[get_ksa_details] No JSON match found in response for skill 'collect financial data'


Adding requests:   0%|          | 0/1 [00:00<?, ?it/s]

Processed prompts:   0%|          | 0/1 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, output: 0.00 toks/s]

Adding requests:   0%|          | 0/1 [00:00<?, ?it/s]

Processed prompts:   0%|          | 0/1 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, output: 0.00 toks/s]

Adding requests:   0%|          | 0/1 [00:00<?, ?it/s]

Processed prompts:   0%|          | 0/1 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, output: 0.00 toks/s]

[get_ksa_details] No JSON match found in response for skill 'gather data'
The skills have been extracted from syllabi data successfully...

      Research ID                                        Description  \
0    661424964946  Course Description: you will be provided with ...   
1    661424964946  Course Description: you will be provided with ...   
2    661424964946  Course Description: you will be provided with ...   
3    661424964946  Course Description: you will be provided with ...   
4    661424964946  Course Description: you will be provided with ...   
5    661424964946  Course Description: you will be provided with ...   
6    661424964946  Course Description: you will be provided with ...   
7    661424964946  Course Description: you will be provided with ...   
8    661424964946  Course Description: you will be provided with ...   
9    661424964946  Course Description: you will be provided with ...   
10   661424964946  Course Description: you will be provided with ...

SystemExit: 0

  warn("To exit: use 'exit', 'quit', or Ctrl-D.", stacklevel=1)
