## Ingest Data

We start by ingesting the ESCO skills data. The preferred labels and descriptions are embedded using a `all-MiniLM-L6-v2` sentence embedding model, and stored in a `chroma` local database.

In [1]:
!./scripts/load_esco.py


>> from langchain.vectorstores import Chroma

with new imports of:

>> from langchain_community.vectorstores import Chroma
You can use the langchain cli to **automatically** upgrade many imports. Please see documentation here <https://python.langchain.com/docs/versions/v0_2/>
  from langchain.vectorstores import Chroma

>> from langchain.embeddings import HuggingFaceEmbeddings

with new imports of:

>> from langchain_community.embeddings import HuggingFaceEmbeddings
You can use the langchain cli to **automatically** upgrade many imports. Please see documentation here <https://python.langchain.com/docs/versions/v0_2/>
  from langchain.embeddings import HuggingFaceEmbeddings
  embedding_model = HuggingFaceEmbeddings(model_name="all-MiniLM-L6-v2", show_progress=True)
Batches: 100%|████████████████████████████████| 171/171 [00:50<00:00,  3.41it/s]
Batches: 100%|████████████████████████████████| 171/171 [00:52<00:00,  3.28it/s]
Batches: 100%|██████████████████████████████████| 95/95 [00:29

The next step is to scrape course data and match the ESCO skills. This is not done yet (open to collaboration!) so I just asked an LLM to generate a repository of courses that the reader can find under `data/course_catalog_esco.json`. 

Once this is done, the course information can be embedded and stored in another `chroma` database. 

In [6]:
!./scripts/load_courses.py


>> from langchain.vectorstores import Chroma

with new imports of:

>> from langchain_community.vectorstores import Chroma
You can use the langchain cli to **automatically** upgrade many imports. Please see documentation here <https://python.langchain.com/docs/versions/v0_2/>
  from langchain.vectorstores import Chroma

>> from langchain.embeddings import HuggingFaceEmbeddings

with new imports of:

>> from langchain_community.embeddings import HuggingFaceEmbeddings
You can use the langchain cli to **automatically** upgrade many imports. Please see documentation here <https://python.langchain.com/docs/versions/v0_2/>
  from langchain.embeddings import HuggingFaceEmbeddings
  embedding_model = HuggingFaceEmbeddings(model_name="all-MiniLM-L6-v2", show_progress=True)
Processing courses: 100%|████████████████████| 30/30 [00:00<00:00, 94608.36it/s]
Batches: 100%|████████████████████████████████████| 1/1 [00:00<00:00,  7.09it/s]
  vectorstore.persist()
✅ Stored 30 courses to ChromaDB at 'da

## Build your profile

Next up, we're ready to build your profile! There is a script for doing that from the CLI, but in this case we're going to do it programmatically.

In [9]:
from coachable_course_agent.agent_runner import create_profile_building_agent
from langchain.vectorstores import Chroma
from langchain.embeddings import HuggingFaceEmbeddings

embedding_model = HuggingFaceEmbeddings(model_name="all-MiniLM-L6-v2")
vectorstore = Chroma(
        persist_directory="data/esco_chroma",
        embedding_function=embedding_model
)

  embedding_model = HuggingFaceEmbeddings(model_name="all-MiniLM-L6-v2")


  vectorstore = Chroma(


In [25]:
# Get LinkedIn-style bio

user_id = "test_user"
linkedin_blurb = """
I am a software engineer with a passion for developing innovative programs that expedite the efficiency and effectiveness of organizational success. I have a strong background in Python, JavaScript, and web development, and I am always eager to learn new technologies and improve my skills. I thrive in collaborative environments and enjoy working on challenging projects that require creative problem-solving.
I have experience in full-stack development, including front-end frameworks like React and back-end technologies such as Node.js and Express. I am also familiar with database management systems like PostgreSQL and MongoDB
and have worked with cloud platforms such as AWS and Azure.
I am looking for opportunities to become a team lead or a senior developer where I can contribute to impactful projects and mentor junior developers. I am particularly interested in roles that involve building scalable web applications and improving user experiences.
"""

In [26]:
# Format prompt
prompt = f"My user ID is {user_id}. Here is my bio: {linkedin_blurb}"

In [27]:
# Create and run the agent
agent = create_profile_building_agent(vectorstore, user_id)
result = agent.invoke({"input": prompt})

In [28]:
result_text = result["output"]
print(f"Generated profile text: {result_text}")

Generated profile text: Your user profile has been successfully saved. Your career headline is "Software Engineer with a passion for innovative program development", and your skills include Python, JavaScript, Web Development, React, Node.js, Express, PostgreSQL, MongoDB, AWS, and Azure. Your goal is to become a team lead or senior developer and contribute to impactful projects, mentoring junior developers and building scalable web applications. Additionally, it has been inferred that you may be missing the skill "develop with cloud services".


For reference, the user profile is saved under `data/memory/{user_id}.json`.

In [41]:
import json
from IPython.display import display, JSON

with open(f"data/memory/{user_id}.json", "r") as f:
    data = json.load(f)
    print(json.dumps(data, indent=4, separators=(',', ': '), sort_keys=True))


{
    "feedback_log": [],
    "goal": "Become a team lead or senior developer and contribute to impactful projects, mentoring junior developers and building scalable web applications",
    "known_skills": [
        {
            "conceptUri": "http://data.europa.eu/esco/skill/ccd0a1d9-afda-43d9-b901-96344886e14d",
            "description": "N/A",
            "preferredLabel": "Python (computer programming)"
        },
        {
            "conceptUri": "http://data.europa.eu/esco/skill/9b9de2a4-d8af-4a7b-933a-a8334ae60067",
            "description": "N/A",
            "preferredLabel": "JavaScript Framework"
        },
        {
            "conceptUri": "http://data.europa.eu/esco/skill/11430d93-c835-48ed-8e70-285fa69c9ae6",
            "description": "N/A",
            "preferredLabel": "design cloud architecture"
        }
    ],
    "missing_skills": [
        {
            "conceptUri": "http://data.europa.eu/esco/skill/6b643893-0a1f-4f6c-83a1-e7eef75849b9",
            "descri