In [1]:
job_position = """Senior Data Scientist (Machine Learning)

Company: TechNova Solutions
Location: San Francisco, CA

Job Description:

TechNova Solutions is seeking a Senior Data Scientist specializing in Machine Learning to join our dynamic analytics team. The ideal candidate will have a strong background in developing and deploying machine learning models, working with large datasets, and translating complex data into actionable insights.

Responsibilities:

Develop and implement advanced machine learning algorithms and models.
Analyze large, complex datasets to extract meaningful patterns and insights.
Collaborate with cross-functional teams to integrate predictive models into products.
Stay updated with the latest advancements in machine learning and data science.
Mentor junior data scientists and provide technical guidance.
Qualifications:

Master’s or Ph.D. in Data Science, Computer Science, Statistics, or a related field.
5+ years of experience in data science and machine learning.
Proficient in Python, R, and SQL.
Experience with deep learning frameworks (e.g., TensorFlow, PyTorch).
Strong problem-solving skills and attention to detail.
Candidate CVs
"""

In [3]:
job_1 = """
CV 1: Relevant
Name: Dr. Emily Carter
Contact Information:

Email: emily.carter@example.com
Phone: (555) 123-4567
Summary:

Senior Data Scientist with over 8 years of experience in machine learning and predictive analytics. Expertise in developing advanced algorithms and deploying scalable models in production environments.

Education:

Ph.D. in Computer Science, Stanford University (2014)
B.S. in Mathematics, University of California, Berkeley (2010)
Experience:

Senior Data Scientist, InnovateAI Labs (2016 – Present)
Led a team in developing machine learning models for natural language processing applications.
Implemented deep learning algorithms that improved prediction accuracy by 25%.
Collaborated with cross-functional teams to integrate models into cloud-based platforms.
Data Scientist, DataWave Analytics (2014 – 2016)
Developed predictive models for customer segmentation and churn analysis.
Analyzed large datasets using Hadoop and Spark frameworks.
Skills:

Programming Languages: Python, R, SQL
Machine Learning: TensorFlow, Keras, Scikit-Learn
Big Data Technologies: Hadoop, Spark
Data Visualization: Tableau, Matplotlib
"""

In [4]:
job_2 = """
CV 2: Relevant
Name: Michael Rodriguez
Contact Information:

Email: michael.rodriguez@example.com
Phone: (555) 234-5678
Summary:

Data Scientist with a strong background in machine learning and statistical modeling. Skilled in handling large datasets and translating data into actionable business insights.

Education:

M.S. in Data Science, Carnegie Mellon University (2013)
B.S. in Computer Science, University of Michigan (2011)
Experience:

Senior Data Scientist, Alpha Analytics (2017 – Present)
Developed machine learning models to optimize marketing strategies.
Reduced customer acquisition cost by 15% through predictive modeling.
Data Scientist, TechInsights (2013 – 2017)
Analyzed user behavior data to improve product features.
Implemented A/B testing frameworks to evaluate product changes.
Skills:

Programming Languages: Python, Java, SQL
Machine Learning: Scikit-Learn, XGBoost
Data Visualization: Seaborn, Plotly
Databases: MySQL, MongoDB
"""

In [5]:
job_3 = """
CV 3: Relevant
Name: Sarah Nguyen
Contact Information:

Email: sarah.nguyen@example.com
Phone: (555) 345-6789
Summary:

Data Scientist specializing in machine learning with 6 years of experience. Passionate about leveraging data to drive business solutions and improve product performance.

Education:

M.S. in Statistics, University of Washington (2014)
B.S. in Applied Mathematics, University of Texas at Austin (2012)
Experience:

Data Scientist, QuantumTech (2016 – Present)
Designed and implemented machine learning algorithms for financial forecasting.
Improved model efficiency by 20% through algorithm optimization.
Junior Data Scientist, DataCore Solutions (2014 – 2016)
Assisted in developing predictive models for supply chain optimization.
Conducted data cleaning and preprocessing on large datasets.
Skills:

Programming Languages: Python, R
Machine Learning Frameworks: PyTorch, Scikit-Learn
Statistical Analysis: SAS, SPSS
Cloud Platforms: AWS, Azure
"""

In [6]:
job_4 = """
CV 4: Not Relevant
Name: David Thompson
Contact Information:

Email: david.thompson@example.com
Phone: (555) 456-7890
Summary:

Creative Graphic Designer with over 8 years of experience in visual design and branding. Proficient in Adobe Creative Suite and passionate about creating compelling visuals.

Education:

B.F.A. in Graphic Design, Rhode Island School of Design (2012)
Experience:

Senior Graphic Designer, CreativeWorks Agency (2015 – Present)
Led design projects for clients in various industries.
Created branding materials that increased client engagement by 30%.
Graphic Designer, Visual Innovations (2012 – 2015)
Designed marketing collateral, including brochures, logos, and websites.
Collaborated with the marketing team to develop cohesive brand strategies.
Skills:

Design Software: Adobe Photoshop, Illustrator, InDesign
Web Design: HTML, CSS
Specialties: Branding and Identity, Typography
"""

In [7]:
job_5 = """
CV 5: Not Relevant
Name: Jessica Miller
Contact Information:

Email: jessica.miller@example.com
Phone: (555) 567-8901
Summary:

Experienced Sales Manager with a strong track record in driving sales growth and building high-performing teams. Excellent communication and leadership skills.

Education:

B.A. in Business Administration, University of Southern California (2010)
Experience:

Sales Manager, Global Enterprises (2015 – Present)
Managed a sales team of 15 members, achieving a 20% increase in annual revenue.
Developed sales strategies that expanded customer base by 25%.
Sales Representative, Market Leaders Inc. (2010 – 2015)
Consistently exceeded sales targets and received the 'Top Salesperson' award in 2013.
Skills:

Sales Strategy and Planning
Team Leadership and Development
CRM Software: Salesforce, Zoho
Negotiation and Relationship Building
"""

In [8]:
from json import load
import os
import dotenv

dotenv.load_dotenv()

# Setting environment variables

if "LLM_API_KEY" not in os.environ:
    os.environ["LLM_API_KEY"] = "YOUR KEY"

# "neo4j" or "networkx"
os.environ["GRAPH_DATABASE_PROVIDER"] = "kuzu"
# Not needed if using kuzu
# os.environ["GRAPH_DATABASE_URL"]=""
# os.environ["GRAPH_DATABASE_USERNAME"]=""
# os.environ["GRAPH_DATABASE_PASSWORD"]=""

# "pgvector", "qdrant", "weaviate" or "lancedb"
os.environ["VECTOR_DB_PROVIDER"] = "lancedb"
# Not needed if using "lancedb" or "pgvector"
# os.environ["VECTOR_DB_URL"]=""
# os.environ["VECTOR_DB_KEY"]=""

# Relational Database provider "sqlite" or "postgres"
os.environ["DB_PROVIDER"] = "sqlite"

# Database name
os.environ["DB_NAME"] = "cognee_db"

# Postgres specific parameters (Only if Postgres or PGVector is used)
# os.environ["DB_HOST"]="127.0.0.1"
# os.environ["DB_PORT"]="5432"
# os.environ["DB_USERNAME"]="cognee"
# os.environ["DB_PASSWORD"]="cognee"

In [9]:
# Reset the cognee system with the following command:

import cognee

await cognee.prune.prune_data()
await cognee.prune.prune_system(metadata=True)
print(cognee.__version__)



[2m2026-01-23T12:14:14.137227[0m [[32m[1minfo     [0m] [1mLog file created at: /Users/jpv/Bharath_J_P_V/Learning/Gen_ai/.venv/lib/python3.13/site-packages/logs/2026-01-23_17-44-13.log[0m [[0m[1m[34mcognee.shared.logging_utils[0m][0m [36mlog_file[0m=[35m/Users/jpv/Bharath_J_P_V/Learning/Gen_ai/.venv/lib/python3.13/site-packages/logs/2026-01-23_17-44-13.log[0m

[2m2026-01-23T12:14:14.137623[0m [[32m[1minfo     [0m] [1mLogging initialized           [0m [[0m[1m[34mcognee.shared.logging_utils[0m][0m [36mcognee_version[0m=[35m0.5.1[0m [36mdatabase_path[0m=[35m/Users/jpv/Bharath_J_P_V/Learning/Gen_ai/.venv/lib/python3.13/site-packages/cognee/.cognee_system/databases[0m [36mgraph_database_name[0m=[35m[0m [36mos_info[0m=[35m'Darwin 25.2.0 (Darwin Kernel Version 25.2.0: Tue Nov 18 21:09:40 PST 2025; root:xnu-12377.61.12~1/RELEASE_ARM64_T6000)'[0m [36mpython_version[0m=[35m3.13.5[0m [36mrelational_config[0m=[35mcognee_db[0m [36mstructlog_vers

0.5.1


In [10]:
import cognee

await cognee.add([job_1, job_2, job_3, job_4, job_5, job_position], "example")

User 0b926174-ad65-4ac6-9030-b007aa129322 has registered.



[2m2026-01-23T12:14:25.114245[0m [[32m[1minfo     [0m] [1mPipeline run started: `aabbc0e3-599a-5b55-b3bd-01e26773c1da`[0m [[0m[1m[34mrun_tasks_with_telemetry()[0m][0m

[2m2026-01-23T12:14:25.558836[0m [[32m[1minfo     [0m] [1mCoroutine task started: `resolve_data_directories`[0m [[0m[1m[34mrun_tasks_base[0m][0m

[2m2026-01-23T12:14:25.884400[0m [[32m[1minfo     [0m] [1mCoroutine task started: `ingest_data`[0m [[0m[1m[34mrun_tasks_base[0m][0m

[2m2026-01-23T12:14:26.221970[0m [[32m[1minfo     [0m] [1mPipeline run started: `aabbc0e3-599a-5b55-b3bd-01e26773c1da`[0m [[0m[1m[34mrun_tasks_with_telemetry()[0m][0m

[2m2026-01-23T12:14:26.606551[0m [[32m[1minfo     [0m] [1mCoroutine task started: `resolve_data_directories`[0m [[0m[1m[34mrun_tasks_base[0m][0m

[2m2026-01-23T12:14:26.912987[0m [[32m[1minfo     [0m] [1mCoroutine task started: `ingest_data`[0m [[0m[1m[34mrun_tasks_base[0m][0m

[2m2026-01-23T12:14:27.240659[0

PipelineRunCompleted(status='PipelineRunCompleted', pipeline_run_id=UUID('9cdca79b-9822-51e5-b07f-a9b3a51e1e98'), dataset_id=UUID('a0c12278-fe16-5799-a21f-66be45593150'), dataset_name='example', payload=None, data_ingestion_info=[{'run_info': PipelineRunCompleted(status='PipelineRunCompleted', pipeline_run_id=UUID('9cdca79b-9822-51e5-b07f-a9b3a51e1e98'), dataset_id=UUID('a0c12278-fe16-5799-a21f-66be45593150'), dataset_name='example', payload=None, data_ingestion_info=None), 'data_id': UUID('81c7d9a5-f0f8-5735-a2d8-1ea381ffcd18')}, {'run_info': PipelineRunCompleted(status='PipelineRunCompleted', pipeline_run_id=UUID('9cdca79b-9822-51e5-b07f-a9b3a51e1e98'), dataset_id=UUID('a0c12278-fe16-5799-a21f-66be45593150'), dataset_name='example', payload=None, data_ingestion_info=None), 'data_id': UUID('0910b840-44be-5070-a052-50b4d3924a08')}, {'run_info': PipelineRunCompleted(status='PipelineRunCompleted', pipeline_run_id=UUID('9cdca79b-9822-51e5-b07f-a9b3a51e1e98'), dataset_id=UUID('a0c12278-fe1

In [11]:
from cognee.shared.data_models import KnowledgeGraph
from cognee.modules.data.models import Dataset, Data
from cognee.modules.data.methods.get_dataset_data import get_dataset_data
from cognee.modules.cognify.config import get_cognify_config
from cognee.modules.pipelines.tasks.task import Task
from cognee.modules.pipelines import run_tasks
from cognee.modules.users.models import User
from cognee.tasks.documents import (
    classify_documents,
    extract_chunks_from_documents,
)
from cognee.infrastructure.llm import get_max_chunk_tokens
from cognee.tasks.graph import extract_graph_from_data
from cognee.tasks.storage import add_data_points
from cognee.tasks.summarization import summarize_text


async def run_cognify_pipeline(dataset: Dataset, user: User = None):
    data_documents: list[Data] = await get_dataset_data(dataset_id=dataset.id)

    try:
        cognee_config = get_cognify_config()

        tasks = [
            Task(classify_documents),
            Task(
                extract_chunks_from_documents, max_chunk_size=get_max_chunk_tokens()
            ),  # Extract text chunks based on the document type.
            Task(
                extract_graph_from_data, graph_model=KnowledgeGraph,
                task_config={"batch_size": 10}
            ),  # Generate knowledge graphs from the document chunks.
            Task(
                summarize_text,
                summarization_model=cognee_config.summarization_model,
                task_config={"batch_size": 10},
            ),
            Task(add_data_points, task_config={"batch_size": 10}),
        ]

        pipeline_run = run_tasks(tasks, dataset.id, data_documents, user, "cognify_pipeline", context={"dataset": dataset})
        pipeline_run_status = None

        async for run_status in pipeline_run:
            pipeline_run_status = run_status

    except Exception as error:
        raise error

In [12]:
from cognee.modules.users.methods import get_default_user
from cognee.modules.data.methods import get_datasets_by_name
from cognee.modules.users.methods import get_user

default_user = await get_default_user()

user = await get_user(default_user.id)

datasets = await get_datasets_by_name(["example"], user.id)

await run_cognify_pipeline(datasets[0], user)


[2m2026-01-23T12:15:09.419479[0m [[32m[1minfo     [0m] [1mPipeline run started: `77708808-5b06-5f2d-ab9e-779ba4f9df55`[0m [[0m[1m[34mrun_tasks_with_telemetry()[0m][0m

[2m2026-01-23T12:15:09.733362[0m [[32m[1minfo     [0m] [1mCoroutine task started: `classify_documents`[0m [[0m[1m[34mrun_tasks_base[0m][0m

[2m2026-01-23T12:15:10.150873[0m [[32m[1minfo     [0m] [1mAsync Generator task started: `extract_chunks_from_documents`[0m [[0m[1m[34mrun_tasks_base[0m][0m

[2m2026-01-23T12:15:10.836774[0m [[32m[1minfo     [0m] [1mPipeline run started: `77708808-5b06-5f2d-ab9e-779ba4f9df55`[0m [[0m[1m[34mrun_tasks_with_telemetry()[0m][0m

[2m2026-01-23T12:15:11.217300[0m [[32m[1minfo     [0m] [1mCoroutine task started: `classify_documents`[0m [[0m[1m[34mrun_tasks_base[0m][0m

[2m2026-01-23T12:15:11.524985[0m [[32m[1minfo     [0m] [1mAsync Generator task started: `extract_chunks_from_documents`[0m [[0m[1m[34mrun_tasks_base[0m][0

In [13]:
import pathlib
from cognee.api.v1.visualize import visualize_graph

# Use the current working directory instead of __file__:
notebook_dir = pathlib.Path.cwd()

graph_file_path = (notebook_dir / ".artifacts" / "graph_visualization.html").resolve()

# Make sure to convert to string if visualize_graph expects a string
b = await visualize_graph(str(graph_file_path))


[2m2026-01-23T12:17:02.355578[0m [[32m[1minfo     [0m] [1mRetrieved 157 nodes and 412 edges in 0.01 seconds[0m [[0m[1m[34mcognee.shared.logging_utils[0m][0m

[2m2026-01-23T12:17:02.825256[0m [[32m[1minfo     [0m] [1mGraph visualization saved as /Users/jpv/Bharath_J_P_V/Learning/Gen_ai/Memory_handler/.artifacts/graph_visualization.html[0m [[0m[1m[34mcognee.shared.logging_utils[0m][0m

[2m2026-01-23T12:17:02.826693[0m [[32m[1minfo     [0m] [1mThe HTML file has been stored at path: /Users/jpv/Bharath_J_P_V/Learning/Gen_ai/Memory_handler/.artifacts/graph_visualization.html[0m [[0m[1m[34mcognee.shared.logging_utils[0m][0m


In [16]:
async def search(
    vector_engine,
    collection_name: str,
    query_text: str = None,
):
    query_vector = (await vector_engine.embedding_engine.embed_text([query_text]))[0]

    connection = await vector_engine.get_connection()
    collection = await connection.open_table(collection_name)

    results = await collection.vector_search(query_vector).limit(10).to_pandas()

    result_values = list(results.to_dict("index").values())

    return [
        dict(
            id=str(result["id"]),
            payload=result["payload"],
            score=result["_distance"],
        )
        for result in result_values
    ]


from cognee.infrastructure.databases.vector import get_vector_engine

vector_engine = get_vector_engine()
results = await search(vector_engine, "Entity_name", "sarah.nguyen@example.com")
for result in results:
    print(result)

{'id': '4d8dda57-2681-5264-a2bd-e2ddfe66a785', 'payload': {'id': '4d8dda57-2681-5264-a2bd-e2ddfe66a785', 'created_at': 1769170600522, 'updated_at': 1769170600522, 'ontology_valid': False, 'version': 1, 'topological_rank': 0, 'type': 'IndexSchema', 'text': 'sarah nguyen'}, 'score': 0.5709534883499146}
{'id': 'b6365021-70ae-53ea-83e7-885714b56092', 'payload': {'id': 'b6365021-70ae-53ea-83e7-885714b56092', 'created_at': 1769170583867, 'updated_at': 1769170583867, 'ontology_valid': False, 'version': 1, 'topological_rank': 0, 'type': 'IndexSchema', 'text': 'jessica.miller@example.com'}, 'score': 0.7203113436698914}
{'id': 'ca14dbae-558b-5284-b057-dc2b1e965693', 'payload': {'id': 'ca14dbae-558b-5284-b057-dc2b1e965693', 'created_at': 1769170608033, 'updated_at': 1769170608033, 'ontology_valid': False, 'version': 1, 'topological_rank': 0, 'type': 'IndexSchema', 'text': 'michael.rodriguez@example.com'}, 'score': 0.7379818558692932}
{'id': '282e7e92-e3d5-5a27-96fd-f145d6ebf7ec', 'payload': {'id'

In [17]:
from cognee.api.v1.search import SearchType

node = (await vector_engine.search("Entity_name", "sarah.nguyen@example.com"))[0]
node_name = node.payload["text"]

search_results = await cognee.search(query_type=SearchType.SUMMARIES, query_text=node_name)
print("\n\Extracted summaries are:\n")
for result in search_results:
    print(f"{result}\n")

  print("\n\Extracted summaries are:\n")

[2m2026-01-23T12:27:14.208665[0m [[32m[1minfo     [0m] [1mStarting summary retrieval for query: 'sarah nguyen'[0m [[0m[1m[34mSummariesRetriever[0m][0m

[2m2026-01-23T12:27:15.120356[0m [[32m[1minfo     [0m] [1mFound 6 summaries from vector search[0m [[0m[1m[34mSummariesRetriever[0m][0m

[2m2026-01-23T12:27:15.120976[0m [[32m[1minfo     [0m] [1mReturning 6 summary payloads  [0m [[0m[1m[34mSummariesRetriever[0m][0m

[2m2026-01-23T12:27:15.121498[0m [[32m[1minfo     [0m] [1mStarting completion generation for query: 'sarah nguyen'[0m [[0m[1m[34mSummariesRetriever[0m][0m

[2m2026-01-23T12:27:15.122084[0m [[32m[1minfo     [0m] [1mReturning context with 6 item(s)[0m [[0m[1m[34mSummariesRetriever[0m][0m



\Extracted summaries are:

{'id': 'ea178f20-a297-5401-9ef5-7f55a2d06245', 'created_at': 1769170613935, 'updated_at': 1769170613935, 'ontology_valid': False, 'version': 1, 'topological_rank': 0, 'type': 'IndexSchema', 'text': 'Senior Data Scientist with 8+ years’ experience in machine learning and predictive analytics; Ph.D. in Computer Science (Stanford). Specializes in NLP, deep learning, and deploying scalable production models. Proficient in Python, R, SQL, TensorFlow, Keras, Scikit‑Learn, Hadoop and Spark.'}

{'id': 'c2fc420d-f56a-50f7-9029-9d0df736e7d1', 'created_at': 1769170608032, 'updated_at': 1769170608032, 'ontology_valid': False, 'version': 1, 'topological_rank': 0, 'type': 'IndexSchema', 'text': 'Data Scientist with expertise in machine learning and statistical modeling, experienced in handling large datasets and turning data into actionable business outcomes; notable achievement: cut customer acquisition cost by 15%.'}

{'id': '82dbd9c6-9fcd-5dc5-a533-16bc8bf9c07c', 'crea

In [None]:
from cognee.api.v1.search import SearchType

node = (await vector_engine.search("Entity_name", "sarah.nguyen@example.com"))[0]
node_name = node.payload["text"]

search_results = await cognee.search(query_type=SearchType.SUMMARIES, query_text=node_name)
print("Extracted summaries are:\n")
for result in search_results:
    print(f"{result}\n")

  print("\n\Extracted summaries are:\n")

[2m2026-01-23T12:32:02.666255[0m [[32m[1minfo     [0m] [1mStarting summary retrieval for query: 'sarah nguyen'[0m [[0m[1m[34mSummariesRetriever[0m][0m

[2m2026-01-23T12:32:03.693489[0m [[32m[1minfo     [0m] [1mFound 6 summaries from vector search[0m [[0m[1m[34mSummariesRetriever[0m][0m

[2m2026-01-23T12:32:03.694260[0m [[32m[1minfo     [0m] [1mReturning 6 summary payloads  [0m [[0m[1m[34mSummariesRetriever[0m][0m

[2m2026-01-23T12:32:03.694771[0m [[32m[1minfo     [0m] [1mStarting completion generation for query: 'sarah nguyen'[0m [[0m[1m[34mSummariesRetriever[0m][0m

[2m2026-01-23T12:32:03.695227[0m [[32m[1minfo     [0m] [1mReturning context with 6 item(s)[0m [[0m[1m[34mSummariesRetriever[0m][0m



\Extracted summaries are:

{'id': 'ea178f20-a297-5401-9ef5-7f55a2d06245', 'created_at': 1769170613935, 'updated_at': 1769170613935, 'ontology_valid': False, 'version': 1, 'topological_rank': 0, 'type': 'IndexSchema', 'text': 'Senior Data Scientist with 8+ years’ experience in machine learning and predictive analytics; Ph.D. in Computer Science (Stanford). Specializes in NLP, deep learning, and deploying scalable production models. Proficient in Python, R, SQL, TensorFlow, Keras, Scikit‑Learn, Hadoop and Spark.'}

{'id': 'c2fc420d-f56a-50f7-9029-9d0df736e7d1', 'created_at': 1769170608032, 'updated_at': 1769170608032, 'ontology_valid': False, 'version': 1, 'topological_rank': 0, 'type': 'IndexSchema', 'text': 'Data Scientist with expertise in machine learning and statistical modeling, experienced in handling large datasets and turning data into actionable business outcomes; notable achievement: cut customer acquisition cost by 15%.'}

{'id': '82dbd9c6-9fcd-5dc5-a533-16bc8bf9c07c', 'crea

In [19]:
search_results = await cognee.search(query_type=SearchType.CHUNKS, query_text=node_name)
print("\n\nExtracted chunks are:\n")
for result in search_results:
    print(f"{result}\n")


[2m2026-01-23T12:33:03.913573[0m [[32m[1minfo     [0m] [1mStarting chunk retrieval for query: 'sarah nguyen'[0m [[0m[1m[34mChunksRetriever[0m][0m

[2m2026-01-23T12:33:04.806392[0m [[32m[1minfo     [0m] [1mFound 6 chunks from vector search[0m [[0m[1m[34mChunksRetriever[0m][0m

[2m2026-01-23T12:33:04.807081[0m [[32m[1minfo     [0m] [1mReturning 6 chunk payloads    [0m [[0m[1m[34mChunksRetriever[0m][0m

[2m2026-01-23T12:33:04.807565[0m [[32m[1minfo     [0m] [1mStarting completion generation for query: 'sarah nguyen'[0m [[0m[1m[34mChunksRetriever[0m][0m

[2m2026-01-23T12:33:04.808244[0m [[32m[1minfo     [0m] [1mReturning context with 6 item(s)[0m [[0m[1m[34mChunksRetriever[0m][0m




Extracted chunks are:

{'id': '74ce30f5-feb6-54e7-ae82-bb737740b009', 'created_at': 1769170600522, 'updated_at': 1769170600522, 'ontology_valid': False, 'version': 1, 'topological_rank': 0, 'type': 'IndexSchema', 'text': '\nCV 3: Relevant\nName: Sarah Nguyen\nContact Information:\n\nEmail: sarah.nguyen@example.com\nPhone: (555) 345-6789\nSummary:\n\nData Scientist specializing in machine learning with 6 years of experience. Passionate about leveraging data to drive business solutions and improve product performance.\n\nEducation:\n\nM.S. in Statistics, University of Washington (2014)\nB.S. in Applied Mathematics, University of Texas at Austin (2012)\nExperience:\n\nData Scientist, QuantumTech (2016 – Present)\nDesigned and implemented machine learning algorithms for financial forecasting.\nImproved model efficiency by 20% through algorithm optimization.\nJunior Data Scientist, DataCore Solutions (2014 – 2016)\nAssisted in developing predictive models for supply chain optimization.\nC

In [20]:
search_results = await cognee.search(query_type=SearchType.GRAPH_COMPLETION, query_text=node_name)
print("\n\nExtracted sentences are:\n")
for result in search_results:
    print(f"{result}\n")


[2m2026-01-23T12:33:14.519174[0m [[32m[1minfo     [0m] [1mVector collection retrieval completed: Retrieved distances from 6 collections in 0.19s[0m [[0m[1m[34mcognee.shared.logging_utils[0m][0m

[2m2026-01-23T12:33:14.520038[0m [[32m[1minfo     [0m] [1mRetrieving ID-filtered graph from database.[0m [[0m[1m[34mCogneeGraph[0m][0m

[2m2026-01-23T12:33:14.536953[0m [[32m[1minfo     [0m] [1mID-filtered retrieval: 157 nodes and 412 edges in 0.02s[0m [[0m[1m[34mcognee.shared.logging_utils[0m][0m

[2m2026-01-23T12:33:14.539894[0m [[32m[1minfo     [0m] [1mGraph projection completed: 157 nodes, 412 edges in 0.00s[0m [[0m[1m[34mCogneeGraph[0m][0m




Extracted sentences are:

Sarah Nguyen — Data Scientist (machine learning), 6 years’ experience. Email: sarah.nguyen@example.com, Phone: (555) 345-6789. Education: M.S. in Statistics (Univ. of Washington, 2014); B.S. in Applied Mathematics (Univ. of Texas at Austin, 2012). Current: Data Scientist at QuantumTech (since 2016) — designed ML algorithms for financial forecasting, improved model efficiency by 20%. Skills: Python, R, PyTorch, Scikit‑Learn, SAS, SPSS, AWS, Azure.

