# QuestionGeneration [🔗](https://gpt-index.readthedocs.io/en/latest/examples/evaluation/QuestionGeneration.html)
This notebook walks through the process of generating a list of questions that could be asked about your data. This is useful for setting up an evaluation pipeline using the FaithfulnessEvaluator and RelevancyEvaluator evaluation tools.

In [1]:
import nest_asyncio
nest_asyncio.apply()

In [2]:
import logging
import sys
import pandas as pd

logging.basicConfig(stream=sys.stdout, level=logging.INFO)
logging.getLogger().addHandler(logging.StreamHandler(stream=sys.stdout))

In [3]:
from llama_index.evaluation import DatasetGenerator, RelevancyEvaluator
from llama_index import (
    SimpleDirectoryReader,
    VectorStoreIndex,
    ServiceContext,
    LLMPredictor,
    Response,
)
from llama_index.llms import OpenAI

In [4]:
reader = SimpleDirectoryReader("../data/") 
documents = reader.load_data()

In [5]:
documents

[Document(id_='01dfa701-d5c3-4eb2-8076-6df151f81403', embedding=None, metadata={'page_label': '1', 'file_name': 'Christos2023Q4Goals+Values.pdf'}, excluded_embed_metadata_keys=[], excluded_llm_metadata_keys=[], relationships={}, hash='2673340e6a6f92d2c5299e5ba02a3e5c41fea9fee830b5c439006fd30f9148d8', text="Christos'\n2023\nQ4\nGoals\nand\nValues\nWhat\ndo\nI\nwant\nto\naccomplish?\nI\nwant\nto\nmake\na\npositive\nimpact\non\nsociety\nby\naccelerating\ntechnological\nadvancement\nusing\nArtificial\nIntelligence\nand\nMachine\nLearning\ntools\nto\nsolve\nreal-world\nproblems.\nI\nam\nvery\npassionate\nabout\nit\nand\nenjoying\nlearning\nas\nmuch\nas\nI\ncan\non\nthe\ntopic.\nI\nlove\nhelping\npeople\nand\nlearning\nabout\ntheir\nexperiences.\nI\nam\nlooking\nfor\na\nrole\nin\nan\nenvironment\nthat\nencourages\nlearning,\ngrowth\n(subject-matter\nexpertise),\na\nteam\nthat\nis\nalso\npassionate\nabout\nwhat\nthey\ndo\nand\nideally\non\na\nmission\nto\nmake\na\npositive\nimpact\nI\nalign\n

In [6]:
data_generator = DatasetGenerator.from_documents(documents)

chunk_size_limit is deprecated, please specify chunk_size instead


In [23]:
# eval_questions = data_generator.generate_questions_from_nodes()
# run above for first time and below thereafter
df1 = pd.read_csv("../output/cam-qa-dataset.csv", index_col=0)
eval_questions = df1["query"]

In [24]:
# gpt-4
gpt4 = OpenAI(temperature=0, model="gpt-4")
service_context_gpt4 = ServiceContext.from_defaults(llm=gpt4)

In [25]:
evaluator_gpt4 = RelevancyEvaluator(service_context=service_context_gpt4)

In [26]:
# create vector index
vector_index = VectorStoreIndex.from_documents(
    documents, service_context=service_context_gpt4
)

In [27]:
# define jupyter display function
def display_eval_df(query: str, response: Response, eval_result: str) -> None:
    eval_df = pd.DataFrame(
        {
            "Query": query,
            "Response": str(response),
            "Source": response.source_nodes[0].node.get_content()[:1000] + "...",
            "Evaluation Result": eval_result,
        },
        index=[0],
    )
    eval_df = eval_df.style.set_properties(
        **{
            "inline-size": "600px",
            "overflow-wrap": "break-word",
        },
        subset=["Response", "Source"]
    )
    display(eval_df)

In [28]:
query_engine = vector_index.as_query_engine()
response_vector = query_engine.query(eval_questions[1])
eval_result = evaluator_gpt4.evaluate_response(
    query=eval_questions[1], response=response_vector
)

INFO:openai:message='OpenAI API response' path=https://api.openai.com/v1/chat/completions processing_ms=539 request_id=e01a8a19913b2693889bbda11c11af74 response_code=200
message='OpenAI API response' path=https://api.openai.com/v1/chat/completions processing_ms=539 request_id=e01a8a19913b2693889bbda11c11af74 response_code=200


In [29]:
eval_questions[1]

'How does Christos plan to make a positive impact on society?'

In [36]:
response_vector.__dict__

{'response': 'Christos plans to make a positive impact on society by accelerating technological advancement using Artificial Intelligence and Machine Learning tools to solve real-world problems. He is passionate about this and enjoys learning as much as he can on the topic. He is also looking for a role in an environment that encourages learning and growth, with a team that is passionate about what they do. Ideally, he wants to find a company that is focused on solving real-world problems with AI using technology and/or scientific research. His goal is to leverage his experience and strengths to add value to the team and gain a breadth and depth of industry experience in AI. This will enable him to contribute to a greater good and potentially build something of his own that solves a real-world problem through AI.',
 'source_nodes': [NodeWithScore(node=TextNode(id_='677c415c-50a2-47fc-a11b-3489be31f7e4', embedding=None, metadata={'page_label': '1', 'file_name': 'Christos2023Q4Goals+Valu

In [35]:
eval_result.__dict__

{'query': 'How does Christos plan to make a positive impact on society?',
 'contexts': None,
 'response': 'Christos plans to make a positive impact on society by accelerating technological advancement using Artificial Intelligence and Machine Learning tools to solve real-world problems. He is passionate about this and enjoys learning as much as he can on the topic. He is also looking for a role in an environment that encourages learning and growth, with a team that is passionate about what they do. Ideally, he wants to find a company that is focused on solving real-world problems with AI using technology and/or scientific research. His goal is to leverage his experience and strengths to add value to the team and gain a breadth and depth of industry experience in AI. This will enable him to contribute to a greater good and potentially build something of his own that solves a real-world problem through AI.',
 'passing': True,
 'feedback': 'YES',
 'score': 1.0}

In [37]:
display_eval_df(eval_questions[1], response_vector, eval_result.feedback)

Unnamed: 0,Query,Response,Source,Evaluation Result
0,How does Christos plan to make a positive impact on society?,"Christos plans to make a positive impact on society by accelerating technological advancement using Artificial Intelligence and Machine Learning tools to solve real-world problems. He is passionate about this and enjoys learning as much as he can on the topic. He is also looking for a role in an environment that encourages learning and growth, with a team that is passionate about what they do. Ideally, he wants to find a company that is focused on solving real-world problems with AI using technology and/or scientific research. His goal is to leverage his experience and strengths to add value to the team and gain a breadth and depth of industry experience in AI. This will enable him to contribute to a greater good and potentially build something of his own that solves a real-world problem through AI.","Christos' 2023 Q4 Goals and Values What do I want to accomplish? I want to make a positive impact on society by accelerating technological advancement using Artificial Intelligence and Machine Learning tools to solve real-world problems. I am very passionate about it and enjoying learning as much as I can on the topic. I love helping people and learning about their experiences. I am looking for a role in an environment that encourages learning, growth (subject-matter expertise), a team that is also passionate about what they do and ideally on a mission to make a positive impact I align with. I want to find a company that is focused on solving real world problems with AI using technology and/or scientific research. Ideally this company matches my values, enables my working style and encourages me to grow and learn. I want to be in a role that leverages my experience and strengths to add value to the team. I hope that the role will also allow me to gain a breadth and depth of industry ex...",YES


In [56]:
# define query, response, source, eval_result
def get_source_with_query_response_eval(query: str, response: Response, eval_result: str) -> None:
    return {
        "query": query,
        "response": str(response),
        "source": f"{response.source_nodes[0].node.get_content()[:1000]}...",
        "eval_result": eval_result,
    }

In [57]:
get_source_with_query_response_eval(eval_questions[1], response_vector, eval_result.feedback)

{'query': 'How does Christos plan to make a positive impact on society?',
 'response': 'Christos plans to make a positive impact on society by accelerating technological advancement using Artificial Intelligence and Machine Learning tools to solve real-world problems. He is passionate about this and enjoys learning as much as he can on the topic. He is also looking for a role in an environment that encourages learning and growth, with a team that is passionate about what they do. Ideally, he wants to find a company that is focused on solving real-world problems with AI using technology and/or scientific research. His goal is to leverage his experience and strengths to add value to the team and gain a breadth and depth of industry experience in AI. This will enable him to contribute to a greater good and potentially build something of his own that solves a real-world problem through AI.',
 'source': "Christos'\n2023\nQ4\nGoals\nand\nValues\nWhat\ndo\nI\nwant\nto\naccomplish?\nI\nwant\nt

In [None]:
df = None
for i in range(len(eval_questions)):
    query_i = eval_questions[i]
    response_vector_i = query_engine.query(eval_questions[i])
    eval_result_i = evaluator_gpt4.evaluate_response(query=query_i, response=response_vector_i)
    tmp_df = pd.DataFrame([get_source_with_query_response_eval(query_i, response_vector_i, eval_result_i.feedback)])
    df = tmp_df if df is None else pd.concat([df,tmp_df], ignore_index=True)

In [73]:
df.to_csv("../output/cam-qa-dataset.csv", escapechar='\\')

In [77]:
pd.read_csv("../output/cam-qa-dataset.csv", index_col=0).head()

Unnamed: 0,query,response,source,eval_result
0,What is Christos' main goal for the fourth qua...,Christos' main goal for the fourth quarter of ...,Christos'\n2023\nQ4\nGoals\nand\nValues\nWhat\...,YES
1,How does Christos plan to make a positive impa...,Christos plans to make a positive impact on so...,Christos'\n2023\nQ4\nGoals\nand\nValues\nWhat\...,YES
2,What is Christos passionate about and enjoying...,Christos is passionate about making a positive...,Christos'\n2023\nQ4\nGoals\nand\nValues\nWhat\...,YES
3,What kind of environment is Christos looking f...,Christos is seeking a role in an environment t...,Christos'\n2023\nQ4\nGoals\nand\nValues\nWhat\...,YES
4,What type of company is Christos hoping to wor...,Christos is hoping to work for a company that ...,Christos'\n2023\nQ4\nGoals\nand\nValues\nWhat\...,YES


In [78]:
# AI Engineer @ Cohere
sample_job_description = """
AI Engineer
SAN FRANCISCO /TECH – FORWARD DEPLOYED ENGINEERING /FULL-TIME/ HYBRID
Who are we?
We’re a team of engineers, thinkers, and champions whose aim is to give technology language. Every day our team is breaking new ground, as we build transformational AI technology and products for enterprise and developers that wish to harness the power of Large Language Models.

We're driven by ambition, as we firmly believe that our technology has the potential to revolutionise the way industries engage with natural language. Our strong technical foundation speaks for itself, with our team composed of world-class experts who have collectively accumulated hundreds of thousands of citations in academia.

The Cohere team is a collective of college dropouts, PhDs, alumni of big tech and scrappy start-ups, new grads and career pivots, who believe a diverse team is the key to a safer, more responsible technology. At Cohere, work isn’t the opposite of play, as we build the future of language AI with team members on almost every continent in the world, working from high rises, cabins, tour buses, and dog-friendly offices.

There’s no better time to herald the next step with us as we shape the future of Generative AI.

Why this role?

As an AI Engineer (a Member of Technical Staff role on our Forward Deployed team) you will work directly with customers to quickly understand their greatest problems and design and implement solutions using Large Language Models.

You’ll apply your problem-solving ability, creativity, and technical skills to close the last-mile gap in Enterprise AI adoption. You’ll be able to deliver products like early startup CTOs/CEOs do and disrupt some of the most important industries and institutions globally!

In this role, you will:
- Own and build large new areas within our product.
- Work across backend, frontend, and customize Large Language Models.
- Experiment at a high velocity and level of quality to engage our customers and eventually deliver solutions that exceed their expectations.
- Work across the entire product lifecycle from conceptualization through production.

This career opportunity may be a good match for you if you have:
- 3+ years of model training, deployment, and maintenance in a production environment.
- Strong skills in NLP and deep learning.
- Experience scaling products at hyper-growth startup.
- Strong written and verbal communication skills.
- Ability and interest to travel up to 25%, as needed to client sites, but flexible based on personal preferences.

Nice to have skills/experiences:
- Experience improving LLM performance for custom domains via fine tuning or RLHF.
- Experience in Information Retrieval systems for document question answering.
- Experience in day-to-day NLP for industry using Python and related toolchains (SpaCy, HuggingFace, NLTK, etc.).
- Published research in areas of machine learning at major conferences and/or journals.
If some of the above doesn’t line up perfectly with your experience, we still encourage you to apply! If you consider yourself a thoughtful worker, a lifelong learner, and a kind and playful team member, Cohere is the place for you.

We value and celebrate diversity and strive to create an inclusive work environment for all. We welcome applicants of all kinds and are committed to providing an equal opportunity process. Cohere provides accessibility accommodations during the recruitment process. Should you require any accommodation, please let us know and we will work with you to meet your needs.

Our Perks:
🤝 An open and inclusive culture and work environment 
🧑‍💻 Work closely with a team on the cutting edge of AI research 
🍽 Free daily lunch 
🦷 Full health and dental benefits, including a separate budget to take care of your mental health 
🐣 100% Parental Leave top-up for 6 months for employees based in Canada, the US, and the UK
🎨 Personal enrichment benefits towards arts and culture, fitness and well-being, quality time, and workspace improvement
🏙 Remote-flexible, offices in Toronto, Palo Alto, San-Francisco and London and co-working stipend
✈️ 6 weeks of vacation"
"""

In [82]:
sample_response = query_engine.query(
    f"""
    [QUERY]
    Which areas from the provided Job Description below does the candidate match? Give examples of matches and misses with scores and explaination.
    [CONTEXT]
    {sample_job_description}
    """)

In [83]:
print(sample_response)

The candidate, Christos Magganas, matches several areas of the job description for the AI Engineer role at Cohere. 

Matches:
1. Experience with Large Language Models (LLM): Christos has extensive experience with LLMs, as evidenced by his work at Learvo and AI Makerspace where he utilized tools like LangChain and LlamaIndex for AI applications. Score: 9/10
2. Generative AI: Christos has worked on Generative AI applications at Learvo and AI Makerspace, aligning with the company's focus on shaping the future of Generative AI. Score: 8/10
3. Strong skills in NLP and deep learning: While the resume does not explicitly mention NLP or deep learning, his experience with LLMs and AI/ML suggests a strong background in these areas. Score: 7/10
4. Strong written and verbal communication skills: Christos' ability to lead cross-functional teams and his emphasis on empathy, humility, and integrity suggest strong communication skills. Score: 8/10
5. Experience scaling products at hyper-growth startup

This process both shows compatibility, and gaps in portfolio whether those are areas of improvement or a missing project that you actually have done that can be updated.