# Building a test dataset
## Introduction
- To run test, it's important to have a test dataset.
- We can use an LLM to generate a synthetic dataset.
- Here's we'll use RAGAS to generate a synthetic dataset.
## Installation

https://docs.ragas.io/en/stable/getstarted/rag_testset_generation/#load-documents

In [1]:
%pip install -q ragas langchain_openai langchain pypdf


[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m24.0[0m[39;49m -> [0m[32;49m25.0.1[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpip install --upgrade pip[0m
Note: you may need to restart the kernel to use updated packages.


In [2]:
import sys
sys.path.append('../..')

## Setup embeddings and llm

In [9]:
from ragas.llms import LangchainLLMWrapper
from ragas.embeddings import LangchainEmbeddingsWrapper

from langchain_openai import ChatOpenAI, OpenAIEmbeddings

# Directly create instances of ChatOpenAI and OpenAIEmbeddings
llm = ChatOpenAI(model="gpt-4o")
embeddings = OpenAIEmbeddings(model="text-embedding-3-small")

generator_llm = LangchainLLMWrapper(llm)
generator_embeddings = LangchainEmbeddingsWrapper(embeddings)


## Load and index the documents

- Here we use a Webloader to load in some documents from the website.

In [None]:
from langchain.document_loaders import WebBaseLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter
from typing import List

# Load text from webpage
web_url = "https://docs.ragas.io/en/stable/"

loader = WebBaseLoader(web_url, show_progress=True, )
data = loader.load()

# Split text into documents
text_splitter = RecursiveCharacterTextSplitter(chunk_size=500, chunk_overlap=0)
documents  = text_splitter.split_documents(data)
print(len(documents))

25


- To handle PDF documents, we use the PyPDFLoader.

In [5]:
from langchain_community.document_loaders import PyPDFLoader
pdf_url = "data/papers/microsoft_annual_report_2022.pdf"
loader = PyPDFLoader(pdf_url)

documents = []
pages = []
async for page in loader.alazy_load():
    pages.append(page)

print("PDF pages:", len(pages))

# Add PDF pages to the documents
documents.extend(pages)
print("Documents:", len(documents))

PDF pages: 93
Documents: 93


- Internally RAGAS uses `rapidfuzz` to calculate the string distance.

In [6]:
%pip install -q rapidfuzz


[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m24.0[0m[39;49m -> [0m[32;49m25.0.1[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpip install --upgrade pip[0m
Note: you may need to restart the kernel to use updated packages.


In [10]:
from ragas.testset import TestsetGenerator # type: ignore

# Number of questions to generate
testset_size = 10

generator = TestsetGenerator(llm=generator_llm, embedding_model=generator_embeddings)
dataset = generator.generate_with_langchain_docs(documents, testset_size=testset_size)

Applying HeadlineSplitter:   0%|          | 0/93 [00:00<?, ?it/s]           unable to apply transformation: 'headlines' property not found in this node
unable to apply transformation: 'headlines' property not found in this node
unable to apply transformation: 'headlines' property not found in this node
unable to apply transformation: 'headlines' property not found in this node
unable to apply transformation: 'headlines' property not found in this node
unable to apply transformation: 'headlines' property not found in this node
unable to apply transformation: 'headlines' property not found in this node
unable to apply transformation: 'headlines' property not found in this node
unable to apply transformation: 'headlines' property not found in this node
unable to apply transformation: 'headlines' property not found in this node
unable to apply transformation: 'headlines' property not found in this node
unable to apply transformation: 'headlines' property not found in this node
unable to ap

## Show the generated dataset
- We can see the `user_input` and the `reference` = answer.
- It also indicates the `reference_contexts` that was used to generate the answer.

In [11]:
# Extract the dataset and wrap the text
# This is to make the text more readable
# Extract user_input and reference columns and wrap text for readability
import pandas as pd
pd.set_option('display.expand_frame_repr', False)  # Prevents line breaks in DataFrame display
pd.set_option('display.max_colwidth', 300)

df = dataset.to_pandas()[["user_input", "reference", "reference_contexts"]]
df = df.style.set_properties(**{'text-align': 'left'})  # Align text to the left
df

Unnamed: 0,user_input,reference,reference_contexts
0,What happening in Ukraine and how it affect Microsoft?,"The war in Ukraine is ongoing, and it is part of the historic economic, societal, and geopolitical changes affecting the world. Microsoft is positioned at a historic intersection of opportunity and responsibility, with a mission to empower every person and organization to achieve more, especially during these uncertain times.","['1 Dear shareholders, colleagues, customers, and partners: We are living through a period of historic economic, societal, and geopolitical change. The world in 2022 looks nothing like the world in 2019. As I write this, inflation is at a 40 -year high, supply chains are stretched, and the war in Ukraine is ongoing. At the same time, we are entering a technological era with the potential to power awesome advancements across every sector of our economy and society. As the world’s largest software company, this places us at a historic intersection of opportunity and responsibility to the world around us. Our mission to empower every person and every organization on the planet to achieve more has never been more urgent or more necessary. For all the uncertainty in the world, one thing is clear: People and organizations in every industry are increasingly looking to digital technology to overcome today’s challenges and emerge stronger. And no company is better positioned to help them than Microsoft. Every day this past fiscal year I have had the privilege to witness our customers use our platforms and tools to connect what technology can do with what the world needs it to do. Here are just a few examples: • Ferrovial, which builds and manages some of the world’s busiest airports and highways, is using our cloud infrastructure to build safer roads as it prepares for a future of autonomous transportation. • Peace Parks Foundation, a nonprofit helping protect natural ecosystems in Southern Africa, is using Microsoft Dynamics 365 and Power BI to secure essential funding, as well as our Azure AI and IoT solutions to help rangers scale their park maintenance and wildlife crime prevention work. • One of the world’s largest robotics companies, Kawasaki Heavy Industries, is using the breadth of our tools — from Azure IoT and HoloLens—to create an industrial metaverse solution that brings its distributed workforce together with its network of connected equipment to improve productivity and keep employees safe. • Globo, the biggest media and TV company in Brazil, is using Power Platform to empower its employees to build their own solutions for everything from booking sets to setting schedules. • And Ørsted, which produces a quarter of the world’s wind energy, is using the Microsoft Intelligent Data Platform to turn data from its offshore turbines into insights for predictive maintenance. Amid this dynamic environment, we delivered record results in fiscal year 2022: We reported $198 billion in revenue and $83 billion in operating income. And the Microsoft Cloud surpassed $100 billion in annualized revenue for the first time. OUR RESPONSIBILITY As a corporation, our purpose and actions must be aligned with addressing the world’s problems, not creating new ones. At our very core, we need to deliver innovation that helps drive broad economic growth. We, as a company, will do well when the world around us does well. That’s what I believe will lead to widespread human progress and ultimately improve the lives of everyone. There is no more powerful input than digital technology to drive the world’s economic output. This is the core thesis for our being as a company, but it’s not enough. As we drive global economic growth, we must also commit to creating a more inclusive, equitable, sustainable, and trusted future.']"
1,What Impact Summary say about Microsoft progress?,"Our annual Impact Summary shares more about our progress and learnings across our commitments to responsibly develop and use technologies like AI, and it provides detailed reports on our environmental data, political activities, workforce demographics, human rights work, and more.","['4 Our commitment to responsibly develop and use technologies like AI is core to who we are. We put our commitment into practice, not only within Microsoft but by empowering our customers and partners to do the same and by advocating for policy change. We released our Responsible AI Standard, which outlines 17 goals aligned to our six AI principles and includes tools and practices to support them. And we share our open -source tools, including the new Responsible AI Dashboard, to help developers building AI technologies identify and mitigate issues before deployment. Finally, we provide clear reporting and information on how we run our business and how we work with customers and partners, delivering the transparency that is central to trust. Our annual Impact Summary shares more about our progress and learnings across these four commitments, and our Reports Hub provides detailed reports on our environmental data, our political activities, our workforce demographics, our human rights work, and more. We should all be proud of this work —and I am. But it’s easy to talk about what we’re doing well. As we look to the next year and beyond, we’ll continue to reflect on where the world needs us to do better. OUR OPPORTUNITY Now, let me turn to how we are positioned to capture the massive opportunities ahead. Over the past few years, I’ve written extensively about digital transformation, but now we need to go beyond that to deliver on what I call the “digital imperative.” Technology is a deflationary force in an inflationary economy. Every organization in every industry will need to infuse technology into every business process and function so they can do more with less. It’s what I believe will make the difference between organizations that thrive and those that get left behind. In the coming years, technology as a percentage of GDP will double from 5% to 10% and beyond, as technology moves from a back -office cost center to a defining feature of every product and service. But even more important will be technology’s influence on the other 90% of the world’s economy. From communications and commerce, to logistics, financial services, energy, healthcare, and entertainment, digital technology will power the entire global economy as every company becomes a software company in its own right. Across our customer solution areas, we are delivering powerful platforms, tools, and services that expand our opportunity to help every organization in every industry deliver on the digital imperative —with a business model that is trusted and always aligned with their success. Apps and infrastructure We are building Azure as the world’s computer, with more than 60 datacenter regions —more than any other provider — delivering faster access to cloud services while addressing critical data residency requirements. With Azure Arc, we’re bringing Azure anywhere, meeting customers where they are and enabling them to run apps across on-premises, edge, or multicloud environments. And we’re extending our infrastructure to the 5G network edge with Azure for Operators, introducing new solutions to help telecom operators deliver ultra-low-latency services closer to end users. As the digital and physical worlds come together, we’re also leading in the industrial metaverse. From smart factories, to smart buildings, to smart cities, we’re helping organizations use Azure IoT, Azure Digital Twins, and Microsoft Mesh to digitize people, places, and things, in order to visualize, simulate, and analyze any business process.']"
2,What Microsoft do with data?,"Microsoft provides a comprehensive data stack that includes best-in-class databases and analytics, as well as data governance, to help organizations turn their data into predictive and analytical power.","['Data and AI From best-in-class databases and analytics to data governance, we have the most comprehensive data stack to help every organization turn its data into predictive and analytical power. With our new Microsoft Intelligent Data']"
3,Cud yu pleese explane the role of Sektion 21E of the Sekurities Exchange Act of 1934 in the context of forward-looking statements?,"Section 21E of the Securities Exchange Act of 1934 is referenced in the context of forward-looking statements, which include estimates, projections, and statements relating to business plans, objectives, and expected operating results. These statements are identified by terms like 'believe,' 'project,' 'expect,' and similar expressions, and are based on current expectations and assumptions subject to risks and uncertainties that may cause actual results to differ materially.","['11 Note About Forward-Looking Statements This report includes estimates, projections, statements relating to our business plans, objectives, and expected operating results that are “forward -looking statements” within the meaning of the Private Securities Litigation Reform Act of 1995, Section 27A of the Securities Act of 1933, and Section 21E of the Securities Exchange Act of 1934. Forward -looking statements may appear throughout this report, including the following sections: “Business” in our fiscal year 2022 Form 10-K and “Management’s Discussion and Analysis of Financial Condition and Results of Operations” in our fiscal year 2022 Form 10 -K. These forward -looking statements generally are identified by the words “believe,” “project,” “expect,” “anticipate,” “estimate,” “intend,” “strategy,” “future,” “opportunity,” “plan,” “may,” “should,” “will,” “would,” “will be,” “will continue,” “will likely result,” and similar expressions. Forward -looking statements are based on current expectations and assumptions that are subject to risks and uncertainties that may cause actual results to differ materially. We describe risks and uncertainties that could cause actual results and events to differ materially in “Risk Factors,” “Management’s Discussion and Analysis of Financial Condition and Results of Operations,” and “Quantitative and Qualitative Disclosures about Market Risk"" in our fiscal year 2022 Form 10 -K. Readers are cautioned not to place undue reliance on forward - looking statements, which speak only as of the date they are made. We undertake no obligation to update or revise publicly any forward-looking statements, whether because of new information, future events, or otherwise. BUSINESS GENERAL Embracing Our Future Microsoft is a technology company whose mission is to empower every person and every organization on the planet to achieve more. We strive to create local opportunity, growth, and impact in every country around the world. Our platforms and tools help drive small business productivity, large business competitiveness, and public -sector efficiency. We are creating the tools and platforms that deliver better, faster, and more effective solutions to support new startups, improve educational and health outcomes, and empower human ingenuity. Microsoft is innovating and expanding our entire portfolio to help people and organizations overcome today’s challenges and emerge stronger. We bring technology and products together into experiences and solutions that unlock value for our customers. In a dynamic environment, digital technology is the key input that powers the world’s economic output. Our ecosystem of customers and partners have learned that while hybrid work is complex, embracing flexibility, different work styles, and a culture of trust can help navigate the challenges the world faces today. Organizations of all sizes have digitized business - critical functions, redefining what they can expect from their business applications. Customers are looking to unlock value while simplifying security and management. From infrastructure and data, to business applications and collaboration, we provide unique, differentiated value to customers. We are building a distributed computing fabric – across cloud and the edge – to help every organization build, run, and manage mission-critical workloads anywhere. In the next phase of innovation, artificial intelligence (“AI”) capabilities are rapidly advancing, fueled by data and knowledge of the world. We are enabling metaverse experiences at all layers of our stack, so customers can more effectively model, automate, simulate, and predict changes within their industrial environments, feel a greater sense of presence in the new world of hybrid work, and create custom immersive worlds to enable new opportunities for connection and experimentation.']"
4,How does Microsoft integrate AI principles and standards to drive economic change and ensure responsible development?,"Microsoft integrates AI principles and standards to drive economic change and ensure responsible development by releasing its Responsible AI Standard, which outlines 17 goals aligned with six AI principles. This includes tools and practices to support these principles, such as the Responsible AI Dashboard, which helps developers identify and mitigate issues before deployment. Additionally, Microsoft empowers its customers and partners to adopt these standards and advocates for policy change, ensuring that technology acts as a deflationary force in an inflationary economy. This approach aligns with Microsoft's mission to drive broad economic growth while committing to a more inclusive, equitable, sustainable, and trusted future.","['<1-hop>\n\n1 Dear shareholders, colleagues, customers, and partners: We are living through a period of historic economic, societal, and geopolitical change. The world in 2022 looks nothing like the world in 2019. As I write this, inflation is at a 40 -year high, supply chains are stretched, and the war in Ukraine is ongoing. At the same time, we are entering a technological era with the potential to power awesome advancements across every sector of our economy and society. As the world’s largest software company, this places us at a historic intersection of opportunity and responsibility to the world around us. Our mission to empower every person and every organization on the planet to achieve more has never been more urgent or more necessary. For all the uncertainty in the world, one thing is clear: People and organizations in every industry are increasingly looking to digital technology to overcome today’s challenges and emerge stronger. And no company is better positioned to help them than Microsoft. Every day this past fiscal year I have had the privilege to witness our customers use our platforms and tools to connect what technology can do with what the world needs it to do. Here are just a few examples: • Ferrovial, which builds and manages some of the world’s busiest airports and highways, is using our cloud infrastructure to build safer roads as it prepares for a future of autonomous transportation. • Peace Parks Foundation, a nonprofit helping protect natural ecosystems in Southern Africa, is using Microsoft Dynamics 365 and Power BI to secure essential funding, as well as our Azure AI and IoT solutions to help rangers scale their park maintenance and wildlife crime prevention work. • One of the world’s largest robotics companies, Kawasaki Heavy Industries, is using the breadth of our tools — from Azure IoT and HoloLens—to create an industrial metaverse solution that brings its distributed workforce together with its network of connected equipment to improve productivity and keep employees safe. • Globo, the biggest media and TV company in Brazil, is using Power Platform to empower its employees to build their own solutions for everything from booking sets to setting schedules. • And Ørsted, which produces a quarter of the world’s wind energy, is using the Microsoft Intelligent Data Platform to turn data from its offshore turbines into insights for predictive maintenance. Amid this dynamic environment, we delivered record results in fiscal year 2022: We reported $198 billion in revenue and $83 billion in operating income. And the Microsoft Cloud surpassed $100 billion in annualized revenue for the first time. OUR RESPONSIBILITY As a corporation, our purpose and actions must be aligned with addressing the world’s problems, not creating new ones. At our very core, we need to deliver innovation that helps drive broad economic growth. We, as a company, will do well when the world around us does well. That’s what I believe will lead to widespread human progress and ultimately improve the lives of everyone. There is no more powerful input than digital technology to drive the world’s economic output. This is the core thesis for our being as a company, but it’s not enough. As we drive global economic growth, we must also commit to creating a more inclusive, equitable, sustainable, and trusted future.', '<2-hop>\n\n4 Our commitment to responsibly develop and use technologies like AI is core to who we are. We put our commitment into practice, not only within Microsoft but by empowering our customers and partners to do the same and by advocating for policy change. We released our Responsible AI Standard, which outlines 17 goals aligned to our six AI principles and includes tools and practices to support them. And we share our open -source tools, including the new Responsible AI Dashboard, to help developers building AI technologies identify and mitigate issues before deployment. Finally, we provide clear reporting and information on how we run our business and how we work with customers and partners, delivering the transparency that is central to trust. Our annual Impact Summary shares more about our progress and learnings across these four commitments, and our Reports Hub provides detailed reports on our environmental data, our political activities, our workforce demographics, our human rights work, and more. We should all be proud of this work —and I am. But it’s easy to talk about what we’re doing well. As we look to the next year and beyond, we’ll continue to reflect on where the world needs us to do better. OUR OPPORTUNITY Now, let me turn to how we are positioned to capture the massive opportunities ahead. Over the past few years, I’ve written extensively about digital transformation, but now we need to go beyond that to deliver on what I call the “digital imperative.” Technology is a deflationary force in an inflationary economy. Every organization in every industry will need to infuse technology into every business process and function so they can do more with less. It’s what I believe will make the difference between organizations that thrive and those that get left behind. In the coming years, technology as a percentage of GDP will double from 5% to 10% and beyond, as technology moves from a back -office cost center to a defining feature of every product and service. But even more important will be technology’s influence on the other 90% of the world’s economy. From communications and commerce, to logistics, financial services, energy, healthcare, and entertainment, digital technology will power the entire global economy as every company becomes a software company in its own right. Across our customer solution areas, we are delivering powerful platforms, tools, and services that expand our opportunity to help every organization in every industry deliver on the digital imperative —with a business model that is trusted and always aligned with their success. Apps and infrastructure We are building Azure as the world’s computer, with more than 60 datacenter regions —more than any other provider — delivering faster access to cloud services while addressing critical data residency requirements. With Azure Arc, we’re bringing Azure anywhere, meeting customers where they are and enabling them to run apps across on-premises, edge, or multicloud environments. And we’re extending our infrastructure to the 5G network edge with Azure for Operators, introducing new solutions to help telecom operators deliver ultra-low-latency services closer to end users. As the digital and physical worlds come together, we’re also leading in the industrial metaverse. From smart factories, to smart buildings, to smart cities, we’re helping organizations use Azure IoT, Azure Digital Twins, and Microsoft Mesh to digitize people, places, and things, in order to visualize, simulate, and analyze any business process.']"
5,How Microsoft use data and AI for responsible AI development and what tools they provide?,"Microsoft is committed to responsibly developing and using AI technologies, which is a core part of their identity. They have released the Responsible AI Standard, which outlines 17 goals aligned with their six AI principles, and includes tools and practices to support these goals. To aid developers in identifying and mitigating issues before deployment, Microsoft shares open-source tools, including the new Responsible AI Dashboard. Additionally, Microsoft provides comprehensive data solutions, from best-in-class databases and analytics to data governance, to help organizations turn their data into predictive and analytical power. This approach ensures that AI development is responsible and aligned with their commitment to transparency and trust.","['<1-hop>\n\n4 Our commitment to responsibly develop and use technologies like AI is core to who we are. We put our commitment into practice, not only within Microsoft but by empowering our customers and partners to do the same and by advocating for policy change. We released our Responsible AI Standard, which outlines 17 goals aligned to our six AI principles and includes tools and practices to support them. And we share our open -source tools, including the new Responsible AI Dashboard, to help developers building AI technologies identify and mitigate issues before deployment. Finally, we provide clear reporting and information on how we run our business and how we work with customers and partners, delivering the transparency that is central to trust. Our annual Impact Summary shares more about our progress and learnings across these four commitments, and our Reports Hub provides detailed reports on our environmental data, our political activities, our workforce demographics, our human rights work, and more. We should all be proud of this work —and I am. But it’s easy to talk about what we’re doing well. As we look to the next year and beyond, we’ll continue to reflect on where the world needs us to do better. OUR OPPORTUNITY Now, let me turn to how we are positioned to capture the massive opportunities ahead. Over the past few years, I’ve written extensively about digital transformation, but now we need to go beyond that to deliver on what I call the “digital imperative.” Technology is a deflationary force in an inflationary economy. Every organization in every industry will need to infuse technology into every business process and function so they can do more with less. It’s what I believe will make the difference between organizations that thrive and those that get left behind. In the coming years, technology as a percentage of GDP will double from 5% to 10% and beyond, as technology moves from a back -office cost center to a defining feature of every product and service. But even more important will be technology’s influence on the other 90% of the world’s economy. From communications and commerce, to logistics, financial services, energy, healthcare, and entertainment, digital technology will power the entire global economy as every company becomes a software company in its own right. Across our customer solution areas, we are delivering powerful platforms, tools, and services that expand our opportunity to help every organization in every industry deliver on the digital imperative —with a business model that is trusted and always aligned with their success. Apps and infrastructure We are building Azure as the world’s computer, with more than 60 datacenter regions —more than any other provider — delivering faster access to cloud services while addressing critical data residency requirements. With Azure Arc, we’re bringing Azure anywhere, meeting customers where they are and enabling them to run apps across on-premises, edge, or multicloud environments. And we’re extending our infrastructure to the 5G network edge with Azure for Operators, introducing new solutions to help telecom operators deliver ultra-low-latency services closer to end users. As the digital and physical worlds come together, we’re also leading in the industrial metaverse. From smart factories, to smart buildings, to smart cities, we’re helping organizations use Azure IoT, Azure Digital Twins, and Microsoft Mesh to digitize people, places, and things, in order to visualize, simulate, and analyze any business process.', '<2-hop>\n\nData and AI From best-in-class databases and analytics to data governance, we have the most comprehensive data stack to help every organization turn its data into predictive and analytical power. With our new Microsoft Intelligent Data']"
6,"How is Microsoft promoting diversity and inclusion through its investment strategies and community initiatives, and what impact does this have on representation and inclusion within the company and beyond?","Microsoft is promoting diversity and inclusion by making a $150 million investment to strengthen inclusion and double the number of Black, African American, Hispanic, and Latinx leaders in the United States by 2025. The company collaborates with partners and communities to launch and scale projects such as the Justice Reform Initiative, expanding access to affordable broadband and devices, and increasing technology support for nonprofits serving Black and African American communities. Microsoft has made significant progress, reaching 90 percent of its goal to double the number of Black and African American leaders and 50 percent for Hispanic and Latinx leaders in the U.S. Additionally, Microsoft has increased its transaction volumes with Black- and African American-owned financial institutions and enriched its supplier pipeline, nearing its goal to spend $500 million with Black and African American-owned suppliers. These efforts are part of a broader commitment to leverage resources to accelerate diversity and inclusion across its ecosystem, holding the company accountable for driving change both within Microsoft and in the wider community.","['<1-hop>\n\n18 Total Rewards We develop dynamic, sustainable, market-driven, and strategic programs with the goal of providing a highly differentiated portfolio to attract, reward, and retain top talent and enable our employees to thrive. These programs reinforce our culture and values such as collaboration and growth mindset. Managers evaluate and recommend rewards based on, for example, how well we leverage the work of others and contribute to the success of our colleagues. We monitor pay equity and career progress across multiple dimensions. As part of our effort to promote a One Microsoft and inclusive culture, in fiscal year 2021 we expanded stock eligibility to all Microsoft employees as part of our annual rewards process. This includes all non -exempt and exempt employees and equivalents across the globe including business support professionals and datacenter and retail employees. In response to the Great Reshuffle, in fiscal year 2022 we announced a sizable investment in annual merit and annual stock award opportunity for all employees below senior executive levels. We also invested in base salary adjustments for our datacenter and retail hourly employees and hourly equivalents outside the U.S. These investments have supported retention and help to ensure that Microsoft remains an employer of choice. Pay Equity In our 2021 Diversity and Inclusion Report, we reported that all racial and ethnic minority employees in the U.S. combined earn $1.006 for every $1.000 earned by their white counterparts, that women in the U.S. earn $1.002 for every $1.000 earned by their counterparts in the U.S. who are men, and women in the U.S. plus our twelve other largest employee geographies representing 86.6% of our global population (Australia, Canada, China, France, Germany, India, Ireland, Israel, Japan, Romania, Singapore, and the United Kingdom) combined earn $1.001 for every $1.000 by men in these countries. Our intended result is a global performance and development approach that fosters our culture, and competitive compensation that ensures equitable pay by role while supporting pay for performance. Wellness and Safety Microsoft is committed to supporting our employees’ well -being and safety while they are at work and in their personal lives. We took a wide variety of measures to protect the health and well -being of our employees, suppliers, and customers during the COVID -19 pandemic and are now supporting employees in shifting to return to office and/or hybrid arrangements. We developed hybrid guidelines for managers and employees to support the transition and continue to identify ways we can support hybrid work scenarios through our employee listening systems. We have invested significantly in holistic wellbeing, and offer a differentiated benefits package which includes many physical, emotional, and financial wellness programs including counseling through the Microsoft CARES Employee Assistance Program, mental wellbeing support, flexible fitness benefits, savings and investment tools, adoption assistance, and back-up care for children and elders. Finally, our Occupational Health and Safety program helps ensure employees can stay safe while they are working. We continue to strive to support our Ukrainian employees and their dependents during the Ukraine crisis with emergency relocation assistance, emergency leave, and other benefits.', '<2-hop>\n\n16 • Increasing representation and strengthening inclusion: build on our momentum, adding a $150 million investment to strengthen inclusion and double the number of Black, African American, Hispanic, and Latinx leaders in the United States by 2025. Over the last year, we collaborated with partners and worked within neighborhoods and communities to launch and scale a number of projects and programs, including: working with 70 organizations in 145 communities on the Justice Reform Initiative, expanding access to affordable broadband and devices for Black and African American communities and key institutions that support them in major urban centers, expanding access to skills and education to support Black and African American students and adults to succeed in the digital economy, and increasing technology support for nonprofits that provide critical services to Black and African American communities. We have made meaningful progress on representation and inclusion at Microsoft. We are 90 percent of the way to our 2025 commitment to double the number of Black and African American people managers, senior individual contributors, and senior leaders in the U.S., and 50 percent of the way for Hispanic and Latinx people managers, senior individual contributors, and senior leaders in the U.S. We exceeded our goal on increasing the percentage of transaction volumes with Black - and African American -owned financial institutions and increased our deposits with Black - and African American-owned minority depository institutions, enabling increased funds into local communities. Additionally, we enriched our supplier pipeline, reaching more than 90 percent of our goal to spend $500 million with double the number of Black and African American -owned suppliers. We also increased the number of identified partners in the Black Partner Growth Initiative and continue to invest in the partner community through the Black Channel Partner Alliance by supporting events focused on business growth, accelerators, and mentorship. Progress does not undo the egregious injustices of the past or diminish those who continue to live with inequity. We are committed to leveraging our resources to help accelerate diversity and inclusion across our ecosystem and to hold ourselves accountable to accelerate change – for Microsoft, and beyond. Investing in Digital Skills The COVID-19 pandemic led to record unemployment, disrupting livelihoods of people around the world. After helping over 30 million people in 249 countries and territories with our global skills initiative, we introduced a new initiative to support a more skills -based labor market, with greater flexibility and accessible learning paths to develop the right skills needed for the most in-demand jobs. Our skills initiative brings together learning resources, certification opportunities, and job-seeker tools from LinkedIn, GitHub, and Microsoft Learn, and is built on data insights drawn from LinkedIn’s Economic Graph. We previously invested $20 million in key non -profit partnerships through Microsoft Philanthropies to help people from underserved communities that are often excluded by the digital economy. We also launched a national campaign with U.S. community colleges to help skill and recruit into the cybersecurity workforce 250,000 people by 2025, representing half of the country’s workforce shortage. To that end, we are making curriculum available free of charge to all of the nation’s public community colleges, providing training for new and existing faculty at 150 community colleges, and providing scholarships and supplemental resources to 25,000 students.']"
7,How does Microsoft integrate responsible AI development with data management to support business strategies?,"Microsoft integrates responsible AI development with data management by releasing their Responsible AI Standard, which outlines goals aligned with AI principles and includes tools and practices to support them. They provide open-source tools like the Responsible AI Dashboard to help developers identify and mitigate issues before deployment. Additionally, Microsoft offers a comprehensive data stack, including best-in-class databases and analytics, to help organizations turn data into predictive and analytical power, thereby supporting business strategies.","['<1-hop>\n\n4 Our commitment to responsibly develop and use technologies like AI is core to who we are. We put our commitment into practice, not only within Microsoft but by empowering our customers and partners to do the same and by advocating for policy change. We released our Responsible AI Standard, which outlines 17 goals aligned to our six AI principles and includes tools and practices to support them. And we share our open -source tools, including the new Responsible AI Dashboard, to help developers building AI technologies identify and mitigate issues before deployment. Finally, we provide clear reporting and information on how we run our business and how we work with customers and partners, delivering the transparency that is central to trust. Our annual Impact Summary shares more about our progress and learnings across these four commitments, and our Reports Hub provides detailed reports on our environmental data, our political activities, our workforce demographics, our human rights work, and more. We should all be proud of this work —and I am. But it’s easy to talk about what we’re doing well. As we look to the next year and beyond, we’ll continue to reflect on where the world needs us to do better. OUR OPPORTUNITY Now, let me turn to how we are positioned to capture the massive opportunities ahead. Over the past few years, I’ve written extensively about digital transformation, but now we need to go beyond that to deliver on what I call the “digital imperative.” Technology is a deflationary force in an inflationary economy. Every organization in every industry will need to infuse technology into every business process and function so they can do more with less. It’s what I believe will make the difference between organizations that thrive and those that get left behind. In the coming years, technology as a percentage of GDP will double from 5% to 10% and beyond, as technology moves from a back -office cost center to a defining feature of every product and service. But even more important will be technology’s influence on the other 90% of the world’s economy. From communications and commerce, to logistics, financial services, energy, healthcare, and entertainment, digital technology will power the entire global economy as every company becomes a software company in its own right. Across our customer solution areas, we are delivering powerful platforms, tools, and services that expand our opportunity to help every organization in every industry deliver on the digital imperative —with a business model that is trusted and always aligned with their success. Apps and infrastructure We are building Azure as the world’s computer, with more than 60 datacenter regions —more than any other provider — delivering faster access to cloud services while addressing critical data residency requirements. With Azure Arc, we’re bringing Azure anywhere, meeting customers where they are and enabling them to run apps across on-premises, edge, or multicloud environments. And we’re extending our infrastructure to the 5G network edge with Azure for Operators, introducing new solutions to help telecom operators deliver ultra-low-latency services closer to end users. As the digital and physical worlds come together, we’re also leading in the industrial metaverse. From smart factories, to smart buildings, to smart cities, we’re helping organizations use Azure IoT, Azure Digital Twins, and Microsoft Mesh to digitize people, places, and things, in order to visualize, simulate, and analyze any business process.', '<2-hop>\n\nData and AI From best-in-class databases and analytics to data governance, we have the most comprehensive data stack to help every organization turn its data into predictive and analytical power. With our new Microsoft Intelligent Data']"
8,"How revenue from SA and cloud services like Office 365 is recognized, and what role does judgment play in determining SSP for these services?","Revenue from Software Assurance (SA) is generally recognized ratably over the contract period as customers simultaneously consume and receive benefits, given that SA comprises distinct performance obligations that are satisfied over time. For cloud services like Office 365, which depend on a significant level of integration, interdependency, and interrelation between desktop applications and cloud services, revenue is recognized ratably over the period in which the cloud services are provided. Judgment plays a crucial role in determining the Standalone Selling Price (SSP) for each distinct performance obligation. This includes using a single amount to estimate SSP for items not sold separately, such as on-premises licenses sold with SA, and using a range of amounts to estimate SSP when products and services are sold separately. Judgment is also required to assess the pattern of delivery and the exercise pattern of certain benefits across the customer portfolio.","['<1-hop>\n\n53 Service and other revenue includes sales from cloud -based solutions that provide customers with software, services, platforms, and content such as Office 365, Azure, Dynamics 365, and Xbox; solution support; and consulting services. Service and other revenue also includes sales from online advertising and LinkedIn. Revenue Recognition Revenue is recognized upon transfer of control of promised products or services to customers in an amount that reflects the consideration we expect to receive in exchange for those products or services. We enter into contracts that can include various combinations of products and services, which are generally capable of being distinct and accounted for as separate performance obligations. Revenue is recognized net of allowances for returns and any taxes collected from customers, which are subsequently remitted to governmental authorities. Nature of Products and Services Licenses for on-premises software provide the customer with a right to use the software as it exists when made available to the customer. Customers may purchase perpetual licenses or subscribe to licenses, which provide customers with the same functionality and differ mainly in the duration over which the customer benefits from the software. Revenue from distinct on -premises licenses is recognized upfront at the point in time when the software is made available to the customer. In cases where we allocate revenue to software updates, primarily because the updates are provided at no additional charge, revenue is recognized as the updates are provided, which is generally ratably over the estimated life of the related device or license. Certain volume licensing programs, including Enterprise Agreements, include on -premises licenses combined with Software Assurance (“SA”). SA conveys rights to new software and upgrades released over the contract period and provides support, tools, and training to help customers deploy and use products more efficiently. On -premises licenses are considered distinct performance obligations when sold with SA. Revenue allocated to SA is generally recognized ratably over the contract period as customers simultaneously consume and receive benefits, given that SA comprises distinct performance obligations that are satisfied over time. Cloud services, which allow customers to use hosted software over the contract period without taking possession of the software, are provided on either a subscription or consumption basis. Revenue related to cloud services provided on a subscription basis is recognized ratably over the contract period. Revenue related to cloud services provided on a consumption basis, such as the amount of storage used in a period, is recognized based on the customer utilization of such resources. When cloud services require a significant level of integration and interdependency with software and the individual components are not considered distinct, all revenue is recognized over the period in which the cloud services are provided. Revenue from search advertising is recognized when the advertisement appears in the search results or when the action necessary to earn the revenue has been completed. Revenue from consulting services is recognized as services are provided. Our hardware is generally highly dependent on, and interrelated with, the underlying operating system and cannot function without the operating system. In these cases, the hardware and software license are accounted for as a single performance obligation and revenue is recognized at the point in time when ownership is transferred to resellers or directly to end customers through retail stores and online marketplaces. Refer to Note 19 – Segment Information and Geographic Data for further information, including revenue by significant product and service offering.', '<2-hop>\n\n54 accounted for separately versus together may require significant judgment. When a cloud-based service includes both on- premises software licenses and cloud services, judgment is required to determine whether the software license is considered distinct and accounted for separately, or not distinct and accounted for together with the cloud service and recognized over time. Certain cloud services, primarily Office 365, depend on a significant level of integration, interdependency, and interrelation between the desktop applications and cloud services, and are accounted for together as one performance obligation. Revenue from Office 365 is recognized ratably over the period in which the cloud services are provided. Judgment is required to determine the SSP for each distinct performance obligation. We use a single amount to estimate SSP for items that are not sold separately, including on -premises licenses sold with SA or software updates provided at no additional charge. We use a range of amounts to estimate SSP when we sell each of the products and services separately and need to determine whether there is a discount to be allocated based on the relative SSP of the various products and services. In instances where SSP is not directly observable, such as when we do not sell the product or service separately, we determine the SSP using information that may include market conditions and other observable inputs. We typically have more than one SSP for individual products and services due to the stratification of those products and services by customers and circumstances. In these instances, we may use information such as the size of the customer and geographic region in determining the SSP. Due to the various benefits from and the nature of our SA program, judgment is required to assess the pattern of delivery, including the exercise pattern of certain benefits across our portfolio of customers. Our products are generally sold with a right of return, we may provide other credits or incentives, and in certain instances we estimate customer usage of our products and services, which are accounted for as variable consideration when determining the amount of revenue to recognize. Returns and credits are estimated at contract inception and updated at the end of each reporting period if additional information becomes available. Changes to our estimated variable consideration were not material for the periods presented. Contract Balances and Other Receivables Timing of revenue recognition may differ from the timing of invoicing to customers. We record a receivable when revenue is recognized prior to invoicing, or unearned revenue when revenue is recognized subsequent to invoicing. For multi -year agreements, we generally invoice customers annually at the beginning of each annual coverage period. We record a receivable related to revenue recognized for multi -year on-premises licenses as we have an unconditional right to invoice and receive payment in the future related to those licenses. Unearned revenue comprises mainly unearned revenue related to volume licensing programs, which may include SA and cloud services. Unearned revenue is generally invoiced annually at the beginning of each contract period for multi -year agreements and recognized ratably over the coverage period. Unearned revenue also includes payments for consulting services to be performed in the future, LinkedIn subscriptions, Office 365 subscriptions, Xbox subscriptions, Windows post-delivery support, Dynamics business solutions, and other offerings for which we have been paid in advance and earn the revenue when we transfer control of the product or service. Refer to Note 13 –']"
9,"How does the integration of on-premises software licenses with cloud services, such as Office 365, affect revenue recognition and the determination of stand-alone selling price (SSP) in Microsoft's SA program?","The integration of on-premises software licenses with cloud services, such as Office 365, affects revenue recognition by requiring significant judgment to determine whether the software license is distinct and accounted for separately or not distinct and accounted for together with the cloud service. Office 365, which depends on a high level of integration, interdependency, and interrelation between desktop applications and cloud services, is accounted for as one performance obligation, with revenue recognized ratably over the period the cloud services are provided. The determination of the stand-alone selling price (SSP) for each distinct performance obligation also requires judgment. Microsoft uses a single amount to estimate SSP for items not sold separately, including on-premises licenses sold with SA or software updates provided at no additional charge. A range of amounts is used to estimate SSP when products and services are sold separately, and discounts are allocated based on the relative SSP of the various products and services. In cases where SSP is not directly observable, market conditions and other observable inputs are used to determine SSP. The SA program's benefits and nature require judgment to assess the delivery pattern, including the exercise pattern of certain benefits across the customer portfolio.","['<1-hop>\n\n54 accounted for separately versus together may require significant judgment. When a cloud-based service includes both on- premises software licenses and cloud services, judgment is required to determine whether the software license is considered distinct and accounted for separately, or not distinct and accounted for together with the cloud service and recognized over time. Certain cloud services, primarily Office 365, depend on a significant level of integration, interdependency, and interrelation between the desktop applications and cloud services, and are accounted for together as one performance obligation. Revenue from Office 365 is recognized ratably over the period in which the cloud services are provided. Judgment is required to determine the SSP for each distinct performance obligation. We use a single amount to estimate SSP for items that are not sold separately, including on -premises licenses sold with SA or software updates provided at no additional charge. We use a range of amounts to estimate SSP when we sell each of the products and services separately and need to determine whether there is a discount to be allocated based on the relative SSP of the various products and services. In instances where SSP is not directly observable, such as when we do not sell the product or service separately, we determine the SSP using information that may include market conditions and other observable inputs. We typically have more than one SSP for individual products and services due to the stratification of those products and services by customers and circumstances. In these instances, we may use information such as the size of the customer and geographic region in determining the SSP. Due to the various benefits from and the nature of our SA program, judgment is required to assess the pattern of delivery, including the exercise pattern of certain benefits across our portfolio of customers. Our products are generally sold with a right of return, we may provide other credits or incentives, and in certain instances we estimate customer usage of our products and services, which are accounted for as variable consideration when determining the amount of revenue to recognize. Returns and credits are estimated at contract inception and updated at the end of each reporting period if additional information becomes available. Changes to our estimated variable consideration were not material for the periods presented. Contract Balances and Other Receivables Timing of revenue recognition may differ from the timing of invoicing to customers. We record a receivable when revenue is recognized prior to invoicing, or unearned revenue when revenue is recognized subsequent to invoicing. For multi -year agreements, we generally invoice customers annually at the beginning of each annual coverage period. We record a receivable related to revenue recognized for multi -year on-premises licenses as we have an unconditional right to invoice and receive payment in the future related to those licenses. Unearned revenue comprises mainly unearned revenue related to volume licensing programs, which may include SA and cloud services. Unearned revenue is generally invoiced annually at the beginning of each contract period for multi -year agreements and recognized ratably over the coverage period. Unearned revenue also includes payments for consulting services to be performed in the future, LinkedIn subscriptions, Office 365 subscriptions, Xbox subscriptions, Windows post-delivery support, Dynamics business solutions, and other offerings for which we have been paid in advance and earn the revenue when we transfer control of the product or service. Refer to Note 13 –', '<2-hop>\n\n42 RECENT ACCOUNTING GUIDANCE Refer to Note 1 – Accounting Policies of the Notes to Financial Statements in our fiscal year 2022 Form 10 -K for further discussion. CRITICAL ACCOUNTING ESTIMATES Our consolidated financial statements and accompanying notes are prepared in accordance with GAAP. Preparing consolidated financial statements requires management to make estimates and assumptions that affect the reported amounts of assets, liabilities, revenue, and expenses. Critical accounting estimates are those estimates that involve a significant level of estimation uncertainty and could have a material impact on our financial condition or results of operations. We have critical accounting estimates in the areas of revenue recognition, impairment of investment securities, goodwill, research and development costs, legal and other contingencies, income taxes, and inventories. Revenue Recognition Our contracts with customers often include promises to transfer multiple products and services to a customer. Determining whether products and services are considered distinct performance obligations that should be accounted for separately versus together may require significant judgment. When a cloud -based service includes both on -premises software licenses and cloud services, judgment is required to determine whether the software license is considered distinct and accounted for separately, or not distinct and accounted for together with the cloud service and recognized over time. Certain cloud services, primarily Office 365, depend on a significant level of integration, interdependency, and interrelation between the desktop applications and cloud services, and are accounted for together as one performance obligation. Revenue from Office 365 is recognized ratably over the period in which the cloud services are provided. Judgment is required to determine the stand -alone selling price (“SSP”) for each distinct performance obligation. We use a single amount to estimate SSP for items that are not sold separately, including on -premises licenses sold with SA or software updates provided at no additional charge. We use a range of amounts to estimate SSP when we sell each of the products and services separately and need to determine whether there is a discount to be allocated based on the relative SSP of the various products and services. In instances where SSP is not directly observable, such as when we do not sell the product or service separately, we determine the SSP using information that may include market conditions and other observable inputs. We typically have more than one SSP for individual products and services due to the stratification of those products and services by customers and circumstances. In these instances, we may use information such as the size of the customer and geographic region in determining the SSP. Due to the various benefits from and the nature of our SA program, judgment is required to assess the pattern of delivery, including the exercise pattern of certain benefits across our portfolio of customers. Our products are generally sold with a right of return, we may provide other credits or incentives, and in certain instances we estimate customer usage of our products and services, which are accounted for as variable consideration when determining the amount of revenue to recognize. Returns and credits are estimated at contract inception and updated at the end of each reporting period if additional information becomes available. Changes to our estimated variable consideration were not material for the periods presented.']"


- You can see the `reference_contexts` that was used to generate the answer.
- They can really benefit from better cleaning.