### Project Objective

This project develops an **Intelligent Knowledge Chatbot** that can answer user queries by combining internal knowledge with external information from Wikipedia. It applies **semantic search using embeddings**, integrates **summarization techniques** (both extractive and abstractive), and provides **text-to-speech (TTS)** responses for interactive use.  

The chatbot is designed to:  
- Preprocess and normalize input text for accurate retrieval  
- Retrieve semantically relevant answers from a growing knowledge base  
- Expand its knowledge by dynamically adding Wikipedia pages or custom text  
- Summarize long texts into concise, human-readable answers  
- Convert generated responses into **audio output** for an engaging experience  

#### Outline
- Step 1: Data preprocessing & text normalization  
- Step 2: Embedding knowledge corpus with Sentence-Transformers  
- Step 3: Semantic similarity search for user queries  
- Step 4: External knowledge integration using Wikipedia API  
- Step 5: Summarization (Sumy LexRank & Hugging Face BART)  
- Step 6: Text-to-Speech conversion using gTTS  
- Step 7: Interactive chatbot response system  

In [21]:
# pip install nltk wikipedia sentence-transformers scikit-learn wikipedia-api

In [23]:
# nltk.download('punkt_tab')

In [None]:
# Import the Natural Language Toolkit (NLTK) for text preprocessing tasks
import nltk

# Import NumPy for numerical computations and array manipulations
import numpy as np

# Import SentenceTransformer from the sentence-transformers library
# This is used to generate dense vector embeddings of sentences for semantic similarity tasks
from sentence_transformers import SentenceTransformer

# Import cosine_similarity from scikit-learn to measure similarity between embeddings
from sklearn.metrics.pairwise import cosine_similarity

# Import wikipediaapi for fetching and parsing content from Wikipedia pages
import wikipediaapi

# Import WordNetLemmatizer from NLTK for reducing words to their base/root form (lemmatization)
from nltk.stem import WordNetLemmatizer

# Import stopwords list from NLTK to filter out common words (e.g., 'the', 'is', 'and')
from nltk.corpus import stopwords

# Import re (regular expressions) for pattern matching and text cleaning
import re

This block imports all the core libraries required for building the chatbot:  

- **NLTK (Natural Language Toolkit):** Provides tools for preprocessing text, such as stopwords and lemmatization.  
- **NumPy:** Handles numerical operations, especially for managing embeddings and similarity calculations.  
- **SentenceTransformer:** Generates dense vector embeddings of sentences, enabling semantic search.  
- **cosine_similarity (scikit-learn):** Measures how similar two embeddings are, which is crucial for matching queries with knowledge base entries.  
- **wikipediaapi:** Fetches and parses live content from Wikipedia to enrich the chatbot’s knowledge base.  
- **WordNetLemmatizer & stopwords (NLTK):** Used to clean and normalize text, ensuring better retrieval accuracy.  
- **re (Regular Expressions):** Helps clean raw text (e.g., removing links, punctuation, or unwanted characters).  

👉 In short: This block sets up all the external tools that power **text preprocessing, semantic search, and knowledge expansion** for the chatbot.  

In [None]:
# Initialize the WordNet Lemmatizer from NLTK
# This will be used to reduce words to their base form (e.g., "running" -> "run")
lemmatizer = WordNetLemmatizer()

# Load the pre-trained SentenceTransformer model
# 'all-MiniLM-L6-v2' is a lightweight SBERT model (384-dimensional embeddings)
# It is efficient and provides a good balance between speed and accuracy
model = SentenceTransformer('all-MiniLM-L6-v2')

The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


### Lemmatizer & Embedding Model Setup  

- **WordNetLemmatizer:**  
  - This tool reduces words to their **base/root form** (e.g., “running” → “run”).  
  - It ensures consistency in the text corpus and queries, improving retrieval accuracy.  

- **SentenceTransformer Model (`all-MiniLM-L6-v2`):**  
  - A pre-trained **Sentence-BERT** model that produces **dense 384-dimensional embeddings**.  
  - Chosen for being **lightweight** and **fast**, while still offering strong performance in capturing semantic meaning.  
  - These embeddings allow the chatbot to **measure similarity** between user queries and knowledge base texts using cosine similarity.  

👉 This block sets up the **text normalization tool** (lemmatizer) and the **core semantic engine** (SBERT model) that powers the chatbot’s understanding of language.  

In [None]:
# Default Knowledge to the chatbot
# This list represents the initial "knowledge base" (corpus) of the chatbot.
# Each element is a text string (sentence) that the chatbot can use for retrieval.

corpus = [
    "Hi there! How can I help you?",  # General greeting
    "Hello! I am a chatbot designed to assist you.",  # Introduction
    "What is your name?",  # Sample user query
    "I am called LB-Bot.",  # Chatbot identity
    "I was created by LB Students as part of an NLP project.",  # Project attribution
    "I currently run on Python using the all-MiniLM-L6-v2 model for semantic understanding.",  # Technical details
    "What is NLP?",  # Knowledge query
    "NLP stands for Natural Language Processing, which focuses on enabling computers to understand, interpret, and generate human language.",  # NLP explanation
    "Goodbye!",  # Closing response
    "See you later. It was nice talking to you."  # Closing variation
]

### Initial Knowledge Base (Corpus)

- This block defines the **default knowledge** available to the chatbot before any external information (like Wikipedia) is added.  
- The `corpus` is a list of text strings, each representing a **sentence or response** that the chatbot can use.  

Key points about this corpus:  
- Includes **basic conversational responses**: greetings, introductions, and closing phrases.  
- Provides **identity and project context**: the chatbot introduces itself as *LB-Bot* built for an NLP project.  
- Contains **domain knowledge**: a definition of NLP for handling basic technical queries.  
- Acts as the **starting knowledge base**, which can later be expanded dynamically by adding custom text or Wikipedia content.  

👉 This ensures the chatbot is **functional right away**, even without external data, while leaving room for future enrichment.  

This initial corpus serves as a **starter knowledge base**, ensuring the chatbot can respond immediately with greetings, identity, and basic NLP explanations. Importantly, the design allows the knowledge base to **scale dynamically** by adding new text or Wikipedia content later. This highlights the chatbot’s adaptability for real-world applications across different domains.  


In [None]:
# Convert the default chatbot knowledge base (corpus) into embeddings
# The model.encode() method transforms each sentence into a dense vector
# Shape of output: (number_of_sentences, embedding_dimension)
# For 'all-MiniLM-L6-v2', embedding_dimension = 384
corpus_embeddings = model.encode(corpus)

### Embedding the Knowledge Base  

- This step converts the **text corpus** into **numerical embeddings** using the SentenceTransformer model.  
- `model.encode(corpus)` produces a dense vector representation for each sentence in the corpus.  
- Shape of the output:  
  - **Rows = number of sentences** in the corpus  
  - **Columns = embedding dimension (384 for all-MiniLM-L6-v2)**  
- These embeddings capture the **semantic meaning** of sentences, allowing the chatbot to later compare user queries with the corpus using **cosine similarity**.  

👉 By embedding the corpus now, the chatbot builds a semantic search index, enabling **intelligent retrieval of the most relevant responses** instead of keyword matching.  

In [None]:
# Create a Wikipedia API object to fetch Wikipedia pages
# Parameters:
# - language='en' → fetch English Wikipedia articles
# - user_agent='LB-Bot' → identifies the bot making requests (good practice to avoid blocking)
wiki_wiki = wikipediaapi.Wikipedia(language='en', user_agent='LB-Bot')

### Wikipedia API Setup  

- This block initializes a **Wikipedia API client** that lets the chatbot fetch and parse articles directly from Wikipedia.  
- Parameters used:  
  - **language='en'** → restricts results to English Wikipedia.  
  - **user_agent='LB-Bot'** → identifies the bot in requests (important for transparency and to avoid being blocked).  
- Once created, this object (`wiki_wiki`) can retrieve **page titles, summaries, and full content**, which can later be added to the chatbot’s knowledge base.  

👉 This enables the chatbot to **expand beyond its default corpus**, pulling in **real-time, authoritative knowledge** from Wikipedia.  

In [None]:
"""
This is a docstring-style pseudocode description of how the chatbot response works:

def chatbot_response(user_input):

  1. Encode the user input into an embedding vector
  2. Compare it with all corpus embeddings using cosine_similarity()
  3. Identify the index of the most similar corpus sentence using np.argmax()

  if similarity < 0.5:
    return("I'm sorry, I don't have info on that.")
  else:
    return corpus[index]
"""

# Actual implementation of semantic chatbot response
def chatbot_response_semantic(user_input):
    # Step 1: Encode the user input as an embedding vector (shape: (1, 384))
    user_embedding = model.encode([user_input])

    # Step 2: Compute cosine similarity between user embedding and all corpus embeddings
    # Result shape: (1, len(corpus)) → similarity score for each corpus entry
    similarity = cosine_similarity(user_embedding, corpus_embeddings)

    # Step 3: Get the index of the most similar sentence in the corpus
    index = np.argmax(similarity)

    # Step 4: Apply a confidence threshold
    # If the maximum similarity score is below 0.2, respond with fallback message
    if similarity[0][index] < 0.2:
        return("I'm sorry, I don't have info on that.")
    else:
        # Otherwise, return the most relevant sentence from the corpus
        return corpus[index]

### Semantic Chatbot Response Function  

This block defines the **core logic** of how the chatbot answers a user query:  

1. **Encode User Input**  
   - The query is converted into an embedding vector (`user_embedding`) using the same SentenceTransformer model.  
   - Ensures both the user input and the corpus are represented in the same semantic space.  

2. **Cosine Similarity Search**  
   - Compares the user embedding with all stored corpus embeddings.  
   - Produces a similarity score for each sentence in the corpus (range: 0 → no relation, 1 → identical).  

3. **Select Best Match**  
   - Identifies the index of the most similar sentence using `np.argmax()`.  

4. **Confidence Threshold**  
   - If the similarity score is **below 0.2**, the chatbot returns a fallback response:  
     *“I'm sorry, I don't have info on that.”*  
   - Otherwise, it retrieves and returns the most relevant response from the corpus.  

👉 This design ensures the chatbot avoids **random or irrelevant replies** when confidence is low, a practice that mirrors **real-world chatbot safety mechanisms**.  

In [None]:
# Function to add user-provided custom text into the chatbot's knowledge base
def add_custom_text(text):
    global corpus, corpus_embeddings  # Access and modify the global corpus and embeddings

    # Step 1: Break the input text into individual sentences
    new_sentences = nltk.sent_tokenize(text)

    # Step 2: Extend the existing corpus with these new sentences
    corpus.extend(new_sentences)

    # Step 3: Re-encode the updated corpus into embeddings
    # This ensures the chatbot can semantically match queries with the new content
    corpus_embeddings = model.encode(corpus)


### Expanding Knowledge Base with Custom Text  

This function allows the chatbot to **learn dynamically** by adding new information provided by the user:  

1. **Sentence Splitting**  
   - The input text is broken down into individual sentences using `nltk.sent_tokenize()`.  
   - This ensures finer granularity so each sentence can be retrieved independently.  

2. **Corpus Update**  
   - The new sentences are appended to the existing corpus, effectively expanding the chatbot’s knowledge base.  

3. **Re-embedding**  
   - The updated corpus is re-encoded into embeddings.  
   - This step is essential so the new content is integrated into future semantic searches.  

👉 With this feature, the chatbot becomes **incrementally trainable**, adapting to user input and evolving beyond its original dataset. This mirrors **real-world knowledge management systems**, where models continuously incorporate fresh information.  

In [None]:
# Function to add an entire Wikipedia page to the chatbot's knowledge base
def add_wikipedia_fullpage(topic):
    global corpus, corpus_embeddings  # Access the global corpus and embeddings

    # Step 1: Fetch the Wikipedia page object for the given topic
    page = wiki_wiki.page(topic)

    # Step 2: Extract the full text of the Wikipedia page
    full_text = page.text

    # Step 3: Split the full text into individual sentences
    new_sentences = nltk.sent_tokenize(full_text)

    # Step 4: Extend the chatbot's knowledge base with these sentences
    corpus.extend(new_sentences)

    # Step 5: Recompute embeddings for the entire updated corpus
    corpus_embeddings = model.encode(corpus)


### Expanding Knowledge Base with Wikipedia Pages  

This function allows the chatbot to **automatically ingest authoritative knowledge** from Wikipedia:  

1. **Fetch Page**  
   - Retrieves the Wikipedia page for the specified topic using the initialized API (`wiki_wiki.page(topic)`).  

2. **Extract Content**  
   - Grabs the **entire text** of the page, ensuring comprehensive coverage.  

3. **Sentence Splitting**  
   - Breaks the text into individual sentences with `nltk.sent_tokenize()`.  
   - This enables precise retrieval of relevant snippets rather than long paragraphs.  

4. **Update Corpus**  
   - The extracted sentences are added to the existing knowledge base.  

5. **Re-embedding**  
   - The expanded corpus is re-encoded so the new content can be included in semantic searches.  

👉 This step transforms the chatbot from a **static assistant** into a **dynamic, knowledge-augmented system**, capable of pulling in up-to-date and reliable information directly from Wikipedia.  

In [None]:
# Knowledge Ingestion

"""
High-level design notes (pseudocode / plan):

chat:

corpus
added text
added wikipedia page

extend:
to import pdf, word, text
build a front end (using steamlit, gradio etc)

text to speeech - convert text to speech
======

1. add custom text
2. add wiki topic
3. start chatting

if option 1: input("enter text"), add_custom_text(text)
if option 2: input("enter wiki article topic"), add_wikipedia_fullpage(topic)
if option 3: start chatting, type bye to exit


chat loop:

input:
response:
"""

# Initial greeting and menu rendering for the console interface
print("LB-Bot: Hello, I am LB-Bot. I'm here to help.")
print("=================================================================")
print("\nOptions: ")
print("1. Add custom text, simply paste it here.")
print("2. Add Wikipedia page and I'll fetch entire page on that topic")
print("3. Start chatting with existing corpus")

# Persistent menu loop; breaks only when the user selects option '3'
while True:
    option = input("Enter your choice (1/2/3): ")

    if option == '1':
        # Ingestion path 1: take arbitrary text and expand the knowledge base
        custom_text = input("Enter custom text to add: ")
        add_custom_text(custom_text)
        print("Custom text added successfully.")

    elif option == '2':
        # Ingestion path 2: fetch and ingest a full Wikipedia page by title
        topic = input("Enter Wikipedia topic to add: ")
        add_wikipedia_fullpage(topic)
        print("Wikipedia page added successfully.")

    elif option == '3':
        # Exit the ingestion menu and hand off control to the chat loop
        print("\nStarting chat....(type bye to exit)\n")
        break

    else:
        # Guard against invalid inputs
        print("Invalid choice. Please enter 1, 2, or 3.")

LB-Bot: Hello, I am LB-Bot. I'm here to help.

Options: 
1. Add custom text, simply paste it here.
2. Add Wikipedia page and I'll fetch entire page on that topic
3. Start chatting with existing corpus
Enter your choice (1/2/3): 2
Enter Wikipedia topic to add: Artificial Intelligence
Wikipedia page added successfully.
Enter your choice (1/2/3): Large Language Model
Invalid choice. Please enter 1, 2, or 3.
Enter your choice (1/2/3): LLM
Invalid choice. Please enter 1, 2, or 3.
Enter your choice (1/2/3): Generative AI
Invalid choice. Please enter 1, 2, or 3.
Enter your choice (1/2/3): 2
Enter Wikipedia topic to add: Large Language Model
Wikipedia page added successfully.
Enter your choice (1/2/3): bye
Invalid choice. Please enter 1, 2, or 3.
Enter your choice (1/2/3): 3

Starting chat....(type bye to exit)



### Console-Based Knowledge Ingestion Menu

- Renders a **simple CLI menu** to (1) add custom text, (2) ingest a full Wikipedia page, or (3) start chatting.
- Runs a **persistent input loop** until option `3` is chosen; invalid inputs are handled with a clear prompt.
- On selection:
  - **1:** calls `add_custom_text(custom_text)` to grow the corpus sentence-by-sentence.
  - **2:** calls `add_wikipedia_fullpage(topic)` to fetch, split, and embed a whole article.
  - **3:** exits the ingestion loop and transfers control to the chat phase.
- This structure keeps the workflow **modular**: ingestion first → interaction later.

This makes the project easy to **demo live**: you can quickly personalize the bot to a domain (paste docs or add a Wikipedia topic) and then start a conversation on the enriched knowledge.

In [23]:
#Chat loop

while True:
  user_input = input("You: ")
  if user_input.lower() == 'bye':
    print("LB-Bot: Goodbye! It was nice talking to you.")
    break

  response = chatbot_response_semantic(user_input)
  print("LB-Bot:", response)

You: Hi, whats your name?
LB-Bot: What is your name?
You: Define Artificial Intelligence.
LB-Bot: Artificial intelligence (AI) is the capability of computational systems to perform tasks typically associated with human intelligence, such as learning, reasoning, problem-solving, perception, and decision-making.
You: What are the limitations of AI?
LB-Bot: AI agents operate within the constraints of their programming, available computational resources, and hardware limitations.
You: History of AI?
LB-Bot: History of AI
Other sources
Further reading
External links

"Artificial Intelligence".
You: Father of artificial intelligence?
LB-Bot: History of AI
Other sources
Further reading
External links

"Artificial Intelligence".
You: What are topics of Large Language Model?
LB-Bot: "Challenges and Applications of Large Language Models".
You: What is large language model?
LB-Bot: The tendency towards larger models is visible in the list of large language models.
You: bye
LB-Bot: Goodbye! It was

### Interactive Chat Loop  

- This block implements the **main dialogue system** between the user and LB-Bot.  
- The loop continues until the user explicitly types **"bye"** (case-insensitive).  
- On each turn:  
  1. **User Input:** Captured through `input("You: ")`.  
  2. **Exit Condition:** If the user says “bye”, the chatbot responds with a farewell and breaks the loop.  
  3. **Response Generation:** Otherwise, the query is passed to `chatbot_response_semantic()`, which retrieves the most semantically relevant answer.  
  4. **Output:** The response is printed back with the chatbot’s label (`LB-Bot`).  

👉 This creates a **continuous conversational flow** where the chatbot can leverage its evolving knowledge base, enriched through both custom text and Wikipedia ingestion.  

In [27]:
# pip install lxml_html_clean

Collecting lxml_html_clean
  Downloading lxml_html_clean-0.4.2-py3-none-any.whl.metadata (2.4 kB)
Downloading lxml_html_clean-0.4.2-py3-none-any.whl (14 kB)
Installing collected packages: lxml_html_clean
Successfully installed lxml_html_clean-0.4.2


In [34]:
text = """Artificial intelligence

Article
Talk
Read
View source
View history

Tools
Appearance hide
Text

Small

Standard

Large
Width

Standard

Wide
Color (beta)

Automatic

Light

Dark
Page semi-protected
From Wikipedia, the free encyclopedia
"AI" redirects here. For other uses, see AI (disambiguation) and Artificial intelligence (disambiguation).
Part of a series on
Artificial intelligence (AI)

Major goals
Approaches
Applications
Philosophy
History
Glossary
vte
Artificial intelligence (AI) is the capability of computational systems to perform tasks typically associated with human intelligence, such as learning, reasoning, problem-solving, perception, and decision-making. It is a field of research in computer science that develops and studies methods and software that enable machines to perceive their environment and use learning and intelligence to take actions that maximize their chances of achieving defined goals.[1]

High-profile applications of AI include advanced web search engines (e.g., Google Search); recommendation systems (used by YouTube, Amazon, and Netflix); virtual assistants (e.g., Google Assistant, Siri, and Alexa); autonomous vehicles (e.g., Waymo); generative and creative tools (e.g., language models and AI art); and superhuman play and analysis in strategy games (e.g., chess and Go). However, many AI applications are not perceived as AI: "A lot of cutting edge AI has filtered into general applications, often without being called AI because once something becomes useful enough and common enough it's not labeled AI anymore."[2][3]

Various subfields of AI research are centered around particular goals and the use of particular tools. The traditional goals of AI research include learning, reasoning, knowledge representation, planning, natural language processing, perception, and support for robotics.[a] To reach these goals, AI researchers have adapted and integrated a wide range of techniques, including search and mathematical optimization, formal logic, artificial neural networks, and methods based on statistics, operations research, and economics.[b] AI also draws upon psychology, linguistics, philosophy, neuroscience, and other fields.[4] Some companies, such as OpenAI, Google DeepMind and Meta,[5] aim to create artificial general intelligence (AGI)—AI that can complete virtually any cognitive task at least as well as a human.

Artificial intelligence was founded as an academic discipline in 1956,[6] and the field went through multiple cycles of optimism throughout its history,[7][8] followed by periods of disappointment and loss of funding, known as AI winters.[9][10] Funding and interest vastly increased after 2012 when graphics processing units started being used to accelerate neural networks and deep learning outperformed previous AI techniques.[11] This growth accelerated further after 2017 with the transformer architecture.[12] In the 2020s, an ongoing period of rapid progress in advanced generative AI became known as the AI boom. Generative AI's ability to create and modify content has led to several unintended consequences and harms, which has raised ethical concerns about AI's long-term effects and potential existential risks, prompting discussions about regulatory policies to ensure the safety and benefits of the technology.

Goals
The general problem of simulating (or creating) intelligence has been broken into subproblems. These consist of particular traits or capabilities that researchers expect an intelligent system to display. The traits described below have received the most attention and cover the scope of AI research.[a]

Reasoning and problem-solving
Early researchers developed algorithms that imitated step-by-step reasoning that humans use when they solve puzzles or make logical deductions.[13] By the late 1980s and 1990s, methods were developed for dealing with uncertain or incomplete information, employing concepts from probability and economics.[14]

Many of these algorithms are insufficient for solving large reasoning problems because they experience a "combinatorial explosion": They become exponentially slower as the problems grow.[15] Even humans rarely use the step-by-step deduction that early AI research could model. They solve most of their problems using fast, intuitive judgments.[16] Accurate and efficient reasoning is an unsolved problem.

Knowledge representation

An ontology represents knowledge as a set of concepts within a domain and the relationships between those concepts.
Knowledge representation and knowledge engineering[17] allow AI programs to answer questions intelligently and make deductions about real-world facts. Formal knowledge representations are used in content-based indexing and retrieval,[18] scene interpretation,[19] clinical decision support,[20] knowledge discovery (mining "interesting" and actionable inferences from large databases),[21] and other areas.[22]

A knowledge base is a body of knowledge represented in a form that can be used by a program. An ontology is the set of objects, relations, concepts, and properties used by a particular domain of knowledge.[23] Knowledge bases need to represent things such as objects, properties, categories, and relations between objects;[24] situations, events, states, and time;[25] causes and effects;[26] knowledge about knowledge (what we know about what other people know);[27] default reasoning (things that humans assume are true until they are told differently and will remain true even when other facts are changing);[28] and many other aspects and domains of knowledge.

Among the most difficult problems in knowledge representation are the breadth of commonsense knowledge (the set of atomic facts that the average person knows is enormous);[29] and the sub-symbolic form of most commonsense knowledge (much of what people know is not represented as "facts" or "statements" that they could express verbally).[16] There is also the difficulty of knowledge acquisition, the problem of obtaining knowledge for AI applications.[c]

Planning and decision-making
An "agent" is anything that perceives and takes actions in the world. A rational agent has goals or preferences and takes actions to make them happen.[d][32] In automated planning, the agent has a specific goal.[33] In automated decision-making, the agent has preferences—there are some situations it would prefer to be in, and some situations it is trying to avoid. The decision-making agent assigns a number to each situation (called the "utility") that measures how much the agent prefers it. For each possible action, it can calculate the "expected utility": the utility of all possible outcomes of the action, weighted by the probability that the outcome will occur. It can then choose the action with the maximum expected utility.[34]

In classical planning, the agent knows exactly what the effect of any action will be.[35] In most real-world problems, however, the agent may not be certain about the situation they are in (it is "unknown" or "unobservable") and it may not know for certain what will happen after each possible action (it is not "deterministic"). It must choose an action by making a probabilistic guess and then reassess the situation to see if the action worked.[36]

In some problems, the agent's preferences may be uncertain, especially if there are other agents or humans involved. These can be learned (e.g., with inverse reinforcement learning), or the agent can seek information to improve its preferences.[37] Information value theory can be used to weigh the value of exploratory or experimental actions.[38] The space of possible future actions and situations is typically intractably large, so the agents must take actions and evaluate situations while being uncertain of what the outcome will be.

A Markov decision process has a transition model that describes the probability that a particular action will change the state in a particular way and a reward function that supplies the utility of each state and the cost of each action. A policy associates a decision with each possible state. The policy could be calculated (e.g., by iteration), be heuristic, or it can be learned.[39]

Game theory describes the rational behavior of multiple interacting agents and is used in AI programs that make decisions that involve other agents.[40]

Learning
Machine learning is the study of programs that can improve their performance on a given task automatically.[41] It has been a part of AI from the beginning.[e]


In supervised learning, the training data is labelled with the expected answers, while in unsupervised learning, the model identifies patterns or structures in unlabelled data.
There are several kinds of machine learning. Unsupervised learning analyzes a stream of data and finds patterns and makes predictions without any other guidance.[44] Supervised learning requires labeling the training data with the expected answers, and comes in two main varieties: classification (where the program must learn to predict what category the input belongs in) and regression (where the program must deduce a numeric function based on numeric input).[45]

In reinforcement learning, the agent is rewarded for good responses and punished for bad ones. The agent learns to choose responses that are classified as "good".[46] Transfer learning is when the knowledge gained from one problem is applied to a new problem.[47] Deep learning is a type of machine learning that runs inputs through biologically inspired artificial neural networks for all of these types of learning.[48]

Computational learning theory can assess learners by computational complexity, by sample complexity (how much data is required), or by other notions of optimization.[49]

Natural language processing
Natural language processing (NLP) allows programs to read, write and communicate in human languages.[50] Specific problems include speech recognition, speech synthesis, machine translation, information extraction, information retrieval and question answering.[51]

Early work, based on Noam Chomsky's generative grammar and semantic networks, had difficulty with word-sense disambiguation[f] unless restricted to small domains called "micro-worlds" (due to the common sense knowledge problem[29]). Margaret Masterman believed that it was meaning and not grammar that was the key to understanding languages, and that thesauri and not dictionaries should be the basis of computational language structure.

Modern deep learning techniques for NLP include word embedding (representing words, typically as vectors encoding their meaning),[52] transformers (a deep learning architecture using an attention mechanism),[53] and others.[54] In 2019, generative pre-trained transformer (or "GPT") language models began to generate coherent text,[55][56] and by 2023, these models were able to get human-level scores on the bar exam, SAT test, GRE test, and many other real-world applications.[57]

Perception
Machine perception is the ability to use input from sensors (such as cameras, microphones, wireless signals, active lidar, sonar, radar, and tactile sensors) to deduce aspects of the world. Computer vision is the ability to analyze visual input.[58]

The field includes speech recognition,[59] image classification,[60] facial recognition, object recognition,[61] object tracking,[62] and robotic perception.[63]

Social intelligence

Kismet, a robot head which was made in the 1990s; it is a machine that can recognize and simulate emotions.[64]
Affective computing is a field that comprises systems that recognize, interpret, process, or simulate human feeling, emotion, and mood.[65] For example, some virtual assistants are programmed to speak conversationally or even to banter humorously; it makes them appear more sensitive to the emotional dynamics of human interaction, or to otherwise facilitate human–computer interaction.

However, this tends to give naïve users an unrealistic conception of the intelligence of existing computer agents.[66] Moderate successes related to affective computing include textual sentiment analysis and, more recently, multimodal sentiment analysis, wherein AI classifies the effects displayed by a videotaped subject.[67]

General intelligence
A machine with artificial general intelligence would be able to solve a wide variety of problems with breadth and versatility similar to human intelligence.[68]

Techniques
AI research uses a wide variety of techniques to accomplish the goals above.[b]

Search and optimization
AI can solve many problems by intelligently searching through many possible solutions.[69] There are two very different kinds of search used in AI: state space search and local search.

State space search
State space search searches through a tree of possible states to try to find a goal state.[70] For example, planning algorithms search through trees of goals and subgoals, attempting to find a path to a target goal, a process called means-ends analysis.[71]

Simple exhaustive searches[72] are rarely sufficient for most real-world problems: the search space (the number of places to search) quickly grows to astronomical numbers. The result is a search that is too slow or never completes.[15] "Heuristics" or "rules of thumb" can help prioritize choices that are more likely to reach a goal.[73]

Adversarial search is used for game-playing programs, such as chess or Go. It searches through a tree of possible moves and countermoves, looking for a winning position.[74]

Local search

Illustration of gradient descent for 3 different starting points; two parameters (represented by the plan coordinates) are adjusted in order to minimize the loss function (the height)
Local search uses mathematical optimization to find a solution to a problem. It begins with some form of guess and refines it incrementally.[75]

Gradient descent is a type of local search that optimizes a set of numerical parameters by incrementally adjusting them to minimize a loss function. Variants of gradient descent are commonly used to train neural networks,[76] through the backpropagation algorithm.

Another type of local search is evolutionary computation, which aims to iteratively improve a set of candidate solutions by "mutating" and "recombining" them, selecting only the fittest to survive each generation.[77]

Distributed search processes can coordinate via swarm intelligence algorithms. Two popular swarm algorithms used in search are particle swarm optimization (inspired by bird flocking) and ant colony optimization (inspired by ant trails).[78]

Logic
Formal logic is used for reasoning and knowledge representation.[79] Formal logic comes in two main forms: propositional logic (which operates on statements that are true or false and uses logical connectives such as "and", "or", "not" and "implies")[80] and predicate logic (which also operates on objects, predicates and relations and uses quantifiers such as "Every X is a Y" and "There are some Xs that are Ys").[81]

Deductive reasoning in logic is the process of proving a new statement (conclusion) from other statements that are given and assumed to be true (the premises).[82] Proofs can be structured as proof trees, in which nodes are labelled by sentences, and children nodes are connected to parent nodes by inference rules.

Given a problem and a set of premises, problem-solving reduces to searching for a proof tree whose root node is labelled by a solution of the problem and whose leaf nodes are labelled by premises or axioms. In the case of Horn clauses, problem-solving search can be performed by reasoning forwards from the premises or backwards from the problem.[83] In the more general case of the clausal form of first-order logic, resolution is a single, axiom-free rule of inference, in which a problem is solved by proving a contradiction from premises that include the negation of the problem to be solved.[84]

Inference in both Horn clause logic and first-order logic is undecidable, and therefore intractable. However, backward reasoning with Horn clauses, which underpins computation in the logic programming language Prolog, is Turing complete. Moreover, its efficiency is competitive with computation in other symbolic programming languages.[85]

Fuzzy logic assigns a "degree of truth" between 0 and 1. It can therefore handle propositions that are vague and partially true.[86]

Non-monotonic logics, including logic programming with negation as failure, are designed to handle default reasoning.[28] Other specialized versions of logic have been developed to describe many complex domains.

Probabilistic methods for uncertain reasoning

A simple Bayesian network, with the associated conditional probability tables
Many problems in AI (including reasoning, planning, learning, perception, and robotics) require the agent to operate with incomplete or uncertain information. AI researchers have devised a number of tools to solve these problems using methods from probability theory and economics.[87] Precise mathematical tools have been developed that analyze how an agent can make choices and plan, using decision theory, decision analysis,[88] and information value theory.[89] These tools include models such as Markov decision processes,[90] dynamic decision networks,[91] game theory and mechanism design.[92]

Bayesian networks[93] are a tool that can be used for reasoning (using the Bayesian inference algorithm),[g][95] learning (using the expectation–maximization algorithm),[h][97] planning (using decision networks)[98] and perception (using dynamic Bayesian networks).[91]

Probabilistic algorithms can also be used for filtering, prediction, smoothing, and finding explanations for streams of data, thus helping perception systems analyze processes that occur over time (e.g., hidden Markov models or Kalman filters).[91]


Expectation–maximization clustering of Old Faithful eruption data starts from a random guess but then successfully converges on an accurate clustering of the two physically distinct modes of eruption.
Classifiers and statistical learning methods
The simplest AI applications can be divided into two types: classifiers (e.g., "if shiny then diamond"), on one hand, and controllers (e.g., "if diamond then pick up"), on the other hand. Classifiers[99] are functions that use pattern matching to determine the closest match. They can be fine-tuned based on chosen examples using supervised learning. Each pattern (also called an "observation") is labeled with a certain predefined class. All the observations combined with their class labels are known as a data set. When a new observation is received, that observation is classified based on previous experience.[45]

There are many kinds of classifiers in use.[100] The decision tree is the simplest and most widely used symbolic machine learning algorithm.[101] K-nearest neighbor algorithm was the most widely used analogical AI until the mid-1990s, and Kernel methods such as the support vector machine (SVM) displaced k-nearest neighbor in the 1990s.[102] The naive Bayes classifier is reportedly the "most widely used learner"[103] at Google, due in part to its scalability.[104] Neural networks are also used as classifiers.[105]

Artificial neural networks

A neural network is an interconnected group of nodes, akin to the vast network of neurons in the human brain.
An artificial neural network is based on a collection of nodes also known as artificial neurons, which loosely model the neurons in a biological brain. It is trained to recognise patterns; once trained, it can recognise those patterns in fresh data. There is an input, at least one hidden layer of nodes and an output. Each node applies a function and once the weight crosses its specified threshold, the data is transmitted to the next layer. A network is typically called a deep neural network if it has at least 2 hidden layers.[105]

Learning algorithms for neural networks use local search to choose the weights that will get the right output for each input during training. The most common training technique is the backpropagation algorithm.[106] Neural networks learn to model complex relationships between inputs and outputs and find patterns in data. In theory, a neural network can learn any function.[107]

In feedforward neural networks the signal passes in only one direction.[108] The term perceptron typically refers to a single-layer neural network.[109] In contrast, deep learning uses many layers.[110] Recurrent neural networks (RNNs) feed the output signal back into the input, which allows short-term memories of previous input events. Long short-term memory networks (LSTMs) are recurrent neural networks that better preserve longterm dependencies and are less sensitive to the vanishing gradient problem.[111] Convolutional neural networks (CNNs) use layers of kernels to more efficiently process local patterns. This local processing is especially important in image processing, where the early CNN layers typically identify simple local patterns such as edges and curves, with subsequent layers detecting more complex patterns like textures, and eventually whole objects.[112]

Deep learning

Deep learning is a subset of machine learning, which is itself a subset of artificial intelligence.[113]
Deep learning uses several layers of neurons between the network's inputs and outputs.[110] The multiple layers can progressively extract higher-level features from the raw input. For example, in image processing, lower layers may identify edges, while higher layers may identify the concepts relevant to a human such as digits, letters, or faces.[114]

Deep learning has profoundly improved the performance of programs in many important subfields of artificial intelligence, including computer vision, speech recognition, natural language processing, image classification,[115] and others. The reason that deep learning performs so well in so many applications is not known as of 2021.[116] The sudden success of deep learning in 2012–2015 did not occur because of some new discovery or theoretical breakthrough (deep neural networks and backpropagation had been described by many people, as far back as the 1950s)[i] but because of two factors: the incredible increase in computer power (including the hundred-fold increase in speed by switching to GPUs) and the availability of vast amounts of training data, especially the giant curated datasets used for benchmark testing, such as ImageNet.[j]

GPT
Generative pre-trained transformers (GPT) are large language models (LLMs) that generate text based on the semantic relationships between words in sentences. Text-based GPT models are pre-trained on a large corpus of text that can be from the Internet. The pretraining consists of predicting the next token (a token being usually a word, subword, or punctuation). Throughout this pretraining, GPT models accumulate knowledge about the world and can then generate human-like text by repeatedly predicting the next token. Typically, a subsequent training phase makes the model more truthful, useful, and harmless, usually with a technique called reinforcement learning from human feedback (RLHF). Current GPT models are prone to generating falsehoods called "hallucinations". These can be reduced with RLHF and quality data, but the problem has been getting worse for reasoning systems.[124] Such systems are used in chatbots, which allow people to ask a question or request a task in simple text.[125][126]

Current models and services include ChatGPT, Claude, Gemini, Copilot, and Meta AI.[127] Multimodal GPT models can process different types of data (modalities) such as images, videos, sound, and text.[128]

Hardware and software
Main articles: Programming languages for artificial intelligence and Hardware for artificial intelligence
In the late 2010s, graphics processing units (GPUs) that were increasingly designed with AI-specific enhancements and used with specialized TensorFlow software had replaced previously used central processing unit (CPUs) as the dominant means for large-scale (commercial and academic) machine learning models' training.[129] Specialized programming languages such as Prolog were used in early AI research,[130] but general-purpose programming languages like Python have become predominant.[131]

The transistor density in integrated circuits has been observed to roughly double every 18 months—a trend known as Moore's law, named after the Intel co-founder Gordon Moore, who first identified it. Improvements in GPUs have been even faster,[132] a trend sometimes called Huang's law,[133] named after Nvidia co-founder and CEO Jensen Huang.

Applications
Main article: Applications of artificial intelligence
AI and machine learning technology is used in most of the essential applications of the 2020s, including: search engines (such as Google Search), targeting online advertisements, recommendation systems (offered by Netflix, YouTube or Amazon), driving internet traffic, targeted advertising (AdSense, Facebook), virtual assistants (such as Siri or Alexa), autonomous vehicles (including drones, ADAS and self-driving cars), automatic language translation (Microsoft Translator, Google Translate), facial recognition (Apple's FaceID or Microsoft's DeepFace and Google's FaceNet) and image labeling (used by Facebook, Apple's Photos and TikTok). The deployment of AI may be overseen by a chief automation officer (CAO).

Health and medicine
Main article: Artificial intelligence in healthcare
The application of AI in medicine and medical research has the potential to increase patient care and quality of life.[134] Through the lens of the Hippocratic Oath, medical professionals are ethically compelled to use AI, if applications can more accurately diagnose and treat patients.[135][136]

For medical research, AI is an important tool for processing and integrating big data. This is particularly important for organoid and tissue engineering development which use microscopy imaging as a key technique in fabrication.[137] It has been suggested that AI can overcome discrepancies in funding allocated to different fields of research.[137][138] New AI tools can deepen the understanding of biomedically relevant pathways. For example, AlphaFold 2 (2021) demonstrated the ability to approximate, in hours rather than months, the 3D structure of a protein.[139] In 2023, it was reported that AI-guided drug discovery helped find a class of antibiotics capable of killing two different types of drug-resistant bacteria.[140] In 2024, researchers used machine learning to accelerate the search for Parkinson's disease drug treatments. Their aim was to identify compounds that block the clumping, or aggregation, of alpha-synuclein (the protein that characterises Parkinson's disease). They were able to speed up the initial screening process ten-fold and reduce the cost by a thousand-fold.[141][142]

Games
Main article: Artificial intelligence in video games
Game playing programs have been used since the 1950s to demonstrate and test AI's most advanced techniques.[143] Deep Blue became the first computer chess-playing system to beat a reigning world chess champion, Garry Kasparov, on 11 May 1997.[144] In 2011, in a Jeopardy! quiz show exhibition match, IBM's question answering system, Watson, defeated the two greatest Jeopardy! champions, Brad Rutter and Ken Jennings, by a significant margin.[145] In March 2016, AlphaGo won 4 out of 5 games of Go in a match with Go champion Lee Sedol, becoming the first computer Go-playing system to beat a professional Go player without handicaps. Then, in 2017, it defeated Ke Jie, who was the best Go player in the world.[146] Other programs handle imperfect-information games, such as the poker-playing program Pluribus.[147] DeepMind developed increasingly generalistic reinforcement learning models, such as with MuZero, which could be trained to play chess, Go, or Atari games.[148] In 2019, DeepMind's AlphaStar achieved grandmaster level in StarCraft II, a particularly challenging real-time strategy game that involves incomplete knowledge of what happens on the map.[149] In 2021, an AI agent competed in a PlayStation Gran Turismo competition, winning against four of the world's best Gran Turismo drivers using deep reinforcement learning.[150] In 2024, Google DeepMind introduced SIMA, a type of AI capable of autonomously playing nine previously unseen open-world video games by observing screen output, as well as executing short, specific tasks in response to natural language instructions.[151]

Mathematics
Large language models, such as GPT-4, Gemini, Claude, Llama or Mistral, are increasingly used in mathematics. These probabilistic models are versatile, but can also produce wrong answers in the form of hallucinations. They sometimes need a large database of mathematical problems to learn from, but also methods such as supervised fine-tuning[152] or trained classifiers with human-annotated data to improve answers for new problems and learn from corrections.[153] A February 2024 study showed that the performance of some language models for reasoning capabilities in solving math problems not included in their training data was low, even for problems with only minor deviations from trained data.[154] One technique to improve their performance involves training the models to produce correct reasoning steps, rather than just the correct result.[155] The Alibaba Group developed a version of its Qwen models called Qwen2-Math, that achieved state-of-the-art performance on several mathematical benchmarks, including 84% accuracy on the MATH dataset of competition mathematics problems.[156] In January 2025, Microsoft proposed the technique rStar-Math that leverages Monte Carlo tree search and step-by-step reasoning, enabling a relatively small language model like Qwen-7B to solve 53% of the AIME 2024 and 90% of the MATH benchmark problems.[157]

Alternatively, dedicated models for mathematical problem solving with higher precision for the outcome including proof of theorems have been developed such as AlphaTensor, AlphaGeometry, AlphaProof and AlphaEvolve[158] all from Google DeepMind,[159] Llemma from EleutherAI[160] or Julius.[161]

When natural language is used to describe mathematical problems, converters can transform such prompts into a formal language such as Lean to define mathematical tasks.

Some models have been developed to solve challenging problems and reach good results in benchmark tests, others to serve as educational tools in mathematics.[162]

Topological deep learning integrates various topological approaches.

Finance
Finance is one of the fastest growing sectors where applied AI tools are being deployed: from retail online banking to investment advice and insurance, where automated "robot advisers" have been in use for some years.[163]

According to Nicolas Firzli, director of the World Pensions & Investments Forum, it may be too early to see the emergence of highly innovative AI-informed financial products and services. He argues that "the deployment of AI tools will simply further automatise things: destroying tens of thousands of jobs in banking, financial planning, and pension advice in the process, but I'm not sure it will unleash a new wave of [e.g., sophisticated] pension innovation."[164]

Military
Main article: Military applications of artificial intelligence
Various countries are deploying AI military applications.[165] The main applications enhance command and control, communications, sensors, integration and interoperability.[166] Research is targeting intelligence collection and analysis, logistics, cyber operations, information operations, and semiautonomous and autonomous vehicles.[165] AI technologies enable coordination of sensors and effectors, threat detection and identification, marking of enemy positions, target acquisition, coordination and deconfliction of distributed Joint Fires between networked combat vehicles, both human-operated and autonomous.[166]

AI has been used in military operations in Iraq, Syria, Israel and Ukraine.[165][167][168][169]

Generative AI

Vincent van Gogh in watercolour created by generative AI software
These paragraphs are an excerpt from Generative artificial intelligence.[edit]
Generative artificial intelligence (Generative AI, GenAI,[170] or GAI) is a subfield of artificial intelligence that uses generative models to produce text, images, videos, or other forms of data.[171][172][173] These models learn the underlying patterns and structures of their training data and use them to produce new data[174][175] based on the input, which often comes in the form of natural language prompts.[176][177]

Generative AI tools have become more common since the AI boom in the 2020s. This boom was made possible by improvements in transformer-based deep neural networks, particularly large language models (LLMs). Major tools include chatbots such as ChatGPT, Copilot, Gemini, Claude, Grok, and DeepSeek; text-to-image models such as Stable Diffusion, Midjourney, and DALL-E; and text-to-video models such as Veo and Sora.[178][179][180][181] Technology companies developing generative AI include OpenAI, Anthropic, Meta AI, Microsoft, Google, DeepSeek, and Baidu.[176][182][183]

Generative AI has raised many ethical questions and governance challenges as it can be used for cybercrime, or to deceive or manipulate people through fake news or deepfakes.[184][185] Even if used ethically, it may lead to mass replacement of human jobs.[186] The tools themselves have been criticized as violating intellectual property laws, since they are trained on copyrighted works.[187]
Agents
Main article: Agentic AI
AI agents are software entities designed to perceive their environment, make decisions, and take actions autonomously to achieve specific goals. These agents can interact with users, their environment, or other agents. AI agents are used in various applications, including virtual assistants, chatbots, autonomous vehicles, game-playing systems, and industrial robotics. AI agents operate within the constraints of their programming, available computational resources, and hardware limitations. This means they are restricted to performing tasks within their defined scope and have finite memory and processing capabilities. In real-world applications, AI agents often face time constraints for decision-making and action execution. Many AI agents incorporate learning algorithms, enabling them to improve their performance over time through experience or training. Using machine learning, AI agents can adapt to new situations and optimise their behaviour for their designated tasks.[188][189][190]

Sexuality
Applications of AI in this domain include AI-enabled menstruation and fertility trackers that analyze user data to offer predictions,[191] AI-integrated sex toys (e.g., teledildonics),[192] AI-generated sexual education content,[193] and AI agents that simulate sexual and romantic partners (e.g., Replika).[194] AI is also used for the production of non-consensual deepfake pornography, raising significant ethical and legal concerns.[195]

AI technologies have also been used to attempt to identify online gender-based violence and online sexual grooming of minors.[196][197]

Other industry-specific tasks
There are also thousands of successful AI applications used to solve specific problems for specific industries or institutions. In a 2017 survey, one in five companies reported having incorporated "AI" in some offerings or processes.[198] A few examples are energy storage, medical diagnosis, military logistics, applications that predict the result of judicial decisions, foreign policy, or supply chain management.

AI applications for evacuation and disaster management are growing. AI has been used to investigate patterns in large-scale and small-scale evacuations using historical data from GPS, videos or social media. Furthermore, AI can provide real-time information on the evacuation conditions.[199][200][201]

In agriculture, AI has helped farmers to increase yield and identify areas that need irrigation, fertilization, pesticide treatments. Agronomists use AI to conduct research and development. AI has been used to predict the ripening time for crops such as tomatoes, monitor soil moisture, operate agricultural robots, conduct predictive analytics, classify livestock pig call emotions, automate greenhouses, detect diseases and pests, and save water.

Artificial intelligence is used in astronomy to analyze increasing amounts of available data and applications, mainly for "classification, regression, clustering, forecasting, generation, discovery, and the development of new scientific insights." For example, it is used for discovering exoplanets, forecasting solar activity, and distinguishing between signals and instrumental effects in gravitational wave astronomy. Additionally, it could be used for activities in space, such as space exploration, including the analysis of data from space missions, real-time science decisions of spacecraft, space debris avoidance, and more autonomous operation.

During the 2024 Indian elections, US$50 million was spent on authorized AI-generated content, notably by creating deepfakes of allied (including sometimes deceased) politicians to better engage with voters, and by translating speeches to various local languages.[202]

Ethics
Main article: Ethics of artificial intelligence

Street art in Tel Aviv[203][204]
AI has potential benefits and potential risks.[205] AI may be able to advance science and find solutions for serious problems: Demis Hassabis of DeepMind hopes to "solve intelligence, and then use that to solve everything else".[206] However, as the use of AI has become widespread, several unintended consequences and risks have been identified.[207] In-production systems can sometimes not factor ethics and bias into their AI training processes, especially when the AI algorithms are inherently unexplainable in deep learning.[208]

Risks and harm
Privacy and copyright
Further information: Information privacy and Artificial intelligence and copyright
Machine learning algorithms require large amounts of data. The techniques used to acquire this data have raised concerns about privacy, surveillance and copyright.

AI-powered devices and services, such as virtual assistants and IoT products, continuously collect personal information, raising concerns about intrusive data gathering and unauthorized access by third parties. The loss of privacy is further exacerbated by AI's ability to process and combine vast amounts of data, potentially leading to a surveillance society where individual activities are constantly monitored and analyzed without adequate safeguards or transparency.

Sensitive user data collected may include online activity records, geolocation data, video, or audio.[209] For example, in order to build speech recognition algorithms, Amazon has recorded millions of private conversations and allowed temporary workers to listen to and transcribe some of them.[210] Opinions about this widespread surveillance range from those who see it as a necessary evil to those for whom it is clearly unethical and a violation of the right to privacy.[211]

AI developers argue that this is the only way to deliver valuable applications and have developed several techniques that attempt to preserve privacy while still obtaining the data, such as data aggregation, de-identification and differential privacy.[212] Since 2016, some privacy experts, such as Cynthia Dwork, have begun to view privacy in terms of fairness. Brian Christian wrote that experts have pivoted "from the question of 'what they know' to the question of 'what they're doing with it'."[213]

Generative AI is often trained on unlicensed copyrighted works, including in domains such as images or computer code; the output is then used under the rationale of "fair use". Experts disagree about how well and under what circumstances this rationale will hold up in courts of law; relevant factors may include "the purpose and character of the use of the copyrighted work" and "the effect upon the potential market for the copyrighted work".[214][215] Website owners who do not wish to have their content scraped can indicate it in a "robots.txt" file.[216] In 2023, leading authors (including John Grisham and Jonathan Franzen) sued AI companies for using their work to train generative AI.[217][218] Another discussed approach is to envision a separate sui generis system of protection for creations generated by AI to ensure fair attribution and compensation for human authors.[219]

Dominance by tech giants
The commercial AI scene is dominated by Big Tech companies such as Alphabet Inc., Amazon, Apple Inc., Meta Platforms, and Microsoft.[220][221][222] Some of these players already own the vast majority of existing cloud infrastructure and computing power from data centers, allowing them to entrench further in the marketplace.[223][224]

Power needs and environmental impacts
See also: Environmental impacts of artificial intelligence
In January 2024, the International Energy Agency (IEA) released Electricity 2024, Analysis and Forecast to 2026, forecasting electric power use.[225] This is the first IEA report to make projections for data centers and power consumption for artificial intelligence and cryptocurrency. The report states that power demand for these uses might double by 2026, with additional electric power usage equal to electricity used by the whole Japanese nation.[226]

Prodigious power consumption by AI is responsible for the growth of fossil fuel use, and might delay closings of obsolete, carbon-emitting coal energy facilities. There is a feverish rise in the construction of data centers throughout the US, making large technology firms (e.g., Microsoft, Meta, Google, Amazon) into voracious consumers of electric power. Projected electric consumption is so immense that there is concern that it will be fulfilled no matter the source. A ChatGPT search involves the use of 10 times the electrical energy as a Google search. The large firms are in haste to find power sources – from nuclear energy to geothermal to fusion. The tech firms argue that – in the long view – AI will be eventually kinder to the environment, but they need the energy now. AI makes the power grid more efficient and "intelligent", will assist in the growth of nuclear power, and track overall carbon emissions, according to technology firms.[227]

A 2024 Goldman Sachs Research Paper, AI Data Centers and the Coming US Power Demand Surge, found "US power demand (is) likely to experience growth not seen in a generation...." and forecasts that, by 2030, US data centers will consume 8% of US power, as opposed to 3% in 2022, presaging growth for the electrical power generation industry by a variety of means.[228] Data centers' need for more and more electrical power is such that they might max out the electrical grid. The Big Tech companies counter that AI can be used to maximize the utilization of the grid by all.[229]

In 2024, the Wall Street Journal reported that big AI companies have begun negotiations with the US nuclear power providers to provide electricity to the data centers. In March 2024 Amazon purchased a Pennsylvania nuclear-powered data center for US$650 million.[230] Nvidia CEO Jensen Huang said nuclear power is a good option for the data centers.[231]

In September 2024, Microsoft announced an agreement with Constellation Energy to re-open the Three Mile Island nuclear power plant to provide Microsoft with 100% of all electric power produced by the plant for 20 years. Reopening the plant, which suffered a partial nuclear meltdown of its Unit 2 reactor in 1979, will require Constellation to get through strict regulatory processes which will include extensive safety scrutiny from the US Nuclear Regulatory Commission. If approved (this will be the first ever US re-commissioning of a nuclear plant), over 835 megawatts of power – enough for 800,000 homes – of energy will be produced. The cost for re-opening and upgrading is estimated at US$1.6 billion and is dependent on tax breaks for nuclear power contained in the 2022 US Inflation Reduction Act.[232] The US government and the state of Michigan are investing almost US$2 billion to reopen the Palisades Nuclear reactor on Lake Michigan. Closed since 2022, the plant is planned to be reopened in October 2025. The Three Mile Island facility will be renamed the Crane Clean Energy Center after Chris Crane, a nuclear proponent and former CEO of Exelon who was responsible for Exelon's spinoff of Constellation.[233]

After the last approval in September 2023, Taiwan suspended the approval of data centers north of Taoyuan with a capacity of more than 5 MW in 2024, due to power supply shortages.[234] Taiwan aims to phase out nuclear power by 2025.[234] On the other hand, Singapore imposed a ban on the opening of data centers in 2019 due to electric power, but in 2022, lifted this ban.[234]

Although most nuclear plants in Japan have been shut down after the 2011 Fukushima nuclear accident, according to an October 2024 Bloomberg article in Japanese, cloud gaming services company Ubitus, in which Nvidia has a stake, is looking for land in Japan near nuclear power plant for a new data center for generative AI.[235] Ubitus CEO Wesley Kuo said nuclear power plants are the most efficient, cheap and stable power for AI.[235]

On 1 November 2024, the Federal Energy Regulatory Commission (FERC) rejected an application submitted by Talen Energy for approval to supply some electricity from the nuclear power station Susquehanna to Amazon's data center.[236] According to the Commission Chairman Willie L. Phillips, it is a burden on the electricity grid as well as a significant cost shifting concern to households and other business sectors.[236]

In 2025, a report prepared by the International Energy Agency estimated the greenhouse gas emissions from the energy consumption of AI at 180 million tons. By 2035, these emissions could rise to 300–500 million tonnes depending on what measures will be taken. This is below 1.5% of the energy sector emissions. The emissions reduction potential of AI was estimated at 5% of the energy sector emissions, but rebound effects (for example if people switch from public transport to autonomous cars) can reduce it.[237]

Misinformation
See also: YouTube § Moderation and offensive content
YouTube, Facebook and others use recommender systems to guide users to more content. These AI programs were given the goal of maximizing user engagement (that is, the only goal was to keep people watching). The AI learned that users tended to choose misinformation, conspiracy theories, and extreme partisan content, and, to keep them watching, the AI recommended more of it. Users also tended to watch more content on the same subject, so the AI led people into filter bubbles where they received multiple versions of the same misinformation.[238] This convinced many users that the misinformation was true, and ultimately undermined trust in institutions, the media and the government.[239] The AI program had correctly learned to maximize its goal, but the result was harmful to society. After the U.S. election in 2016, major technology companies took some steps to mitigate the problem.[240]

In the early 2020s, generative AI began to create images, audio, and texts that are virtually indistinguishable from real photographs, recordings, or human writing,[241] while realistic AI-generated videos became feasible in the mid-2020s.[242][243][244] It is possible for bad actors to use this technology to create massive amounts of misinformation or propaganda;[245] one such potential malicious use is deepfakes for computational propaganda.[246] AI pioneer Geoffrey Hinton expressed concern about AI enabling "authoritarian leaders to manipulate their electorates" on a large scale, among other risks.[247]

AI researchers at Microsoft, OpenAI, universities and other organisations have suggested using "personhood credentials" as a way to overcome online deception enabled by AI models.[248]

Algorithmic bias and fairness
Main articles: Algorithmic bias and Fairness (machine learning)
Machine learning applications will be biased[k] if they learn from biased data.[250] The developers may not be aware that the bias exists.[251] Bias can be introduced by the way training data is selected and by the way a model is deployed.[252][250] If a biased algorithm is used to make decisions that can seriously harm people (as it can in medicine, finance, recruitment, housing or policing) then the algorithm may cause discrimination.[253] The field of fairness studies how to prevent harms from algorithmic biases.

On June 28, 2015, Google Photos's new image labeling feature mistakenly identified Jacky Alcine and a friend as "gorillas" because they were black. The system was trained on a dataset that contained very few images of black people,[254] a problem called "sample size disparity".[255] Google "fixed" this problem by preventing the system from labelling anything as a "gorilla". Eight years later, in 2023, Google Photos still could not identify a gorilla, and neither could similar products from Apple, Facebook, Microsoft and Amazon.[256]

COMPAS is a commercial program widely used by U.S. courts to assess the likelihood of a defendant becoming a recidivist. In 2016, Julia Angwin at ProPublica discovered that COMPAS exhibited racial bias, despite the fact that the program was not told the races of the defendants. Although the error rate for both whites and blacks was calibrated equal at exactly 61%, the errors for each race were different—the system consistently overestimated the chance that a black person would re-offend and would underestimate the chance that a white person would not re-offend.[257] In 2017, several researchers[l] showed that it was mathematically impossible for COMPAS to accommodate all possible measures of fairness when the base rates of re-offense were different for whites and blacks in the data.[259]

A program can make biased decisions even if the data does not explicitly mention a problematic feature (such as "race" or "gender"). The feature will correlate with other features (like "address", "shopping history" or "first name"), and the program will make the same decisions based on these features as it would on "race" or "gender".[260] Moritz Hardt said "the most robust fact in this research area is that fairness through blindness doesn't work."[261]

Criticism of COMPAS highlighted that machine learning models are designed to make "predictions" that are only valid if we assume that the future will resemble the past. If they are trained on data that includes the results of racist decisions in the past, machine learning models must predict that racist decisions will be made in the future. If an application then uses these predictions as recommendations, some of these "recommendations" will likely be racist.[262] Thus, machine learning is not well suited to help make decisions in areas where there is hope that the future will be better than the past. It is descriptive rather than prescriptive.[m]

Bias and unfairness may go undetected because the developers are overwhelmingly white and male: among AI engineers, about 4% are black and 20% are women.[255]

There are various conflicting definitions and mathematical models of fairness. These notions depend on ethical assumptions, and are influenced by beliefs about society. One broad category is distributive fairness, which focuses on the outcomes, often identifying groups and seeking to compensate for statistical disparities. Representational fairness tries to ensure that AI systems do not reinforce negative stereotypes or render certain groups invisible. Procedural fairness focuses on the decision process rather than the outcome. The most relevant notions of fairness may depend on the context, notably the type of AI application and the stakeholders. The subjectivity in the notions of bias and fairness makes it difficult for companies to operationalize them. Having access to sensitive attributes such as race or gender is also considered by many AI ethicists to be necessary in order to compensate for biases, but it may conflict with anti-discrimination laws.[249]

At its 2022 Conference on Fairness, Accountability, and Transparency (ACM FAccT 2022), the Association for Computing Machinery, in Seoul, South Korea, presented and published findings that recommend that until AI and robotics systems are demonstrated to be free of bias mistakes, they are unsafe, and the use of self-learning neural networks trained on vast, unregulated sources of flawed internet data should be curtailed.[dubious – discuss][264]

Lack of transparency
See also: Explainable AI, Algorithmic transparency, and Right to explanation
Many AI systems are so complex that their designers cannot explain how they reach their decisions.[265] Particularly with deep neural networks, in which there are many non-linear relationships between inputs and outputs. But some popular explainability techniques exist.[266]

It is impossible to be certain that a program is operating correctly if no one knows how exactly it works. There have been many cases where a machine learning program passed rigorous tests, but nevertheless learned something different than what the programmers intended. For example, a system that could identify skin diseases better than medical professionals was found to actually have a strong tendency to classify images with a ruler as "cancerous", because pictures of malignancies typically include a ruler to show the scale.[267] Another machine learning system designed to help effectively allocate medical resources was found to classify patients with asthma as being at "low risk" of dying from pneumonia. Having asthma is actually a severe risk factor, but since the patients having asthma would usually get much more medical care, they were relatively unlikely to die according to the training data. The correlation between asthma and low risk of dying from pneumonia was real, but misleading.[268]

People who have been harmed by an algorithm's decision have a right to an explanation.[269] Doctors, for example, are expected to clearly and completely explain to their colleagues the reasoning behind any decision they make. Early drafts of the European Union's General Data Protection Regulation in 2016 included an explicit statement that this right exists.[n] Industry experts noted that this is an unsolved problem with no solution in sight. Regulators argued that nevertheless the harm is real: if the problem has no solution, the tools should not be used.[270]

DARPA established the XAI ("Explainable Artificial Intelligence") program in 2014 to try to solve these problems.[271]

Several approaches aim to address the transparency problem. SHAP enables to visualise the contribution of each feature to the output.[272] LIME can locally approximate a model's outputs with a simpler, interpretable model.[273] Multitask learning provides a large number of outputs in addition to the target classification. These other outputs can help developers deduce what the network has learned.[274] Deconvolution, DeepDream and other generative methods can allow developers to see what different layers of a deep network for computer vision have learned, and produce output that can suggest what the network is learning.[275] For generative pre-trained transformers, Anthropic developed a technique based on dictionary learning that associates patterns of neuron activations with human-understandable concepts.[276]

Bad actors and weaponized AI
Main articles: Lethal autonomous weapon, Artificial intelligence arms race, and AI safety
Artificial intelligence provides a number of tools that are useful to bad actors, such as authoritarian governments, terrorists, criminals or rogue states.

A lethal autonomous weapon is a machine that locates, selects and engages human targets without human supervision.[o] Widely available AI tools can be used by bad actors to develop inexpensive autonomous weapons and, if produced at scale, they are potentially weapons of mass destruction.[278] Even when used in conventional warfare, they currently cannot reliably choose targets and could potentially kill an innocent person.[278] In 2014, 30 nations (including China) supported a ban on autonomous weapons under the United Nations' Convention on Certain Conventional Weapons, however the United States and others disagreed.[279] By 2015, over fifty countries were reported to be researching battlefield robots.[280]

AI tools make it easier for authoritarian governments to efficiently control their citizens in several ways. Face and voice recognition allow widespread surveillance. Machine learning, operating this data, can classify potential enemies of the state and prevent them from hiding. Recommendation systems can precisely target propaganda and misinformation for maximum effect. Deepfakes and generative AI aid in producing misinformation. Advanced AI can make authoritarian centralized decision-making more competitive than liberal and decentralized systems such as markets. It lowers the cost and difficulty of digital warfare and advanced spyware.[281] All these technologies have been available since 2020 or earlier—AI facial recognition systems are already being used for mass surveillance in China.[282][283]

There are many other ways in which AI is expected to help bad actors, some of which can not be foreseen. For example, machine-learning AI is able to design tens of thousands of toxic molecules in a matter of hours.[284]

Technological unemployment
Main articles: Workplace impact of artificial intelligence and Technological unemployment
Economists have frequently highlighted the risks of redundancies from AI, and speculated about unemployment if there is no adequate social policy for full employment.[285]

In the past, technology has tended to increase rather than reduce total employment, but economists acknowledge that "we're in uncharted territory" with AI.[286] A survey of economists showed disagreement about whether the increasing use of robots and AI will cause a substantial increase in long-term unemployment, but they generally agree that it could be a net benefit if productivity gains are redistributed.[287] Risk estimates vary; for example, in the 2010s, Michael Osborne and Carl Benedikt Frey estimated 47% of U.S. jobs are at "high risk" of potential automation, while an OECD report classified only 9% of U.S. jobs as "high risk".[p][289] The methodology of speculating about future employment levels has been criticised as lacking evidential foundation, and for implying that technology, rather than social policy, creates unemployment, as opposed to redundancies.[285] In April 2023, it was reported that 70% of the jobs for Chinese video game illustrators had been eliminated by generative artificial intelligence.[290][291]

Unlike previous waves of automation, many middle-class jobs may be eliminated by artificial intelligence; The Economist stated in 2015 that "the worry that AI could do to white-collar jobs what steam power did to blue-collar ones during the Industrial Revolution" is "worth taking seriously".[292] Jobs at extreme risk range from paralegals to fast food cooks, while job demand is likely to increase for care-related professions ranging from personal healthcare to the clergy.[293]

From the early days of the development of artificial intelligence, there have been arguments, for example, those put forward by Joseph Weizenbaum, about whether tasks that can be done by computers actually should be done by them, given the difference between computers and humans, and between quantitative calculation and qualitative, value-based judgement.[294]

Existential risk
Main article: Existential risk from artificial intelligence
It has been argued AI will become so powerful that humanity may irreversibly lose control of it. This could, as physicist Stephen Hawking stated, "spell the end of the human race".[295] This scenario has been common in science fiction, when a computer or robot suddenly develops a human-like "self-awareness" (or "sentience" or "consciousness") and becomes a malevolent character.[q] These sci-fi scenarios are misleading in several ways.

First, AI does not require human-like sentience to be an existential risk. Modern AI programs are given specific goals and use learning and intelligence to achieve them. Philosopher Nick Bostrom argued that if one gives almost any goal to a sufficiently powerful AI, it may choose to destroy humanity to achieve it (he used the example of a paperclip maximizer).[297] Stuart Russell gives the example of household robot that tries to find a way to kill its owner to prevent it from being unplugged, reasoning that "you can't fetch the coffee if you're dead."[298] In order to be safe for humanity, a superintelligence would have to be genuinely aligned with humanity's morality and values so that it is "fundamentally on our side".[299]

Second, Yuval Noah Harari argues that AI does not require a robot body or physical control to pose an existential risk. The essential parts of civilization are not physical. Things like ideologies, law, government, money and the economy are built on language; they exist because there are stories that billions of people believe. The current prevalence of misinformation suggests that an AI could use language to convince people to believe anything, even to take actions that are destructive.[300]

The opinions amongst experts and industry insiders are mixed, with sizable fractions both concerned and unconcerned by risk from eventual superintelligent AI.[301] Personalities such as Stephen Hawking, Bill Gates, and Elon Musk,[302] as well as AI pioneers such as Yoshua Bengio, Stuart Russell, Demis Hassabis, and Sam Altman, have expressed concerns about existential risk from AI.

In May 2023, Geoffrey Hinton announced his resignation from Google in order to be able to "freely speak out about the risks of AI" without "considering how this impacts Google".[303] He notably mentioned risks of an AI takeover,[304] and stressed that in order to avoid the worst outcomes, establishing safety guidelines will require cooperation among those competing in use of AI.[305]

In 2023, many leading AI experts endorsed the joint statement that "Mitigating the risk of extinction from AI should be a global priority alongside other societal-scale risks such as pandemics and nuclear war".[306]

Some other researchers were more optimistic. AI pioneer Jürgen Schmidhuber did not sign the joint statement, emphasising that in 95% of all cases, AI research is about making "human lives longer and healthier and easier."[307] While the tools that are now being used to improve lives can also be used by bad actors, "they can also be used against the bad actors."[308][309] Andrew Ng also argued that "it's a mistake to fall for the doomsday hype on AI—and that regulators who do will only benefit vested interests."[310] Yann LeCun "scoffs at his peers' dystopian scenarios of supercharged misinformation and even, eventually, human extinction."[311] In the early 2010s, experts argued that the risks are too distant in the future to warrant research or that humans will be valuable from the perspective of a superintelligent machine.[312] However, after 2016, the study of current and future risks and possible solutions became a serious area of research.[313]

Ethical machines and alignment
Main articles: Machine ethics, AI safety, Friendly artificial intelligence, Artificial moral agents, and Human Compatible
Friendly AI are machines that have been designed from the beginning to minimize risks and to make choices that benefit humans. Eliezer Yudkowsky, who coined the term, argues that developing friendly AI should be a higher research priority: it may require a large investment and it must be completed before AI becomes an existential risk.[314]

Machines with intelligence have the potential to use their intelligence to make ethical decisions. The field of machine ethics provides machines with ethical principles and procedures for resolving ethical dilemmas.[315] The field of machine ethics is also called computational morality,[315] and was founded at an AAAI symposium in 2005.[316]

Other approaches include Wendell Wallach's "artificial moral agents"[317] and Stuart J. Russell's three principles for developing provably beneficial machines.[318]

Open source
Active organizations in the AI open-source community include Hugging Face,[319] Google,[320] EleutherAI and Meta.[321] Various AI models, such as Llama 2, Mistral or Stable Diffusion, have been made open-weight,[322][323] meaning that their architecture and trained parameters (the "weights") are publicly available. Open-weight models can be freely fine-tuned, which allows companies to specialize them with their own data and for their own use-case.[324] Open-weight models are useful for research and innovation but can also be misused. Since they can be fine-tuned, any built-in security measure, such as objecting to harmful requests, can be trained away until it becomes ineffective. Some researchers warn that future AI models may develop dangerous capabilities (such as the potential to drastically facilitate bioterrorism) and that once released on the Internet, they cannot be deleted everywhere if needed. They recommend pre-release audits and cost-benefit analyses.[325]

Frameworks
Artificial intelligence projects can be guided by ethical considerations during the design, development, and implementation of an AI system. An AI framework such as the Care and Act Framework, developed by the Alan Turing Institute and based on the SUM values, outlines four main ethical dimensions, defined as follows:[326][327]

Respect the dignity of individual people
Connect with other people sincerely, openly, and inclusively
Care for the wellbeing of everyone
Protect social values, justice, and the public interest
Other developments in ethical frameworks include those decided upon during the Asilomar Conference, the Montreal Declaration for Responsible AI, and the IEEE's Ethics of Autonomous Systems initiative, among others;[328] however, these principles are not without criticism, especially regarding the people chosen to contribute to these frameworks.[329]

Promotion of the wellbeing of the people and communities that these technologies affect requires consideration of the social and ethical implications at all stages of AI system design, development and implementation, and collaboration between job roles such as data scientists, product managers, data engineers, domain experts, and delivery managers.[330]

The UK AI Safety Institute released in 2024 a testing toolset called 'Inspect' for AI safety evaluations available under an MIT open-source licence which is freely available on GitHub and can be improved with third-party packages. It can be used to evaluate AI models in a range of areas including core knowledge, ability to reason, and autonomous capabilities.[331]

Regulation
Main articles: Regulation of artificial intelligence, Regulation of algorithms, and AI safety
AI Safety Summit
The first global AI Safety Summit was held in the United Kingdom in November 2023 with a declaration calling for international cooperation.
The regulation of artificial intelligence is the development of public sector policies and laws for promoting and regulating AI; it is therefore related to the broader regulation of algorithms.[332] The regulatory and policy landscape for AI is an emerging issue in jurisdictions globally.[333] According to AI Index at Stanford, the annual number of AI-related laws passed in the 127 survey countries jumped from one passed in 2016 to 37 passed in 2022 alone.[334][335] Between 2016 and 2020, more than 30 countries adopted dedicated strategies for AI.[336] Most EU member states had released national AI strategies, as had Canada, China, India, Japan, Mauritius, the Russian Federation, Saudi Arabia, United Arab Emirates, U.S., and Vietnam. Others were in the process of elaborating their own AI strategy, including Bangladesh, Malaysia and Tunisia.[336] The Global Partnership on Artificial Intelligence was launched in June 2020, stating a need for AI to be developed in accordance with human rights and democratic values, to ensure public confidence and trust in the technology.[336] Henry Kissinger, Eric Schmidt, and Daniel Huttenlocher published a joint statement in November 2021 calling for a government commission to regulate AI.[337] In 2023, OpenAI leaders published recommendations for the governance of superintelligence, which they believe may happen in less than 10 years.[338] In 2023, the United Nations also launched an advisory body to provide recommendations on AI governance; the body comprises technology company executives, government officials and academics.[339] In 2024, the Council of Europe created the first international legally binding treaty on AI, called the "Framework Convention on Artificial Intelligence and Human Rights, Democracy and the Rule of Law". It was adopted by the European Union, the United States, the United Kingdom, and other signatories.[340]

In a 2022 Ipsos survey, attitudes towards AI varied greatly by country; 78% of Chinese citizens, but only 35% of Americans, agreed that "products and services using AI have more benefits than drawbacks".[334] A 2023 Reuters/Ipsos poll found that 61% of Americans agree, and 22% disagree, that AI poses risks to humanity.[341] In a 2023 Fox News poll, 35% of Americans thought it "very important", and an additional 41% thought it "somewhat important", for the federal government to regulate AI, versus 13% responding "not very important" and 8% responding "not at all important".[342][343]

In November 2023, the first global AI Safety Summit was held in Bletchley Park in the UK to discuss the near and far term risks of AI and the possibility of mandatory and voluntary regulatory frameworks.[344] 28 countries including the United States, China, and the European Union issued a declaration at the start of the summit, calling for international co-operation to manage the challenges and risks of artificial intelligence.[345][346] In May 2024 at the AI Seoul Summit, 16 global AI tech companies agreed to safety commitments on the development of AI.[347][348]

History
Main article: History of artificial intelligence
For a chronological guide, see Timeline of artificial intelligence.

In 2024, AI patents in China and the US numbered more than three-fourths of AI patents worldwide.[349] Though China had more AI patents, the US had 35% more patents per AI patent-applicant company than China.[349]
The study of mechanical or "formal" reasoning began with philosophers and mathematicians in antiquity. The study of logic led directly to Alan Turing's theory of computation, which suggested that a machine, by shuffling symbols as simple as "0" and "1", could simulate any conceivable form of mathematical reasoning.[350][351] This, along with concurrent discoveries in cybernetics, information theory and neurobiology, led researchers to consider the possibility of building an "electronic brain".[r] They developed several areas of research that would become part of AI,[353] such as McCulloch and Pitts design for "artificial neurons" in 1943,[117] and Turing's influential 1950 paper 'Computing Machinery and Intelligence', which introduced the Turing test and showed that "machine intelligence" was plausible.[354][351]

The field of AI research was founded at a workshop at Dartmouth College in 1956.[s][6] The attendees became the leaders of AI research in the 1960s.[t] They and their students produced programs that the press described as "astonishing":[u] computers were learning checkers strategies, solving word problems in algebra, proving logical theorems and speaking English.[v][7] Artificial intelligence laboratories were set up at a number of British and U.S. universities in the latter 1950s and early 1960s.[351]

Researchers in the 1960s and the 1970s were convinced that their methods would eventually succeed in creating a machine with general intelligence and considered this the goal of their field.[358] In 1965 Herbert Simon predicted, "machines will be capable, within twenty years, of doing any work a man can do".[359] In 1967 Marvin Minsky agreed, writing that "within a generation ... the problem of creating 'artificial intelligence' will substantially be solved".[360] They had, however, underestimated the difficulty of the problem.[w] In 1974, both the U.S. and British governments cut off exploratory research in response to the criticism of Sir James Lighthill[362] and ongoing pressure from the U.S. Congress to fund more productive projects.[363] Minsky and Papert's book Perceptrons was understood as proving that artificial neural networks would never be useful for solving real-world tasks, thus discrediting the approach altogether.[364] The "AI winter", a period when obtaining funding for AI projects was difficult, followed.[9]

In the early 1980s, AI research was revived by the commercial success of expert systems,[365] a form of AI program that simulated the knowledge and analytical skills of human experts. By 1985, the market for AI had reached over a billion dollars. At the same time, Japan's fifth generation computer project inspired the U.S. and British governments to restore funding for academic research.[8] However, beginning with the collapse of the Lisp Machine market in 1987, AI once again fell into disrepute, and a second, longer-lasting winter began.[10]

Up to this point, most of AI's funding had gone to projects that used high-level symbols to represent mental objects like plans, goals, beliefs, and known facts. In the 1980s, some researchers began to doubt that this approach would be able to imitate all the processes of human cognition, especially perception, robotics, learning and pattern recognition,[366] and began to look into "sub-symbolic" approaches.[367] Rodney Brooks rejected "representation" in general and focussed directly on engineering machines that move and survive.[x] Judea Pearl, Lotfi Zadeh, and others developed methods that handled incomplete and uncertain information by making reasonable guesses rather than precise logic.[87][372] But the most important development was the revival of "connectionism", including neural network research, by Geoffrey Hinton and others.[373] In 1990, Yann LeCun successfully showed that convolutional neural networks can recognize handwritten digits, the first of many successful applications of neural networks.[374]

AI gradually restored its reputation in the late 1990s and early 21st century by exploiting formal mathematical methods and by finding specific solutions to specific problems. This "narrow" and "formal" focus allowed researchers to produce verifiable results and collaborate with other fields (such as statistics, economics and mathematics).[375] By 2000, solutions developed by AI researchers were being widely used, although in the 1990s they were rarely described as "artificial intelligence" (a tendency known as the AI effect).[376] However, several academic researchers became concerned that AI was no longer pursuing its original goal of creating versatile, fully intelligent machines. Beginning around 2002, they founded the subfield of artificial general intelligence (or "AGI"), which had several well-funded institutions by the 2010s.[68]

Deep learning began to dominate industry benchmarks in 2012 and was adopted throughout the field.[11] For many specific tasks, other methods were abandoned.[y] Deep learning's success was based on both hardware improvements (faster computers,[378] graphics processing units, cloud computing[379]) and access to large amounts of data[380] (including curated datasets,[379] such as ImageNet). Deep learning's success led to an enormous increase in interest and funding in AI.[z] The amount of machine learning research (measured by total publications) increased by 50% in the years 2015–2019.[336]


The number of Google searches for the term "AI" accelerated in 2022.
In 2016, issues of fairness and the misuse of technology were catapulted into center stage at machine learning conferences, publications vastly increased, funding became available, and many researchers re-focussed their careers on these issues. The alignment problem became a serious field of academic study.[313]

In the late 2010s and early 2020s, AGI companies began to deliver programs that created enormous interest. In 2015, AlphaGo, developed by DeepMind, beat the world champion Go player. The program taught only the game's rules and developed a strategy by itself. GPT-3 is a large language model that was released in 2020 by OpenAI and is capable of generating high-quality human-like text.[381] ChatGPT, launched on November 30, 2022, became the fastest-growing consumer software application in history, gaining over 100 million users in two months.[382] It marked what is widely regarded as AI's breakout year, bringing it into the public consciousness.[383] These programs, and others, inspired an aggressive AI boom, where large companies began investing billions of dollars in AI research. According to AI Impacts, about US$50 billion annually was invested in "AI" around 2022 in the U.S. alone and about 20% of the new U.S. Computer Science PhD graduates have specialized in "AI".[384] About 800,000 "AI"-related U.S. job openings existed in 2022.[385] According to PitchBook research, 22% of newly funded startups in 2024 claimed to be AI companies.[386]

Philosophy
Main article: Philosophy of artificial intelligence
Philosophical debates have historically sought to determine the nature of intelligence and how to make intelligent machines.[387] Another major focus has been whether machines can be conscious, and the associated ethical implications.[388] Many other topics in philosophy are relevant to AI, such as epistemology and free will.[389] Rapid advancements have intensified public discussions on the philosophy and ethics of AI.[388]

Defining artificial intelligence
See also: Synthetic intelligence, Intelligent agent, Artificial mind, Virtual intelligence, and Dartmouth workshop
Alan Turing wrote in 1950 "I propose to consider the question 'can machines think'?"[390] He advised changing the question from whether a machine "thinks", to "whether or not it is possible for machinery to show intelligent behaviour".[390] He devised the Turing test, which measures the ability of a machine to simulate human conversation.[354] Since we can only observe the behavior of the machine, it does not matter if it is "actually" thinking or literally has a "mind". Turing notes that we can not determine these things about other people but "it is usual to have a polite convention that everyone thinks."[391]


The Turing test can provide some evidence of intelligence, but it penalizes non-human intelligent behavior.[392]
Russell and Norvig agree with Turing that intelligence must be defined in terms of external behavior, not internal structure.[1] However, they are critical that the test requires the machine to imitate humans. "Aeronautical engineering texts", they wrote, "do not define the goal of their field as making 'machines that fly so exactly like pigeons that they can fool other pigeons.'"[393] AI founder John McCarthy agreed, writing that "Artificial intelligence is not, by definition, simulation of human intelligence".[394]

McCarthy defines intelligence as "the computational part of the ability to achieve goals in the world".[395] Another AI founder, Marvin Minsky, similarly describes it as "the ability to solve hard problems".[396] The leading AI textbook defines it as the study of agents that perceive their environment and take actions that maximize their chances of achieving defined goals.[1] These definitions view intelligence in terms of well-defined problems with well-defined solutions, where both the difficulty of the problem and the performance of the program are direct measures of the "intelligence" of the machine—and no other philosophical discussion is required, or may not even be possible.

Another definition has been adopted by Google,[397] a major practitioner in the field of AI. This definition stipulates the ability of systems to synthesize information as the manifestation of intelligence, similar to the way it is defined in biological intelligence.

Some authors have suggested in practice, that the definition of AI is vague and difficult to define, with contention as to whether classical algorithms should be categorised as AI,[398] with many companies during the early 2020s AI boom using the term as a marketing buzzword, often even if they did "not actually use AI in a material way".[399]

There has been debate over whether large language models exhibit genuine intelligence or merely simulate it by imitating human text.[400]

Evaluating approaches to AI
No established unifying theory or paradigm has guided AI research for most of its history.[aa] The unprecedented success of statistical machine learning in the 2010s eclipsed all other approaches (so much so that some sources, especially in the business world, use the term "artificial intelligence" to mean "machine learning with neural networks"). This approach is mostly sub-symbolic, soft and narrow. Critics argue that these questions may have to be revisited by future generations of AI researchers.

Symbolic AI and its limits
Symbolic AI (or "GOFAI")[402] simulated the high-level conscious reasoning that people use when they solve puzzles, express legal reasoning and do mathematics. They were highly successful at "intelligent" tasks such as algebra or IQ tests. In the 1960s, Newell and Simon proposed the physical symbol systems hypothesis: "A physical symbol system has the necessary and sufficient means of general intelligent action."[403]

However, the symbolic approach failed on many tasks that humans solve easily, such as learning, recognizing an object or commonsense reasoning. Moravec's paradox is the discovery that high-level "intelligent" tasks were easy for AI, but low level "instinctive" tasks were extremely difficult.[404] Philosopher Hubert Dreyfus had argued since the 1960s that human expertise depends on unconscious instinct rather than conscious symbol manipulation, and on having a "feel" for the situation, rather than explicit symbolic knowledge.[405] Although his arguments had been ridiculed and ignored when they were first presented, eventually, AI research came to agree with him.[ab][16]

The issue is not resolved: sub-symbolic reasoning can make many of the same inscrutable mistakes that human intuition does, such as algorithmic bias. Critics such as Noam Chomsky argue continuing research into symbolic AI will still be necessary to attain general intelligence,[407][408] in part because sub-symbolic AI is a move away from explainable AI: it can be difficult or impossible to understand why a modern statistical AI program made a particular decision. The emerging field of neuro-symbolic artificial intelligence attempts to bridge the two approaches.

Neat vs. scruffy
Main article: Neats and scruffies
"Neats" hope that intelligent behavior is described using simple, elegant principles (such as logic, optimization, or neural networks). "Scruffies" expect that it necessarily requires solving a large number of unrelated problems. Neats defend their programs with theoretical rigor, scruffies rely mainly on incremental testing to see if they work. This issue was actively discussed in the 1970s and 1980s,[409] but eventually was seen as irrelevant. Modern AI has elements of both.

Soft vs. hard computing
Main article: Soft computing
Finding a provably correct or optimal solution is intractable for many important problems.[15] Soft computing is a set of techniques, including genetic algorithms, fuzzy logic and neural networks, that are tolerant of imprecision, uncertainty, partial truth and approximation. Soft computing was introduced in the late 1980s and most successful AI programs in the 21st century are examples of soft computing with neural networks.

Narrow vs. general AI
Main articles: Weak artificial intelligence and Artificial general intelligence
AI researchers are divided as to whether to pursue the goals of artificial general intelligence and superintelligence directly or to solve as many specific problems as possible (narrow AI) in hopes these solutions will lead indirectly to the field's long-term goals.[410][411] General intelligence is difficult to define and difficult to measure, and modern AI has had more verifiable successes by focusing on specific problems with specific solutions. The sub-field of artificial general intelligence studies this area exclusively.

Machine consciousness, sentience, and mind
Main articles: Philosophy of artificial intelligence and Artificial consciousness
There is no settled consensus in philosophy of mind on whether a machine can have a mind, consciousness and mental states in the same sense that human beings do. This issue considers the internal experiences of the machine, rather than its external behavior. Mainstream AI research considers this issue irrelevant because it does not affect the goals of the field: to build machines that can solve problems using intelligence. Russell and Norvig add that "[t]he additional project of making a machine conscious in exactly the way humans are is not one that we are equipped to take on."[412] However, the question has become central to the philosophy of mind. It is also typically the central question at issue in artificial intelligence in fiction.

Consciousness
Main articles: Hard problem of consciousness and Theory of mind
David Chalmers identified two problems in understanding the mind, which he named the "hard" and "easy" problems of consciousness.[413] The easy problem is understanding how the brain processes signals, makes plans and controls behavior. The hard problem is explaining how this feels or why it should feel like anything at all, assuming we are right in thinking that it truly does feel like something (Dennett's consciousness illusionism says this is an illusion). While human information processing is easy to explain, human subjective experience is difficult to explain. For example, it is easy to imagine a color-blind person who has learned to identify which objects in their field of view are red, but it is not clear what would be required for the person to know what red looks like.[414]

Computationalism and functionalism
Main articles: Computational theory of mind and Functionalism (philosophy of mind)
Computationalism is the position in the philosophy of mind that the human mind is an information processing system and that thinking is a form of computing. Computationalism argues that the relationship between mind and body is similar or identical to the relationship between software and hardware and thus may be a solution to the mind–body problem. This philosophical position was inspired by the work of AI researchers and cognitive scientists in the 1960s and was originally proposed by philosophers Jerry Fodor and Hilary Putnam.[415]

Philosopher John Searle characterized this position as "strong AI": "The appropriately programmed computer with the right inputs and outputs would thereby have a mind in exactly the same sense human beings have minds."[ac] Searle challenges this claim with his Chinese room argument, which attempts to show that even a computer capable of perfectly simulating human behavior would not have a mind.[419]

AI welfare and rights
It is difficult or impossible to reliably evaluate whether an advanced AI is sentient (has the ability to feel), and if so, to what degree.[420] But if there is a significant chance that a given machine can feel and suffer, then it may be entitled to certain rights or welfare protection measures, similarly to animals.[421][422] Sapience (a set of capacities related to high intelligence, such as discernment or self-awareness) may provide another moral basis for AI rights.[421] Robot rights are also sometimes proposed as a practical way to integrate autonomous agents into society.[423]

In 2017, the European Union considered granting "electronic personhood" to some of the most capable AI systems. Similarly to the legal status of companies, it would have conferred rights but also responsibilities.[424] Critics argued in 2018 that granting rights to AI systems would downplay the importance of human rights, and that legislation should focus on user needs rather than speculative futuristic scenarios. They also noted that robots lacked the autonomy to take part in society on their own.[425][426]

Progress in AI increased interest in the topic. Proponents of AI welfare and rights often argue that AI sentience, if it emerges, would be particularly easy to deny. They warn that this may be a moral blind spot analogous to slavery or factory farming, which could lead to large-scale suffering if sentient AI is created and carelessly exploited.[422][421]

Future
Superintelligence and the singularity
A superintelligence is a hypothetical agent that would possess intelligence far surpassing that of the brightest and most gifted human mind.[411] If research into artificial general intelligence produced sufficiently intelligent software, it might be able to reprogram and improve itself. The improved software would be even better at improving itself, leading to what I. J. Good called an "intelligence explosion" and Vernor Vinge called a "singularity".[427]

However, technologies cannot improve exponentially indefinitely, and typically follow an S-shaped curve, slowing when they reach the physical limits of what the technology can do.[428]

Transhumanism
Main article: Transhumanism
Robot designer Hans Moravec, cyberneticist Kevin Warwick and inventor Ray Kurzweil have predicted that humans and machines may merge in the future into cyborgs that are more capable and powerful than either. This idea, called transhumanism, has roots in the writings of Aldous Huxley and Robert Ettinger.[429]

Edward Fredkin argues that "artificial intelligence is the next step in evolution", an idea first proposed by Samuel Butler's "Darwin among the Machines" as far back as 1863, and expanded upon by George Dyson in his 1998 book Darwin Among the Machines: The Evolution of Global Intelligence.[430]

In fiction
Main article: Artificial intelligence in fiction

The word "robot" itself was coined by Karel Čapek in his 1921 play R.U.R., the title standing for "Rossum's Universal Robots".
Thought-capable artificial beings have appeared as storytelling devices since antiquity,[431] and have been a persistent theme in science fiction.[432]

A common trope in these works began with Mary Shelley's Frankenstein, where a human creation becomes a threat to its masters. This includes such works as Arthur C. Clarke's and Stanley Kubrick's 2001: A Space Odyssey (both 1968), with HAL 9000, the murderous computer in charge of the Discovery One spaceship, as well as The Terminator (1984) and The Matrix (1999). In contrast, the rare loyal robots such as Gort from The Day the Earth Stood Still (1951) and Bishop from Aliens (1986) are less prominent in popular culture.[433]

Isaac Asimov introduced the Three Laws of Robotics in many stories, most notably with the "Multivac" super-intelligent computer. Asimov's laws are often brought up during lay discussions of machine ethics;[434] while almost all artificial intelligence researchers are familiar with Asimov's laws through popular culture, they generally consider the laws useless for many reasons, one of which is their ambiguity.[435]

Several works use AI to force us to confront the fundamental question of what makes us human, showing us artificial beings that have the ability to feel, and thus to suffer. This appears in Karel Čapek's R.U.R., the films A.I. Artificial Intelligence and Ex Machina, as well as the novel Do Androids Dream of Electric Sheep?, by Philip K. Dick. Dick considers the idea that our understanding of human subjectivity is altered by technology created with artificial intelligence.[436]

"""

In [None]:
# Import tools from the Sumy library for extractive summarization
from sumy.parsers.plaintext import PlaintextParser   # Parses raw text into a document object
from sumy.nlp.tokenizers import Tokenizer            # Tokenizes text into sentences and words
from sumy.summarizers.lex_rank import LexRankSummarizer  # LexRank algorithm for summarization


# Function to summarize a block of text using LexRank
def summarize_text(text, num_sentences=5):
    # Step 1: Parse the raw text into a structured document format using Sumy
    parser = PlaintextParser.from_string(text, Tokenizer("english"))
    
    # Step 2: Initialize the LexRank summarizer
    summarizer = LexRankSummarizer()
    
    # Step 3: Apply LexRank to the parsed document to get top-ranked sentences
    # num_sentences controls how many sentences are returned
    summary = summarizer(parser.document, num_sentences)
    
    # Step 4: Convert the sentence objects back to strings and join them into one summary
    return " ".join([str(sentence) for sentence in summary])


# Example call: summarize a long piece of text down to 5 sentences
summarize_text(text, num_sentences=5)

'Artificial intelligence [a] A knowledge base is a body of knowledge represented in a form that can be used by a program. [375] By 2000, solutions developed by AI researchers were being widely used, although in the 1990s they were rarely described as "artificial intelligence" (a tendency known as the AI effect). Narrow vs. general AI Main articles: Weak artificial intelligence and Artificial general intelligence AI researchers are divided as to whether to pursue the goals of artificial general intelligence and superintelligence directly or to solve as many specific problems as possible (narrow AI) in hopes these solutions will lead indirectly to the field\'s long-term goals.'

### Extractive Summarization with LexRank (Sumy)

This block introduces **text summarization** to condense long passages into short, meaningful summaries:  

1. **Library Setup**  
   - `PlaintextParser`: Converts raw text into a document structure that Sumy can work with.  
   - `Tokenizer`: Splits the text into sentences and words.  
   - `LexRankSummarizer`: A graph-based algorithm that ranks sentences based on similarity and centrality, returning the most important ones.  

2. **Summarization Function (`summarize_text`)**  
   - **Input:** Long text (e.g., Wikipedia articles).  
   - **Process:**  
     - Parse the text into a structured format.  
     - Apply the LexRank algorithm to identify top-ranked sentences.  
   - **Output:** Joins the selected sentences into a coherent summary.  

3. **Usage Example**  
   - The provided example condenses a long Wikipedia-style article on *Artificial Intelligence* down to **5 sentences**.  


In [None]:
# Install the Hugging Face Transformers library and SentencePiece tokenizer
# - transformers: provides pre-trained models for NLP tasks (BERT, GPT, BART, etc.)
# - sentencepiece: a tokenizer library required by many seq2seq models (e.g., BART, T5, MarianMT)

!pip install transformers sentencepiece



Text SUmmarization:

1. Extractive Summarization - returns important sentencesw without generating any on its own.
2. Abstractive Summarization - rephrase and compress the text and gives the summary.

### Installing Libraries for Abstractive Summarization  

- **Transformers (Hugging Face):**  
  Provides state-of-the-art pre-trained models (BERT, GPT, BART, T5, etc.) for various NLP tasks, including summarization.  

- **SentencePiece:**  
  A tokenizer library commonly used with sequence-to-sequence (seq2seq) models like **BART** and **T5**. It helps handle subword tokenization, which is crucial for languages with large vocabularies.  

👉 This step ensures the environment is ready for **abstractive summarization**, where the model can **rephrase content in its own words** rather than simply extracting key sentences.  

In [None]:
# Import Hugging Face's high-level pipeline API
# The pipeline utility allows quick use of pre-trained models without explicitly handling tokenizer/model objects.
from transformers import pipeline

# Initialize an abstractive summarization pipeline
# - task: "summarization"
# - model: "facebook/bart-large-cnn" (a seq2seq model trained for summarization)
# This loads both the model and tokenizer automatically.
summarizer = pipeline("summarization", model="facebook/bart-large-cnn")

# Define a function to summarize text using the BART summarizer
def summarize_text(text, max_length=50, min_length=20):
    # Generate a summary with constraints:
    # - max_length: upper bound on token length of the summary
    # - min_length: lower bound to avoid overly short outputs
    # - do_sample=False ensures deterministic output (greedy/beam search instead of random sampling)
    summary = summarizer(text, max_length=max_length, min_length=min_length, do_sample=False)

    # The pipeline returns a list of dicts [{'summary_text': "..."}]; extract the text
    return summary[0]['summary_text']

Device set to use cpu


### Abstractive Summarization with BART  

This block sets up **abstractive summarization** using Hugging Face’s high-level `pipeline` API:  

1. **Pipeline Import**  
   - `pipeline("summarization", model="facebook/bart-large-cnn")` loads both the tokenizer and the **BART model** trained specifically for summarization.  
   - Provides a quick, production-ready interface without needing low-level model/tokenizer handling.  

2. **Summarization Function**  
   - **Inputs:** raw text, plus parameters to control output length.  
   - **Parameters:**  
     - `max_length=50`: ensures the summary does not exceed 50 tokens.  
     - `min_length=20`: prevents very short, uninformative summaries.  
     - `do_sample=False`: uses deterministic decoding (beam/greedy search) for stable, repeatable results.  
   - **Output:** Extracts the generated summary text from the pipeline’s result format.  

👉 Unlike extractive methods (like LexRank), BART **paraphrases and compresses** the input, producing **human-like summaries** that are clearer and more natural for end users.  

In [None]:
# Summarize the first 1024 characters of the text corpus
# This call runs the abstractive summarizer (BART large CNN)
# and returns the summary as a string.
summary_output = summarize_text(text[:1024])
print(summary_output)

'Artificial intelligence (AI) is the capability of computational systems to perform tasks typically associated with human intelligence. It is a field of research in computer science that develops and studies methods and software that enable machines to perceive their environment. High-'

### Running Abstractive Summarization  

- This block demonstrates how the **BART summarizer** is applied to real text:  
  1. **Input Restriction:** Only the first **1024 characters** of the text are passed, since many transformer models (including BART) have input length limits.  
  2. **Summarization Call:** The text is processed by `summarize_text()`, which uses the BART pipeline to generate a concise summary.  
  3. **Output:** The resulting summary is stored in `summary_output` and then printed for inspection.  

👉 This shows how the chatbot can take **long, unstructured content** (e.g., Wikipedia articles) and return a **clear, human-readable summary**

**Playground using Text to Voice**

In [None]:
# Import the Google Text-to-Speech library (gTTS) and the OS module
from gtts import gTTS
import os

# Text that will be converted into speech
text = "This is Learnbay NLP Class and we are learning how to create interactive chatbot!"

# Specify the language for speech synthesis ('en' = English)
language = 'en'

# Initialize the gTTS object
# - text: the input string to convert
# - lang: language code
# - slow=False: normal speed (if True, generates slower, clearer speech)
tts = gTTS(text=text, lang=language, slow=False)

# Save the generated speech as an MP3 file
tts.save("output.mp3")

### Text-to-Speech (TTS) with gTTS  

This block adds a **voice output feature** to the chatbot, converting text into spoken audio:  

1. **Library Imports**  
   - `gTTS`: Google Text-to-Speech for generating natural voice audio.  
   - `os`: For handling file operations if needed.  

2. **Input Text**  
   - A sample string (`"This is Learnbay NLP Class..."`) is chosen for demonstration.  

3. **TTS Initialization**  
   - Parameters:  
     - `text`: the content to be spoken.  
     - `lang='en'`: sets the language to English.  
     - `slow=False`: generates speech at normal speed.  

4. **Output File**  
   - The audio is saved locally as **`output.mp3`**.  
   - This file can be played to produce the chatbot’s spoken response.  

👉 This step transforms the chatbot into a **multimodal assistant** — not just returning text, but also **speaking answers aloud**, which improves user engagement and accessibility.  

In [9]:
# pip install playsound

Collecting playsound
  Downloading playsound-1.3.0.tar.gz (7.7 kB)
  Preparing metadata (setup.py) ... [?25l[?25hdone
Building wheels for collected packages: playsound
  Building wheel for playsound (setup.py) ... [?25l[?25hdone
  Created wheel for playsound: filename=playsound-1.3.0-py3-none-any.whl size=7020 sha256=7a10f997e78c6fc8d00c22d12d727391ca5bfe2c7f4ca1413f0c6137c0b0b6be
  Stored in directory: /root/.cache/pip/wheels/50/98/42/62753a9e1fb97579a0ce2f84f7db4c21c09d03bb2091e6cef4
Successfully built playsound
Installing collected packages: playsound
Successfully installed playsound-1.3.0


In [None]:
# Option 1 (commented out): Using playsound library
# from playsound import playsound
# playsound('/content/output.mp3')

# Option 2: Using IPython display utilities (works well inside Jupyter/Colab)
from IPython.display import Audio, display

# Path to the saved MP3 file
audio_file_path = '/content/output.mp3'

# Display an audio player in the notebook
# autoplay=True makes it start playing immediately
display(Audio(audio_file_path, autoplay=True))

### Playing the Generated Speech  

This block demonstrates two ways to **play back the chatbot’s speech**:  

1. **Option 1 – `playsound` (commented out):**  
   - A simple library for directly playing MP3 files.  
   - Suitable for local scripts but less reliable inside Jupyter/Colab environments.  

2. **Option 2 – `IPython.display.Audio`:**  
   - Displays an embedded audio player directly in the notebook.  
   - `autoplay=True` makes the file play automatically once rendered.   

👉 This final step closes the loop — the chatbot now supports **knowledge retrieval, summarization, and voice output**, making it feel much more interactive and practical.  