<a href="https://colab.research.google.com/github/Saranrsaran/28seprepo/blob/master/Copy_of_Agentic_AI.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [None]:
## 📌 Problem Statement

We aim to simulate an autonomous agent that:
- Starts in the middle of a 5-state environment
- Learns by trial and error
- Reaches a goal state using Q-learning

##  1. Environment Setup

In [None]:
import random
import numpy as np
import matplotlib.pyplot as plt

# A simple linear environment with 5 states (0 to 4)
class SimpleEnvironment:
    def __init__(self):
        self.states = [0, 1, 2, 3, 4]
        self.current_state = 2
        self.goal_state = 4
        self.action_space = [0, 1]  # 0 = left, 1 = right

    def reset(self):
        self.current_state = 2
        return self.current_state

    def step(self, action):
        if action == 0:
            self.current_state = max(0, self.current_state - 1)
        elif action == 1:
            self.current_state = min(4, self.current_state + 1)

        reward = 1 if self.current_state == self.goal_state else -0.1
        done = self.current_state == self.goal_state
        return self.current_state, reward, done

## 2. Q-Learning Agent

In [None]:
class QLearningAgent:
    def __init__(self, state_size, action_size, alpha=0.1, gamma=0.9, epsilon=0.2):
        self.q_table = np.zeros((state_size, action_size))
        self.alpha = alpha
        self.gamma = gamma
        self.epsilon = epsilon

    def choose_action(self, state):
        if random.uniform(0, 1) < self.epsilon:
            return random.choice([0, 1])
        else:
            return np.argmax(self.q_table[state])

    def learn(self, state, action, reward, next_state):
        predict = self.q_table[state][action]
        target = reward + self.gamma * np.max(self.q_table[next_state])
        self.q_table[state][action] += self.alpha * (target - predict)

## 🏋️ 3. Training the Agent

In [None]:
def train_agent(episodes=200):
    env = SimpleEnvironment()
    agent = QLearningAgent(state_size=5, action_size=2)
    rewards_per_episode = []

    for ep in range(episodes):
        state = env.reset()
        total_reward = 0
        done = False

        while not done:
            action = agent.choose_action(state)
            next_state, reward, done = env.step(action)
            agent.learn(state, action, reward, next_state)
            state = next_state
            total_reward += reward

        rewards_per_episode.append(total_reward)

    return agent, rewards_per_episode

## 4. Demo the Agent

In [None]:
def demo_agent(agent):
    env = SimpleEnvironment()
    state = env.reset()
    path = [state]
    done = False

    while not done:
        action = agent.choose_action(state)
        state, _, done = env.step(action)
        path.append(state)

    print("🚀 Agent path to goal:", path)

## 5. Run Training and Visualize Learning

In [None]:
agent, rewards = train_agent(episodes=200)

# Show the path taken after training
demo_agent(agent)

# Plot the learning curve
plt.plot(rewards)
plt.xlabel("Episode")
plt.ylabel("Total Reward")
plt.title("📈 Agent Learning Curve")
plt.grid(True)
plt.show()

# ** Code Implementation**

The Python file (agentic_ai.py) now includes detailed comments, a well-structured Q-learning setup, and clean logic for training and demo.



# ** Model**


A Q-learning model is used to help the agent learn the best action for each state.

The Q-table is updated over 200 training episodes, learning from rewards and refining its decision-making.

# Demo


After training, the agent autonomously selects actions to reach the goal from the starting point (state 2).

It prints the sequence of states visited and shows a learning curve (reward over episodes).

# README (Brief Documentation)

# Agentic AI: Simple Decision-Making Agent

## 🧩 Problem Statement
Create a basic agent that can make decisions and learn from its environment using reinforcement learning. The goal is for the agent to reach a predefined target state.

## 🔧 Approach
- **Environment**: Linear 5-state world.
- **Goal**: Move from state 2 to state 4.
- **Agent**: Q-learning agent with:
  - Exploration (epsilon-greedy)
  - Reward-based learning
- **Training**: 200 episodes to learn optimal actions.

## 🚀 How to Run

1. Install dependencies:
   ```bash
   pip install numpy , matplotlib

Run the agent script:


In [None]:
bash

python agentic_ai.py


# Output:

Agent's path to reach the goal

A plotted learning curve showing reward improvements

# Learning Strategy

Reward of +1 for reaching the goal

Small penalty of -0.1 for other moves

Q-values are updated to reflect better strategies