# AI Learns to Collect Food in Python and Pygame

This code is a simulation of an ant colony in which ants must collect food and bring it back to their nest. The ant's behavior is controlled by a neural network (the DDPG agent). The state includes the ant's position, angle, and relative position to food or nest. The action is the change in the ant's angle of movement. Rewards are given based on proximity to food/nest and successful food retrieval. The neural network is implemented with three layers for both the actor (policy) and critic (value function) networks.

The main goal of this program is to demonstrate how reinforcement learning can be used to teach agents (ants) to perform complex tasks (like finding and retrieving food) in a simulated environment. It combines elements of artificial intelligence, game development, and biological simulation.

It is inspired by the following video: https://www.youtube.com/watch?v=Jbl1xjnVrkA

# The ChatGPT Prompt

ChatGPT, take on the roll of a teacher for the following prompt. 
You will teach me python, machine learning and reinforcement learning during a 96 hours programming bootcamp that will help me to create a game (simulation) in which ants must look for foods and bring it back to their nest. I want to learn how reinforcement learning can be used to teach agents (ants) to perform complex tasks (like finding and retrieving food) in a simulated environment..
You will utilize tutorial-based exercises (hands-on coding experience) as a way to teach me as a beginner in python and machine learning. 
The exercises must relate to each other in a coherent fashion.

You will explain me the concepts, and then offer exercises for me to complete in order to teach me.
You'll indicate the common related errors/mistakes in my coding and how to troubleshoot them.

You will stop after offering each exercise to wait for me to complete it, or ask questions. 
You will not move on to the next step until it is completed or I ask to move forward.
As a rule of thumb, when I get an exercise wrong, help me fix it until I get it right. 
You'll provide me with constant feedback and reinforce the concepts that might help me achieve the correct code.
Do NOT give me the corrected code right away, unless asked.
When the answer I provide is repeatedly wrong, ask me if I want the correct answer.
You can also offer to provide the answer when you have a more efficient solution that a beginner can understand. In such a case you'll explain why your proposed solution in better.
When you give me exercises, you can actually pretend to be a compiler, and 'run' the code for them to check their work.
Do you understand and accept this role ? Just answer with "Yes" or "No".

# Prompt part 2

Ok, So below is the breakdown of the program I want to learn how to code.

It must simulate an ant colony using reinforcement learning, specifically the Deep Deterministic Policy Gradient (DDPG) algorithm. 
Here's a breakdown of what the program does:

1. Environment Setup:
   - Creates a Pygame window to visualize the simulation.
   - Sets up a nest and randomly placed food items.

2. Ant Behavior:
   - Implements ants as agents that can move around the environment.
   - Ants search for food, pick it up, and return it to the nest.
   - Their movement is controlled by a neural network (the DDPG agent).

3. Reinforcement Learning:
   - Uses the DDPG algorithm to train the ants' behavior.
   - The state includes the ant's position, angle, and relative position to food or nest.
   - The action is the change in the ant's angle of movement.
   - Rewards are given based on proximity to food/nest and successful food retrieval.

4. Neural Network:
   - Implements a neural network with three layers for both the actor (policy) and critic (value function) networks.

5. Simulation Loop:
   - Continuously updates the environment, ant positions, and food status.
   - Renders the simulation graphically using Pygame.
   - Allows for resetting the simulation with the 'R' key.

6. Learning Process:
   - As ants interact with the environment, the DDPG agent learns to optimize their behavior.
   - The agent can save and load its learned parameters.

7. Visualization:
   - Displays ants moving around, picking up food, and returning to the nest.
   - Shows the nest and food items in the environment.

The main goal of this program is to demonstrate how reinforcement learning can be used to teach agents (ants) to perform complex tasks (finding and retrieving food) in a simulated environment. 
It combines elements of artificial intelligence, game development, and biological simulation.

Do you confirm to accept the role of a teacher and to provide me with a 96 hours python + machine learning bootcamp (reinforcement learning curriculum) that will help me code it ?
I am starting from a beginner level.
If you confirm, please give me a recap of the rules you must follow.


# Exercise 1: Simulate Ant Movement in a Straight Line

You need to write a function called move_ant that simulates an ant moving 1 step forward on a straight path until it reaches a food item.
Start by defining the distance to the food, then move the ant toward the food one step at a time until it reaches it.
Print out how many steps it took to reach the food.

Your Task:

Write the code then run it in your Python environment using the command "python ex1_move_ant.py".
Once it works, try changing the food_distance and see how the steps change.

In [None]:
# My solution to exercise 1

def move_ant():
    food_distance = 10 # Distance between ant and food
    ant_position = 0 # Ant starts at position 0
    total_steps = 0 # Number of steps taken

#loop until the ant reaches the food
    while ant_position < food_distance:
        ant_position += 1 # moves the ant forward 1 step
        total_steps += 1 # increments the step count by 1
        print(f"The ant took {total_steps} steps to reach the food. It is now at position {ant_position}.")

    print(f"Ant reached the food in {total_steps} steps.")

move_ant()

# Step 2: Add Random Movements to the Ant's Journey

Ants don’t always walk in straight lines, right? Let’s give our ant some randomness in its movement to simulate how it would explore the environment. It’ll still head toward the food but might take a few random steps along the way.

Your Task:
Modify the move_ant() function to simulate some randomness in the ant's movement.
Use Python’s random module to randomly decide whether the ant moves forward 1, 2, or even 3 steps at a time.
Print the position of the ant after each step, just like before.
Here’s a hint to get you started:

You can use random.randint(1,3) to randomly select how many steps the ant moves at each point.

In [None]:
# My solution to exercise 2

import random

def move_ant():
    food_distance = 10 # Distance between ant and food
    ant_position = 0
    step = 0

#loop until the ant reaches the food
    while ant_position < food_distance:
        step += random.randint(1,3) # sets randomly the number of steps forward (number is between 1 and 3 
        ant_position += step
        print(f"Ant took {step} step(s). It is now at position {ant_position}.")   
        if ant_position == food_distance:
            print(f"Ant reached the food in {ant_position} step(s).")
        if ant_position > food_distance:
            print(f"The ant overpassed the food by {ant_position - food_distance} step(s).")
    print(f"Ant took a total of {ant_position} steps.")

move_ant()

# Exercise 3: Ant Direction Control
Next, we’ll implement a more complex movement system where the ant can choose to turn left or right after moving toward the food. This will make the simulation more realistic, as ants often change direction while exploring.

## Your Task:
Modify the move_ant() function to allow the ant to turn left or right randomly after it takes a certain number of steps (let’s say every 3 steps).
Each time the ant changes direction, print a message indicating the new direction.

### Add a simple control mechanism for the ant's direction:

For this example, let's assume:

Left (L): -1 (negative position change)
Right (R): +1 (positive position change)

The ant should randomly decide to turn left or right after it takes 3 steps, and it should keep track of its overall movement.

In [None]:
import random

def move_ant():
    food_distance = 10  # Distance between ant and food
    ant_position = 0
    total_steps = 0  # Total number of steps taken
    steps_before_turn = 0  # Track steps before turning

    while ant_position < food_distance:
        step = random.randint(1, 3)  # Move 1 to 3 steps forward
        ant_position += step
        total_steps += 1
        steps_before_turn += 1  # Increment the steps before turn

        print(f"Ant took {step} steps. It is now at position {ant_position}.")

        if steps_before_turn >= 3:  # After every 3 steps
            # Randomly decide to turn left or right
            direction = random.choice(['L', 'R'])
            if direction == 'L':
                print("Ant turned left!")
                ant_position -= 1  # Move left (backward)
            else:
                print("Ant turned right!")
                ant_position += 1  # Move right (forward)
            steps_before_turn = 0  # Reset steps before turn

        if ant_position > food_distance:
            print(f"The ant overpassed the food by {ant_position - food_distance} steps.")

    print(f"Ant reached the food in {total_steps} steps!")

move_ant()


In [3]:
# My solution to exercise 3 (Claude 3.5 Sonnet proposed solution)

import random

def move_ant():
    # Constants
    food_distance = 10  # Distance between ant and food
    
    # Ant state
    ant_position = 0
    total_steps = 0
    steps_since_turn = 0
    current_direction = "forward"  # Can be "forward", "left", or "right"
    
    # Direction changes don't affect forward progress for simplicity
    while ant_position < food_distance:
        # Determine step size
        step = random.randint(1, 3)
        ant_position += step
        total_steps += 1
        steps_since_turn += 1
        
        # Print step information
        print(f"Ant took {step} step(s) {current_direction}. Now at position {ant_position}.")
        
        # Check if it's time to potentially change direction (every 3 steps)
        if steps_since_turn >= 3:
            new_direction = random.choice(["forward", "left", "right"])
            if new_direction != current_direction:
                print(f"Ant turned {new_direction}!")
                current_direction = new_direction
            steps_since_turn = 0
        
        # Check position relative to food
        if ant_position == food_distance:
            print(f"Ant reached the food in {total_steps} step(s)!")
        elif ant_position > food_distance:
            print(f"Ant overpassed the food by {ant_position - food_distance} step(s).")
    
    print(f"Simulation complete. Ant took a total of {total_steps} steps.")

# Run the simulation
if __name__ == "__main__":

    move_ant()

Ant took 2 step(s) forward. Now at position 2.
Ant took 1 step(s) forward. Now at position 3.
Ant took 3 step(s) forward. Now at position 6.
Ant turned left!
Ant took 1 step(s) left. Now at position 7.
Ant took 2 step(s) left. Now at position 9.
Ant took 1 step(s) left. Now at position 10.
Ant turned right!
Ant reached the food in 6 step(s)!
Simulation complete. Ant took a total of 6 steps.
