# AgentGPT Cloud Training with Local Environment Integration

This notebook demonstrates how to integrate local environment simulators with cloud-based training on AWS SageMaker.

While the training job runs on AWS SageMaker (the cloud), the environment simulators are launched on local machines using their IP addresses and specific ports. To ensure successful connectivity:

- **Network Accessibility:** Ensure that the SageMaker instance can reach the local IP addresses directly by configuring proper port forwarding (using NAT or VPN, for example) so that the endpoints are accessible.
- **Endpoint Configuration:** The simulator endpoints, including both the IP address and port, are provided in the hyperparameter configuration so that the cloud training job can interact with them.

This setup allows you to leverage cloud training power while using locally hosted environments for data collection and simulation.


## Step 1. Launch Local Environment Simulators

Below, environment simulators are launched on three different local machines using their respective IP addresses.

In [None]:
# Launch environments on the first local machine
from src.env_host.launcher import EnvLauncher

first_local_machine_env_launchers = []
num_envs = 2
for i in range(num_envs):
    env_launcher = EnvLauncher.launch_on_local_with_ip(
        env_simulator='gym', 
        ip_address='http://9.67.82.216', 
        host='0.0.0.0', 
        port=56780 + i
    )
    first_local_machine_env_launchers.append(env_launcher)

print('First machine environments launched:', first_local_machine_env_launchers)

In [None]:

# Launch environments on the second local machine
from src.env_host.launcher import EnvLauncher

second_local_machine_env_launchers = []
num_envs = 2
for i in range(num_envs):
    env_launcher = EnvLauncher.launch_on_local_with_ip(
        env_simulator='gym', 
        ip_address='http://40.167.14.65', 
        host='0.0.0.0', 
        port=56780 + i
    )
    second_local_machine_env_launchers.append(env_launcher)

print('Second machine environments launched:', second_local_machine_env_launchers)

In [None]:
# Launch environments on the third local machine
from src.env_host.launcher import EnvLauncher

third_local_machine_env_launchers = []
num_envs = 2
for i in range(num_envs):
    env_launcher = EnvLauncher.launch_on_local_with_ip(
        env_simulator='gym', 
        ip_address='http://209.172.43.69', 
        host='0.0.0.0', 
        port=56780 + i
    )
    third_local_machine_env_launchers.append(env_launcher)

print('Third machine environments launched:', third_local_machine_env_launchers)

## Step 2. Configure AWS SageMaker Training

This cell sets up the SageMaker configuration required to launch the training job on the cloud.

In [None]:
from src.config.aws_config import SageMakerConfig

role_arn = "arn:aws:iam::123456789012:role/SageMakerExecutionRole"
image_uri = "agent-gpt-trainer.ccnets.org"
output_path = "s3://agent-gpt-ap-northeast-2"
instance_type = "ml.g5.4xlarge"

sagemaker_config = SageMakerConfig(
    output_path=output_path, 
    image_uri=image_uri, 
    region="ap-northeast-2",
    role_arn=role_arn, 
    instance_type=instance_type, 
    max_run=24 * 3600
)

print('SageMaker configuration set:', sagemaker_config)

## Step 3. Set Hyperparameters and Environment Hosts

The next cell initializes the RL model parameters and sets up the environment hosts with the local endpoints. These endpoints will be used by the cloud training job to connect to the local simulators.

In [None]:
from src.config.hyperparams import Hyperparameters, EnvHost, Exploration

num_hosts = 2
hyperparams = Hyperparameters(env_id='Humanoid-v5')
hyperparams.batch_size = 128
hyperparams.lr_init = 2e-4
hyperparams.lr_end = 1e-6
hyperparams.max_steps = 20_000_000
hyperparams.set_exploration('continuous', Exploration(type='gaussian_noise'))

[hyperparams.set_env_host("local" + f"{i}", EnvHost(env_endpoint="http://9.67.82.216:" + f"{56780 + i}", num_agents=128)) for i in range(num_hosts)]
[hyperparams.set_env_host("local" + f"{i + num_hosts}", EnvHost(env_endpoint="http://40.167.14.65:" + f"{56780 + i}", num_agents=128)) for i in range(num_hosts)]
[hyperparams.set_env_host("local" + f"{i + 2*num_hosts}", EnvHost(env_endpoint="http://209.172.43.69:" + f"{56780 + i}", num_agents=128)) for i in range(num_hosts)]

print('Hyperparameters and environment hosts configured:')
print(hyperparams)

## Step 4. Launch Cloud Training

Finally, this cell launches the training job on AWS SageMaker. The training job will connect to the local environment endpoints specified in the hyperparameters.

In [None]:
from src.agent_gpt import AgentGPT

# Launch the training job on AWS SageMaker
AgentGPT.train_on_cloud(sagemaker_config, hyperparameters=hyperparams)

INFO:sagemaker:Creating training-job with name: agent-gpt-trainer-2025-02-12-07-29-14-875


2025-02-12 07:29:17 Starting - Starting the training job
2025-02-12 07:29:17 Pending - Training job waiting for capacity......
2025-02-12 07:30:00 Pending - Preparing the instances for training...
2025-02-12 07:30:39 Downloading - Downloading the training image...............
2025-02-12 07:33:05 Training - Training image download completed. Training in progress....2025-02-12 07:33:52,349#011INFO worker.py:1841 -- Started a local Ray instance.
[INFO] Loading hyperparameters from /opt/ml/input/config/hyperparameters.json
[INFO] Setting 'batch_size' to: 256
[INFO] Setting 'buffer_size' to: 1000000
[INFO] Setting 'd_model' to: 384
[INFO] Setting 'dropout' to: 0.1
[INFO] Setting 'env_hosts' to: {'local0': {'host_id': None, 'env_endpoint': 'http://9.67.82.216:56780', 'num_agents': 128}, 'local1': {'host_id': None, 'env_endpoint': 'http://9.67.82.216:56781', 'num_agents': 128}, 'local2': {'host_id': None, 'env_endpoint': 'http://40.167.14.65:56780', 'num_agents': 128}, 'local3': {'host_id': N