
# Random and age agents for Vampire and iProver


## Random agent for Vampire

We can make a prover environment as any other Gymnasium one
We will always add a wrapper to extract formulae labels



In [None]:
import gymnasium as gym

from gym_saturation.wrappers import LabelsExtractor

env = LabelsExtractor(gym.make("Vampire-v0"))

before using the environment, we should reset it



In [None]:
observation, info = env.reset()

``gym-saturation`` environments don't return any ``info``



In [None]:
print(info)

Observation is a tuple of CNF formulae.
By default, we are trying to prove a basic group theory lemma:
every idempotent element equals the identity



In [None]:
print("Observation:")
print("\n".join(observation["observation"]))

Wrappers extracts formulae labels for us:



In [None]:
labels = list(observation["labels"])
print(labels)

Here is an example of an episode during which we play random actions.
We set the random seed for reproducibility.



In [None]:
import random

random.seed(0)

terminated, truncated = False, False
while not (terminated or truncated):
    action = random.choice(labels)
    observation, reward, terminated, truncated, info = env.step(action)
    print("Action:", action, "Observation:")
    print("\n".join(observation["observation"]))
    labels.remove(action)
    labels += list(observation["labels"])

env.close()

the episode is terminated



In [None]:
print(terminated, truncated)

It means we arrived at a contradiction (``$false``) which proves the lemma.



In [None]:
print(observation["observation"][-1])

## Age agent for iProver

We initialise iProver-based environment in the same way



In [None]:
env = LabelsExtractor(gym.make("iProver-v0"))

Special magic needed if running by Jupyter



In [None]:
import nest_asyncio

nest_asyncio.apply()

Instead of a random agent, let's use Age agent which selects actions in the
order they appear



In [None]:
observation, info = env.reset()
print("Observation:")
print("\n".join(observation["observation"]))
labels = list(observation["labels"])
terminated = False
while not terminated:
    action = labels.pop(0)
    observation, reward, terminated, truncated, info = env.step(action)
    print("Action:", action, "Observation:")
    print("\n".join(observation["observation"]))
    labels += list(observation["labels"])
env.close()

We still arrive at a contradiction



In [None]:
print(terminated, truncated)
print(observation["observation"][-1])