Skip to content

ykawahar/EcologyAI-DeepReinforcementLearning

Repository files navigation

AI In Ecology: Modeling a Predator-Prey Environment with Q-Learning

Capstone project by Yuki Kawahara and Ahmed Aldirderi Abdalla Ahmed

The intersection of artificial intelligence and ecology is understudied. The growing importance of ecology and the increasing accessibility and applicability of artificial intelligence open the door for interdisciplinary applications of the two fields. Our project investigates the possibilities of simulating complex predator-prey interactions in ecosystems in a virtual environment with artificially intelligent agents by implementing multi-agent reinforcement learning. We found Q-learning to be limited by finite states, preventing the emergence of complex behaviors and limiting the possible scale of such an environment by the size of the state-action possibilities. Implementations using Deep-Q Networks would improve the results observed in the experiments.

About

Ecosystems on Earth often involve a complex community of individual agents interacting with other agents, where they often make complex behavioral decisions based on their environment. These agents mostly comprise the biotic part of the interconnected web of activity in a given space, displaying behaviors of communication, community and reproduction within species, and intraspecies competition for survival. Modeling and simulating these natural-world behaviors in a virtual environment could help us understand the complex web of interactions in an ecosystem, and observe how certain environmental and behavioral scenarios correspond to real-world environmental changes. We seek to utilize multi-agent reinforcement learning to simulate predator-prey dynamics. We were inspired to seek whether artificial intelligence could be developed as a means to replicate the complex environmental dynamics to observe and predict critical changes in an environment, and how it could impact the greater ecosystem. Advanced implementations of AI such as Deep Reinforcement Learning, would allow simulation of sophisticated ecological systems with cognitive agents. However, agents in the real world are concerned with multiple goals and activities involving different kinds of interactions. It would be unrealistic to try to recreate every aspect of a given species. Thus, we focus solely on the predator-prey dynamics observed in real-life species. Namely, we aimed to recreate how agents learn from and adapt to their environment to hunt sources of food, and how to evade capture to survive longer.

Therefore, we implemented an Artificial life simulation from scratch that implemented multi-agent reinforcement learning (MARL) through Q-learning, where we set a randomized virtual environment populated with blank-slate agents that learn optimal behavior tactics from their environment. Our initial approach was for each agent, of a predator or prey type, to learn how to survive for as many time-steps as possible by either hunting or avoiding being hunted. The investigation hopes to uncover if MARL could recreate or mimic existing models for predator-prey interactions such as the Lotka–Volterra population models. Investigating this possible implementation would be incredibly beneficial to understanding how to strengthen the quantitative aspects of Ecology, particularly in an era and time of rapid ecological change due to environmental crises such as climate change.

Experiment Conclusions

Reinforcement learning provides an interesting approach to the simulating dynamics between animal and ecological environments. Q-learning provides a relatively simple entry to this simulation problem, particularly in its categorization of actions, states and rewards. However, the simple nature of tabular Q-learning proposes a limitation in its need for discrete information, which limits agents’ ability to learn and predict complex dynamics and patterns, and quickly adapt to changing world states. We identified that the application of Deep Q-Networks allows for a rigid approach that can keep up with changing and dynamic environments. Moreover, the Deep Q-Networks allow for better memory usage, instead of the formation of large tables to hold information for states in tabular Q-learning. We find that our implementation of a predator-prey simulation is effective for static environments, but lack adequate sensory data for agents to adapt to changing states and recognize individual spatial patterns. Tabular Q-learning lacks a realistic means of expanding on increasing complexity with state-action spaces, necessitating an approach with higher levels of infrastructure in other model-free reinforcement learning policies.