# Cournot duopoly

**Context:** there a re two companies producing the same identical good for the same group of costumers. In particular, for each time istant each company has to decide how much of that good produce. Conversely at each time instant the group of consumer has to decide weather to buy or not as well as from which of the two company, eventually. Compines strive for maximization of profit.

**Model:** the istance has been approached from the active inference perspective. Companies are modeled as agents which:
- at each time step decide the quantity ($q_i$) of goods to produce 
- production costs $C(q_i)$ are fixed
- Costumers decide the price according to a profit function $P(Q=q_1+q_2)$, which is unkown to the agents

**Further assumptions:** these are minor assumptions in the code:
- the total number of costumers is unkown to companies and fixed over time [variante: $~N(\mu,\sigma)$ per dare varianza all'agente]
- the amount of good sold by the first company is handled through a "reputation" parameter ($\in[0,1]$), higher it is higher the amount of good sold. The second company has (1-reputation) consumers.

**Active Inference perspective:** What follow is a brief recap of the agents and environment design. The first observation to be made is about multy-agency perspective in active inference paradigm. Not having found examples about this specific case I imagined two agents acting through the same set of possible action in the same environment, in the same moments. In this view lies also the choice of the "Cournot Duopoly" example.
Furthermore, for every company:

- observation O={profit}; [attualmente sono sold_items le osservazioni... il ricavo è variabile... pensarci su]
- states S={sold_item}; 
- actions U={how much do I produce};

- Likelihood matrix A assumes a perfect perception:
- Transition matric B assumes to sell less or equal to the amount produced, in an uninformative way; [magari considerare una campana]
- Goal oriented priors C assumes the basic "the more I sell, the more I gain", assuming a selling price; [pensarci seriamente...]
- Prior over stated D uninformative, I expect to see the market.

Finally, the environment takes in input the amount produced by the two companies, elaborates the market price [che non viene mai usato!!!] and assigns to costumers the items according to the reputation. The output of the environment are sales data for both company. 



In [1]:
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

In [2]:
import pymdp
from pymdp import utils

## Problem initialization

Dimensionalities of the hidden states factors, the control factors, and observation modalities

In [3]:
Sales_1 =  np.array([0, 1, 2, 3, 4, 5])
Sales_2 =  np.array([0, 1, 2, 3, 4, 5])

Production_1 =  np.array([0, 1, 2, 3, 4, 5])
Production_2 =  np.array([0, 1, 2, 3, 4, 5])

# number of states
n_states_1 = [len(Sales_1), len(Production_1)]
n_states_2 = [len(Sales_2), len(Production_2)]

# number of factors
n_factors_1 = len(n_states_1) #? capiamo
n_factors_2 = len(n_states_2)

# observations of sales 
obs_sales_1 = np.array([0, 1, 2, 3, 4, 5])
obs_sales_2 = np.array([0, 1, 2, 3, 4, 5])

# number of observations and modalities
num_obs_1 = len(obs_sales_1)
num_obs_2 = len(obs_sales_2)

num_modalities_1 = 1
num_modalities_2 = 1

# num_modalities_1 = len(num_obs_1) qui ho solo una modalità 
# num_modalities_2 = len(num_obs_2)

## Agents initialization

### Likelihood matrices

In [4]:
A1 = utils.obj_array(num_modalities_1)
A2 = utils.obj_array(num_modalities_2)

modeling perfect perception of the sold items:

In [5]:
A1[0] = np.eye(n_states_1[0])
A2[0] = np.eye(n_states_2[0])

### Transition matrices

In [6]:
B1 = np.zeros((n_states_1[0], n_states_1[0], n_states_1[1]))
B2 = np.zeros((n_states_2[0], n_states_2[0], n_states_2[1]))

In [7]:
for i in range(n_states_1[1]):  # for every level of production 
    for start_state in range(n_states_1[0]):  # Initial states of sales
        max_sales = min(start_state, i)  # maximum sale is min(start_state, i)
        
        if max_sales >= 0:
            B1[:max_sales+1, start_state, i] = 1 / (max_sales + 1)  # uninformative distr.

In [8]:
for i in range(n_states_2[1]):  # for every level of production 
    for start_state in range(n_states_2[0]):  # Initial states of sales
        max_sales = min(start_state, i)  # maximum sale is min(start_state, i)
        
        if max_sales >= 0:
            B2[:max_sales+1, start_state, i] = 1 / (max_sales + 1)  # uninformative distr.

### Goal oriented priors 

In [9]:
production_price_1 = 50 # price of a single piece
discount_rate_1 = 1 # the more I produce the less I pay

C1 = utils.obj_array_zeros(Production_1)

C1 = production_price_1 * discount_rate_1 * Production_1

In [10]:
production_price_2 = 5 # price of a single piece
discount_rate_2 = 1 # the more I produce the less I pay

C2 = utils.obj_array_zeros(Production_2)

C2 = production_price_2 * discount_rate_2 * Production_2

### States prior

In [11]:
# D = utils.obj_array_uniform(Sales_1) #vorrei usarlo ma fa cose strane
D1 = np.full(len(Sales_1), 1 / len(Sales_1))

In [12]:
D2 = np.full(len(Sales_2), 1 / len(Sales_2))

### Agents construction

In [13]:
from pymdp.agent import Agent

my_agent1 = Agent(A = A1, B = B1, C = C1, D = D1)
my_agent2 = Agent(A = A2, B = B2, C = C2, D = D2)

## Generative process

We shall create an environment able to elaborate the actions of the two agents and return an observation for both.

Even if this is a first attempt and the agent expect only to see sales results, I'm already introducing the price function which determines the market price of the good given the joint production of the two agents. This is for sure to be integrated somehow somewhere.

In [14]:
class CournotMarket(object):

    def __init__(self, reputation=0.5, total_demand=5, max_customer_price=10, sensibility=1):
        self.reputation = float(reputation)
        self.total_demand = float(total_demand)
        self.max_customer_price = float(max_customer_price)
        self.sensibility = float(sensibility)

    def step(self, action1, action2):
        total_production = action1 + action2
        if total_production < self.total_demand:
            self.total_demand = total_production

        market_price = max(0, self.max_customer_price - self.sensibility * total_production)

        # I cannot sell more than my production
        if action1 > int(self.total_demand * self.reputation):
            actual_sales_1 = int(self.total_demand * self.reputation)
        else:
            actual_sales_1 = int(action1)
        
        if action2 > int(self.total_demand - actual_sales_1):
            actual_sales_2 = int(self.total_demand - actual_sales_1)
        else:
            actual_sales_2 = int(action2)

        obs = [actual_sales_1, actual_sales_2]
        return obs


## Active Inference Loop

Now let's write a function that takes the:
- Agent
- Environment 
- Time length
and run the active inference loop

In [19]:
def run_active_inference_loop(my_agent1, my_agent2, my_env, T = 5):
    """Initialize first observation"""
    obs = [1,5]

    for t in range(T):
        qs_1 = my_agent1.infer_states([obs[0]])
        qs_2 = my_agent2.infer_states([obs[1]])

        q_pi_1, efe_1 = my_agent1.infer_policies()
        q_pi_2, efe_2 = my_agent2.infer_policies()

        action_1 = my_agent1.sample_action()
        action_2 = my_agent2.sample_action()

        obs = my_env.step(action_1, action_2)

        print(f'Production industry 1 at time {t}: {action_1}')
        print(f'Production industry 2 at time {t}: {action_2}') 
        print(f'Sells of industry 1 at time {t}: {obs[0]}')
        print(f'Sells of industry 2 at time {t}: {obs[1]}')


## Main

In [20]:
import warnings
warnings.filterwarnings('ignore')

In [21]:
reputation_env = 0.5
total_demand_env = 6
env = CournotMarket(reputation = reputation_env, total_demand = total_demand_env, max_customer_price = 0, sensibility = 1)

T = 5

run_active_inference_loop(my_agent1, my_agent2, env, T = T)

Production industry 1 at time 0: [1.]
Production industry 2 at time 0: [5.]
Sells of industry 1 at time 0: 1
Sells of industry 2 at time 0: 5
Production industry 1 at time 1: [1.]
Production industry 2 at time 1: [5.]
Sells of industry 1 at time 1: 1
Sells of industry 2 at time 1: 5
Production industry 1 at time 2: [1.]
Production industry 2 at time 2: [5.]
Sells of industry 1 at time 2: 1
Sells of industry 2 at time 2: 5
Production industry 1 at time 3: [1.]
Production industry 2 at time 3: [5.]
Sells of industry 1 at time 3: 1
Sells of industry 2 at time 3: 5
Production industry 1 at time 4: [1.]
Production industry 2 at time 4: [5.]
Sells of industry 1 at time 4: 1
Sells of industry 2 at time 4: 5
