# Application Lesson 2: Application to the control of a tandem  multi server system

In [1]:
import marmote.core as mco
import marmote.markovchain as mch
import marmote.mdp as md
import numpy as np

## The model

We consider the tandem multi server queue model (below) (credit of the picture  [Tournaire 2021]) presented also in [Ohno and  Ichiki, 1987], or [Tournaire 2023].

<img src="./tandemQueue.png">

**Parameters**

Size of the systems: B1 and B2 with B1=B2=B

Number of servers: K1 and K2 with K1=K2=K

Statistics:  
arrival  Poisson with rate lambda  
service homogeneous with rate mu1 and mu2  

Costs:  
Instantaneous costs:  
activation cost: *Ca*, deactivation cost: *Cd*, reject cost: *Cr*.   
Rates costs (also called accumulated cost):    
cost per time unit of using a VM: *Cs*, cost per time unit of holding a request in the system *Ch*.

**Numerical values**

B=10  
K=5  
lam=5  
mu1=1  
mu2=1   
Ca=1  
Cd=1  
Cr=10  
Cs=2  
Ch=2

Build the model with a python dictionary

In [2]:
model=dict()
model['B1']=3 #10
model['B2']=3 #10
model['K1']=2 #5
model['K2']=2 #5
model['lam']=5.0
model['mu1']=1.0
model['mu2']=1.0
model['Ca']=1  
model['Cd']=1 
model['Cr']=10 
model['Cs']=2 
model['Ch']=2
model['beta']=1

print(model)

{'B1': 3, 'B2': 3, 'K1': 2, 'K2': 2, 'lam': 5.0, 'mu1': 1.0, 'mu2': 1.0, 'Ca': 1, 'Cd': 1, 'Cr': 10, 'Cs': 2, 'Ch': 2, 'beta': 1}


## Build a discrete time discounted MDP

### Build the states

The state is *(m1,k1,m2,k2)* with m from 0 to B-1 and  k from 0 to K-1. 
An action is *(k1,k2)* with k from 0 to K-1.

In [3]:
dims=np.array([model['B1'],model['K1'],model['B2'],model['K2']])
print(dims) 
states= mco.MarmoteBox(dims)
#
actions=mco.MarmoteBox([model['K1'],model['K2']])

print("Number of states",states.Cardinal())
print(states)
print("Number of actions",actions.Cardinal())
print("actions",actions)

[3 2 3 2]
Number of states 36
Box( [ 0..2 ] x [ 0..1 ] x [ 0..2 ] x [ 0..1 ] )
Number of actions 4
actions Box( [ 0..1 ] x [ 0..1 ] )


### Build matrices

#### Transitions matrices

We begin by defining a function which computes the transition matrix associated with an action such that the action index is: index_action.

In a state, there is three types of event: arrival in the system, departure of the system, departure of the system 1 and arrival in system 2.

In [4]:
def fill_in_matrix(index_action,modele,ssp,asp):
    # retrieve the action asscoiated with index
    action_buf = asp.DecodeState(index_action)
    #*#print("index action",index_action,"action",action_buf)
    #define the states
    etat=np.array([0,0,0,0])
    afteraction=np.array([0,0,0,0])
    jump=np.array([0,0,0,0])
    # define transition matrix
    P=mco.SparseMatrix(ssp.Cardinal()) 
    #browsing state space
    ssp.FirstState(etat)
    for k in range(ssp.Cardinal()):
        # compute the index of the state
        indexL=ssp.Index(etat)
        # compute the state after the action
        afteraction[0]=etat[0]
        afteraction[1]=action_buf[0]
        afteraction[2]=etat[2]
        afteraction[3]=action_buf[1]
        #*# print("####index State=",k,"State",etat,"State after action",afteraction)
        # then detail all the possible transitions
        ## Arrival (increases the number of customer in first coordinate with rate lambda)
        if (afteraction[0]<modele['B1']-1) :
            jump[0]=afteraction[0]+1
            jump[1]=afteraction[1]
            jump[2]=afteraction[2]
            jump[3]=afteraction[3]
        else: 
            jump[0]=afteraction[0]
            jump[1]=afteraction[1]
            jump[2]=afteraction[2]
            jump[3]=afteraction[3]
        #compute the index of the jump
        indexC=ssp.Index(jump)
        #fill in the entry
        #*# print("*Event: Arrival. Index=",indexC,"Jump State=",jump,"rate=",modele['lam'])
        P.setEntry(indexL,indexC,modele['lam'])
        #
        ## departure of the first system entry in the second one
        if (afteraction[2]<modele['B2']-1) :
            jump[0]=max(0,afteraction[0]-1)
            jump[1]=afteraction[1]
            jump[2]=afteraction[2]+1
            jump[3]=afteraction[3]
        else: 
            jump[0]=max(0,afteraction[0]-1)
            jump[1]=afteraction[1]
            jump[2]=afteraction[2]
            jump[3]=afteraction[3]
        #index of the jump
        indexC=ssp.Index(jump)
        # rate of the transition
        rate=min(afteraction[1],afteraction[0])*modele['mu1']
        #fill in the entry
        #*# print("*Event: Departure s1 entry s2. Index=",indexC,"Jump State=",jump,"rate=",rate)
        P.setEntry(indexL,indexC,rate)
        #
        ##departure of the second  system
        jump[0]=afteraction[0]
        jump[1]=afteraction[1]
        jump[2]=max(0,afteraction[2]-1)
        jump[3]=afteraction[3]
        #compute the index of the jump
        indexC=ssp.Index(jump)
        # compute the rate
        rate=min(afteraction[2],afteraction[3])*modele['mu2']
        #fill in the entry
        #*# print("*Event: Departure s2. Index=",indexC,"Jump State=",jump,"rate=",rate)
        P.setEntry(indexL,indexC,rate)
        #change state
        ssp.NextState(etat)
    return P


#### Cost Matrix

We define now a function to fill in the cost matrix. 

Instantaneous Costs are:   
Costs of activations = *max(action1-k1,0) \* Ca + max(action2-k2,0) \* Ca*  
Costs of deactivations = *max(K1-action1,0) \* Cd + max(K2-action2,0) \* Cd*  
rejection cost= *Cr \* lambda/Lambda(s,a)* in states where *m1=B* added by  *Cr \* action1 mu/Lambda(s,a)* in states where *m2=B*.  
*Lambda(s,a)* is the total rate. It is equal to *lambda + action1 \* mu + action2 \* mu* .

Accumulated Costs are:  
(number of customers in the system)Ch = *(m1+m2)\*Ch*   
(number of activated VM) = *(action1+action2)\*Cs*  

In [5]:
def fill_in_cost(modele,ssp,asp):
    R= mco.FullMatrix(ssp.Cardinal(),asp.Cardinal())
    #define the states
    etat=np.array([0,0,0,0])
    #define the actions
    acb=asp.StateBuffer()
    ssp.FirstState(etat)
    for k in range(ssp.Cardinal()):
        # compute the index of the state
        indexL=ssp.Index(etat)
        #*#print("##State",etat)
        asp.FirstState(acb)
        for j in range(asp.Cardinal()):
            #*#print("---Action",acb,end='  ')
            action1=acb[0]
            action2=acb[1]
            totalrate=modele['lam']+action1*modele['mu1']+ action2*modele['mu2']
            activationcosts=modele['Ca']*(max(0,action1-etat[1])+max(0,action2-etat[3])) 
            deactivationcosts=modele['Cd']*(max(0,etat[1]-action1)+max(0,action2-etat[3]))
            rejectioncosts=0.0
            if ((modele['B1']-1)==etat[0]):
                rejectioncosts+=(modele['lam']*modele['Cr']) / totalrate
            if ((modele['B2']-1)==etat[2]):
                rejectioncosts+=( min(etat[0],action1)*modele['mu1']*modele['Cr']) / totalrate
            instantaneouscosts=activationcosts+deactivationcosts+rejectioncosts
            accumulatedcosts=(etat[0]+etat[2])*modele['Ch'] + (action1 +action2)*modele['Cs']
            accumulatedcosts/=(totalrate+model['beta'])
            #*#print("Instantaneous=",instantaneouscosts," Rejection=",rejectioncosts,end= ' ')
            #*#print("Accumulatedcosts=",accumulatedcosts)
            R.setEntry(indexL,j,accumulatedcosts+instantaneouscosts)
            asp.NextState(acb)
        ssp.NextState(etat)
    return R;

### Build the continuous time MDP

Build all the transition matrices

In [6]:
trans=list()

action_buf = actions.StateBuffer()
actions.FirstState(action_buf)
for k in range(actions.Cardinal()):
    trans.append(fill_in_matrix(k,model,states,actions))
    print("---Matrix kth=",k, "filled in")

---Matrix kth= 0 filled in
---Matrix kth= 1 filled in
---Matrix kth= 2 filled in
---Matrix kth= 3 filled in


Fill in the costs

In [7]:
print("Matrice of Costs")
Costs=fill_in_cost(model,states,actions)

Matrice of Costs


Build the MDP

In [8]:
print("Begining of Building MDP")
ctmdp=md.ContinuousTimeDiscountedMDP("min",states,actions,trans,Costs,model['beta'])
print(ctmdp)

Begining of Building MDP
#############################################
Model: Infinite Horizon Discounted MDP
MDP Criteria : infinite horizon discounted
Discount factor:1
#############################################
MDP type (discrete,continuous): continuous
MDP rule (min,max): min
State space size: 36
Action space size: 4
State  dimension: 4
Action dimension: 2
#############################################
Transition matrix per action:
action: 0
         0         12 5.000000e+00
         1         12 5.000000e+00
         2         14 5.000000e+00
         3         14 5.000000e+00
         4         16 5.000000e+00
         5         16 5.000000e+00
         6         12 5.000000e+00
         7         12 5.000000e+00
         8         14 5.000000e+00
         9         14 5.000000e+00
        10         16 5.000000e+00
        11         16 5.000000e+00
        12         24 5.000000e+00
        13         24 5.000000e+00
        14         26 5.000000e+00
        15         26 5

Uniformization of the MDP. After uniformization the MDP is a discrete time discounted MDP.

In [9]:
ctmdp.UniformizeMDP()
print("Rate of Uniformization",ctmdp.getMaximumRate())
#*# print(ctmdp)

Rate of Uniformization 7.0


### Solve the MDP

In [10]:
optimum=ctmdp.ValueIteration(0.01,75)
print(optimum)

#############################################
Solution of MDP problem
Size of the state space: 36
#############################################
Solution model: Feedback Stationary Policy
- column 1: index of the state
- column 2: Value function 
- column 3: Optimal action 

  0         0.0547792   0
  1         0.0547792   0
  2          0.241898   0
  3          0.241898   0
  4          0.414941   0
  5          0.414941   0
  6           0.15414   2
  7           0.15414   2
  8          0.352924   2
  9          0.352924   2
 10          0.555138   2
 11          0.555138   2
 12          0.600903   0
 13          0.600903   0
 14           1.01751   0
 15           1.01751   0
 16           1.28234   0
 17           1.26689   1
 18          0.660929   2
 19          0.660929   2
 20            1.0673   2
 21            1.0673   2
 22           1.87782   0
 23           1.85155   1
 24           4.89178   2
 25           4.38653   3
 26           6.71516   2
 27           5.97864  

## Structural Analysis

The structural analysis is mainly related to the policy handling. In what follow we :

1. Check property of the MDP by building a Markov Chain Associated with a policy
2. Check property of the value function 

### Check property using Markov Chain analysis

#### Check if the MDP is "multichain"

Actually the multichain property is useless for discounted criteria and is solely valid for average multichain criteria. This is presented here for an example purpose. 

To assess the property, we build a special policy

In [11]:
policy=md.FeedbackSolutionMDP(states.Cardinal())

Now we fill-in policy. The policy is defined as foolows: in any states such that the number of customer is less than 2 the server si activated and deactivated otherwise. 

In [12]:
etat=states.StateBuffer()
states.FirstState(etat)
for k in range(states.Cardinal()):
    if(etat[0]==(model['B1']-1) or etat[2]==(model['B2']-1) ):
        policy.setActionIndex(k,0)
    else :
        policy.setActionIndex(k,1)
    states.NextState(etat)
print(policy)

#############################################
Solution of MDP problem
Size of the state space: 36
#############################################
Solution model: Feedback Stationary Policy
- column 1: index of the state
- column 2: Value function 
- column 3: Optimal action 

  0                 0   1
  1                 0   1
  2                 0   1
  3                 0   1
  4                 0   0
  5                 0   0
  6                 0   1
  7                 0   1
  8                 0   1
  9                 0   1
 10                 0   0
 11                 0   0
 12                 0   1
 13                 0   1
 14                 0   1
 15                 0   1
 16                 0   0
 17                 0   0
 18                 0   1
 19                 0   1
 20                 0   1
 21                 0   1
 22                 0   0
 23                 0   0
 24                 0   0
 25                 0   0
 26                 0   0
 27                 0  

**Build a Markov Chain from a policy**

In [13]:
Mat=ctmdp.GetChain(optimum)
Mat.set_type(mco.DISCRETE)
#*# print(Mat)

initial = mco.UniformDiscreteDistribution(0,states.Cardinal()-1)

Making the chain

In [14]:
chaine = mch.MarkovChain( Mat )
chaine.set_init_distribution(initial)
chaine.set_model_name( "Chain issued from the MDP")

**Analysis of the transition matrix**

In [15]:
Mat.FullDiagnose()

# Generator general diagnostic:
Diagnostic for SparseMatrix structure:
- generator type:        discrete
- number of origin states:      36
- number of destination states: 36
- number of transitions: 83
- number of empty rows:  0
- maximum outdegree:     3
- minimum outdegree:     1
- maximum indegree:      5
- minimum indegree:      0
- maximum value:                    1
- minimum value:             0.142857
- maximum row sum:                  1
- minimum row sum:                  1
- row sum mismatch:                 0
# Communicating classes:
number = 33
list = ( [ 0 ] [ 1 ] [ 2 ] [ 3 ] [ 4 ] [ 5 ] [ 6 ] [ 7 ] [ 8 ] [ 9 ] [ 10 ] [ 11 ] [ 12 ] [ 13 ] [ 14 ] [ 15 ] [ 16 ] [ 17 ] [ 18 ] [ 19 ] [ 20 ] [ 21 ] [ 22 ] [ 23 27 29 33 ] [ 24 ] [ 25 ] [ 26 ] [ 28 ] [ 30 ] [ 31 ] [ 32 ] [ 34 ] [ 35 ] )
# connectivity:
0.00000000 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000 1.00000000 0.00000000 0.00000000 0.00000000 0.

00 1.00000000 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000 
0.00000000 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000 1.00000000 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000 
0.00000000 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000 1.00000000 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000 
0.00000000 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000 0.00

**Evaluate the policy**

Now we can evaluate the policy by the way of the `PolicyCost` method

In [16]:
ctmdp.PolicyCost(policy,0.01,75)
print(policy)

#############################################
Solution of MDP problem
Size of the state space: 36
#############################################
Solution model: Feedback Stationary Policy
- column 1: index of the state
- column 2: Value function 
- column 3: Optimal action 

  0          0.817848   1
  1          0.161599   1
  2           1.23627   1
  3           0.35928   1
  4          0.414941   0
  5          0.414941   0
  6           1.14597   1
  7          0.489724   1
  8           1.67476   1
  9          0.797774   1
 10          0.861555   0
 11          0.861555   0
 12           1.63417   1
 13          0.740944   1
 14           2.33983   1
 15           1.17051   1
 16           1.28234   0
 17           1.28234   0
 18           2.08079   1
 19           1.18756   1
 20           2.92449   1
 21           1.75517   1
 22           1.87782   0
 23           1.87782   0
 24           5.24872   0
 25           5.24981   0
 26           7.21699   0
 27           7.21849  

### Check if the value function has structural property (convex,monotone)

This is done by building a specific object `PropertiesValue`.

In [17]:
checkValue =  md.PropertiesValue(states)
checkValue.avoidDetail()
monotonicity=checkValue.Monotonicity(optimum)
print("Printing monotonicity property of value function (1 if increasing -1 if decreasing 0 otherwise) : "\
      + str(monotonicity) )

print("Checking convexity")
convexity=checkValue.Convexity(optimum)
print("Printing convexity property of value function (1 if convex -1 concave 0 otherwise) : " + \
      str(convexity))

Printing monotonicity property of value function (1 if increasing -1 if decreasing 0 otherwise) : 0
Checking convexity
Printing convexity property of value function (1 if convex -1 concave 0 otherwise) : 0


The analysis can be made dimension by dimension. Now we check the monotonicty of the first dimension by letting vary the entries with index 0 and keeping the other dimensions fixed.

In [18]:
monotonicity=checkValue.MonotonicityByDim(optimum,0)
print("Following dimension 0 monotonicity is",str(monotonicity))

Following dimension 0 monotonicity is 1


### Check if the optimal policy has structural property

The structural analysis of a policy property is carried out using a `PropertiesValue` object.

In [19]:
print("Checking Structural Properties of value")
checkPolicy =  md.PropertiesPolicy(states)

monotonicity=checkPolicy.Monotonicity(optimum)
print("PropertiesPolicy::MonotonicityOptimalPolicy="+str(monotonicity))

Checking Structural Properties of value
PropertiesPolicy::MonotonicityOptimalPolicy=0


End