This project aims at developing different VM schedulers for a given IaaS cloud. Each scheduler will have meaningful properties for either the cloud customers or the cloud provider.
The implementation and the evaluation will be made over the IaaS cloud simulator CloudSim. The simulator will replay a workload extracted from Emulab, on a datacenter having realistic characteristics.
- CloudSim FAQ
- CloudSim API
- CloudSim source code
- CloudSim mailing-list
You must have a working Java 7 + maven environment to develop and Git to manage the sources. No IDE is required but feel free to use it.
- clone this repository. The project directory is organized as follow:
$ tree
|- src # the source code
|- repository # external dependencies
|- planetlab # the workload to process
|-cloudsim-3.0.3-src.tar.gz # simulator sources
\- pom.xml # maven project descriptor
- check everything is working by typing
mvn install
in the root directory - Integrate the project with your IDE if needed
fr.unice.vicc.Main
is the entry point. It can be launch from your IDE or using the command mvn compile exec:java
.
Usage: Main scheduler [day]
scheduler
is the identifier of the scheduler to test, prefixed by--
.day
is optional, it is one of the workload day (see folders inplanetlab
). Whenall
is indicated all the days are replayed sequentially.
By default, the output is written in a log file in the logs
folder.
If you execute the program through mvn exec:java
, then the arguments are provided using the 'sched' and the 'day' properties.
- To execute the simulator using the
naive
scheduler and all the days:mvn compile exec:java -Dsched=naive -Dday=all
- to replay only day
20110303
:mvn compile exec:java -Dsched=naive -Dday=20110303
For this project, you have to develop various VM schedulers.
To integrate your schedulers within the codebase, you will have to declare your schedulers inside the class VmAllocationPolicyFactory
.
For each implemented scheduler, provide inside the class header:
- the role
- the overal design and technical choices
- the worst-case temporal complexity
This first scheduler aims only at discovering the CloudSim API. This scheduler simply places each Vm
to the first Host
having enough free resources (CPU and memory).
- Just create the new class handling the scheduling, integrate it into
VmAllocationPolicyFactory
. Your class must extendsVmAllocationPolicy
. The flag to call this scheduler for the command line interface (CLI) will be "naive". Test if the integration is correct. The code shall crash in your class but that is expected at this stage. - Implements the easy part first, that is to indicate where a Vm runs. This is done by the
getHost(Vm)
and thegetHost(int, int)
methods - The 2
allocateHostForVm
are the core of the Vm scheduler. One of the 2 methods will be executed directly by the simulator each time a Vm is submitted. In these methods, you are in charge of compute the most appropriate host for each Vm. ImplementingallocateHostForVm(Vm, Host)
is straighforward as the host is forced. To allocate the Vm on a host look at the methodHost.vmCreate(Vm)
. It allocates and returns true iff the host as sufficient free resources. The methodgetHostList
fromVmAllocationPolicy
allows to get the datacenter nodes. Track the way you want the host used to host that Vm. - Implements
deallocateHostForVm
, the method that remove a runningVm
from its hosting node. Find the host that is running your Vm and useHost.vmDestroy()
to kill it. - The scheduler is static.
optimizeAllocation
must returnsnull
- Now, implement
allocateHostForVm(Vm)
that is the main method of this class. As we said, the scheduler is very simple, it just schedule theVm
on the first appropriateHost
. - Test your simulator on a single day. If the simulation terminates successfully, all the VMs have been scheduled, all the cloudlets ran, and the provider revenues is displayed.
- Test the simulator runs successfully on all the days. For future comparisons, save the daily revenues and the global one. At this stage, it is ok to have penalties due to SLA violations
Let consider the VMs run replicated applications. To make them fault-tolerant to node failure, the customer expects to have the replicas running on distinct hosts.
- Implement a new scheduler (
antiAffinity
flag) that places the Vms with regards to their affinity. In practice, all Vms with an id between [0-99] must be on distinct nodes, the same with Vms having an id between [100-199], [200-299], ... . - What is the impact of such an algorithm over the cluster hosting capacity ? Why ?
The previous scheduler ensures fault tolerance to some node failures. Switches can also fail and in such a circumstance, a lot of nodes become unavailable. Let consider a hierarchical network. The Ml110G4 nodes are connected to one switch. The Ml110G5 to another. Both switches are then interconnected.
- Write a scheduler (flag
dr
) that ensures fault tolerance to a single switch failure. Balance the replica as possible to minimize the loss in case of failure.
When a VM is not replicated (/e.g/ remote desktop scenario), fault-tolerance is obtained by ensuring that if the hosting node crashes, then, it must be possible to restart the VM elsewhere immediatly, on another suitable node. For example, This figure depicts a viable mapping: if node 1 fails, VM1 can be restarted to N3, if node 2 fails, VM2 can be restarted to N3 and VM3 to N1. Finally, if N3 fails, VM4 can be restarted to N1. This figure is not fully resilient: if N2 crashes, it is not possible to restart VM2 elsewhere.
-
Implement a new scheduler (
ft
flag) that ensures the fault tolerance to 1 node failure for all the VM having an id that is a multiple of 10. -
How can we report the infrastructure load in that particular context ?
- Develop a scheduler that performs load balancing using a next fit algorithm (flag
nextFit
). You should observe fewer penalties with regards to the naive scheduler. - Develop another algorithm based on a /worst fit algorithm/ (
worstFit
flag) that balances with regards to both RAM and mips. Justify the method you choosed to consider the two dimensions and an evaluation metric. It is ok to work in a pragmatic manner (different approaches, keep the best) at the moment you prove your statements. - Which algorithms performs the best in terms of reducing the SLA violation. Why ?
For a practical understanding of what a SLA violation is in this project, look at the Revenue
class. Basically, there is a SLA violation when the associated Vm is requiring more MIPS it is possible to get on its host.
If the SLA is not met then the provider must pay penalties to the client. It is then not desirable to have violations to attract customers and maximize the revenues.
- Implement a scheduler that ensures there can be no SLA violation (
noViolations
flag). Remember the nature of the hypervisor in terms of CPU allocation and the VM templates. The scheduler is effective when you can successfully simulate all the days, with theRevenue
class reporting no re-fundings due to SLA violation.
Develop a scheduler (energy
flag) that reduces the overall energy consumption without relying on VM migration. The resulting simulation must consumes less energy than all the previous schedulers.
Develop a scheduler that maximizes revenues. It is then important to provide a good trade-off between energy savings and penalties for SLA violation. Justify your choices and the theoretical complexity of the algorithm