Fault Injection for Hosts' CPUs and Recovery mechanism which dynamically removes failed PEs from VMs and starts VM's snapshots when a Host or VM completely fails #81

RaysaOliveira · 2017-05-03T16:17:34Z

Feature

Implements a mechanism to inject random failures into Hosts' PEs (CPUs).

The Host Fault Injection class enables injecting random failures into Hosts PEs. It uses a given Pseudo Random Number Generator (PRNG) following some statistical distribution to generate times of failure. PRNGs such as the new PoissonDistr can be used for this purpose. Internally, it's created other PRNG to define how many Host PEs will fail when a fault is generated.

The HostFaultInjection class works as a fault injector for the Hosts of a given Datacenter. The mechanism considers the following situations.

Removal of failed PEs from VMs

If the number of working PEs is lower than the total required by all Vms, then failed PEs will be removed from running VMs, using a round-robin algorithm, one PE by Vm at a time. If all PEs are removed from a VM, such a VM is destroyed.

Management of affected VMs

Generated failures may or may not affect running VMs. If the number of working PEs remaining into a Host is higher than the total PEs required by all VMs, the failure will not cause any side effect.
If there are N free PEs into the Host and the number of failed PEs is less or equal to N, no VM will be affected.

If no VMs is affect by the failure, failed Host PEs are just set to Pe.Status.FAILED and they will be unavailable. If new VMs are tried to be placed into that Host, such PEs will not be available for them.

Start a VM snapshot (clone) when all VMs from the same broker fail

If all PEs of a Host fail, all its VMs are immediately destroyed. When all VMs from a given broker fail (doesn't matter in which Host they were), a clone for the last failed VM is created. This cloning process copies previous Cloudlets which were executing or waiting into the failed VM to the cloned VM. By cloning a VM, it is simulated starting a snapshot of that VM, as in a real cloud infrastructure.

Increase completion time for cloudlets affected by removed VM PEs

Consider a VM has N PEs. If some of its PEs fail and there were Cloudlets using these PEs, Cloudlets will continue to be executed but should spend more time to finish.
- Example: the VM has 2 PEs and a Cloudlet is using all of them. If one PE fails, the Cloudlet will spend the double of the time to finish using just the remaining PE.

VM Migration when Host is overloaded because of failures

If failure of PEs into a Host increase the percentage of CPU usage, which may cause Host overload, using a VmAllocationPolicyMigration should make VMs to be migrated to another Host.

Implementation Details

Using the HostFaultInject.addVmCloner() method, a VmCloner object may be set to define how to clone a given VM when all PEs it was using fail. Setting a VmCloner enables simulating the creation of a snapshot for that VM. This way, the HostFaultInject.addVmCloner will use this object to create a new VM when the all VMs from a specific broker fail, recovering from the failure.

Since each broker represents a customer, you can simulate the execution of multiple VMs, representing the same service such as a Web Server. These multiple VMs may be used to simulate load balancing and fault tolerance of a hosted service. If you have, for instance, 3 VMs simulating the replication of the same service, this scenario has a 2-fault tolerance level. That means your service will keep running if the maximum of 2 failures happen.

In this scenario, using the VmCloner you get a 3-fault-tolerance level. That is, if all these 3 VMs are destroyed, then a snapshot of the last destroyed VM will be created. The snapshot will take some time to be started, which is randomly chosen internally, simulating the time to get the new VM up and running. Meanwhile, the service will experience some downtime.

See #105 for more details.

Available Examples

HostFaultInjectionExample1.java

The text was updated successfully, but these errors were encountered:

- Fixed some test failures. - Added Vm Cloner and Cloudlets Cloner Functions to HostFaultInjection to allow creating clone from a given VM and re-create all its Cloudlets inside the clone, simulating the initilialization of a VM snapshot into a different Host when a previous one fails. - Removed the duplicated attribute schedulingInterval from PowerHost. This attribute is got from the Datacenter. - Added new PowerVm constructor that doesn't require an ID - Added a new submitCloudlets method into the DatacenterBroker that accepts a list of cloudlets and a VM to which such cloudlets will be bound to. - Added a getCloudletList method int the CloudletScheduler to get the list of all cloudlets which are executing or waiting inside a given VM.

- Documentation updated. - Several refactorings. - Updated the HostFaultInjection to automatically set the broker of cloned VMs and Cloudlets if one is not set.

- Moved the class PoissonProcess to the distribution package. - Moved HostFaultInjection to org.cloudsimplus.faultinjection package. - Created new basic example inside the examples module.

Refactored the HostFaultInjection to use the poisson random number generator given by the developer.

Created a VmCloner class to store the Vm and Cloudlets Cloner Functions. It also defines the maximum number of VM clones to be created using a VmCloner object. Now, the HostFaultInjection class accepts a VmCloner object, instead of setting Vm Cloner and Cloudlets Cloner individually.

- Changed the faultArrivalTimesGenerator into the HostFaultInjection class documentation to indicate the faultArrivalTimesGenerator is considered to be in hours (not minutes anymore). - Included a FaultToleranceLevel inside the JSON SLA Contracts (see SlaContract class). Now, the number of VMs to create for each broker is based on this k-fault-tolerance level. The AWS EC2 Template to be used to create these k VMs is based on the max price the customer is willing to pay hourly for all VMs. This way, the price for each VM template cannot be higher than maxPrice/k. If this is the case, the cheaper VM will be selected and the k will be recomputed to avoiding violating the contract price. If even the cheaper VM is more expensive than the contract price, it will be created only one instance of it, violating the contract price, but avoiding the customer services to be stopped.

- Created a VmCloner class to store the Vm and Cloudlets Cloner Functions. It also defines the maximum number of VM clones to be created using a VmCloner object. Now, the HostFaultInjection class accepts a VmCloner object, instead of setting Vm Cloner and Cloudlets Cloner individually. - Changed the faultArrivalTimesGenerator into the HostFaultInjection class documentation to indicate the faultArrivalTimesGenerator is considered to be in hours (not minutes anymore). - Included a FaultToleranceLevel inside the JSON SLA Contracts (see SlaContract class). Now, the number of VMs to create for each broker is based on this k-fault-tolerance level. The AWS EC2 Template to be used to create these k VMs is based on the max price the customer is willing to pay hourly for all VMs. This way, the price for each VM template cannot be higher than maxPrice/k. If this is the case, the cheaper VM will be selected and the k will be recomputed to avoiding violating the contract price. If even the cheaper VM is more expensive than the contract price, it will be created only one instance of it, violating the contract price, but avoiding the customer services to be stopped.

* Updates #81 - Created a VmCloner class to store the Vm and Cloudlets Cloner Functions. It also defines the maximum number of VM clones to be created using a VmCloner object. Now, the HostFaultInjection class accepts a VmCloner object, instead of setting Vm Cloner and Cloudlets Cloner individually. - Changed the faultArrivalTimesGenerator into the HostFaultInjection class documentation to indicate the faultArrivalTimesGenerator is considered to be in hours (not minutes anymore). - Included a FaultToleranceLevel inside the JSON SLA Contracts (see SlaContract class). Now, the number of VMs to create for each broker is based on this k-fault-tolerance level. The AWS EC2 Template to be used to create these k VMs is based on the max price the customer is willing to pay hourly for all VMs. This way, the price for each VM template cannot be higher than maxPrice/k. If this is the case, the cheaper VM will be selected and the k will be recomputed to avoiding violating the contract price. If even the cheaper VM is more expensive than the contract price, it will be created only one instance of it, violating the contract price, but avoiding the customer services to be stopped.

RaysaOliveira changed the title ~~Provide a Fault Injection mechanism for Hosts' PEs (CPUs)~~ Provide a Fault Injection and recovery mechanism for Hosts' PEs (CPUs) May 3, 2017

RaysaOliveira changed the title ~~Provide a Fault Injection and recovery mechanism for Hosts' PEs (CPUs)~~ Provide a Fault Injection and Recovery mechanism for Hosts' PEs (CPUs) May 3, 2017

manoelcampos assigned RaysaOliveira May 4, 2017

manoelcampos added the feature label May 4, 2017

manoelcampos added this to the CloudSim Plus 2.0 milestone May 4, 2017

manoelcampos changed the title ~~Provide a Fault Injection and Recovery mechanism for Hosts' PEs (CPUs)~~ Fault Injection for Hosts' CPUs and Recovery mechanisms which readjust VM's PEs number and start VM's snapshots when a Host or VM completely fails May 4, 2017

RaysaOliveira added a commit to RaysaOliveira/cloudsim-plus that referenced this issue May 4, 2017

Updates cloudsimplus#81

6a671d5

- Documentation updated. - Several refactorings. - Updated the HostFaultInjection to automatically set the broker of cloned VMs and Cloudlets if one is not set.

RaysaOliveira added a commit to RaysaOliveira/cloudsim-plus that referenced this issue May 4, 2017

Updates cloudsimplus#81

eb74f14

Refactored the HostFaultInjection to use the poisson random number generator given by the developer.

RaysaOliveira mentioned this issue May 15, 2017

Closes #81 #84

Merged

manoelcampos closed this as completed in 8bd5436 May 15, 2017

manoelcampos mentioned this issue Jun 2, 2017

Enable a VM belonging to a broker to be destroyed after all its Cloudlets have finished, independently of the state of other running VMs and according to a given delay #99

Closed

RaysaOliveira mentioned this issue Jun 5, 2017

Updates #81 #105

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fault Injection for Hosts' CPUs and Recovery mechanism which dynamically removes failed PEs from VMs and starts VM's snapshots when a Host or VM completely fails #81

Fault Injection for Hosts' CPUs and Recovery mechanism which dynamically removes failed PEs from VMs and starts VM's snapshots when a Host or VM completely fails #81

RaysaOliveira commented May 3, 2017 •

edited by manoelcampos

Loading

Fault Injection for Hosts' CPUs and Recovery mechanism which dynamically removes failed PEs from VMs and starts VM's snapshots when a Host or VM completely fails #81

Fault Injection for Hosts' CPUs and Recovery mechanism which dynamically removes failed PEs from VMs and starts VM's snapshots when a Host or VM completely fails #81

Comments

RaysaOliveira commented May 3, 2017 • edited by manoelcampos Loading

Feature

Removal of failed PEs from VMs

Management of affected VMs

Start a VM snapshot (clone) when all VMs from the same broker fail

Increase completion time for cloudlets affected by removed VM PEs

VM Migration when Host is overloaded because of failures

Implementation Details

Available Examples

RaysaOliveira commented May 3, 2017 •

edited by manoelcampos

Loading