Skip to content

Commit

Permalink
Merge branch 'master' of https://github.com/wrench-project/wrench
Browse files Browse the repository at this point in the history
  • Loading branch information
mesurajpandey committed Nov 28, 2017
2 parents 0a7e595 + 38f8a3b commit 7c68d51
Show file tree
Hide file tree
Showing 8 changed files with 88 additions and 23 deletions.
26 changes: 22 additions & 4 deletions doc/getting_started.md
Expand Up @@ -12,11 +12,11 @@ follow the instructions to [install it](@ref install).

# Running a First Example # {#getting-started-example}

The WRENCH distribution provides an example WMS implementation (`simple-wms`) available
The WRENCH distribution provides an example WMS implementation (`SimpleWMS`) available
in the `examples` folder. Note that a simple installation via `make && make install`
compiles all examples.

WRENCH provides two implementations for the `simple-wms` example: a cloud-based
WRENCH provides two implementations for the `SimpleWMS` example: a cloud-based
implementation `wrench-simple-wms-cloud`, and an implementation to run workflows
in a batch system (e.g., SLURM) `wrench-simple-wms-batch`.

Expand All @@ -34,7 +34,7 @@ wrench-simple-wms-batch <PATH-TO-WRENCH-SRC-FOLDER>/examples/two_hosts.xml <PATH

## Understanding the Simple-WMS Examples {#getting-started-example-simplewms}

The `simple-wms` example requires two arguments: (1) a [SimGrid virtual platform
The `SimpleWMS` example requires two arguments: (1) a [SimGrid virtual platform
description file](http://simgrid.gforge.inria.fr/simgrid/3.17/doc/platform.html); and
(2) a WRENCH workflow file.

Expand All @@ -48,7 +48,25 @@ A detailed description on how to create a platform description file can be found

**WRENCH workflow file:**
WRENCH provides native parsers for [DAX](http://workflowarchive.org) (DAG in XML)
and [JSON](http://workflowhub.org/traces/) worfklow description file formats. Refer to their respective Web sites for detailed documentation.
and [JSON](http://workflowhub.org/traces/) worfklow description file formats. Refer to
their respective Web sites for detailed documentation.

The `SimpleWMS` example implementations (either cloud or batch) are structured as follows:

- The first step is to read and parse the workflow and the SimGrid platform files, and
create a simulation object (wrench::Simulation).
- A storage service (wrench::SimpleStorageService) is created and deployed on a host.
- A cloud (wrench::CloudService) or a batch (wrench::BatchService) service is created and
deployed on a host. Both services are seen by the simulation engine as a compute service
(wrench::ComputeService) – jobs can then be scheduled to these resources.
- A WMS (wrench::WMS) is instantiated (in this case the `SimpleWMS`) with a reference to
the workflow object (wrench::Workflow) and a scheduler (wrench::Scheduler). For the
cloud example, a cloud scheduler is required to decide when to spawn VMs on hosts. The
batch service does not require a specific scheduler, since resources are fixed. In
such case, a regular scheduler can be used.
- A file registry (wrench::FileRegistryService), or a file replica catalog keeps track
of files stored in different storage services.
- Workflow input files are staged into the storage service, and the simulation is launched.


# Preparing the Environment # {#getting-started-prep}
Expand Down
8 changes: 0 additions & 8 deletions include/wrench/services/compute/batch/BatchServiceProperty.h
Expand Up @@ -17,14 +17,6 @@ namespace wrench {
public:
/** @brief The overhead to start a thread execution, in seconds **/
DECLARE_PROPERTY_NAME(THREAD_STARTUP_OVERHEAD);
// /** @brief The number of bytes in the control message sent by the service to notify a submitter that a job has completed. **/
// DECLARE_PROPERTY_NAME(STANDARD_JOB_DONE_MESSAGE_PAYLOAD);
// /** @brief The number of bytes in the control message sent by the service to notify a submitter that a job has failed. **/
// DECLARE_PROPERTY_NAME(STANDARD_JOB_FAILED_MESSAGE_PAYLOAD);
// /** @brief The number of bytes in the control message sent to the service to submit a job. **/
// DECLARE_PROPERTY_NAME(SUBMIT_BATCH_JOB_ANSWER_MESSAGE_PAYLOAD);
// /** @brief The number of bytes in the control message sent by the service in answer to a job submission. **/
// DECLARE_PROPERTY_NAME(SUBMIT_BATCH_JOB_REQUEST_MESSAGE_PAYLOAD);
/** @brief The host selection algorithm. Can be:
* - FIRSTFIT
* - BESTFIT
Expand Down
2 changes: 2 additions & 0 deletions include/wrench/workflow/execution_events/FailureCause.h
Expand Up @@ -26,9 +26,11 @@ namespace wrench {
/** \cond DEVELOPER */
/***********************/


/**
* @brief A top-level class to describe all simulation-valid failures that can occur during
* workflow execution
*
*/
class FailureCause {

Expand Down
Expand Up @@ -141,6 +141,7 @@ namespace wrench {

// Kill the StandardJobExecutor
this->kill_actor();

}

/**
Expand Down
Expand Up @@ -44,7 +44,7 @@ namespace wrench {
* @param num_cores: the number of cores available to the executor
* @param callback_mailbox: the callback mailbox to which the worker
* thread can send "work done" messages
* @param workunit: the workinit to perform
* @param workunit: the workunit to perform
* @param default_storage_service: the default storage service from which to read/write data (if any)
* @param thread_startup_overhead: the thread_startup overhead, in seconds
*/
Expand Down
17 changes: 13 additions & 4 deletions src/wrench/simgrid_S4U_util/S4U_Daemon.cpp
Expand Up @@ -40,20 +40,29 @@ namespace wrench {
* @param process_name: the name of the simulated process/actor
*/
S4U_Daemon::S4U_Daemon(std::string process_name)
: process_name(process_name),
mailbox_name("") {
{
this->process_name = process_name; // TODO: Why does this leak?
this->mailbox_name="";
this->terminated = false;
}

S4U_Daemon::~S4U_Daemon() {
// WRENCH_INFO("In the Daemon Destructor");
}

int daemonGoodbye(void *x, void*y) {

/**
* \cond
*/
static int daemon_goodbye(void *x, void *y) {
WRENCH_INFO("Terminating");
return 0;
}

/**
* \endcond
*/

/**
* @brief Start the daemon
*
Expand All @@ -79,7 +88,7 @@ namespace wrench {
if (daemonized)
this->s4u_actor->daemonize();

this->s4u_actor->onExit(daemonGoodbye, (void *)(this->process_name.c_str()));
this->s4u_actor->onExit(daemon_goodbye, (void *) (this->process_name.c_str()));


// Set the mailbox receiver
Expand Down
8 changes: 4 additions & 4 deletions src/wrench/simulation/Simulation.cpp
Expand Up @@ -22,10 +22,7 @@ XBT_LOG_NEW_DEFAULT_CATEGORY(simulation, "Log category for Simulation");
namespace wrench {

/**
* @brief Exception handler to catch SIGABRT signals from SimGrid (which should
* probably throw exceptions at some point)
*
* @param signal: the signal number
* \cond
*/
void signal_handler(int signal) {
if (signal == SIGABRT) {
Expand All @@ -35,6 +32,9 @@ namespace wrench {
std::cerr << "Unexpected signal " << signal << " received\n";
}
}
/**
* \endcond
*/

/**
* @brief Constructor
Expand Down
47 changes: 45 additions & 2 deletions test/simulation/StandardJobExecutorTest.cpp
Expand Up @@ -908,7 +908,13 @@ class OneMultiCoreTaskTestWMS : public wrench::WMS {

this->test->storage_service1->deleteFile(workflow->getFileById("output_file"));


workflow->removeTask(task);

// wrench::S4U_Simulation::sleep(0.00001); // TODO: This is needed to avoid a segfault from the delete
// // TODO: (this is a general design issue)
// delete executor;

}

/** Case 2: Create a multicore task with 50% parallel efficiency that lasts one hour **/
Expand Down Expand Up @@ -974,6 +980,10 @@ class OneMultiCoreTaskTestWMS : public wrench::WMS {

this->test->storage_service1->deleteFile(workflow->getFileById("output_file"));

// wrench::S4U_Simulation::sleep(0.00001); // TODO: This is needed to avoid a segfault from the delete
// // TODO: (this is a general design issue)
// delete executor;

}

/** Case 3: Create a multicore task with 50% parallel efficiency and include thread startup overhead **/
Expand Down Expand Up @@ -1037,6 +1047,10 @@ class OneMultiCoreTaskTestWMS : public wrench::WMS {
workflow->removeTask(task);

this->test->storage_service1->deleteFile(workflow->getFileById("output_file"));
//
// wrench::S4U_Simulation::sleep(0.00001); // TODO: This is needed to avoid a segfault from the delete
// // TODO: (this is a general design issue)
// delete executor;

}

Expand Down Expand Up @@ -1229,6 +1243,10 @@ class TwoMultiCoreTasksTestWMS : public wrench::WMS {

this->test->storage_service1->deleteFile(workflow->getFileById("output_file"));

// wrench::S4U_Simulation::sleep(0.00001); // TODO: This is needed to avoid a segfault from the delete
// // TODO: (this is a general design issue)
// delete executor;

}


Expand Down Expand Up @@ -1311,6 +1329,10 @@ class TwoMultiCoreTasksTestWMS : public wrench::WMS {
workflow->removeTask(task2);
this->test->storage_service1->deleteFile(workflow->getFileById("output_file"));

// wrench::S4U_Simulation::sleep(0.00001); // TODO: This is needed to avoid a segfault from the delete
// // TODO: (this is a general design issue)
// delete executor;

}


Expand Down Expand Up @@ -1396,6 +1418,10 @@ class TwoMultiCoreTasksTestWMS : public wrench::WMS {
workflow->removeTask(task3);
this->test->storage_service1->deleteFile(workflow->getFileById("output_file"));

// wrench::S4U_Simulation::sleep(0.00001); // TODO: This is needed to avoid a segfault from the delete
// // TODO: (this is a general design issue)
// delete executor;

}

// Terminate everything
Expand Down Expand Up @@ -1584,6 +1610,10 @@ class MultiHostTestWMS : public wrench::WMS {
workflow->removeTask(task1);
workflow->removeTask(task2);
this->test->storage_service1->deleteFile(workflow->getFileById("output_file"));

// wrench::S4U_Simulation::sleep(0.00001); // TODO: This is needed to avoid a segfault from the delete
// // TODO: (this is a general design issue)
// delete executor;
}

/** Case 2: Create 4 tasks that will run in best fit manner **/
Expand Down Expand Up @@ -1637,9 +1667,9 @@ class MultiHostTestWMS : public wrench::WMS {
wrench::StandardJobExecutorDoneMessage *msg = dynamic_cast<wrench::StandardJobExecutorDoneMessage *>(message.get());
if (!msg) {
wrench::StandardJobExecutorFailedMessage *msg = dynamic_cast<wrench::StandardJobExecutorFailedMessage *>(message.get());
std::cerr << "----> " << msg->cause->toString() << "\n";
// std::cerr << "----> " << msg->cause->toString() << "\n";

throw std::runtime_error("Unexpected '" + message->getName() + "' message");
throw std::runtime_error("Unexpected '" + msg->cause->toString() + "' error");

}

Expand Down Expand Up @@ -1667,6 +1697,10 @@ class MultiHostTestWMS : public wrench::WMS {
// throw std::runtime_error("Case 2: Unexpected task2 end date: " + std::to_string(task2->getEndDate()));
// }

// wrench::S4U_Simulation::sleep(0.00001); // TODO: This is needed to avoid a segfault from the delete
// // TODO: (this is a general design issue)
// delete executor;

workflow->removeTask(task1);
workflow->removeTask(task2);
workflow->removeTask(task3);
Expand Down Expand Up @@ -1837,6 +1871,10 @@ class JobTerminationTestDuringAComputationWMS : public wrench::WMS {
workflow->removeTask(task2);
workflow->removeTask(task3);
workflow->removeTask(task4);

wrench::S4U_Simulation::sleep(0.00001); // TODO: This is needed to avoid a segfault from the delete
// TODO: (this is a general design issue)
delete executor;
}


Expand Down Expand Up @@ -2002,6 +2040,11 @@ class JobTerminationTestDuringATransferWMS : public wrench::WMS {
workflow->removeTask(task2);
workflow->removeTask(task3);
workflow->removeTask(task4);

wrench::S4U_Simulation::sleep(0.00001); // TODO: This is needed to avoid a segfault from the delete
// TODO: (this is a general design issue)
delete executor;

}


Expand Down

0 comments on commit 7c68d51

Please sign in to comment.