Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Process trace files from Google Cluster Data #149

Closed
manoelcampos opened this issue Jul 6, 2018 · 4 comments
Closed

Process trace files from Google Cluster Data #149

manoelcampos opened this issue Jul 6, 2018 · 4 comments
Assignees
Labels

Comments

@manoelcampos
Copy link
Collaborator

manoelcampos commented Jul 6, 2018

Implement classes to read Google Trace files, enabling the creation of Hosts and Cloudlets from these files.
A script to download trace files is available at download-google-cluster-data.sh

Detailed information about how the feature should work

New classes should be introduced to implement the format of the different trace files. The existing WorkloalFileReader must be refactored to extract a superclass containing common methods for the other classes.

A brief explanation of why you think this feature is useful

It will enable creating more realistic simulations using real and extensive data from physical datacenters. This data can be assessed, for instance, to identify a possible correlation between different workloads, such as CPU and RAM requirements.

Examples

Related Issues

@manoelcampos manoelcampos added feature in-progress Someone has started to work on the issue. The progress may be available at the dev branch. labels Jul 6, 2018
@manoelcampos manoelcampos added this to the CloudSim Plus 4.0 milestone Jul 6, 2018
@manoelcampos manoelcampos self-assigned this Jul 6, 2018
@manoelcampos manoelcampos changed the title Process Google Trace files Process trace files from Google Cluster Data Jul 6, 2018
manoelcampos added a commit that referenced this issue Aug 9, 2018
- Refactors WorkloadFileReader extracting methods to a superclasse to enable
  creating new subclasses for different trace formats
  (such as Google Cluster Data)
- Renames WorkloadFileReader to SwfWorkloadFileReader
  because the class is specific for the Standard Workload Format (*.swf)
  from The Hebrew University of Jerusalem.

- Introduces the GoogleMachineEventsTraceReader to create Hosts
  from Google Cloud trace files.

  * Adds startTime and shutdownTime attributes to Host
  * Enables to request the creation of Hosts only at the time
    specified by the timestamp in trace file.
  * Enables processing of ADD and REMOVE event types.
  * Adds GoogleMachineEventsExample1.

- Introduces the GoogleTaskEventsTraceReader
  to read tasks events from Google Cloud traces.
  Jobs files are not read because they don't have useful information
  for simulations and are represented in CloudSim Plus just as an integer
  value for the jobId attribute of a Cloudlet.

  * Adds new Cloudlet status to conform to the Google Cluster Trace tasks events
  * Adds new CloudSimTags to enable a broker to receive Cloudlet status changes
    when creating Cloudlets from a trace file such as the Google Cluster Trace
  * Adds a jobId attribute to Cloudlet so that Cloudlets
    can be categorized as belonging to a fictitious job.
  * Adds DatacenterBroker.getCloudletSubmittedList

- FieldIndex enums in GoogleTraceReaderAbstract subclasses
  enable getting a value from the parsed trace line directly
  from a method inside the enum. This way, the value is returned
  in the correct type and after making unit conversions
  (if required). That was accomplished by making the enum to
  implement the TraceField interface.

- Adds google-cluster-data-samples.xlsx
  as an easier way to analyse the structure of Google Trace files.
- Adds sample google trace files to the resource directory
  of the examples project.

- Fix Conversion issues (such as GB to MB) and updates documentation.
- Changes stream and "enhanced for" in CloudSim class to iterators
  and indexed loops to improve efficiency and avoid
  ConcurrentModificationException.
  Before it was being created defensive copies of event and entities list
  to avoid such an exception, but reducing performance when
  there is a large number of objects in such lists.

Signed-off-by: Manoel Campos <manoelcampos@gmail.com>
manoelcampos added a commit that referenced this issue Aug 9, 2018
- Refactors WorkloadFileReader extracting methods to a superclasse to enable
  creating new subclasses for different trace formats
  (such as Google Cluster Data)
- Renames WorkloadFileReader to SwfWorkloadFileReader
  because the class is specific for the Standard Workload Format (*.swf)
  from The Hebrew University of Jerusalem.

- Introduces the GoogleMachineEventsTraceReader to create Hosts
  from Google Cloud trace files.

  * Adds startTime and shutdownTime attributes to Host
  * Enables to request the creation of Hosts only at the time
    specified by the timestamp in trace file.
  * Enables processing of ADD and REMOVE event types.
  * Adds GoogleMachineEventsExample1.

- Introduces the GoogleTaskEventsTraceReader
  to read tasks events from Google Cloud traces.
  Jobs files are not read because they don't have useful information
  for simulations and are represented in CloudSim Plus just as an integer
  value for the jobId attribute of a Cloudlet.

  * Adds new Cloudlet status to conform to the Google Cluster Trace tasks events
  * Adds new CloudSimTags to enable a broker to receive Cloudlet status changes
    when creating Cloudlets from a trace file such as the Google Cluster Trace
  * Adds a jobId attribute to Cloudlet so that Cloudlets
    can be categorized as belonging to a fictitious job.
  * Adds DatacenterBroker.getCloudletSubmittedList
  * Adds GoogleTaskEventsExample1.

- FieldIndex enums in GoogleTraceReaderAbstract subclasses
  enable getting a value from the parsed trace line directly
  from a method inside the enum. This way, the value is returned
  in the correct type and after making unit conversions
  (if required). That was accomplished by making the enum to
  implement the TraceField interface.

- Adds google-cluster-data-samples.xlsx
  as an easier way to analyse the structure of Google Trace files.
- Adds sample google trace files to the resource directory
  of the examples project.

- Fix Conversion issues (such as GB to MB) and updates documentation.
- Changes stream and "enhanced for" in CloudSim class to iterators
  and indexed loops to improve efficiency and avoid
  ConcurrentModificationException.
  Before it was being created defensive copies of event and entities list
  to avoid such an exception, but reducing performance when
  there is a large number of objects in such lists.

Signed-off-by: Manoel Campos <manoelcampos@gmail.com>
manoelcampos added a commit that referenced this issue Aug 9, 2018
- Refactors WorkloadFileReader extracting methods to a superclasse to enable
  creating new subclasses for different trace formats
  (such as Google Cluster Data)
- Renames WorkloadFileReader to SwfWorkloadFileReader
  because the class is specific for the Standard Workload Format (*.swf)
  from The Hebrew University of Jerusalem.

- Introduces the GoogleMachineEventsTraceReader to create Hosts
  from Google Cloud trace files.

  * Adds startTime and shutdownTime attributes to Host
  * Enables to request the creation of Hosts only at the time
    specified by the timestamp in trace file.
  * Enables processing of ADD and REMOVE event types.
  * Adds GoogleMachineEventsExample1.

- Introduces the GoogleTaskEventsTraceReader
  to read tasks events from Google Cloud traces.
  Jobs files are not read because they don't have useful information
  for simulations and are represented in CloudSim Plus just as an integer
  value for the jobId attribute of a Cloudlet.

  * Adds new Cloudlet status to conform to the Google Cluster Trace tasks events
  * Adds new CloudSimTags to enable a broker to receive Cloudlet status changes
    when creating Cloudlets from a trace file such as the Google Cluster Trace
  * Adds a jobId attribute to Cloudlet so that Cloudlets
    can be categorized as belonging to a fictitious job.
  * Adds DatacenterBroker.getCloudletSubmittedList
  * Adds GoogleTaskEventsExample1.

- FieldIndex enums in GoogleTraceReaderAbstract subclasses
  enable getting a value from the parsed trace line directly
  from a method inside the enum. This way, the value is returned
  in the correct type and after making unit conversions
  (if required). That was accomplished by making the enum to
  implement the TraceField interface.

- Adds google-cluster-data-samples.xlsx
  as an easier way to analyse the structure of Google Trace files.
- Adds sample google trace files to the resource directory
  of the examples project.

- Fix Conversion issues (such as GB to MB) and updates documentation.
- Changes stream and "enhanced for" in CloudSim class to iterators
  and indexed loops to improve efficiency and avoid
  ConcurrentModificationException.
  Before it was being created defensive copies of event and entities list
  to avoid such an exception, but reducing performance when
  there is a large number of objects in such lists.

Signed-off-by: Manoel Campos <manoelcampos@gmail.com>
manoelcampos added a commit that referenced this issue Aug 9, 2018
- Refactors WorkloadFileReader extracting methods to a superclasse to enable
  creating new subclasses for different trace formats
  (such as Google Cluster Data)
- Renames WorkloadFileReader to SwfWorkloadFileReader
  because the class is specific for the Standard Workload Format (*.swf)
  from The Hebrew University of Jerusalem.

- Introduces the GoogleMachineEventsTraceReader to create Hosts
  from Google Cloud trace files.

  * Adds startTime and shutdownTime attributes to Host
  * Enables to request the creation of Hosts only at the time
    specified by the timestamp in trace file.
  * Enables processing of ADD and REMOVE event types.
  * Adds GoogleMachineEventsExample1.

- Introduces the GoogleTaskEventsTraceReader
  to read tasks events from Google Cloud traces.
  Jobs files are not read because they don't have useful information
  for simulations and are represented in CloudSim Plus just as an integer
  value for the jobId attribute of a Cloudlet.

  * Adds new Cloudlet status to conform to the Google Cluster Trace tasks events
  * Adds new CloudSimTags to enable a broker to receive Cloudlet status changes
    when creating Cloudlets from a trace file such as the Google Cluster Trace
  * Adds a jobId attribute to Cloudlet so that Cloudlets
    can be categorized as belonging to a fictitious job.
  * Adds DatacenterBroker.getCloudletSubmittedList
  * Adds GoogleTaskEventsExample1.

- FieldIndex enums in GoogleTraceReaderAbstract subclasses
  enable getting a value from the parsed trace line directly
  from a method inside the enum. This way, the value is returned
  in the correct type and after making unit conversions
  (if required). That was accomplished by making the enum to
  implement the TraceField interface.

- Adds google-cluster-data-samples.xlsx
  as an easier way to analyse the structure of Google Trace files.
- Adds sample google trace files to the resource directory
  of the examples project.

- Fix Conversion issues (such as GB to MB) and updates documentation.
- Changes stream and "enhanced for" in CloudSim class to iterators
  and indexed loops to improve efficiency and avoid
  ConcurrentModificationException.
  Before it was being created defensive copies of event and entities list
  to avoid such an exception, but reducing performance when
  there is a large number of objects in such lists.

Signed-off-by: Manoel Campos <manoelcampos@gmail.com>
manoelcampos added a commit that referenced this issue Aug 10, 2018
- Refactors WorkloadFileReader extracting methods to a superclasse to enable
  creating new subclasses for different trace formats
  (such as Google Cluster Data)
- Renames WorkloadFileReader to SwfWorkloadFileReader
  because the class is specific for the Standard Workload Format (*.swf)
  from The Hebrew University of Jerusalem.

- Introduces the GoogleMachineEventsTraceReader to create Hosts
  from Google Cloud trace files.

  * Adds startTime and shutdownTime attributes to Host
  * Enables to request the creation of Hosts only at the time
    specified by the timestamp in trace file.
  * Enables processing of ADD and REMOVE event types.
  * Adds GoogleMachineEventsExample1.

- Introduces the GoogleTaskEventsTraceReader
  to read tasks events from Google Cloud traces.
  Jobs files are not read because they don't have useful information
  for simulations and are represented in CloudSim Plus just as an integer
  value for the jobId attribute of a Cloudlet.

  * Adds new Cloudlet status to conform to the Google Cluster Trace tasks events
  * Adds new CloudSimTags to enable a broker to receive Cloudlet status changes
    when creating Cloudlets from a trace file such as the Google Cluster Trace
  * Adds a jobId attribute to Cloudlet so that Cloudlets
    can be categorized as belonging to a fictitious job.
  * Adds DatacenterBroker.getCloudletSubmittedList
  * Adds GoogleTaskEventsExample1.

- FieldIndex enums in GoogleTraceReaderAbstract subclasses
  enable getting a value from the parsed trace line directly
  from a method inside the enum. This way, the value is returned
  in the correct type and after making unit conversions
  (if required). That was accomplished by making the enum to
  implement the TraceField interface.

- Adds google-cluster-data-samples.xlsx
  as an easier way to analyse the structure of Google Trace files.
- Adds sample google trace files to the resource directory
  of the examples project.

- Fix Conversion issues (such as GB to MB) and updates documentation.
- Changes stream and "enhanced for" in CloudSim class to iterators
  and indexed loops to improve efficiency and avoid
  ConcurrentModificationException.
  Before it was being created defensive copies of event and entities list
  to avoid such an exception, but reducing performance when
  there is a large number of objects in such lists.

Signed-off-by: Manoel Campos <manoelcampos@gmail.com>
manoelcampos added a commit that referenced this issue Aug 10, 2018
- Refactors WorkloadFileReader extracting methods to a superclasse to enable
  creating new subclasses for different trace formats
  (such as Google Cluster Data)
- Renames WorkloadFileReader to SwfWorkloadFileReader
  because the class is specific for the Standard Workload Format (*.swf)
  from The Hebrew University of Jerusalem.

- Introduces the GoogleMachineEventsTraceReader to create Hosts
  from Google Cloud trace files.

  * Adds startTime and shutdownTime attributes to Host
  * Enables to request the creation of Hosts only at the time
    specified by the timestamp in trace file.
  * Enables processing of ADD and REMOVE event types.
  * Adds GoogleMachineEventsExample1.

- Introduces the GoogleTaskEventsTraceReader
  to read tasks events from Google Cloud traces.
  Jobs files are not read because they don't have useful information
  for simulations and are represented in CloudSim Plus just as an integer
  value for the jobId attribute of a Cloudlet.

  * Adds new Cloudlet status to conform to the Google Cluster Trace tasks events
  * Adds new CloudSimTags to enable a broker to receive Cloudlet status changes
    when creating Cloudlets from a trace file such as the Google Cluster Trace
  * Adds a jobId attribute to Cloudlet so that Cloudlets
    can be categorized as belonging to a fictitious job.
  * Adds DatacenterBroker.getCloudletSubmittedList
  * Adds GoogleTaskEventsExample1.

- FieldIndex enums in GoogleTraceReaderAbstract subclasses
  enable getting a value from the parsed trace line directly
  from a method inside the enum. This way, the value is returned
  in the correct type and after making unit conversions
  (if required). That was accomplished by making the enum to
  implement the TraceField interface.

- Adds google-cluster-data-samples.xlsx
  as an easier way to analyse the structure of Google Trace files.
- Adds sample google trace files to the resource directory
  of the examples project.

- Fix Conversion issues (such as GB to MB) and updates documentation.
- Changes stream and "enhanced for" in CloudSim class to iterators
  and indexed loops to improve efficiency and avoid
  ConcurrentModificationException.
  Before it was being created defensive copies of event and entities list
  to avoid such an exception, but reducing performance when
  there is a large number of objects in such lists.

Signed-off-by: Manoel Campos <manoelcampos@gmail.com>
manoelcampos added a commit that referenced this issue Aug 10, 2018
- Refactors WorkloadFileReader extracting methods to a superclasse to enable
  creating new subclasses for different trace formats
  (such as Google Cluster Data)
- Renames WorkloadFileReader to SwfWorkloadFileReader
  because the class is specific for the Standard Workload Format (*.swf)
  from The Hebrew University of Jerusalem.

- Introduces the GoogleMachineEventsTraceReader to create Hosts
  from Google Cloud trace files.

  * Adds startTime and shutdownTime attributes to Host
  * Enables to request the creation of Hosts only at the time
    specified by the timestamp in trace file.
  * Enables processing of ADD and REMOVE event types.
  * Adds GoogleMachineEventsExample1.

- Introduces the GoogleTaskEventsTraceReader
  to read tasks events from Google Cloud traces.
  Jobs files are not read because they don't have useful information
  for simulations and are represented in CloudSim Plus just as an integer
  value for the jobId attribute of a Cloudlet.

  * Adds new Cloudlet status to conform to the Google Cluster Trace tasks events
  * Adds new CloudSimTags to enable a broker to receive Cloudlet status changes
    when creating Cloudlets from a trace file such as the Google Cluster Trace
  * Adds a jobId attribute to Cloudlet so that Cloudlets
    can be categorized as belonging to a fictitious job.
  * Adds DatacenterBroker.getCloudletSubmittedList
  * Adds GoogleTaskEventsExample1.

- FieldIndex enums in GoogleTraceReaderAbstract subclasses
  enable getting a value from the parsed trace line directly
  from a method inside the enum. This way, the value is returned
  in the correct type and after making unit conversions
  (if required). That was accomplished by making the enum to
  implement the TraceField interface.

- Adds google-cluster-data-samples.xlsx
  as an easier way to analyse the structure of Google Trace files.
- Adds sample google trace files to the resource directory
  of the examples project.

- Fix Conversion issues (such as GB to MB) and updates documentation.
- Changes stream and "enhanced for" in CloudSim class to iterators
  and indexed loops to improve efficiency and avoid
  ConcurrentModificationException.
  Before it was being created defensive copies of event and entities list
  to avoid such an exception, but reducing performance when
  there is a large number of objects in such lists.

Signed-off-by: Manoel Campos <manoelcampos@gmail.com>
manoelcampos added a commit that referenced this issue Aug 10, 2018
- Refactors WorkloadFileReader extracting methods to a superclasse to enable
  creating new subclasses for different trace formats
  (such as Google Cluster Data)
- Renames WorkloadFileReader to SwfWorkloadFileReader
  because the class is specific for the Standard Workload Format (*.swf)
  from The Hebrew University of Jerusalem.

- Introduces the GoogleMachineEventsTraceReader to create Hosts
  from Google Cloud trace files.

  * Adds startTime and shutdownTime attributes to Host
  * Enables to request the creation of Hosts only at the time
    specified by the timestamp in trace file.
  * Enables processing of ADD and REMOVE event types.
  * Adds GoogleMachineEventsExample1.

- Introduces the GoogleTaskEventsTraceReader
  to read tasks events from Google Cloud traces.
  Jobs files are not read because they don't have useful information
  for simulations and are represented in CloudSim Plus just as an integer
  value for the jobId attribute of a Cloudlet.

  * Adds new Cloudlet status to conform to the Google Cluster Trace tasks events
  * Adds new CloudSimTags to enable a broker to receive Cloudlet status changes
    when creating Cloudlets from a trace file such as the Google Cluster Trace
  * Adds a jobId attribute to Cloudlet so that Cloudlets
    can be categorized as belonging to a fictitious job.
  * Adds DatacenterBroker.getCloudletSubmittedList
  * Adds GoogleTaskEventsExample1.

- FieldIndex enums in GoogleTraceReaderAbstract subclasses
  enable getting a value from the parsed trace line directly
  from a method inside the enum. This way, the value is returned
  in the correct type and after making unit conversions
  (if required). That was accomplished by making the enum to
  implement the TraceField interface.

- Adds google-cluster-data-samples.xlsx
  as an easier way to analyse the structure of Google Trace files.
- Adds sample google trace files to the resource directory
  of the examples project.

- Fix Conversion issues (such as GB to MB) and updates documentation.
- Changes stream and "enhanced for" in CloudSim class to iterators
  and indexed loops to improve efficiency and avoid
  ConcurrentModificationException.
  Before it was being created defensive copies of event and entities list
  to avoid such an exception, but reducing performance when
  there is a large number of objects in such lists.

Signed-off-by: Manoel Campos <manoelcampos@gmail.com>
manoelcampos added a commit that referenced this issue Aug 10, 2018
- Refactors WorkloadFileReader extracting methods to a superclasse to enable
  creating new subclasses for different trace formats
  (such as Google Cluster Data)
- Renames WorkloadFileReader to SwfWorkloadFileReader
  because the class is specific for the Standard Workload Format (*.swf)
  from The Hebrew University of Jerusalem.

- Introduces the GoogleMachineEventsTraceReader to create Hosts
  from Google Cloud trace files.

  * Adds startTime and shutdownTime attributes to Host
  * Enables to request the creation of Hosts only at the time
    specified by the timestamp in trace file.
  * Enables processing of ADD and REMOVE event types.
  * Adds GoogleMachineEventsExample1.

- Introduces the GoogleTaskEventsTraceReader
  to read tasks events from Google Cloud traces.
  Jobs files are not read because they don't have useful information
  for simulations and are represented in CloudSim Plus just as an integer
  value for the jobId attribute of a Cloudlet.

  * Adds new Cloudlet status to conform to the Google Cluster Trace tasks events
  * Adds new CloudSimTags to enable a broker to receive Cloudlet status changes
    when creating Cloudlets from a trace file such as the Google Cluster Trace
  * Adds a jobId attribute to Cloudlet so that Cloudlets
    can be categorized as belonging to a fictitious job.
  * Adds DatacenterBroker.getCloudletSubmittedList
  * Adds GoogleTaskEventsExample1.

- FieldIndex enums in GoogleTraceReaderAbstract subclasses
  enable getting a value from the parsed trace line directly
  from a method inside the enum. This way, the value is returned
  in the correct type and after making unit conversions
  (if required). That was accomplished by making the enum to
  implement the TraceField interface.

- Adds google-cluster-data-samples.xlsx
  as an easier way to analyse the structure of Google Trace files.
- Adds sample google trace files to the resource directory
  of the examples project.

- Fix Conversion issues (such as GB to MB) and updates documentation.
- Changes stream and "enhanced for" in CloudSim class to iterators
  and indexed loops to improve efficiency and avoid
  ConcurrentModificationException.
  Before it was being created defensive copies of event and entities list
  to avoid such an exception, but reducing performance when
  there is a large number of objects in such lists.

Signed-off-by: Manoel Campos <manoelcampos@gmail.com>
manoelcampos added a commit that referenced this issue Aug 10, 2018
- Refactors WorkloadFileReader extracting methods to a superclasse to enable
  creating new subclasses for different trace formats
  (such as Google Cluster Data)
- Renames WorkloadFileReader to SwfWorkloadFileReader
  because the class is specific for the Standard Workload Format (*.swf)
  from The Hebrew University of Jerusalem.

- Introduces the GoogleMachineEventsTraceReader to create Hosts
  from Google Cloud trace files.

  * Adds startTime and shutdownTime attributes to Host
  * Enables to request the creation of Hosts only at the time
    specified by the timestamp in trace file.
  * Enables processing of ADD and REMOVE event types.
  * Adds GoogleMachineEventsExample1.

- Introduces the GoogleTaskEventsTraceReader
  to read tasks events from Google Cloud traces.
  Jobs files are not read because they don't have useful information
  for simulations and are represented in CloudSim Plus just as an integer
  value for the jobId attribute of a Cloudlet.

  * Adds new Cloudlet status to conform to the Google Cluster Trace tasks events
  * Adds new CloudSimTags to enable a broker to receive Cloudlet status changes
    when creating Cloudlets from a trace file such as the Google Cluster Trace
  * Adds a jobId attribute to Cloudlet so that Cloudlets
    can be categorized as belonging to a fictitious job.
  * Adds DatacenterBroker.getCloudletSubmittedList
  * Adds GoogleTaskEventsExample1.

- FieldIndex enums in GoogleTraceReaderAbstract subclasses
  enable getting a value from the parsed trace line directly
  from a method inside the enum. This way, the value is returned
  in the correct type and after making unit conversions
  (if required). That was accomplished by making the enum to
  implement the TraceField interface.

- Adds google-cluster-data-samples.xlsx
  as an easier way to analyse the structure of Google Trace files.
- Adds sample google trace files to the resource directory
  of the examples project.

- Fix Conversion issues (such as GB to MB) and updates documentation.
- Changes stream and "enhanced for" in CloudSim class to iterators
  and indexed loops to improve efficiency and avoid
  ConcurrentModificationException.
  Before it was being created defensive copies of event and entities list
  to avoid such an exception, but reducing performance when
  there is a large number of objects in such lists.

Signed-off-by: Manoel Campos <manoelcampos@gmail.com>
manoelcampos added a commit that referenced this issue Aug 17, 2018
- Refactors WorkloadFileReader extracting methods to a superclasse to enable
  creating new subclasses for different trace formats
  (such as Google Cluster Data)
- Renames WorkloadFileReader to SwfWorkloadFileReader
  because the class is specific for the Standard Workload Format (*.swf)
  from The Hebrew University of Jerusalem.

- Introduces the GoogleMachineEventsTraceReader to create Hosts
  from Google Cloud trace files.

  * Adds startTime and shutdownTime attributes to Host
  * Enables to request the creation of Hosts only at the time
    specified by the timestamp in trace file.
  * Enables processing of ADD and REMOVE event types.
  * Adds GoogleMachineEventsExample1.

- Introduces the GoogleTaskEventsTraceReader
  to read tasks events from Google Cloud traces.
  Jobs files are not read because they don't have useful information
  for simulations and are represented in CloudSim Plus just as an integer
  value for the jobId attribute of a Cloudlet.

  * Adds new Cloudlet status to conform to the Google Cluster Trace tasks events
  * Adds new CloudSimTags to enable a broker to receive Cloudlet status changes
    when creating Cloudlets from a trace file such as the Google Cluster Trace
  * Adds a jobId attribute to Cloudlet so that Cloudlets
    can be categorized as belonging to a fictitious job.
  * Adds DatacenterBroker.getCloudletSubmittedList
  * Adds GoogleTaskEventsExample1.
  * Updates Cloudlets attributes during simulation time,
    according to the values read from the "task events" trace file.

- FieldIndex enums in GoogleTraceReaderAbstract subclasses
  enable getting a value from the parsed trace line directly
  from a method inside the enum. This way, the value is returned
  in the correct type and after making unit conversions
  (if required). That was accomplished by making the enum to
  implement the TraceField interface.

- Adds google-cluster-data-samples.xlsx
  as an easier way to analyse the structure of Google Trace files.
- Adds sample google trace files to the resource directory
  of the examples project.

- Fix Conversion issues (such as GB to MB) and updates documentation.
- Changes stream and "enhanced for" in CloudSim class to iterators
  and indexed loops to improve efficiency and avoid
  ConcurrentModificationException.
  Before it was being created defensive copies of event and entities list
  to avoid such an exception, but reducing performance when
  there is a large number of objects in such lists.

Signed-off-by: Manoel Campos <manoelcampos@gmail.com>
@manoelcampos manoelcampos removed the in-progress Someone has started to work on the issue. The progress may be available at the dev branch. label Aug 28, 2018
manoelcampos added a commit that referenced this issue Aug 28, 2018
- Refactors WorkloadFileReader by extracting methods to a superclass to enable
  creating new subclasses for different trace formats
  (such as Google Cluster Data).
- Renames WorkloadFileReader to SwfWorkloadFileReader
  because the class is specific for the Standard Workload Format (*.swf)
  from The Hebrew University of Jerusalem.

- Introduces the GoogleMachineEventsTraceReader class to create Hosts
  from Google Cloud trace files:

  * Adds startTime and shutdownTime attributes to Host;
  * Enables requesting the creation of Hosts only at the time
    specified by the timestamp in trace file;
  * Enables processing of ADD and REMOVE event types;
  * Adds GoogleMachineEventsExample1.

- Introduces the GoogleTaskEventsTraceReader
  to read tasks events from Google Cloud traces.
  Jobs files are not read because they don't have useful information
  for simulations. They are represented in CloudSim Plus just as integer
  values for the jobId attribute of a Cloudlet:

  * Adds new Cloudlet status to conform to the Google Cluster Trace tasks events;
  * Adds new CloudSimTags to enable a broker to receive Cloudlet status changes
    when creating Cloudlets from a trace file;
  * Adds a jobId attribute to Cloudlet so that it
    can be categorized as belonging to a fictitious job;
  * Adds DatacenterBroker.getCloudletSubmittedList;
  * Adds GoogleTaskEventsExample1;
  * Updates Cloudlet's attributes during simulation time,
    according to the values read from the "task events" trace file
    (such as max number of CPU cores and max RAM usage).

- FieldIndex enums in GoogleTraceReaderAbstract subclasses
  enable getting a value from the parsed trace line directly
  from a method inside the enum. This way, the value is returned
  in the correct type and after making unit conversions
  (if required). That was accomplished by making the enum to
  implement the TraceField interface.

- Adds google-cluster-data-samples.xlsx
  as an easier way to analyse the structure of Google Trace files.
- Adds sample google trace files to the resource directory
  of the examples project.

- Fix Conversion issues (such as GB to MB) and updates documentation.
- Changes stream and "enhanced for" in CloudSim class to iterators
  and indexed loops to improve efficiency and avoid
  ConcurrentModificationException.
  Previously it was being created defensive copies of events and entities lists
  to avoid such an exception, but reducing performance when
  there is a large number of objects in such lists.

- Changes the type of the CloudSimEvent's data attribute to
  a Runnable when the event tag is CloudSimTags.CLOUDLET_UPDATE_ATTRIBUTES.
  This way, all the logic to update the Cloudlet's attributes
  can be customized by the researcher and encapsulated
  into a no-args and no-return function (the Runnable).
  This function is then executed by the broker when
  an event of such a type is received.
- Defines the correct way to interpret the "resource request" fields
  in the "task events" trace file:
  * The "resource request for CPU" is used to define the Cloudlet's PEs
  * The "resource request for RAM" is used to define max resource utilization
    for the Ram UtilizationModel of a Cloudlet.
    In this case, it's required a UtilizationModelDynamic instance.

- Introduces the GoogleTaskUsageTracerReader class to read "task usage" trace files.

Signed-off-by: Manoel Campos <manoelcampos@gmail.com>
@cloudsimplus cloudsimplus deleted a comment from ammaralmoalmi Feb 11, 2019
@cloudsimplus cloudsimplus deleted a comment from ammaralmoalmi Feb 11, 2019
@GitHubDiom
Copy link

GitHubDiom commented Nov 12, 2020

Hey, I am new to CloudSim Plus, sorry to have disturbed you.
How do I run my simulation under the Google Trace v3 or Alibaba Trace 2018?
Do I need to write my Reader Class?
I would like to evaluate the final completion time of all tasks and the resource usage (or utlization) of the host during this period, which example is the most appropriate?

@manoelcampos
Copy link
Collaborator Author

If the v3 trace files have the same structure, you can just use the available examples.
If the structure has changed, you need to create new readers.
In that case, the available Google Trace reader classes will help.
There is not support for Alibaba traces and no current plans to implement that.

@GitHubDiom
Copy link

@manoelcampos thanks for the reply
I downloaded the Google Trace v2 (2011) and an error occurred while running example GoogleTaskEventsExample1
I used task-events-part-00000-of-00500.csv (task-events file) and task-usage-part-00000-of-00500.csv (task-usage file) as my Trace files
I've done some filtering on these trace files.
Here are the sed command

$jobID=5494054149
sed -e '/'$jobID'/!d' task-events-part-00000-of-00500.csv > task-events-$jobID.csv
sed -e '/'$jobID'/!d' task-usage-part-00000-of-00500.csv > task-usage-$jobID.csv

When I run the simulation, I get Past event detected error. Could you please tell me what is wrong?
In this case, the number of VMs and Hosts both 300.

INFO
================== Starting CloudSim Plus 5.5.1 ==================
INFO 0.00: DatacenterSimple1 is starting...
INFO Broker_07YMt+54AmVeMivAZ6AgIC+yR4U1ALbWtJbiJtFzxJY= is starting...
INFO Entities started.
INFO 0.00: Broker_07YMt+54AmVeMivAZ6AgIC+yR4U1ALbWtJbiJtFzxJY=: List of 1 datacenters(s) received.
INFO 0.00: Broker_07YMt+54AmVeMivAZ6AgIC+yR4U1ALbWtJbiJtFzxJY=: Trying to create Vm 0 in DatacenterSimple1
INFO 0.00: Broker_07YMt+54AmVeMivAZ6AgIC+yR4U1ALbWtJbiJtFzxJY=: Trying to create Vm 1 in DatacenterSimple1
...
INFO 0.00: VmAllocationPolicySimple: Vm 0 has been allocated to Host 0/DC 1
INFO 0.00: VmAllocationPolicySimple: Vm 1 has been allocated to Host 1/DC 1
INFO 0.00: VmAllocationPolicySimple: Vm 2 has been allocated to Host 2/DC 1
INFO 0.00: VmAllocationPolicySimple: Vm 3 has been allocated to Host 3/DC 1
...
INFO 0.10: Broker_07YMt+54AmVeMivAZ6AgIC+yR4U1ALbWtJbiJtFzxJY=: Sending Cloudlet 54940541490 to Vm 0 in Host 0/DC 1.
INFO 0.10: Broker_07YMt+54AmVeMivAZ6AgIC+yR4U1ALbWtJbiJtFzxJY=: Sending Cloudlet 54940541491 to Vm 1 in Host 1/DC 1.
INFO 0.10: Broker_07YMt+54AmVeMivAZ6AgIC+yR4U1ALbWtJbiJtFzxJY=: Sending Cloudlet 54940541492 to Vm 2 in Host 2/DC 1.
...
INFO 0.10: Broker_07YMt+54AmVeMivAZ6AgIC+yR4U1ALbWtJbiJtFzxJY=: All waiting Cloudlets submitted to some VM.

Exception in thread "main" java.lang.IllegalArgumentException: Past event detected. Event time: 0.0 Simulation clock: 0.1

at org.cloudbus.cloudsim.core.CloudSim.processEvent(CloudSim.java:708)
at org.cloudbus.cloudsim.core.CloudSim.processFutureEventsHappeningAtSameTimeOfTheFirstOne(CloudSim.java:565)
at org.cloudbus.cloudsim.core.CloudSim.runClockTickAndProcessFutureEvents(CloudSim.java:519)
at org.cloudbus.cloudsim.core.CloudSim.processEvents(CloudSim.java:321)
at org.cloudbus.cloudsim.core.CloudSim.start(CloudSim.java:280)
at org.cloudsimplus.examples.traces.google.GoogleTaskEventsExample1.(GoogleTaskEventsExample1.java:146)
at org.cloudsimplus.examples.traces.google.GoogleTaskEventsExample1.main(GoogleTaskEventsExample1.java:128)

Then I try to delete task-event file to only the first line (without changing the task-usage file), I got the following feedback **(WARN 5400.00: DatacenterSimple: Vm 0 destroyed on Host 0/DC 1 ... **). In this case, the number of VMs and Hosts both 1.

INFO
================== Starting CloudSim Plus 5.5.1 ==================

INFO 0.00: DatacenterSimple1 is starting...
INFO Broker_07YMt+54AmVeMivAZ6AgIC+yR4U1ALbWtJbiJtFzxJY= is starting...
INFO Entities started.
INFO 0.00: Broker_07YMt+54AmVeMivAZ6AgIC+yR4U1ALbWtJbiJtFzxJY=: List of 1 datacenters(s) received.
INFO 0.00: Broker_07YMt+54AmVeMivAZ6AgIC+yR4U1ALbWtJbiJtFzxJY=: Trying to create Vm 0 in DatacenterSimple1
INFO 0.00: VmAllocationPolicySimple: Vm 0 has been allocated to Host 0/DC 1
INFO 0.10: Broker_07YMt+54AmVeMivAZ6AgIC+yR4U1ALbWtJbiJtFzxJY=: Sending Cloudlet 54940541490 to Vm 0 in Host 0/DC 1.
INFO 0.10: Broker_07YMt+54AmVeMivAZ6AgIC+yR4U1ALbWtJbiJtFzxJY=: All waiting Cloudlets submitted to some VM.
TRACE 600.00: Broker_07YMt+54AmVeMivAZ6AgIC+yR4U1ALbWtJbiJtFzxJY=: Cloudlet 54940541490 resource usage changed: CPU Utilization: 100.0 -> 0.0% | RAM Utilization: 0.0 -> 0.1% |
TRACE 900.00: Broker_07YMt+54AmVeMivAZ6AgIC+yR4U1ALbWtJbiJtFzxJY=: Cloudlet 54940541490 resource usage changed: RAM Utilization: 0.1 -> 0.1% |
TRACE 1200.00: Broker_07YMt+54AmVeMivAZ6AgIC+yR4U1ALbWtJbiJtFzxJY=: Cloudlet 54940541490 resource usage changed: RAM Utilization: 0.1 -> 0.1% |
TRACE 1500.00: Broker_07YMt+54AmVeMivAZ6AgIC+yR4U1ALbWtJbiJtFzxJY=: Cloudlet 54940541490 resource usage changed:
TRACE 1800.00: Broker_07YMt+54AmVeMivAZ6AgIC+yR4U1ALbWtJbiJtFzxJY=: Cloudlet 54940541490 resource usage changed: RAM Utilization: 0.1 -> 0.1% |
TRACE 2100.00: Broker_07YMt+54AmVeMivAZ6AgIC+yR4U1ALbWtJbiJtFzxJY=: Cloudlet 54940541490 resource usage changed: RAM Utilization: 0.1 -> 0.0% |
TRACE 2400.00: Broker_07YMt+54AmVeMivAZ6AgIC+yR4U1ALbWtJbiJtFzxJY=: Cloudlet 54940541490 resource usage changed: RAM Utilization: 0.0 -> 0.0% |
TRACE 2700.00: Broker_07YMt+54AmVeMivAZ6AgIC+yR4U1ALbWtJbiJtFzxJY=: Cloudlet 54940541490 resource usage changed:
TRACE 3000.00: Broker_07YMt+54AmVeMivAZ6AgIC+yR4U1ALbWtJbiJtFzxJY=: Cloudlet 54940541490 resource usage changed: RAM Utilization: 0.0 -> 0.0% |
TRACE 3300.00: Broker_07YMt+54AmVeMivAZ6AgIC+yR4U1ALbWtJbiJtFzxJY=: Cloudlet 54940541490 resource usage changed:
TRACE 3600.00: Broker_07YMt+54AmVeMivAZ6AgIC+yR4U1ALbWtJbiJtFzxJY=: Cloudlet 54940541490 resource usage changed:
TRACE 3900.00: Broker_07YMt+54AmVeMivAZ6AgIC+yR4U1ALbWtJbiJtFzxJY=: Cloudlet 54940541490 resource usage changed:
TRACE 4200.00: Broker_07YMt+54AmVeMivAZ6AgIC+yR4U1ALbWtJbiJtFzxJY=: Cloudlet 54940541490 resource usage changed:
TRACE 4500.00: Broker_07YMt+54AmVeMivAZ6AgIC+yR4U1ALbWtJbiJtFzxJY=: Cloudlet 54940541490 resource usage changed:
TRACE 4800.00: Broker_07YMt+54AmVeMivAZ6AgIC+yR4U1ALbWtJbiJtFzxJY=: Cloudlet 54940541490 resource usage changed:
TRACE 5100.00: Broker_07YMt+54AmVeMivAZ6AgIC+yR4U1ALbWtJbiJtFzxJY=: Cloudlet 54940541490 resource usage changed:
TRACE 5400.00: Broker_07YMt+54AmVeMivAZ6AgIC+yR4U1ALbWtJbiJtFzxJY=: Cloudlet 54940541490 resource usage changed:
INFO 5400.00: Processing last events before simulation shutdown.
TRACE 5400.00: Datacenter 1: Unknown event -1 received.
INFO 5400.00: Broker_07YMt+54AmVeMivAZ6AgIC+yR4U1ALbWtJbiJtFzxJY= is shutting down...
INFO 5400.00: Broker_07YMt+54AmVeMivAZ6AgIC+yR4U1ALbWtJbiJtFzxJY=: Requesting Vm 0 destruction.

WARN 5400.00: DatacenterSimple: Vm 0 destroyed on Host 0/DC 1. It had a total of 1 cloudlets (running + waiting). Some events may have been missed. You can try: (a) decreasing CloudSim's minTimeBetweenEvents and/or Datacenter's schedulingInterval attribute; (b) increasing broker's Vm destruction delay for idle VMs if you set it to zero; (c) defining Cloudlets with smaller length (your Datacenter's scheduling interval may be smaller than the time to finish some Cloudlets).

INFO Simulation: No more future events

INFO CloudInformationService0: Notify all CloudSim Plus entities to shutdown.

INFO 5400.00: DatacenterSimple1 is shutting down...
INFO
================== Simulation finished at time 5400.00 ==================

DEBUG DeferredQueue >> max size: 2 added to middle: 0 added to tail: 27

                Simulation results for Broker representing the username 07YMt+54AmVeMivAZ6AgIC+yR4U1ALbWtJbiJtFzxJY=

Job|Cloudlet|Status |DC|Host|Host PEs |VM|VM Size|Cloudlet Size|VM PEs |Waiting >Time|CloudletLen|CloudletPEs|StartTime|FinishTime|ExecTime
ID| ID| |ID| ID|CPU cores|ID| MB| MB|CPU cores| Seconds| MI| CPU cores| Seconds| Seconds| Seconds
Simulation finished at 22:42:04.457513400. Execution time: 0.75 seconds

@manoelcampos
Copy link
Collaborator Author

Please subscribe to the forum. The link is on the home page. Make sure you read the subscription page and provide ALL the required info or your application will be rejected.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants