Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

added google workload to champsim trace converter #379

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

hrishi-06
Copy link

In reference to this conversation

## About Wrapper Script

### Google Workload Trace Format
Google Workload traces are stored as records, each record being of type `trace_entry_t` and size 12 bytes. Following is the structure of `trace_entry_t`:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the conversion, we would highly recommend using our libraries which read the disk format into a record format suitable for analysis and simulation (the memref_t type).  The disk format is subject to change without considering that a compatibility change, as it is not meant to be a public relied-upon interface.  Furthermore, there are numerous complexities built into the disk format, such as including instruction encodings or physical address values only the first time observed, which are abstracted away by the libraries.  Trying to directly convert the disk format would require duplicating the handling of those complexities.  Best to work at the recommended and documented higher level for simplicity and to avoid breaking changes.

Thus, we would request to not take this approach and to instead create an analysis tool in our framework and convert from the memref_t level. There are numerous changes at the disk level that have been made in the last year which break this converter as-is: it won't work on today's traces.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Going further, for our user-mode software-thread-oriented traces, we would recommend building a dynamic converter from memref_t, rather than an offline static converter, for use with our dynamic scheduler which re-schedules our traces onto the simulated virtual cores. This solves several issues: it allows for mixing multiple workloads to model multi-tenant datacenters yet per-workload traces; it deflates the context switch inflation added due to tracing overhead; it (will) provide some functionality typically missing from trace-based simulation such as speculative path support by pulling instructions from prior points in the trace or using other schemes.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you suggest we do instead? Would this fall under the category of analysis tools, as described here? Or is there a different way to get to a memref_t?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, an analysis tool. If ChampSim files are per-software-thread, a parallel tool with software threads as the parallel shards would work. You could start with an offline conversion to ChampSim files in the analysis tool. The same analysis tool code could be run with the shards as virtual cores (possibly with a dynamic reschedule of the software threads) to produce per-hardware-thread trace files.


### Conversion

Converting of Non-Branch instructions to champsim format is straightforward. We just copy instruction pointer value and set `is_branch=0` and `branch_taken=0`.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The drmemtrace records now have different types for taken or untaken conditional branches.


`./a.out <trace>`

`<trace>` should be a google workload trace file in compressed form with `.gz` extension. Google workload traces can be obtained [here](https://dynamorio.org/google_workload_traces.html).

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

drmemtrace traces are now in .zip format. Sample traces are in the https://github.com/DynamoRIO/drmemtrace_samples repository.

Copy link
Collaborator

@ngober ngober left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This probably still needs work before it can be merged in. I think Derek's comments are important, and I also have some concerns about the ChampSim trace that is produced.

If we want to, we can merge this into its own branch and create a series of PRs against that. It would allow us to iterate on this over time.

Comment on lines +10 to +35

#define REG_STACK_POINTER 6
#define REG_FLAGS 25
#define REG_INSTRUCTION_POINTER 26

using namespace std;

#define NUM_INSTR_DESTINATIONS 2
#define NUM_INSTR_SOURCES 4

#define READ_BUFF_SIZE 12 * 1024
#define WRITE_BUFF_SIZE 64 * 1024

typedef struct trace_instr_format
{
unsigned long long int ip; // instruction pointer (program counter) value

unsigned char is_branch; // is this branch
unsigned char branch_taken; // if so, is this taken

unsigned char destination_registers[NUM_INSTR_DESTINATIONS]; // output registers
unsigned char source_registers[NUM_INSTR_SOURCES]; // input registers

unsigned long long int destination_memory[NUM_INSTR_DESTINATIONS]; // output memory
unsigned long long int source_memory[NUM_INSTR_SOURCES]; // input memory
} trace_instr_format_t;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should reference the trace format in the codebase already:

Suggested change
#define REG_STACK_POINTER 6
#define REG_FLAGS 25
#define REG_INSTRUCTION_POINTER 26
using namespace std;
#define NUM_INSTR_DESTINATIONS 2
#define NUM_INSTR_SOURCES 4
#define READ_BUFF_SIZE 12 * 1024
#define WRITE_BUFF_SIZE 64 * 1024
typedef struct trace_instr_format
{
unsigned long long int ip; // instruction pointer (program counter) value
unsigned char is_branch; // is this branch
unsigned char branch_taken; // if so, is this taken
unsigned char destination_registers[NUM_INSTR_DESTINATIONS]; // output registers
unsigned char source_registers[NUM_INSTR_SOURCES]; // input registers
unsigned long long int destination_memory[NUM_INSTR_DESTINATIONS]; // output memory
unsigned long long int source_memory[NUM_INSTR_SOURCES]; // input memory
} trace_instr_format_t;
#include "../inc/trace_instruction.h"
using namespace std;
#define READ_BUFF_SIZE 12 * 1024
#define WRITE_BUFF_SIZE 64 * 1024
using trace_instr_format_t = input_instr;

This reduces the possibility for bugs that come from the formats being slightly different. You might need to reference the special registers with champsim::REG_STACK_POINTER.

## About Wrapper Script

### Google Workload Trace Format
Google Workload traces are stored as records, each record being of type `trace_entry_t` and size 12 bytes. Following is the structure of `trace_entry_t`:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you suggest we do instead? Would this fall under the category of analysis tools, as described here? Or is there a different way to get to a memref_t?

cs.source_registers[1] = REG_INSTRUCTION_POINTER; // reads_ip
cs.destination_registers[0] = REG_STACK_POINTER; // writes_sp
cs.destination_registers[1] = REG_INSTRUCTION_POINTER; // writes_ip
cs.source_registers[2] = 99; // reads_other
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm concerned that instructions do not get any register information unless they're branches. It implies that there are no data dependencies in the trace, which is almost certainly not true. Furthermore, all indirect branches have data dependencies on each other (on register 99), which could lead to some strange results.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess this is a limitation of the Google traces. Google traces do not have these info.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we do not have this information, I'm not sure we can move forward converting the Google traces to ChampSim traces. Memory timing is extremely important, and register information is the way that ChampSim generates timing.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regular drmemtrace traces have instruction encodings with this information so a converter is feasible there. The version 1 release of the Google Workload traces had that information stripped out, and most core studies then had to perform limit studies assuming full deps vs zero deps. We are working on a version 2 release of the Google workloads where we hope to get approval to release information on dependencies, though likely not full encodings.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants