Data Desk is a project utility that parses a simple C-like data description format. Input files in this data description format are parsed to create corresponding abstract syntax trees which represent the information extracted from the files. These abstract syntax trees are then sent to project-specific custom code that is written by the user. This custom code is simply a dynamic library with a few exported functions that are used as callbacks for the parser. Below is a list of the callbacks.
DataDeskCustomInitCallback(void)is called when the parser starts.
DataDeskCustomFileCallback(DataDeskASTNode *root, char *filename)is called when the parser finishes parsing a file.
DataDeskCustomConstantCallback(DataDeskConstant constant_info, char *filename)is called for every constant definition that is parsed.
DataDeskCustomStructCallback(DataDeskStruct struct_info, char *filename)is called for every structure that is parsed.
DataDeskCustomDeclarationCallback(DataDeskDeclaration declaration, char *filename)is called for every declaration that is parsed.
DataDeskCustomCleanUpCallback(void)is called before the parser shuts down.
The abstract syntax tree is formed completely by
DataDeskASTNode structures. This structure can be found in the
Data Desk also offers a number of utility functions for introspecting on abstract syntax trees it passes to your custom code. A list of these is in the
data_desk.h file, which can be included into your custom layer.
To use Data Desk, you'll need to do a few things:
- Get Data Desk
- Make or get some Data Desk format files (.ds)
- Make a project-specific custom layer
Step 1: Get Data Desk
- Run the command
git clone https://github.com/ryanfleury/data_desk
buildon Windows or
build.bat script on Windows expects to find
cl (MSVC). Your environment should know about this. The easiest way to do this is to use one of the Visual Studio command prompts (titled
x64 Native Tools Command Prompt for VS<version>, or
x86 Native Tools Command Prompt for VS<version>). Otherwise, you can call
vcvarsall.bat in your terminal environment, which is packaged with Visual Studio.
Step 2: Make or get Data Desk format files (.ds)
Grab an example or make your own.
Step 3: Make a project-specific custom layer
An easy way to write the code for this is to check out the custom layer template, located here. Fill out the functions in your custom layer code however you want to. There are some helper functions available in
data_desk.hthat might be useful for you here. This can be dropped into your code and used.
To build a custom layer, you just need to build a DLL with the function callbacks you've written as the appropriate exported symbols.
data_desk.houtlines what symbols are used for each callback.
Step 4: Run Data Desk
To run Data Desk with your custom layer, you can use the following command template:
data_desk --custom /path/to/custom/layer /file/to/parse/1 /file/to/parse/2 ...
Data Desk (.ds) File Documentation
A valid Data Desk file is defined as a set of zero or more
Consts. Each of the following sections defines these (and what they are comprised of).
- Numeric Constants
- String Constants
- Character Constants
- Binary Operators
- Constant Expressions
Identifiers are defined as a sequence of characters that begin with either an underscore or an alphabetic character, and contain numeric characters, alphabetic characters, or underscores (similar to C).
Numeric constants (
Numbers) are defined as a sequence of characters that begin with a numeric character, and contain only numeric characters, periods, or alphabetic characters.
NOTE: Data Desk does not guarantee the correctness as defined by programming languages of your numeric constants. For example, the following will be interpreted by Data Desk as a numeric constant:
1.2.3.a.b.c. Because Data Desk does not do any evaluation of numeric constants, it will not enforce validity of numeric constants.
String constants (
Strings) can be single-line or multi-line.
A single-line string constant is defined similarly to those in C. It begins with a double-quote character, and ends with a non-escaped double-quote character. Double-quote characters can be escaped with a backslash.
A multi-line string constant is defined as beginning with three double-quote characters (
"""), and ending with three double-quote characters (
Character constants (
Chars) are defined almost identically to single-line string constants, but with single-quote beginning and ending characters instead of double-quote characters.
Data Desk defines a subset of the binary operators found in C. It does not define shorthand assignment operators, like
>>=. The following binary operators are defined (in order of ascending precedence):
<<: Left Bitshift
>>: Right Bitshift
&: Bitwise And
|: Bitwise Or
&&: Boolean And
||: Boolean Or
An expression (
Expr) in Data Desk is defined as being one of the following:
Types are used in declarations. They are defined as being the following:
- A group of 0 or more
*characters, representing the number of layers of indirection
- A type name, which can be:
Identifierreferring to a type name
Structdefinition (refer to next section)
- A group of 0 or more array size specifiers, being defined as a
Expression, and a
Declarations are defined as follows:
Structs are groups of zero or more declarations. They are defined as:
Zero or more
Enums are groups of one or more identifiers. They are defined as:
One or more
Identifiers, each followed by
When transpiled to C, these will be defined as a normal C
enum; that is, the first one will be defined as a constant that evaluates to 0, the next to 1, and so on.
Flagss are groups of one or more identifiers. They are defined as:
One or more
Identifiers, each followed by
When transpiled to C, these will be defined as several C preprocessor macros that evaluate to unique bits inside of an integral value. These are similar to
Enums, but their purpose is to define unique bits instead of unique integral values for a set of constants.
Constant expressions (
Consts) are defined as:
Comments are ignored by the parser. They can be single-line or multi-line.
Single-line comments can be defined with two
/ characters. They are terminated by a newline character.
Multi-line comments can be defined with a
/* pattern. They are terminated by a
*/ pattern. They can also be nested. For example, if there exists the pattern
/*/*, it will require
*/*/ to terminate.
Declarations (including those within
Consts can be preceded with one or more
Tag is defined as beginning with a
@ character, and ending with whitespace. These are used to annotate meta-information about various things. They will be passed to custom-layer code.