Java has some embedded features to parse binary data (for instance ByteBuffer), but I wanted to work with separated bits and describe binary structure in some strong DSL(domain specific language). I was very impressed by the the Python Struct package package so that I decided to make something like that. So JBBP was born.
p.s.
For instance I have been very actively using the framework in the ZX-Poly emulator to parse snapshot files and save results.
-
2.0.0 (20-nov-2019)
- removed DslBinCustom annotation, use @Bin annotation instead
- renamed attributes of @Bin annotation to their correct form
- reworked object mapping system, removed hacks to instantiate classes, now only mapping to objects allowed, support of private fields mapping is removed
- minimal JDK version now 1.8+
- minimal Android API now 3.0+
- added support of getters and setters into mapping
- added
Object newInstance(Class)
method support of mapped classes to generate local class member instances - added generating of
makeFIELD()
method for structure types in Java class converter - refactoring
-
1.4.1 (20-aug-2018)
- fixed incompatibility in tokenizer regex syntax for Android SDK #23
- added DslBinCustom annotation to provide way to mark custom type fields for JBBPDslBuilder
- fixed NPE in JBBPDslBuilder for not-provided outBitNumber attribute in annotated field marked as type BIT or BIT_ARRAY #20
- naming of fields has been made more tolerant, now it is allowed to have field names with names similar to data types
- improved check of field names in JBBPDslBuilder #21
-
1.4.0 (29-jul-2018)
- added type
val
which allows to create virtual field with calculated value, can play role of variable in scripts val
andvar
have been added into reserved words and can't be used as field names- added field
outByteOrder
attribute toBin
annotation, it affects logic ofJBBPOut#Bin
for output of annotated objects which fields should be saved with different byte order - removed deprecated method
JBBPFinderException#getNameOrPath
- added auxiliary class to build JBBP script
- added flag
JBBPParser#FLAG_NEGATIVE_EXPRESSION_RESULT_AS_ZERO
to recognize negative expression result as zero - improved Java 6 class source generator to process FLAG_SKIP_REMAINING_FIELDS_IF_EOF for structure fields
- added stable automatic module name
igormaznitsa.jbbp
into manifest file - added support of float, double and string java types, as
floatj
,doublej
andstringj
- added type
-
1.3.0 (02-sep-2017)
- Fixed issue #16 NullPointerException when referencing a JBBPCustomFieldTypeProcessor parsed field", many thanks to @use-sparingly for the bug report
- added Maven plugin to generate sources from JBBP scripts
- added Gradle plugin to generate sources from JBBP scripts
- added extra byte array reading writing methods with byte order support into JBBPBitInputStream and JBBPBitOutputStream
- added converter of compiled parser data into Java class sources (1.6+)
- added method to read unsigned short values as char [] into JBBPBitInputStream
- Class version target has been changed to Java 1.6
- fixed compatibiity of tests with Java 1.6
- Minor refactoring
-
1.2.1 (28-JUL-2016)
- Fixed issue #10 "assertArrayLength throws exception in multi-thread", many thanks to @sky4star for the bug report.
- minor refactoring
The Framework has been published in the Maven Central and can be easily added as a dependency
<dependency>
<groupId>com.igormaznitsa</groupId>
<artifactId>jbbp</artifactId>
<version>2.0.0</version>
</dependency>
the precompiled library jar, javadoc and sources also can be downloaded directly from the Maven central.
The Framework is very easy in use because it has only two main classes for its functionality com.igormaznitsa.jbbp.JBBPParser (for data parsing) and com.igormaznitsa.jbbp.io.JBBPOut (for binary block writing), both of them work over low-level IO classes com.igormaznitsa.jbbp.io.JBBPBitInputStream and com.igormaznitsa.jbbp.io.JBBPBitOutputStream which are the core for the framework.
The Easiest case below shows how to parse byte array to bits.
byte [] parsedBits = JBBPParser.prepare("bit:1 [_];").parse(new byte[]{1,2,3,4,5}).
findFieldForType(JBBPFieldArrayBit.class).getArray();
Of course sometime it is not a comfortable way to look for parsed fields in the result, so you can use mapping of parsed data to class fields.
class Parsed {@Bin(type = BinType.BIT_ARRAY)byte[] parsed;}
Parsed parsedBits = JBBPParser.prepare("bit:1 [_] parsed;").parse(new byte[]{1,2,3,4,5}).mapTo(new Parsed());
On the start the framework was created to provide comfortable way to parse data, it was not developed for hight speed. But since 1.3.0 version there has been added way to generate Java class sources from JBBP parsers and it allows to increase parsing speed dramatically (keep in mind that JBBP generates Java class sources but not compile it automaticaly). I have made some microbenchmark testing of all parsing approaches to show relative productivity of each one
The Chart shows three standard ways to parse data with JBBP
- Dynamic - parsing into inside structures through interpretation of script written in DSL. It is not very fast way but you can generate parsers on fly even from dynamically formed strings.
- Dynamic + map to class - Parsing into inside structures through interpretation of script and mapping parsed data directly into class instances. the way is very slow (because it uses reflections to fill fields) and recommended only if comfortable parsing is much more preffered than speed.
- Static class - parsing with Java sources generated from a JBBP parser. It is the fastest way because Java compiler and JIT can make optimizations. The Approach can be used in High-Load systems. It is possible to compile generated Java sources on fly, you can take a look at auxiliary class which I use in tests.
Since 1.3.0 version, the framework can convert JBBP scripts into sources (the sources anyway need JBBP framework for work). For instance you can use such simple snippet to generate Java classes from JBBP script, potentially it can generate many classes but usually only one class
JBBPParser parser = JBBPParser.prepare("byte a; byte b; byte c;");
List<ResultSrcItem> generated = parser.convertToSrc(TargetSources.JAVA,"com.test.jbbp.gen.SomeClazz");
for(ResultSrcItem i : generated) {
for(Map.Entry<String,String> j :i.getResult().entrySet()) {
System.out.println("Class file name "+j.getKey());
System.out.println("Class file content "+j.getValue());
}
}
also there are special plugins for Maven and Gradle to generate sources from JBBP scripts during source generate phase
in Maven you should just add such plugin execution
<plugin>
<groupId>com.igormaznitsa</groupId>
<artifactId>jbbp-maven-plugin</artifactId>
<version>2.0.0</version>
<executions>
<execution>
<id>gen-jbbp-src</id>
<goals>
<goal>generate</goal>
</goals>
</execution>
</executions>
</plugin>
By default the maven plugin looks for files with jbbp
extension in src/jbbp
folder of project (it can be changed in options) and produces result java classes in target/generated-sources/jbbp
folder. I use such approach in ZX-Poly emulator.
The Example shows how to parse a byte written in non-standard MSB0 order (Java has LSB0 bit order) to bit fields, print its values and pack fields back
class Flags {
@Bin(outOrder = 1, name = "f1", type = BinType.BIT, outBitNumber = JBBPBitNumber.BITS_1, comment = "It's flag one") byte flag1;
@Bin(outOrder = 2, name = "f2", type = BinType.BIT, outBitNumber = JBBPBitNumber.BITS_2, comment = "It's second flag") byte flag2;
@Bin(outOrder = 3, name = "f3", type = BinType.BIT, outBitNumber = JBBPBitNumber.BITS_1, comment = "It's 3th flag") byte flag3;
@Bin(outOrder = 4, name = "f4", type = BinType.BIT, outBitNumber = JBBPBitNumber.BITS_4, comment = "It's 4th flag") byte flag4;
}
final int data = 0b10101010;
Flags parsed = JBBPParser.prepare("bit:1 f1; bit:2 f2; bit:1 f3; bit:4 f4;", JBBPBitOrder.MSB0).parse(new byte[]{(byte)data}).mapTo(new Flags());
assertEquals(1,parsed.flag1);
assertEquals(2,parsed.flag2);
assertEquals(0,parsed.flag3);
assertEquals(5,parsed.flag4);
System.out.println(new JBBPTextWriter().Bin(parsed).Close().toString());
assertEquals(data, JBBPOut.BeginBin(JBBPBitOrder.MSB0).Bin(parsed).End().toByteArray()[0] & 0xFF);
The Example will print in console the text below
;--------------------------------------------------------------------------------
; START : Flags
;--------------------------------------------------------------------------------
01; f1, It's flag one
02; f2, It's second flag
00; f3, It's 3th flag
05; f4, It's 4th flag
;--------------------------------------------------------------------------------
; END : Flags
;--------------------------------------------------------------------------------
Every field can have case insensitive name which should not contain '.' (because it is reserved for links to structure field values) and '#'(because it is also reserved for inside usage). Field name must not be started by a number or chars '$' and '_'. Field names are case insensitive!
int someNamedField;
byte field1;
byte field2;
byte field3;
The Framework supports full set of Java numeric primitives with extra types like ubyte and bit.
The Framework provides support for arrays and structures. Just keep in mind that in expressions you can make links to field values only defined before expression.
it is possible to define processors for own custom data types, for instance you can take a look at case processing three byte unsigned integer types.
The Parser does not support Java float and double types out of the box. But it can be implemented through custom type processor. there is written example and test and the code can be copy pasted.
If you have some data which structure is variable then you can use the var
type for defined field and process reading of the data manually with custom JBBPVarFieldProcessor instance.
final JBBPParser parser = JBBPParser.prepare("short k; var; int;");
final JBBPIntCounter counter = new JBBPIntCounter();
final JBBPFieldStruct struct = parser.parse(new byte[]{9, 8, 33, 1, 2, 3, 4}, new JBBPVarFieldProcessor() {
public JBBPAbstractArrayField<? extends JBBPAbstractField> readVarArray(final JBBPBitInputStream inStream, final int arraySize, final JBBPNamedFieldInfo fieldName, final int extraValue, final JBBPByteOrder byteOrder, final JBBPNamedNumericFieldMap numericFieldMap) throws IOException {
fail("Must not be called");
return null;
}
public JBBPAbstractField readVarField(final JBBPBitInputStream inStream, final JBBPNamedFieldInfo fieldName, final int extraValue, final JBBPByteOrder byteOrder, final JBBPNamedNumericFieldMap numericFieldMap) throws IOException {
final int value = inStream.readByte();
return new JBBPFieldByte(fieldName, (byte) value);
}
}, null);
NB! Some programmers trying to use only parser for complex data, it is mistake. In the case it is much better to have several easy parsers working with the same JBBPBitInputStream instance, it allows to keep decision points on Java level and make solution easier.
Special types makes some actions to skip data in input stream
Every multi-byte type can be read with different byte order.
Expressions are used for calculation of length of arrays and allow brackets and integer operators which work similar to Java operators:
- Arithmetic operators: +,-,%,*,/,%
- Bit operators: &,|,^,~
- Shift operators: <<,>>,>>>
- Brackets: (, )
Inside expression you can use integer numbers and named field values through their names (if you use fields from the same structure) or paths. Keep in your mind that you can't use array fields or fields placed inside structure arrays.
int field1;
struct1 {
int field2;
}
byte [field1+struct1.field2] data;
You can use commentaries inside a parser script, the parser supports the only comment format and recognizes as commentaries all text after '//' till the end of line.
int;
// hello commentaries
byte field;
Inside expression you can use field names and field paths, also you can use the special macros '$$' which represents the current input stream byte counter, all fields started with '$' will be recognized by the parser as special user defined variables and it will be requesting them from special user defined provider. If the array size contains the only '_' symbol then the field or structure will not have defined size and whole stream will be read.
The Result of parsing is an instance of com.igormaznitsa.jbbp.model.JBBPFieldStruct class which represents the root invisible structure for the parsed data and you can use its inside methods to find desired fields for their names, paths or classes. All Fields are successors of com.igormaznitsa.jbbp.model.JBBPAbstractField class. To increase comfort, it is easier to use mapping to classes when the mapper automatically places values to fields of a Java class.
The Example below shows how to parse a PNG file with the JBBP parser (the example taken from tests)
final InputStream pngStream = getResourceAsInputStream("picture.png");
try {
final JBBPParser pngParser = JBBPParser.prepare(
"long header;"
+ "// chunks\n"
+ "chunk [_]{"
+ " int length; "
+ " int type; "
+ " byte[length] data; "
+ " int crc;"
+ "}"
);
final JBBPFieldStruct result = pngParser.parse(pngStream);
assertEquals(0x89504E470D0A1A0AL,result.findFieldForNameAndType("header",JBBPFieldLong.class).getAsLong());
final JBBPFieldArrayStruct chunks = result.findFieldForNameAndType("chunk", JBBPFieldArrayStruct.class);
final String [] chunkNames = new String[]{"IHDR","gAMA","bKGD","pHYs","tIME","tEXt","IDAT","IEND"};
final int [] chunkSizes = new int[]{0x0D, 0x04, 0x06, 0x09, 0x07, 0x19, 0x0E5F, 0x00};
assertEquals(chunkNames.length,chunks.size());
for(int i=0;i<chunks.size();i++){
assertChunk(chunkNames[i], chunkSizes[i], (JBBPFieldStruct)chunks.getElementAt(i));
}
}
finally {
closeResource(pngStream);
}
Also it is possible to map parsed packet to class fields
final JBBPParser pngParser = JBBPParser.prepare(
"long header;"
+ "chunk [_]{"
+ " int length; "
+ " int type; "
+ " byte[length] data; "
+ " int crc;"
+ "}"
);
class Chunk {
@Bin int length;
@Bin int type;
@Bin byte [] data;
@Bin int crc;
}
@Bin
class Png {
long header;
Chunk [] chunk;
public Object newInstance(Class<?> klazz){
return klazz == Chunk.class ? new Chunk() : null;
}
}
final Png png = pngParser.parse(pngStream).mapTo(new Png());
The Example from tests shows how to parse a tcp frame wrapped in a network frame
final JBBPParser tcpParser = JBBPParser.prepare(
"skip:34; // skip bytes till the frame\n"
+ "ushort SourcePort;"
+ "ushort DestinationPort;"
+ "int SequenceNumber;"
+ "int AcknowledgementNumber;"
+ "bit:1 NONCE;"
+ "bit:3 RESERVED;"
+ "bit:4 HLEN;"
+ "bit:1 FIN;"
+ "bit:1 SYN;"
+ "bit:1 RST;"
+ "bit:1 PSH;"
+ "bit:1 ACK;"
+ "bit:1 URG;"
+ "bit:1 ECNECHO;"
+ "bit:1 CWR;"
+ "ushort WindowSize;"
+ "ushort TCPCheckSum;"
+ "ushort UrgentPointer;"
+ "byte [$$-34-HLEN*4] Option;"
+ "byte [_] Data;"
);
final JBBPFieldStruct result = pngParser.parse(tcpFrameStream);
No, @Bin
annotations in classes are used only for mapping and data writing, but there is the code snippet allows to generate JBBP DSL based on detected @Bin annotations in class.
No problems! The Parser works over com.igormaznitsa.jbbp.io.JBBPBitInputStream class which can be used directly and allows read bits, bytes, count bytes and align data from a stream (for output there is similar class JBBPBitOutputStream)
The Framework contains a special helper as the class com.igormaznitsa.jbbp.io.JBBPOut which allows to build bin blocks with some kind of DSL
import static com.igormaznitsa.jbbp.io.JBBPOut.*;
...
final byte [] array =
BeginBin().
Bit(1, 2, 3, 0).
Bit(true, false, true).
Align().
Byte(5).
Short(1, 2, 3, 4, 5).
Bool(true, false, true, true).
Int(0xABCDEF23, 0xCAFEBABE).
Long(0x123456789ABCDEF1L, 0x212356239091AB32L).
End().toByteArray();