/
doc.go
163 lines (162 loc) · 6.5 KB
/
doc.go
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
// Copyright ©2018 The go-hep Authors. All rights reserved.
// Use of this source code is governed by a BSD-style
// license that can be found in the LICENSE file.
// Package groot provides a pure-go read/write-access to ROOT files.
//
// A typical usage is as follows:
//
// f, err := groot.Open("ntup.root")
// if err != nil {
// log.Fatal(err)
// }
// defer f.Close()
//
// obj, err := f.Get("tree")
// if err != nil {
// log.Fatal(err)
// }
// tree := obj.(rtree.Tree)
// fmt.Printf("entries= %v\n", tree.Entries())
//
// More complete examples on how to iterate over the content of a Tree can
// be found in the examples attached to groot.TreeScanner and groot.Scanner:
// https://godoc.org/go-hep.org/x/hep/groot/rtree#pkg-examples
//
// Another possibility is to look at:
// https://godoc.org/go-hep.org/x/hep/groot/cmd/root-ls,
// a command that inspects the content of ROOT files.
//
//
// File layout
//
// ROOT files are a suite of consecutive data records.
// Each data record consists of a header part, called a TKey, and a payload
// whose content, length and meaning are described by the header.
// The current ROOT file format encodes all data in big endian.
//
// ROOT files initially only supported 32b addressing.
// Large files support (>4Gb) was added later on by migrating to a 64b addressing.
//
// The on-disk binary layout of a ROOT file header looks like this:
//
// Type | Record Name | Description
// =================+=============+===========================================
// [4]byte | "root" | Root file identifier
// int32 | fVersion | File format version
// int32 | fBEGIN | Pointer to first data record
// int32 [int64] | fEND | Pointer to first free word at the EOF
// int32 [int64] | fSeekFree | Pointer to FREE data record
// int32 | fNbytesFree | Number of bytes in FREE data record
// int32 | nfree | Number of free data records
// int32 | fNbytesName | Number of bytes in TNamed at creation time
// byte | fUnits | Number of bytes for file pointers
// int32 | fCompress | Compression level and algorithm
// int32 [int64] | fSeekInfo | Pointer to TStreamerInfo record
// int32 | fNbytesInfo | Number of bytes in TStreamerInfo record
// [18]byte | fUUID | Universal Unique ID
// =================+=============+===========================================
//
// This is followed by a sequence of data records, starting at the fBEGIN
// offset from the beginning of the file.
//
// The on-disk binary layout of a data record is:
//
// Type | Member Name | Description
// ===============+=============+===========================================
// int32 | Nbytes | Length of compressed object (in bytes)
// int16 | Version | TKey version identifier
// int32 | ObjLen | Length of uncompressed object
// int32 | Datime | Date and time when object was written to file
// int16 | KeyLen | Length of the key structure (in bytes)
// int16 | Cycle | Cycle of key
// int32 [int64] | SeekKey | Pointer to record itself (consistency check)
// int32 [int64] | SeekPdir | Pointer to directory header
// byte | lname | Number of bytes in the class name
// []byte | ClassName | Object Class Name
// byte | lname | Number of bytes in the object name
// []byte | Name | Name of the object
// byte | lTitle | Number of bytes in the object title
// []byte | Title | Title of the object
// []byte | DATA | Data bytes associated to the object
// ===============+=============+===========================================
//
// The high-level on-disk representation of a ROOT file is thus:
//
// +===============+ -- 0
// | |
// | File Header |
// | |
// +===============+ -- fBEGIN offset
// | |
// | Record Header | -->-+
// | | |
// +---------------+ |
// | | |
// | Record Data | | Reference to next Record
// | Payload | |
// | | |
// +===============+ <---+
// | |
// | Record Header | -->-+
// | | |
// +---------------+ |
// | | |
// | Record Data | | Reference to next Record
// | Payload | |
// | | |
// +===============+ <---+
// | |
// ...
//
// | |
// +===============+ -- fSeekInfo
// | |
// | Record Header | -->-+
// | | |
// +---------------+ |
// | | |
// | Record Data | | Reference to next Record
// | Payload | |
// | | |
// +===============+ <---+ -- fEND offset
//
// Data records payloads and how to deserialize them are described by a TStreamerInfo.
// The list of all TStreamerInfos that are used to interpret the content of
// a ROOT file is stored at the end of that ROOT file, at offset fSeekInfo.
//
//
// Data records
//
// Data records' payloads may be compressed.
// Detecting whether a payload is compressed is usually done by comparing
// the object length (ObjLen) field of the record header with the length
// of the compressed object (Nbytes) field.
// If they differ after having subtracted the record header length, then
// the payload has been compressed.
//
// A record data payload is itself split into multiple chunks of maximum
// size 16*1024*1024 bytes.
// Each chunk consists of:
//
// - the chunk header,
// - the chunk compressed payload.
//
// The chunk header:
//
// - 3 bytes to identify the compression algorithm and version,
// - 3 bytes to identify the deflated buffer size,
// - 3 bytes to identify the inflated buffer size.
//
// Streamer informations
//
// Streamers describe how a given type, for a given version of that type, is
// written on disk.
// In C++/ROOT, a streamer is represented as a TStreamerInfo class that can
// give metadata about the type it's describing (version, name).
// When reading a file, all the streamer infos are read back in memory, from
// disk, by reading the data record at offset fSeekInfo.
// A streamer info is actually a list of streamer elements, one for each field
// and, in C++, base class (in Go, this is emulated as an embedded field.)
//
// TODO: groot can not write trees yet.
package groot // import "go-hep.org/x/hep/groot"