fleetd_logs table#19489
Conversation
roperzh
left a comment
There was a problem hiding this comment.
very nice idea of using MultiLevelWriter, I took a quick look and left a couple of questions!
| // empty string when the error is empty | ||
| row["time"] = entry.Time | ||
| row["level"] = entry.Level.String() | ||
| row["payload"] = string(entry.Payload) |
There was a problem hiding this comment.
do you know if the compiler is smart enough to not do a full copy for each row here? I wonder if it's worth it to store it as a string in the first place.
There was a problem hiding this comment.
This makes sense to me, I can store it as string. It will also stop it from re-interpreting the byte array of each record every time this table is queried
| l.logs = append(l.logs, msg) | ||
|
|
||
| if MaxEntries > 0 && len(l.logs) > int(MaxEntries) { | ||
| l.logs = l.logs[len(l.logs)-int(MaxEntries):] | ||
| } |
There was a problem hiding this comment.
can you sanity check my logic here?
- Once we reached
MaxEntries, won't this keep allocating and copying10_000elements to new arrays for the slices to point to? - Minor: can we initialize the slice capacity to avoid reallocation while the slice is filling in for the first time?
There was a problem hiding this comment.
Yeah so this is a good question. What should happen is that the underlying slice remains and we only only reference items 1-10_000 len=9_999, cap=. The array only gets reallocated once we exceed the cap. That being said, if we keep sliding the array over, at some point it will hit the cap and get reallocated. I'm actually not sure if it would re-allocate a larger slice or notice that even though we're at the end of the array, we only have len=10_000 and keep the allocation small
There was a problem hiding this comment.
Sorry I got the numbers wrong, we append to the array before shifting, it should be 10_001 and 10_000
There was a problem hiding this comment.
If you run this you'll see that that capacity keeps shrinking as we shift the array, and it only re-allocates the array once the len and cap match
package main
import (
"fmt"
"unsafe"
)
func main() {
max := 25
arr := []bool{}
for range 1000 {
arr = append(arr, true)
if len(arr) > max {
arr = arr[len(arr)-max:]
}
fmt.Printf("ptr=%p len=%d, cap=%d\n", unsafe.SliceData(arr), len(arr), cap(arr))
}
}There was a problem hiding this comment.
That being said, a ring buffer would probably be a better data structure for this since then we would never need to re-allocate anything
There was a problem hiding this comment.
yes, that's what I'm worried about (I don't know of a quick and easy way to sanity check). I'm concerned about the underlying array will keep on growing (using amortized constant time, so the copies are not that bad) but I think all the logs will be carried over.
check out this section: https://go.dev/blog/slices-intro#a-possible-gotcha
There was a problem hiding this comment.
I commented that ^ without seeing your code example, taking a look now
There was a problem hiding this comment.
If we add 1 million entries to a 10,000 big array, it only reallocs less than 300 times
package main
import (
"fmt"
"unsafe"
)
func main() {
max := 10_000
arr := []bool{}
reallocs := 0
for range 1_000_000 {
origCap := cap(arr)
arr = append(arr, true)
if len(arr) > max {
arr = arr[len(arr)-max:]
}
newCap := cap(arr)
if origCap < newCap {
println("reallocated")
reallocs++
}
fmt.Printf("ptr=%p len=%d, cap=%d\n", unsafe.SliceData(arr), len(arr), cap(arr))
}
println("Reallocs:", reallocs)
}Reallocs: 293
There was a problem hiding this comment.
Still more than we want though perhaps
There was a problem hiding this comment.
I'm okay with reallocations as they're amortized, I was worried about us keeping all the logs in memory. I think we proved that we're good! thank you!
just for kicks I also did this contrived example: https://go.dev/play/p/0qzpJ54-NR_j
| dec := json.NewDecoder(bytes.NewReader(event)) | ||
| dec.UseNumber() | ||
| if err := dec.Decode(&evt); err != nil { | ||
| return Event{}, fmt.Errorf("cannot decode: %w", err) | ||
| } |
There was a problem hiding this comment.
this is a tricky API, I learned this from Martin:
The problem with json.Decoder is that it consumes a stream of JSON values, so when it's done decoding one value, it is not guaranteed that it consumed the whole input (e.g. https://pkg.go.dev/encoding/json#example-Decoder). What we should do to enforce that is to check that it returns io.EOF (and eat that error as a success)
There was a problem hiding this comment.
Their console writer logger only processes one message at a time, so I suspect they only send one per write call. I'll add the code to process multiple entries though, it won't be hard
There was a problem hiding this comment.
I was thinking just checking for io.EOF? to prevent malformed input
There was a problem hiding this comment.
Malformed input should return a non-EOF error I think? io.EOF will only get returned if we read the input in a loop and there is no more data after the first object is parsed
There was a problem hiding this comment.
not always, take this for example:
{"foo": "bar"}{
here decoder will decode the first token, however the JSON is invalid. It's a footgun of the decoder, if you're decoding a single token you should keep consuming tokens until you reach io.EOF
in this case, since you're expecting a single token, we can fail if the next one is not io.EOF
There was a problem hiding this comment.
Understood, I have added support for io.EOF 🙂
There was a problem hiding this comment.
(you can see in the official docs I linked an elegant way to do it)
#18234
changes/,orbit/changes/oree/fleetd-chrome/changes.See Changes files for more information.