Add support for compressed QMDL by wgreenberg · Pull Request #970 · EFForg/rayhunter

wgreenberg · 2026-03-31T02:59:48Z

this is prerequisite work for #81, since the diag logs for things like RSSI massively increase the size of QMDL files. in my experience, simply gzipping the qmdls reduces their size by 4-5x, which i think should be sufficient for our purposes.

this PR reworks QmdlWriter to output gzipped QMDL files by default, and allows QmdlReader to operate on either compressed or uncompressed QMDLs.

QmdlReader has been significantly rewritten to expose a single AsyncRead interface to both compressed and uncompressed QMDL sources.

i'd still like to do some more in-depth testing of this, but in the meantime i'd love a review on it

This reworks the QmdlWriter to output gzipped QMDL files by default, and allows QmdlReader to operate on either compressed or uncompressed QMDLs. QmdlReader has been significantly rewritten to expose a single AsyncRead interface to both compressed and uncompressed QMDL sources.

This'll balance the enum size given QmdlWriter's larger size

bmw · 2026-04-03T20:04:00Z

-            self.total_written += msg.data.len();
+            // for a gzipped file, we can't use `msg.data.len()` to
+            // determine the number of bytes written, so we have to
+            // manually do a `write_all()` type loop


i'm not understanding this and i'm hoping you can help me

if what we're tracking here is the number of uncompressed bytes, why can't we still use write_all and msg.data.len()?

ah you're totally right, this is an outdated comment/implementation from when i was trying to track total_bytes_written rather than total_uncompressed_bytes

ah ok. that makes sense

i do want to flag tho that when i try changing this back to use write_all, the tests fail so there may be another reason to use this approach which i also don't understand right now lol

i wager it's due to a missing .flush() call -- i'll push a commit w/ working tests in a sec

bmw

i didn't make the time to really test this (at least not yet), but i wanted to share the comments i had after reading the code and playing with things just a little bit so you could continue working on this

for my benefit at least as much as yours for when i come back to this, my comment above about continuing to use write_all in write_container has not been fully addressed yet

bmw · 2026-04-07T16:56:41Z

+    let compressed = qmdl_path.ends_with(".gz");
    let qmdl_file_size = qmdl_file.metadata().await.unwrap().len();
-    let mut qmdl_reader = QmdlReader::new(qmdl_file, Some(qmdl_file_size as usize));
+    let mut qmdl_reader = QmdlReader::new(qmdl_file, compressed, Some(qmdl_file_size as usize));


the qmdl_file_size value won't work here if compressed is true right? it should probably be omitted like it was above

bmw · 2026-04-07T18:22:46Z

                start_time: start_time.into(),
                last_message_time: Some(last_message_time.into()),
-                qmdl_size_bytes: metadata.size() as usize,
+                uncompressed_qmdl_size_bytes: metadata.size() as usize,


isn't metadata.size() going to be the compressed size for qmdl.gz files?

fixing this seems tricky. it seems to me like we either have to

read each file in its entirety to determine the real uncompressed file size

do the refactoring i believe you were talking about of removing the need for tracking max file size entirely by working at the level of HDLC containers

the latter seems much cleaner, but idk how much work it is

what do you think?

bmw · 2026-04-07T19:23:54Z

-            reader: BufReader::new(reader),
-            bytes_read: 0,
-            max_bytes,
+            buf_reader: BufReader::new(QmdlAsyncReader::new(


if we're going to continue accepting max_uncompressed_bytes in this function, what about using an approach something like this and dropping the handling of max bytes inside QmdlAsyncReader entirely? i think it'd allow us to simplify the code significantly

bmw · 2026-04-07T19:46:22Z

            {
-                let entry =
-                    ZipEntryBuilder::new(format!("{qmdl_idx}.qmdl").into(), Compression::Stored);
+                let extension = if compressed { "qmdl.gz" } else { "qmdl" };


i didn't test this, but won't the resulting file here always be a qmdl file as reading using QmdlReader below will decompress it?

i personally think only including qmdl files in the zip is the nicest behavior for users anyway, but we should make sure the file extension is right regardless. if i'm correct and it is always should just be qmdl, it'll simplify the code and diff here

yup, i ran into this while refactoring the write_all implementation, and am currently stuck rooting out the empty zip bug you mentioned below

ah ok. let me know if you'd like a 2nd set of eyes on the empty zip problem. i hit it when trying to verify my understanding of the code here, but otherwise didn't really dig into it

bmw · 2026-04-07T20:30:08Z

        let body_bytes = axum::body::to_bytes(body, usize::MAX).await.unwrap();

        let zip_reader = ZipFileReader::new(body_bytes.to_vec()).await.unwrap();



i think something is going wrong somewhere in here as the qmdl.gz file added to the zip here is empty. if i add code like this, tests pass on main but fail on this branch

for entry in zip_reader.file().entries() { assert_ne!(entry.uncompressed_size(), 0); }

untitaker · 2026-04-08T17:18:14Z

+            if self.uncompressed_bytes_read > max_bytes {
+                error!(
+                    "warning: {} bytes read, but max_bytes was {}",
+                    self.uncompressed_bytes_read, max_bytes


uncompressed_bytes_read never gets incremented afaict

untitaker · 2026-04-08T17:22:02Z

+}
+
+#[derive(Debug)]
+struct QmdlAsyncReader<T> {


I struggle to understand why we have both QmdlReader and QmdlAsyncReader. can you rename one of them? I assume we need this layering, but i'm not sure why.

untitaker · 2026-04-08T17:23:17Z

-        .expect("failed to get QMDL file metadata")
-        .len();
-    let mut qmdl_reader = QmdlReader::new(qmdl_file, Some(file_size as usize));
+    let compressed = qmdl_path.ends_with(".gz");


I believe it would be easier to sniff gzip magic bytes. Then the distinction between gzip vs non-gzip can happen within the reader entirely, it doesn't need extra params, and no other code needs to be touched.

untitaker mentioned this pull request Apr 1, 2026

GPS logging capability #971

Open

7 tasks

wgreenberg added 4 commits April 1, 2026 11:40

qmdl_store: maintain backwards compatibility

9191540

daemon: fix zip test

e6a3a43

run cargo fmt

e0ae8a0

daemon: put QmdlWriter in a Box

58c60c2

This'll balance the enum size given QmdlWriter's larger size

bmw reviewed Apr 3, 2026

View reviewed changes

bmw reviewed Apr 7, 2026

View reviewed changes

untitaker reviewed Apr 8, 2026

View reviewed changes

cooperq assigned bmw Apr 17, 2026

cooperq mentioned this pull request Apr 24, 2026

ROADMAP #990

Open

24 tasks

		let body_bytes = axum::body::to_bytes(body, usize::MAX).await.unwrap();

		let zip_reader = ZipFileReader::new(body_bytes.to_vec()).await.unwrap();

Conversation

wgreenberg commented Mar 31, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

bmw left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

bmw Apr 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

bmw Apr 8, 2026 •

edited

Loading