Store orbit determination results in parquet #134
Labels
Kind: Improvement
This is a proposed improvement
Priority: high
Status: Design
Issue at Design phase of the quality assurance process
Topic: Orbit Determination
Milestone
Coauthors: Claude by Anthropic and GPT-4
High level description
Storing the OD results (estimates and residuals) in CSV files is inefficient and cumbersome. We propose the implementation of a
to_parquet
function for theODProcess
structure. The Apache Parquet format is well suited for this type of data. By storing the data in Parquet, the estimates and residuals can be queried and analyzed much more efficiently by the users. The data should be stored in base units (it's currently in kilometers, leads to error plots being in "micro kilometers", which is confusing).Requirements
ODProcess
structure must have a to_parquet method to export its data to Parquet.to_parquet
method must take:EventEvaluator
to evaluate events. If provided, the events data is also exported.ExportCfg
configuration object to configure the export (consider reuse of the current one used in trajectory export).FloatType
is anOption<f64>
.)Test plans
Unit tests:
Integration tests:
Edge cases:
An OD result set with only estimates and no residuals.Not possible.A malformed ExportCfg object.Not possible.Invalid path provided.Handled by the path objectLack of write permissions to the path.Handled by the path objectCorrupted Parquet file as input.Not possible since I create a new parquet fileOut of memory issues when exporting very large result sets. This would require failures to be handled gracefully.This would be a dyn Error, which is supported. Not sure how to test this on any of the machines I have since they have several GBs of RAM.Benchmark tests
Documentation and examples:
Design
Here is a Mermaid JS diagram showing the proposed implementation:
Consider using the building pattern from https://github.com/apache/arrow-rs/blob/master/arrow/examples/builders.rs or parquet_derive directly (but this might not work because it needs a custom struct for each variation), cf. https://github.com/apache/arrow-rs/blob/master/parquet_derive/README.md .
The text was updated successfully, but these errors were encountered: