Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Benchmark obs parsing + faster parser #91

Merged
merged 15 commits into from
Jul 18, 2024
Merged

Benchmark obs parsing + faster parser #91

merged 15 commits into from
Jul 18, 2024

Conversation

jtec
Copy link
Owner

@jtec jtec commented Jul 10, 2024

This PR adresses the current bottleneck of prx's speed, which is observation file parsing, by adding

  • a benchmark for parsing speed by splitting a 24h, 30s obs interval file into a sequence of smaller files and recording the parsing time
  • a faster rinex 3.05 obs parser making heavy use of pandas
  • a unit test comparing georinx parser outputs to those of the new prx parser

Note that the PR adds the new parser, but does not enable it, so prx still uses georinex as its parsing backend.

What about the loss-of-lock indicator

The loss-of-lock indicators are not extracted yet, will deal with that in a follow-up PR.

What about nav files?

We can continue to rely on georinex's nav file parser, since parsing e.g. a 1-day file takes only about 45 seconds (M2 Mac), and most web-prx users will benefit from caching, since it is likely that a previous users has caused prx to download and parse the same nav files.

Testing

Running the benchmark yields

newplot (4)

Note that the pandas-based parser's speed is pretty much independent of file size, at least in this benchmark.

@@ -1,7 +1,7 @@
[tool.poetry]
name = "prx"
version = "0.0.1"
description = "Making GNSS positioning a bit more accessible"
description = "Making raw GNSS data more accessible"
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm not sure this is an improvement ;)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It makes sense!! 👍

@@ -215,15 +215,6 @@ def build_records(
check_assumptions(rinex_3_obs_file)
obs = helpers.parse_rinex_file(rinex_3_obs_file)

# Flatten the xarray DataSet into a pandas DataFrame:
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moves to helpers.py

@jtec jtec requested a review from plutonheaven July 10, 2024 18:45
@jtec jtec marked this pull request as ready for review July 12, 2024 22:55
@jtec
Copy link
Owner Author

jtec commented Jul 16, 2024

@plutonheaven OK if I merge this?

Copy link
Collaborator

@plutonheaven plutonheaven left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great!!

@jtec jtec merged commit 2e33a06 into main Jul 18, 2024
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants