Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NPI-3072 additional comments explaining why we read Sinex with read_fwf() and that field names do not map #9

Merged
merged 1 commit into from
Jan 18, 2024

Conversation

treefern
Copy link
Collaborator

No description provided.

…ng, and the fact names used in the dataframe do not correlate with names in the Sinex comment above the data
@treefern treefern self-assigned this Jan 18, 2024
Copy link
Collaborator

@ronaldmaj ronaldmaj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No functional code changes but really helpful comments so I think this can go in as is 👍
Not sure if you want to wait on @jashlearn or @bmatv to provide additional commentary around this, but looks good to me

@bmatv
Copy link
Collaborator

bmatv commented Jan 18, 2024

@treefern @ronaldmaj I love it! We might dump the slow read_fwf in the future for a faster parsing method but that method is to be discovered

@treefern
Copy link
Collaborator Author

treefern commented Jan 18, 2024

I assume read_fwf would be a bit faster (and more reliable) if we explicitly set column boundaries, but I haven't tested this.

I think we could also improve robustness and maybe speed, by adding logic to first trim to the data structure needed. Eg +SOLUTION/APRIORI through -SOLUTION/APRIORI.
Forget that, I overlooked an earlier section of the code that already does this (I was wondering how on earth it was working) 😄

@bmatv
Copy link
Collaborator

bmatv commented Jan 18, 2024

@treefern you could definitely measure this but from experience it won't help. One could make it marginally faster by not asking for STD column if it's not present but there is no way of telling from a header of the file or the block if STD is present or not... SINEX is a pain to read efficiently in python

@treefern treefern merged commit 6abad62 into main Jan 18, 2024
@treefern treefern deleted the NPI-3072-improve-sinex-reader-fwf-commenting branch January 18, 2024 04:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants