Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Calculate XPT row_count using metadata and final chunk #315

Open
gerrycampion opened this issue Apr 30, 2024 · 0 comments
Open

Calculate XPT row_count using metadata and final chunk #315

gerrycampion opened this issue Apr 30, 2024 · 0 comments

Comments

@gerrycampion
Copy link

Describe the issue

For XPT files, I understand that row_count cannot be extracted from the metadata alone, but I think it can be calculated using only the metadata and final 80-byte chunk.

Expected behavior

  • Read the header information to find: variable_sizes and the start of record data
  • Calculate record_size as sum of variable_sizes
  • Read the last 80-byte chunk of data to find out how much trailing ASCII blank padding there is.
  • Calculate number of records using:
    (total_file_size - start - padding) / record_size

It would be helpful if readstat could expose either:

  • row_count

or these, if not already exposed:

  • total_file_size
  • records_start_offset
  • records_end_offset
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant