Problem with the HES-APC-2021 data set #26
-
|
Hello! I am having trouble working with the HES-APC-2021 dataset. I have noticed that the 2021 file includes a different set of columns compared to previous and subsequent years, which raises the question of whether this might be a different dataset rather than HES-APC (Hospital Episode Statistics - Admitted Patient Care). I would appreciate your guidance. Thanks, |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments 3 replies
-
|
Hi Giovana We will look into this and get back to you! Thanks |
Beta Was this translation helpful? Give feedback.
-
|
Hi Giovana Thank you for your query. When reading a Microsoft SQL Server table into a Python DataFrame e.g., using pandas or sqlalchemy, you may occasionally encounter ‘Operational Metadata’ columns in a table which appear unexpectedly. Such columns are used for processing, ingestion, monitoring, or debugging purposes, e.g., run_id, ingestion_status, is_processed, created_at etc I have investigated this issue and found table APC_2021 only contained 30 additional columns when using Python script. I did not come across this issue when using R or SQL scripts. Therefore, I would suggest modifying your scripts to ignore these additional columns. Let me know how you get on. Best wishes Zan |
Beta Was this translation helpful? Give feedback.
I think this may be an issue with how the ONS team have provisioned the data for your project.
I'm sure @z2an will be in contact.
I'm going to re-label this question as "ONS SRS" related rather than a "data-quality" issue.