Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DOC: xml.etree.ElementTree.ParseError due to healthkit version 12 #24

Open
mmngreco opened this issue Jan 1, 2023 · 2 comments
Open

Comments

@mmngreco
Copy link

mmngreco commented Jan 1, 2023

Hi @simonw

I hope you find this issue ok, the idea is provide some documentation to other users like me about how to solve this problem and save some time.

Following the instructions from the README.md I've faced this error:

(venv) mgreco@pop-os apple-health master* (23:44|0s)
$ healthkit-to-sqlite apple_health_export/export.xml healthkit.db --xml
Importing from HealthKit  [------------------------------------]    0%
Traceback (most recent call last):
  File "/home/mgreco/github/mmngreco/apple-health/venv/bin/healthkit-to-sqlite", line 33, in <module>
    sys.exit(load_entry_point('healthkit-to-sqlite', 'console_scripts', 'healthkit-to-sqlite')())
  File "/home/mgreco/github/mmngreco/apple-health/venv/lib/python3.10/site-packages/click/core.py", line 1130, in __call__
    return self.main(*args, **kwargs)
  File "/home/mgreco/github/mmngreco/apple-health/venv/lib/python3.10/site-packages/click/core.py", line 1055, in main
    rv = self.invoke(ctx)
  File "/home/mgreco/github/mmngreco/apple-health/venv/lib/python3.10/site-packages/click/core.py", line 1404, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/home/mgreco/github/mmngreco/apple-health/venv/lib/python3.10/site-packages/click/core.py", line 760, in invoke
    return __callback(*args, **kwargs)
  File "/home/mgreco/github/mmngreco/apple-health/.deps/healthkit-to-sqlite/healthkit_to_sqlite/cli.py", line 57, in cli
    convert_xml_to_sqlite(fp, db, progress_callback=bar.update, zipfile=zf)
  File "/home/mgreco/github/mmngreco/apple-health/.deps/healthkit-to-sqlite/healthkit_to_sqlite/utils.py", line 25, in convert_xml_to_sqlite
    for tag, el in find_all_tags(
  File "/home/mgreco/github/mmngreco/apple-health/.deps/healthkit-to-sqlite/healthkit_to_sqlite/utils.py", line 12, in find_all_tags
    for event, el in parser.read_events():
  File "/home/mgreco/github/mmngreco/apple-health/venv/lib/python3.10/xml/etree/ElementTree.py", line 1324, in read_events
    raise event
  File "/home/mgreco/github/mmngreco/apple-health/venv/lib/python3.10/xml/etree/ElementTree.py", line 1296, in feed
    self._parser.feed(data)
xml.etree.ElementTree.ParseError: syntax error: line 156, column 0

So, after debugging and searching on internet I found this useful link: https://discussions.apple.com/thread/254202523 (etresoft, the real hero). Which basically says that the xml given by the health app (healthkit version 12) has some bugs but fortunately, they can be solved with a couple of commads:

  1. Uncompress the zip and move the new folder where export.xml is.

  2. Create a patch.txt with the following content

    --- export.xml	2022-09-18 15:17:09.000000000 -0400
    +++ export-fixed.xml	2022-09-18 16:37:08.000000000 -0400
    @@ -15,6 +15,7 @@
       HKCharacteristicTypeIdentifierBiologicalSex       CDATA #REQUIRED
       HKCharacteristicTypeIdentifierBloodType           CDATA #REQUIRED
       HKCharacteristicTypeIdentifierFitzpatrickSkinType CDATA #REQUIRED
    +  HKCharacteristicTypeIdentifierCardioFitnessMedicationsUse CDATA #IMPLIED
     >
     <!ELEMENT Record ((MetadataEntry|HeartRateVariabilityMetadataList)*)>
     <!ATTLIST Record
    @@ -39,7 +40,7 @@
       startDate     CDATA #REQUIRED
       endDate       CDATA #REQUIRED
     >
    -<!ELEMENT Workout ((MetadataEntry|WorkoutEvent|WorkoutRoute)*)>
    +<!ELEMENT Workout ((MetadataEntry|WorkoutEvent|WorkoutRoute|WorkoutStatistics)*)>
     <!ATTLIST Workout
       workoutActivityType   CDATA #REQUIRED
       duration              CDATA #IMPLIED
    @@ -63,7 +64,7 @@
       duration             CDATA #IMPLIED
       durationUnit         CDATA #IMPLIED
     >
    -<!ELEMENT WorkoutEvent EMPTY>
    +<!ELEMENT WorkoutEvent (MetadataEntry?)>
     <!ATTLIST WorkoutEvent
       type                 CDATA #REQUIRED
       date                 CDATA #REQUIRED
    @@ -79,6 +80,7 @@
       minimum              CDATA #IMPLIED
       maximum              CDATA #IMPLIED
       sum                  CDATA #IMPLIED
    +  unit                 CDATA #IMPLIED
     >
     <!ELEMENT WorkoutRoute ((MetadataEntry|FileReference)*)>
     <!ATTLIST WorkoutRoute
    @@ -153,6 +155,7 @@
       dateIssued       CDATA #REQUIRED
       expirationDate   CDATA #REQUIRED
       brand            CDATA #IMPLIED
    +>
     <!ELEMENT RightEye EMPTY>
     <!ATTLIST RightEye
       sphere           CDATA #IMPLIED
    @@ -203,13 +206,6 @@
       diameter         CDATA #IMPLIED
       diameterUnit     CDATA #IMPLIED
     >
    -  device           CDATA #IMPLIED
    -<!ELEMENT MetadataEntry EMPTY>
    -<!ATTLIST MetadataEntry
    -  key              CDATA #IMPLIED
    -  value            CDATA #IMPLIED
    ->
    ->
     ]>
     <HealthData>
      <ExportDate/>
  3. Apply the path with the command: patch < patch.txt

  4. Fix endDates with the command sed 's/startDate/endDate/2' export.xml > export-fixed.xml

  5. Try again healthkit-to-sqlite export-fixed.xml healthkit.db --xml

@Mjboothaus
Copy link

Thanks for reporting this and providing a solution -- I was puzzled by this error when I revisited my walking data and experienced this issues. I haven't tried the fix yet.

@Mjboothaus
Copy link

@simonw - maybe put in some error handling to trap for poorly formed XML (from Apple engineers) so that it suggests that there are problems with export.zip rather than odd looking Python errors :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants