Skip to content

Latest commit

 

History

History
 
 

bigquery-to-xml

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 
 
 
 
 

BigQuery to XML Conversion

This simple Python script allows you to take the output of a BigQuery query and convert the output into an XML file. This works best with nested and repeated fields and contains some customizations like custom root nodes for your XML reports.

User Guide

This tool assumes it will be run by an entity with the appropriate permissions to query BigQuery and that the default project has been configured.

Necessary Python packages are included in 'requirements.txt' file.

In v1 the tool only returns the XML report as a string, leaving it up to the user what to do with it. A common usecase would be to port the output into a .txt file, e.g. 'bigquery_to_xml.py > results.txt'

Tips for Report Formatting

There are two tags that are generated by default; the root tag '' and the row tags ''. These are used when the structure of your report cannot be inferred from the nested and repeated structure of your table; in other words, the report does not know how to tag the individual table rows or the overall report, so it uses overrideable defaults.

For example, if your BigQuery table was a single table with only "FName" and "LName" columns, your output might look something like this (assuming three rows): Sam Smith James Blake Justin Vernon

There are two optional parameters for the script that allow you to define your own root and row tags; 'custom_root_node' and 'custom_row_tag'.

Nested and Repeated Fields

To take advantage of XML's nested and repeated structure, the tool takes advantage of BigQuery's nested and repeated capabilities. Any nested or repeated field will be reflected in the XML output.

For an example of this in action, refer to the 'test_nested_repeated.py' script and the resulting report in 'tests/results.txt'.

Example XML Schema:

<header/>
  <transaction>
    <item1/>
    <item2/>
  </transaction>
  <transaction>
    <item1/>
    <item2/>
  </transaction>
<footer/>

Example BQ Schema:

header | STRING | NULLABLE transaction | RECORD | REPEATED transaction.item1 | STRING | NULLABLE transaction.item2 | STRING | NULLABLE footer | STRING | NULLABLE

Example Configuration

 _QUERY = """
 SELECT * EXCEPT (item)
 FROM `bigquery-public-data`.wikipedia.wikidata
 LIMIT 1"""

 bigquery_to_xml(_QUERY,_CUSTOM_ROOT_NODE,_CUSTOM_ROW_TAG)