## Using XULE to extract the presentation structure from an XBRL taxonomy 
This interactive Python code updates, compiles, and runs a XULE expression to output the presentation hierarchy for a filing or taxonomy to either a spreadsheet or JSON file.

Click the first run button below to copy the [free, open-source XULE plugin](https://github.com/xbrlus/xule/) into an [Arelle](https://pypi.org/project/arelle-release/) installation. This step only needs to be completed **once for the session** unless different versions of Arelle or XULE are required. Click the `Show code` link to review/revise the setup script.

In [None]:
# @title
import os, shutil, sys, site, platform
print('Please wait while Arelle, XULE and some helper packages are installed. \nA XULE version message and input field appear below when the environment is ready.')

# In this example, Arelle and aniso are required to use XULE - get Arelle release details from GitHub (https://github.com/Arelle/arelle/releases).
# Use %pip -q install git+https://git@github.com/Arelle/arelle.git@master to use Arelle's development release
%pip -q install Arelle-release==2.30.11
%pip -q install aniso8601==9.0.1

# 1) locate Arelle's plugin directory (do not modify this location); remove temp and xuledir if they exist
plugindir = site.getsitepackages()[0] + '/arelle/plugin/'
xuledir = plugindir + 'xule/'
xodeldir = plugindir + 'xodel/'
serializerdir = plugindir + 'serializer/'
SimpleXBRLModeldir = plugindir + 'SimpleXBRLModel/'
temp = plugindir + 'temp/'
tempxule = temp + 'plugin/xule/'
tempxodel = temp + 'plugin/xodel/'
tempserializer = temp + 'plugin/serializer/'
tempSimpleXBRLModel = temp + 'plugin/SimpleXBRLModel/'
if os.path.exists(temp):
        if os.path.isdir(temp):
            os.remove(plugindir + 'semanticHash.py')
            shutil.rmtree(xuledir)
            shutil.rmtree(xodeldir)
            shutil.rmtree(serializerdir)
            shutil.rmtree(SimpleXBRLModeldir)
            shutil.rmtree(temp)
else: ''
os.chdir(plugindir)

# 2) copy a XULE release from GitHub (https://github.com/xbrlus/xule/releases) to plugin directory
!git clone --quiet --depth=1 --branch 30041 --single-branch https://github.com/xbrlus/xule.git temp &> /dev/null
shutil.move(temp + 'plugin/semanticHash.py', plugindir)
shutil.move(tempxule, xuledir)
shutil.move(tempxodel, xodeldir)
shutil.move(tempserializer, serializerdir)
shutil.move(tempSimpleXBRLModel, SimpleXBRLModeldir)

# 3) confirm XULE (change -v to -h and re-run to see help contents for Arelle and XULE)
!arelleCmdLine --plugins 'xule' -v

Run the cell below to choose ``constants`` used in the XULE expression below.  Click the ``Show code`` link to review/revise this script.

In [None]:
# @title
def get_user_url(options):
    while True:
        print("\n\nType a number from the list or enter a valid report or taxonomy URI to generate its presentation hierarchy.\nLeave blank to use the first report in the list: ")
        for i, option in enumerate(options):
            print(f"{i + 1}. {option}")
        choice = input("> ")
        # Check if the user entered a number from the list
        try:
            choice = int(choice)
            if 1 <= choice <= len(options):
                return options[choice - 1]
        except ValueError:
            if choice == '':
                return options[0]
            else: 
                pass
        # If the user didn't enter a number, take it as custom input
        return choice

options = ["https://www.sec.gov/Archives/edgar/data/720762/000149315224042532/form10-k.htm",
           "https://www.sec.gov/Archives/edgar/data/700764/000107997324001419/vyey_10q-033124.htm",
           "https://xbrl.fasb.org/us-gaap/2024/entire/us-gaap-entryPoint-std-2024.xsd"]

taxonomy = get_user_url(options)

print("\nSelect the number of the file format for output.\nLeave blank to use the spreadsheet option: ")
data = {
    "1": "spreadsheet",
    "2": "json"
    }

for key, value in data.items():
    print(f"{key}. {value}")

choice = input("> ")

if choice in data:
    format = data[choice]
else:
    print("No entry - using spreadsheet option.")
    format = "spreadsheet"

location = input("\nEnter a path to the output location or leave the field blank \nto use the default value /content/ (Colab location)> ") or '/content/'

name = input("\nEnter a name for the file without spaces or leave the field blank \nto use the default value 'taxonomy-hierarchy'> ") or 'taxonomy-hierarchy'

# This deletes the prior version of XULE file if the name matches what  was provided in this routine.
if os.path.exists(name + '.xule'):
    os.remove(name + '.xule')
else: ''

### 1. Define a XULE Expression
The data array created below corresponds to tabs in a spreadsheet for each named section of the presentation linkbase (ie. the financial statements and its disclosures) The function ``.agg-to-dict(1)`` is applied to each different set of 6 characters produced by the list (and printed to the first position) to create section arrays.

Details from the presentation linkbase are printed in each section of the data array (columns 1 -9 in each worksheet). To create a visual hierarchy, an `if/else condition` evaluates each concept's _navigation-depth_ and _preferred-label_ and applies blank spaces preceding the _target.label.text_ string within the statement or disclosure's strucuture.

The code cell below can be edited. Run the cell to write and view the XULE file that will be compiled in the next step. After this XULE cell is run once, re-running the cell above can be used to update ``constants`` in the XULE expression.

In [None]:
xule_file = '''
constant $schema = taxonomy('%s')
constant $fileformat = '%s'
constant $filedir = '%s'
constant $filename = '%s'

output-attribute file-location
output-attribute file-content

output taxonomy

$presentation = (navigate parent-child descendants taxonomy $schema
    returns  (
        role-description,
        source-name,
        target-name,
        order,
        navigation-order,
        preferred-label,
        result-order,
        navigation-depth,
        target
    )
)

$updatedlist = filter $presentation returns list(
    $item[1].substring(1,6),
    $item[1],
    $item[2],
    $item[3],
    $item[4],
    $item[5],
    if
      $item[6].text == none '  '.repeat($item[8]) + $item[9].label.text
    else
      '  '.repeat($item[8]) + $item[6].text)

$presentation_updated = $updatedlist.agg-to-dict(1)

if $fileformat != 'spreadsheet'
  $filelocation = $filedir + $filename + '.json'
  $presentation_updated.to-json 
else
  $filelocation = $filedir + $filename + '.xlsx'
  $presentation_updated.to-spreadsheet

/** The output attributes below are written to the log. To use an output attribute in the logic of a XULE expression **/
/** - as with the conditional choosing file format above - it must be defined as a variable (eg. file-location $filelocation) **/

file-content $rule-value
file-location $filelocation
'''
with open(name + '.xule', mode='w') as file:
    file.write(xule_file % (taxonomy, format, location, name))
!cat $name'.xule'

### 2. Run XULE against an XBRL report
Several steps are taken in the terminal commands below. After defining variables for files and locations based on the earlier inputs, the first Arelle command compiles the XULE into a .zip. Next, the XULE and compiled .zip files are copied from the default save location (the XULE plugin directory), followed by a second Arelle command running the .zip to create the output. (NB: In scenarios where the plugin configuration is identical, the ``--xule-compile`` and ``--xule-rule-set -v -f`` - effectively compile and use .zip to validate an entrypoint or file - can be run as a single command.

In [None]:
FILE_NAME = name + '.xule'
ZIP_NAME = name + '.zip'
FILE_LOCATION = location
LOG_LOCATION = location + 'log.xml'
print('Start compiling XULE.\n\n') 

# compile XULE into .zip 
!arelleCmdLine --plugins "xule" --xule-compile $FILE_NAME \
--xule-rule-set $ZIP_NAME --logFormat="[%(messageCode)s] %(message)s"

# copy XULE and .zip to /content/ directory (a Colab-specific location)
shutil.copy(FILE_NAME, FILE_LOCATION+FILE_NAME)
shutil.copy(ZIP_NAME, FILE_LOCATION+ZIP_NAME)
print('\n\nEnd compiling and start analyzing ' + taxonomy + \
'\nThere may be different URIs below because the -f argument is a placeholder report URI matching the base taxonomy year \nof the report or taxonomy URI input above (ie. 2024 US GAAP).\n\n')

# run .zip to create output
!arelleCmdLine --plugins "xule|transforms/SEC|validate/EFM|inlineXbrlDocumentSet" \
--xule-rule-set $ZIP_NAME \
-v -f "https://www.sec.gov/Archives/edgar/data/720762/000149315224042532/form10-k.htm" \
--xule-time .000 --xule-debug --noCertificateCheck \
--httpUserAgent "XULE-Arelle (xbrl.us; support@xbrl.us)" --logFile $LOG_LOCATION

#### Get help with Arelle and XULE commands by running the cell below.

In [None]:
!arelleCmdLine --plugins "xule" -h