**Run posgen cluster demo and load the outputed .pos file**

First make sure you are in the same working directory as posgen e.g.

```python

import os
os.chdir("`your_working_directory`")

```

Once in the correct directory you can now call posgen and run the clusterOp.xml example

In [70]:
#import required packages
import os
import subprocess
import numpy as np
import struct

The first step is to run the posgen clustering example from Python using the subprocess module. This outputs a clustered .pos file (filename given in `xml_input`) 

In [71]:
xml_input = "examples/clusterOp.xml"

In [72]:
process = subprocess.run(["./posgen", xml_input])
if process.returncode == 0:
    print("process completed successfully")

process completed successfully


The generated clustered .pos file (in this case cluster.pos) can now be loaded into an Python environment variable.  

In [73]:
def chunk_read_pos(filename, sample = 10):

    """

    A python reader for the .pos APT data file.
    Chunks reading of file to handle large file sizes
    Also samples ions (sample rate defined by `sample`)
    Once unpacked the 1D array d contains
    the following elements:

    'x': d[0::11], reconstructed x-coordinate of ion 
    'y': d[1::11], reconstructed y-coordinate of ion 
    'z': d[2::11], reconstructed z-coordinate of ion 
    'Da': d[3::11], list of Da (m/q) values   

    :param1 filename: The .pos filename
    :param2 sample: The sample rate (defaults to one in every 10)
    :returns:
        sx
        sy
        sz
        Da
        returncode

    """

    if filename.split(".")[-1] != "pos":
        print("Incorrect filetype detected (based off name). Method only supports .pos file types. Please call appropriate reader.")
        print("exiting")
        return [], [], [], [], 1

    # read in the data
    try:
        f = open(filename, "rb")
    except:
        print("Could not open or read file: " + filename + "\n Please make sure the correct file (.pos) and location has been passed.")
        print("exiting")
        return [], [], [], [], 2

    #get number of bytes within the .pos
    f.seek(0,2) # move the cursor to the end of the file
    n = int(f.tell())
    f.seek(0) # move the cursor to the beginning of the file
        
    rs = int(n / 4) #number of entries/rows

    d = []

    #chunk data reading to 16 bytes i.e. row by row (while sampling rows)
    for a in np.arange(0, (n * 4)/(4 * 4), sample, dtype = int):
        f.seek(a * 4 * 4)
        byte = f.read(16)
        if len(byte) == 0:
            continue
        d.extend(struct.unpack('>'+'ffff'*1, byte))
                    # '>' denotes 'big-endian' byte order

    sx = d[::4]
    sy = d[1::4]
    sz = d[2::4]
    Da = d[3::4]
    
    return sx, sy, sz, Da, 0


Finally this function can be called 

In [77]:
pos_file = "uncluster.pos"
sample = 1 #sampling (number of rows to skip between read ions when chunking - 1 reads in all ions)

In [78]:
sx, sy, sz, Da, returncode = chunk_read_pos("uncluster.pos", sample = 1)

In [80]:
print(sx[:10], sy[:10], sz[:10], Da[:10])

[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0] [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0] [0.0, 0.4050000011920929, 0.8100000023841858, 1.215000033378601, 1.6200000047683716, 2.0250000953674316, 2.430000066757202, 2.8350000381469727, 3.240000009536743, 3.6449999809265137] [1.0, 2.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0]
