(how-to-list_files)=
# List dat files 
ApRES data is stored in binary files with the extension `.dat`. 
One of the main purposes of xapres is to load these files into a format that is easy to work with in python, xarray datasets. 

A class called `from_dats` is included in the `xapres.load` module to handle loading .dat files. 

First we load the module:

In [19]:
import sys
sys.path.append("../../../xapres") 
import xapres as xa

A `from_dats` object is initialized with one option parameter `loglevel`, which can be set to `"DEBUG"` to print out more information about the processing, or ignored to suppress this output.

In [20]:
fd = xa.load.from_dats()


The method `list_files` lists all the dat files in a given directory or cloud-based file-like location. It searches recursively through the file structure beneath the directory you supply to it and produces a list of all files it finds with the extension `.dat` or `.DAT`. 

This method can be useful for looking at what and how many files you are dealing with and it is used internally by other methods when loading and concatenating data to produce an xarray dataset. 

## Local files

If you have dat files in your current directory, you can simply run `filepaths = fd.list_files()` to produce a list of the dat files.

Or to find the dat files in a specific local directory, you can run

In [21]:
filepaths = fd.list_files(directory = '../data/')
filepaths 

['../data/thwaites/DATA2023-02-12-0437.DAT']

We found 1 dat file in the directory `../data/`.

## Remote files
We can also load dat file stored in google cloud storage (i.e. a google bucket)

```{note}
We will add the capability to use AWS S3 storage in the future.
```

To load from a google bucket you must provide the path (specifically the 'gsutil URI' of the directory). to the bucket as the first argument `directory`. The following cell produces a list of all the dat files in the google bucket and prints out the first five.

In [22]:
filepaths = fd.list_files(directory='gs://ldeo-glaciology/apres/thwaites/continuous/ApRES_LTG')
filepaths[0:5]

['gs://ldeo-glaciology/apres/thwaites/continuous/ApRES_LTG/SD1/DIR2000-01-04-2210/DATA2000-01-04-2210.DAT',
 'gs://ldeo-glaciology/apres/thwaites/continuous/ApRES_LTG/SD1/DIR2000-01-04-2221/DATA2000-01-04-2221.DAT',
 'gs://ldeo-glaciology/apres/thwaites/continuous/ApRES_LTG/SD1/DIR2023-01-15-2304/DATA2023-01-15-2304.DAT',
 'gs://ldeo-glaciology/apres/thwaites/continuous/ApRES_LTG/SD1/DIR2023-01-15-2330/DATA2023-01-15-2330.DAT',
 'gs://ldeo-glaciology/apres/thwaites/continuous/ApRES_LTG/SD1/DIR2023-01-16-0051/DATA2023-01-16-0051.DAT']

## Search suffix
`list_files` also takes an optional argument `suffix` which can be used to search for files with a specific suffix. This can be useful when the .dat files have been collected as part of a polarimetric radar survey, where the antennas are rotated to different orientations for each measurement. The user typically names the dat files with HH, HV, VH, or VV to signify the orientation of the antennas used. 

In [None]:
directory = "gs://ldeo-glaciology/apres/thwaites/2022-2023/Polarmetric"
all_polarimetric_files = fd.list_files(directory=directory)
just_HV_files = fd.list_files(directory=directory, search_suffix='HV')
print("")
print(f"We found {len(all_polarimetric_files)} files in {directory}; only {len(just_HV_files)} had an HV suffix")


We found 186 files in gs://ldeo-glaciology/apres/thwaites/2022-2023/Polarmetric; only 45 had an HV suffix


## Summary
Loading .dat files is dealt with using a `from_dats` object in the `xapres.load` module. The `list_files` method can be used to list all the dat files in a local directory or a google bucket. 
