## Example for OUT File Reader


In [1]:
from swmm_api import read_out_file, SwmmOutput

In [2]:
out = read_out_file('epaswmm5_apps_manual/Example7-Final.out')

Is equal to:

In [3]:
out = SwmmOutput('epaswmm5_apps_manual/Example7-Final.out')

# Available Data in the file

## Which variables are present in the out file for each object type?
Using this method will generate a dictionary with the object type as the key and a list of variables as the value.

The structure of this dictionary will be consistent across all out files, unless pollutants are introduced. In that case, the pollutants will be appended to the link and node variables.

In [4]:
out.variables

{'subcatchment': ['rainfall',
  'snow_depth',
  'evaporation',
  'infiltration',
  'runoff',
  'groundwater_outflow',
  'groundwater_elevation',
  'soil_moisture'],
 'node': ['depth',
  'head',
  'volume',
  'lateral_inflow',
  'total_inflow',
  'flooding'],
 'link': ['flow', 'depth', 'velocity', 'volume', 'capacity'],
 'pollutant': [],
 'system': ['air_temperature',
  'rainfall',
  'snow_depth',
  'infiltration',
  'runoff',
  'dry_weather_inflow',
  'groundwater_inflow',
  'RDII_inflow',
  'direct_inflow',
  'lateral_inflow',
  'flooding',
  'outflow',
  'volume',
  'evaporation',
  'PET']}

## What are the object labels in the out file for each object type?
Using this method will produce a dictionary where the object type is the key and the object labels are the value.

In [5]:
out.labels

{'subcatchment': ['S1', 'S2', 'S3', 'S4', 'S5', 'S6', 'S7'],
 'node': ['J1',
  'J2a',
  'J2',
  'J3',
  'J4',
  'J5',
  'J6',
  'J7',
  'J8',
  'J9',
  'J10',
  'J11',
  'Aux1',
  'Aux2',
  'Aux3',
  'O1'],
 'link': ['C2a',
  'C2',
  'C3',
  'C4',
  'C5',
  'C6',
  'C7',
  'C8',
  'C9',
  'C10',
  'C11',
  'C_Aux1',
  'C_Aux2',
  'C_Aux1to2',
  'C_Aux3',
  'P1',
  'P2',
  'P3',
  'P4',
  'P5',
  'P6',
  'P7',
  'P8'],
 'pollutant': [],
 'system': ['']}

In [6]:
out.flow_unit

'CFS'

In [7]:
out.pollutant_units

{}

In [8]:
out.n_periods

720

In [9]:
out.report_interval

datetime.timedelta(seconds=60)

In [10]:
out.run_failed

False

In [11]:
out.swmm_version

51015

In [12]:
out.start_date

datetime.datetime(2007, 1, 1, 0, 1)

To extract a subset of model parameters of the objects, you can use the following function which returns a nested dictionary.
The object type serves as the first key, the label as the second key, the parameter as the third key, and the associated value as the value.

In [13]:
out.model_properties

{'subcatchment': {'S1': {'area': 4.550000190734863},
  'S2': {'area': 4.739999771118164},
  'S3': {'area': 3.740000009536743},
  'S4': {'area': 6.789999961853027},
  'S5': {'area': 4.789999961853027},
  'S6': {'area': 1.9800000190734863},
  'S7': {'area': 2.3299999237060547}},
 'node': {'J1': {'type': 'JUNCTION',
   'invert': 4969.0,
   'max_depth': 5.300000190734863},
  'J2a': {'type': 'JUNCTION',
   'invert': 4966.7001953125,
   'max_depth': 5.300000190734863},
  'J2': {'type': 'JUNCTION', 'invert': 4965.0, 'max_depth': 5.300000190734863},
  'J3': {'type': 'JUNCTION', 'invert': 4973.0, 'max_depth': 3.0},
  'J4': {'type': 'JUNCTION', 'invert': 4965.0, 'max_depth': 9.0},
  'J5': {'type': 'JUNCTION', 'invert': 4965.7998046875, 'max_depth': 7.0},
  'J6': {'type': 'JUNCTION', 'invert': 4969.0, 'max_depth': 3.5},
  'J7': {'type': 'JUNCTION', 'invert': 4963.5, 'max_depth': 11.0},
  'J8': {'type': 'JUNCTION', 'invert': 4966.5, 'max_depth': 3.5},
  'J9': {'type': 'JUNCTION', 'invert': 4964.79

## What is the total number of columns present in the out file?
The answer can be calculated by multiplying the number of object types with the number of objects per type and then multiplying the result with the number of variables per type.

In [14]:
out.number_columns

282

# Retrieving the data
## Retrieving the Data in Numpy Array Format

This method involves reading the entire file and storing the data in memory as a numpy array. This numpy array will later be used to create a pandas dataframe.

In [15]:
type(out.to_numpy())

numpy.ndarray

## Retrieving the data in Pandas Dataframe Format

Using this method, a pandas dataframe will be created with all the data present in the file. The columns of the dataframe will be organized with a multi-index that includes the following levels: object type, object label, and variable. (see [pandas-docs/stable/user_guide/advanced indexing](https://pandas.pydata.org/pandas-docs/stable/user_guide/advanced.html))

It is recommended to use this method when a majority of the columns are needed, or when the out file contains only a few columns. However, if the out file has a large number of columns and only a small subset of columns are required, it is advisable to use the function introduced below.

In [16]:
out.to_frame()

Unnamed: 0_level_0,subcatchment,subcatchment,subcatchment,subcatchment,subcatchment,subcatchment,subcatchment,subcatchment,subcatchment,subcatchment,...,system,system,system,system,system,system,system,system,system,system
Unnamed: 0_level_1,S1,S1,S1,S1,S1,S1,S1,S1,S2,S2,...,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
Unnamed: 0_level_2,rainfall,snow_depth,evaporation,infiltration,runoff,groundwater_outflow,groundwater_elevation,soil_moisture,rainfall,snow_depth,...,dry_weather_inflow,groundwater_inflow,RDII_inflow,direct_inflow,lateral_inflow,flooding,outflow,volume,evaporation,PET
2007-01-01 00:01:00,1.00,0.0,0.0,0.43200,0.088076,0.0,0.0,0.0,1.00,0.0,...,0.0,0.0,0.0,0.0,0.563977,0.0,0.003971,14.105279,0.0,0.0
2007-01-01 00:02:00,1.00,0.0,0.0,0.43200,0.233792,0.0,0.0,0.0,1.00,0.0,...,0.0,0.0,0.0,0.0,1.466747,0.0,0.022646,57.129456,0.0,0.0
2007-01-01 00:03:00,1.00,0.0,0.0,0.43200,0.368483,0.0,0.0,0.0,1.00,0.0,...,0.0,0.0,0.0,0.0,2.281808,0.0,0.085892,158.937866,0.0,0.0
2007-01-01 00:04:00,1.00,0.0,0.0,0.43200,0.531651,0.0,0.0,0.0,1.00,0.0,...,0.0,0.0,0.0,0.0,3.297160,0.0,0.287618,309.448669,0.0,0.0
2007-01-01 00:05:00,1.14,0.0,0.0,0.49248,0.973276,0.0,0.0,0.0,1.14,0.0,...,0.0,0.0,0.0,0.0,6.085905,0.0,0.730127,536.387329,0.0,0.0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
2007-01-01 11:56:00,0.00,0.0,0.0,0.00000,0.000000,0.0,0.0,0.0,0.00,0.0,...,0.0,0.0,0.0,0.0,0.000000,0.0,0.000612,8.957918,0.0,0.0
2007-01-01 11:57:00,0.00,0.0,0.0,0.00000,0.000000,0.0,0.0,0.0,0.00,0.0,...,0.0,0.0,0.0,0.0,0.000000,0.0,0.000610,8.935231,0.0,0.0
2007-01-01 11:58:00,0.00,0.0,0.0,0.00000,0.000000,0.0,0.0,0.0,0.00,0.0,...,0.0,0.0,0.0,0.0,0.000000,0.0,0.000608,8.912636,0.0,0.0
2007-01-01 11:59:00,0.00,0.0,0.0,0.00000,0.000000,0.0,0.0,0.0,0.00,0.0,...,0.0,0.0,0.0,0.0,0.000000,0.0,0.000605,8.890132,0.0,0.0


## Retrieving a Subset of Data in Pandas Dataframe or Series Format

To retrieve a specific subset of the data and save computation time, use `get_part`.

When a specific object type, label, and variable are specified, the function returns a pandas series.
On the other hand, when a list of labels or a list of variables is provided or when the label or variable is set to `None`, a pandas dataframe will be returned.

In [17]:
out.get_part('node', 'J1', 'head').to_frame()

Unnamed: 0,node/J1/head
2007-01-01 00:01:00,4969.000000
2007-01-01 00:02:00,4969.001953
2007-01-01 00:03:00,4969.026855
2007-01-01 00:04:00,4969.064941
2007-01-01 00:05:00,4969.108887
...,...
2007-01-01 11:56:00,4969.001465
2007-01-01 11:57:00,4969.001465
2007-01-01 11:58:00,4969.001465
2007-01-01 11:59:00,4969.001465


The object types and their associated variables are pre-defined as objects to facilitate the use of auto-complete feature in IDEs.

However, an exception to this rule is the pollutants, which vary depending on the model used.

In [18]:
from swmm_api.output_file import VARIABLES, OBJECTS
out.get_part(OBJECTS.NODE, 'J1', VARIABLES.NODE.HEAD).to_frame()

Unnamed: 0,node/J1/head
2007-01-01 00:01:00,4969.000000
2007-01-01 00:02:00,4969.001953
2007-01-01 00:03:00,4969.026855
2007-01-01 00:04:00,4969.064941
2007-01-01 00:05:00,4969.108887
...,...
2007-01-01 11:56:00,4969.001465
2007-01-01 11:57:00,4969.001465
2007-01-01 11:58:00,4969.001465
2007-01-01 11:59:00,4969.001465


In [19]:
# to get all data of a node, just remove the variable part
out.get_part(OBJECTS.NODE, 'J1')

Unnamed: 0,depth,head,volume,lateral_inflow,total_inflow,flooding
2007-01-01 00:01:00,0.000000,4969.000000,0.0,0.0,0.000000,0.0
2007-01-01 00:02:00,0.002062,4969.001953,0.0,0.0,0.001879,0.0
2007-01-01 00:03:00,0.026991,4969.026855,0.0,0.0,0.019385,0.0
2007-01-01 00:04:00,0.064753,4969.064941,0.0,0.0,0.063515,0.0
2007-01-01 00:05:00,0.109040,4969.108887,0.0,0.0,0.138886,0.0
...,...,...,...,...,...,...
2007-01-01 11:56:00,0.001290,4969.001465,0.0,0.0,0.000057,0.0
2007-01-01 11:57:00,0.001286,4969.001465,0.0,0.0,0.000057,0.0
2007-01-01 11:58:00,0.001282,4969.001465,0.0,0.0,0.000057,0.0
2007-01-01 11:59:00,0.001278,4969.001465,0.0,0.0,0.000056,0.0


# How to Convert Out File to a Different File Format?

To convert the out file to a different file format, you can first create a pandas dataframe using one of the functions mentioned above.
Then, you can utilize one of the built-in functions of pandas to save the dataframe to the desired file format.

Pandas provides various functions for reading and writing data, which can be found in the following links:
- [pandas-docs/stable/reference/io](https://pandas.pydata.org/pandas-docs/stable/reference/io.html)
- [pandas-docs/stable/user_guide/io#writing-out-data](https://pandas.pydata.org/pandas-docs/stable/user_guide/io.html#writing-out-data)

For this type of data, we recommend using the parquet format, which not only saves storage space but is also faster to read.

# Reading just a few columns of a huge out file

If you have a large out file that is several GBs in size, and you only need to extract a few columns from it, you can save time and memory by using the following function. This method reads only the necessary column instead of the entire dataset.

Please note that this approach is only faster when the out file is enormous, due to how the function is implemented. You can experiment with both functions to determine the best approach for your particular case.

If you haven't installed tqdm or don't wish to see the progress bar, you can use `show_progress=False`.

In [20]:
out.get_part(OBJECTS.NODE, 'J1', VARIABLES.NODE.HEAD, slim=True)

SwmmOutput(file="epaswmm5_apps_manual/Example7-Final.out").get_selective_results(n_cols=1):   0%|          | 0…

2007-01-01 00:01:00    4969.000000
2007-01-01 00:02:00    4969.001953
2007-01-01 00:03:00    4969.026855
2007-01-01 00:04:00    4969.064941
2007-01-01 00:05:00    4969.108887
                          ...     
2007-01-01 11:56:00    4969.001465
2007-01-01 11:57:00    4969.001465
2007-01-01 11:58:00    4969.001465
2007-01-01 11:59:00    4969.001465
2007-01-01 12:00:00    4969.001465
Freq: T, Name: node/J1/head, Length: 720, dtype: float64