# AllInOne Files

### Faking nc=0 for some spaces:

Faking is done at the **ens** and **ensAvg** levels (and directories) not at the **segment** and **whole** levles, so the **segment stamps** are not modified byt the **whole** and **ens** ones are modified.

### To-do list
- [x] **space_tseries** fucntion for all the timeseries in one **space**.
- [ ] **space** fucntion for all the distributions in one **space**.
- [ ] **all-in-one** fucntion for all the timeseries in all **space**s in a **project**.
- [ ] **all-in-one** fucntion for all the distributions in all **space**s in a **project**.

### Naming convention:

This is the pattern of file or directory names:

1. **whole** files: whole-group-property_[-measure][-stage][.ext]
2. **ensemble** files: ensemble-group-property_[-measure][-stage][.ext]
3. **ensemble_long** files: ensemble_long-group-property_[-measure][-stage][.ext]
4. **space** files: space-group-property_[-measure][-stage][.ext]
5. **all in one** files: space-group-**species**-**allInOne**-property-_[-measure][-stage][.ext]

[keyword] means that the keyword in the file name is option. [-measure] is a physical measurement such as the auto correlation function (AFC) done on the physical 'property_'.

## CUATION: How to do:

1. Choose the project setting
2. Choose the space
3. Check the unique properties, and the ACF measures per unique properties.
4. Check the name, format, and extension of the Pandas writer.

#### Imports

In [None]:
from glob import glob
import pandas as pd
import numpy as np
from polyphys.manage import organizer
from polyphys.manage.parser import SumRule, TransFoci
from polyphys.analyze import measurer
import polyphys.api as api
import warnings
warnings.filterwarnings('ignore')

# List of physical properties: Set the project hierarchy
project = 'SumRule'
#project = 'TransFoci'
project_details ={
    'SumRule':{
        'parser': SumRule,
        'space_pat': 'N*D*ac*',
        'hierarchy': 'N*',
        'space_hierarchy': 'N*',
        'attributes': ['space', 'ensemble_long', 'ensemble', 'nmon', 'dcyl',
                       'dcrowd','phi_c_bulk'
                      ],
        'time_varying_props': [ 'asphericityTMon', 'fsdTMon', 'gyrTMon',
                               'rfloryTMon','shapeTMon'],
        'equil_measures': [np.mean, np.var, measurer.sem],
        'equil_attributes': ['space', 'ensemble_long', 'ensemble', 'nmon',
                             'dcyl','dcrowd', 'phi_c_bulk', 
                             'phi_c_bulk_round'
                            ],
        'equil_properties': ['asphericityMon-mean', 'asphericityMon-var',
                             'asphericityMon-sem', 'fsdMon-mean',
                             'fsdMon-var', 'fsdMon-sem', 'gyrMon-mean',
                             'gyrMon-var', 'gyrMon-sem', 'rfloryMon-mean',
                             'rfloryMon-var', 'rfloryMon-sem',
                             'shapeMon-mean', 'shapeMon-var', 'shapeMon-sem']
    },
    'TransFoci':{
        'parser': TransFoci,
        'space_pat': 'ns*nl*al*D*ac*',
        'hierarchy': 'eps*',
        'space_hierarchy': 'ns*',
        'attributes': ['space', 'ensemble_long', 'ensemble', 'nmon_small',
                       'nmon_large','dmon_large', 'dcyl', 'dcrowd',
                       'phi_c_bulk'
                      ],
        'time_varying_props': ['asphericityTMon', 'fsdTMon', 'gyrTMon',
                               'shapeTMon'],
        'equil_measures': [np.mean, np.var, measurer.sem],
        'equil_attributes': ['ensemble_long', 'ensemble', 'space', 'dcyl',
                             'dmon_large', 'nmon_large', 'nmon_small',
                             'dcrowd', 'phi_c_bulk', 'phi_c_bulk_round'],
        'equil_properties': ['asphericityMon-mean', 'asphericityMon-var',
                             'asphericityMon-sem', 'fsdMon-mean',
                             'fsdMon-var', 'fsdMon-sem', 'gyrMon-mean',
                             'gyrMon-var', 'gyrMon-sem', 'shapeMon-mean',
                             'shapeMon-var', 'shapeMon-sem']
    }
}

## allInOne *whole* and *ensAvg* stamps per project

### **ensemble-averaged** stamps per project:

In [None]:
analysis_db = '/Users/amirhsi_mini/research_data/analysis/'
group = 'bug'
phase="ensAvg"
space_dbs = glob(analysis_db + project_details[project]['space_pat'])
ens_avg_space_dbs = [
    space_db + "/" for space_db in space_dbs if space_db.endswith(
        group + '-' + phase
    )
]
allInOne_stamps = []
for space_db in ens_avg_space_dbs:
    stamp_path = project_details[project]['space_hierarchy'] + 'stamps*' 
    stamp_path = glob(space_db + "/" + stamp_path + '.csv')[0]
    space_stamps = pd.read_csv(stamp_path)
    allInOne_stamps.append(space_stamps)
allInOne_stamps = pd.concat(allInOne_stamps, axis=0)
allInOne_stamps.reset_index(inplace=True, drop=True)
output = analysis_db + "allInOne-" + project + "-stamps-" + phase + ".csv"
allInOne_stamps.to_csv(output, index=False)

### **whole** stamps per project

In [None]:
analysis_db = '/Users/amirhsi_mini/research_data/analysis/'
group = 'bug'
phase='ens'
space_dbs = glob(analysis_db + project_details[project]['space_pat'])
phase_space_dbs = [
    space_db for space_db in space_dbs if space_db.endswith(
        group + '-' + phase
    )
]
allInOne_stamps = []
for space_db in phase_space_dbs:
    stamp_path = project_details[project]['space_hierarchy'] + 'stamps*'
    stamp_path = glob(space_db + "/" + stamp_path + '.csv')[0]
    space_stamps = pd.read_csv(stamp_path)
    allInOne_stamps.append(space_stamps)
allInOne_stamps = pd.concat(allInOne_stamps, axis=0)
allInOne_stamps.reset_index(inplace=True, drop=True)
output = analysis_db + "allInOne-" + project + "-stamps-" + phase +".csv"
allInOne_stamps.to_csv(output, index=False)

## **ensAvg** timeseries and their associated measures 

### **Measures** of chainsize timeseries properties per space

#### **allInONe** ACFs of the chain-size properties per **space**

In [None]:
%%time
# Wall time: 11.8 s for TransFoci
analysis_db = '/Users/amirhsi_mini/research_data/analysis/'
group = 'bug'
geometry = 'biaxial'
phase = 'ensAvg'
space_dbs = glob(analysis_db + project_details[project]['space_pat'])
ens_avg_space_dbs = [
    space_db + "/" for space_db in space_dbs if space_db.endswith(
        group + '-' + phase
    )
]
print(ens_avg_space_dbs)
# list of unique property_measures:
filepath = ens_avg_space_dbs[0] + '*' + project_details[project]['hierarchy'] + '.csv'  # physical properties in all the 
_, uniq_props_measures = organizer.unique_property(
    filepath, 2, ["-" + phase], drop_properties=['stamps'])
print(uniq_props_measures)
for ens_avg_space_db in ens_avg_space_dbs:
    ens_avgs = list()
    space = ens_avg_space_db.split('/')[-2].split('-')[0]
    for property_ in uniq_props_measures:
        ens_avg = organizer.space_tseries(
            ens_avg_space_db,
            property_,
            project_details[project]['parser'],
            project_details[project]['hierarchy'],
            project_details[project]['attributes'],
            group,
            geometry,
            is_save = False  # if True, save per property per space
        )
        ens_avgs.append(ens_avg)
    ens_avgs = pd.concat(ens_avgs,axis=1)
    # drop duplicated columns:
    ens_avgs = ens_avgs.loc[:,~ens_avgs.columns.duplicated()]
    output_name = analysis_db +  "-".join(
        [space,  group, "chainSize-acf.parquet.brotli"]
    )
    ens_avgs.to_parquet(output_name, index=False, compression='brotli')

#### **allInOne** the chian-size properties per **space**

In [None]:
%%time
# Wall time: 14.7 s for TransFoci
analysis_db = '/Users/amirhsi_mini/research_data/analysis/'
group = 'bug'
geometry = 'biaxial'
phase = 'ensAvg'
space_dbs = glob(analysis_db + project_details[project]['space_pat'])
ens_avg_space_dbs = [
    space_db + "/" for space_db in space_dbs if space_db.endswith(
        group + '-' + phase
    )
]
print(ens_avg_space_dbs)
# list of unique property_measures:
filepath = ens_avg_space_dbs[0] + '*' + project_details[project]['hierarchy'] + '.csv'  # physical properties in all the 
_, uniq_props_measures = organizer.unique_property(
    filepath, 2, ["-" + phase], drop_properties=['stamps'])
props_tseries = list(
    set(
        [prop.split("-acf")[0] for prop in uniq_props_measures]
    )
)
print(props_tseries)
for ens_avg_space_db in ens_avg_space_dbs:
    ens_avgs = list()
    space = ens_avg_space_db.split('/')[-2].split('-')[0]
    for property_ in props_tseries:
        ens_avg = organizer.space_tseries(
            ens_avg_space_db,
            property_,
            project_details[project]['parser'],
            project_details[project]['hierarchy'],
            project_details[project]['attributes'],
            group,
            geometry,
            is_save = False  # if True, save per property per space
        )
        ens_avgs.append(ens_avg)
    ens_avgs = pd.concat(ens_avgs,axis=1)
    # drop duplicated columns:
    ens_avgs = ens_avgs.loc[:,~ens_avgs.columns.duplicated()]
    output_name = analysis_db +  "-".join(
        [space,  group, "chainSize.parquet.brotli"]
    )
    ens_avgs.to_parquet(output_name, index=False, compression='brotli')

### Pair distance time-series per project

In [None]:
%%time
analysis_db = '/Users/amirhsi_mini/research_data/analysis/'
group = 'bug'
geometry = 'biaxial'
phase = 'ensAvg'
space_dbs = glob(analysis_db + project_details[project]['space_pat'])
ens_avg_space_dbs = [
    space_db + "/" for space_db in space_dbs if space_db.endswith(
        group + '-' + phase
    )
]
print(ens_avg_space_dbs)
tseries_foci_props = ['pairDistTFoci']
project_ens_avgs = []
for prop in tseries_foci_props:
    prop_ens_avgs = list()
    for ens_avg_space_db in ens_avg_space_dbs:
        space = ens_avg_space_db.split('/')[-2].split('-')[0]
        ens_avg = organizer.space_tseries(
            ens_avg_space_db,
            prop,
            project_details[project]['parser'],
            project_details[project]['hierarchy'],
            project_details[project]['attributes'],
            group,
            geometry,
            is_save = False  # if True, save per property per space
        )
        prop_ens_avgs.append(ens_avg)
    prop_ens_avgs = pd.concat(prop_ens_avgs,axis=0)
    # drop duplicated columns:
    prop_ens_avgs = prop_ens_avgs.loc[:, ~prop_ens_avgs.columns.duplicated()]
    prop_ens_avgs.reset_index(inplace=True, drop=True)
    project_ens_avgs.append(prop_ens_avgs)
project_ens_avgs = pd.concat(project_ens_avgs,axis=1)
project_ens_avgs = \
    project_ens_avgs.loc[:, ~project_ens_avgs.columns.duplicated()]
project_ens_avgs.reset_index(inplace=True, drop=True)
output ='-'.join(['allInOne', project, group, 'pairDistT.parquet.brotli'])
output = analysis_db + output
project_ens_avgs.to_parquet(output, index=False, compression='brotli')

## Equilibrium timeseries properties per space AND per project

### Whole quilibrium properties allInOne

In [None]:
%%time
# Wall time: 14.7 s for TransFoci
# Wall time: 2min and 22 s for SumRule
analysis_db = '/Users/amirhsi_mini/research_data/analysis/'
group = 'bug'
spaces = glob(analysis_db + project_details[project]['space_pat'])
spaces = list(set([space.split('/')[-1].split('-')[0] for space in spaces]))
save_space = True
equili_props_wholes = api.allInOne_equil_tseries(
    project,
    analysis_db,
    group,
    spaces,
    project_details[project]['time_varying_props'],
    project_details[project]['equil_measures'],
    save_space=save_space,
    divisor = 0.025,
    round_to = 3,
    save_to=analysis_db
)

ens_avg = api.allInOne_equil_tseries_ensAvg(
    project,
    equili_props_wholes,
    group,
    project_details[project]['equil_properties'],
    project_details[project]['equil_attributes'],
    save_to=analysis_db
)

## Distributions

### Clusters and bonds per project

- The histograms of **Clusters** and **bonds** can **not** be combined in **one** dataset.
- Since **per project** datasets are small, we create **one** per project dataset for each property.

In [None]:
analysis_db = '/Users/amirhsi_mini/research_data/analysis/'
group = 'bug'
geometry = 'biaxial'
phase = 'ensAvg'
space_dbs = glob(analysis_db + project_details[project]['space_pat'])
ens_avg_space_dbs = [
    space_db + "/" for space_db in space_dbs if space_db.endswith(
        group + '-' + phase
    )
]
print(ens_avg_space_dbs)

nmon_large = 5
hist_t_foci_bin_centers = {
   'bondsHistFoci': np.arange(nmon_large),
   'clustersHistFoci': np.arange(1, nmon_large + 1)
}
# Separate dataset for bonds and clusters per 
for prop, bin_center in hist_t_foci_bin_centers.items():
    ens_avgs = list()
    for ens_avg_space_db in ens_avg_space_dbs:
        space = ens_avg_space_db.split('/')[-2].split('-')[0]
        ens_avg = organizer.space_hists(
            ens_avg_space_db,
            prop,
            project_details[project]['parser'],
            project_details[project]['hierarchy'],
            project_details[project]['attributes'],
            group,
            geometry,
            bin_center=bin_center,
            normalize=True,
            is_save = False
        )
        ens_avgs.append(ens_avg)
    ens_avgs = pd.concat(ens_avgs,axis=0)
    # drop duplicated columns:
    ens_avgs = ens_avgs.loc[:, ~ens_avgs.columns.duplicated()]
    ens_avgs.reset_index(inplace=True, drop=True)
    output =  "-".join(['allInOne', project, group, prop + ".parquet.brotli"])
    output = analysis_db + output
    ens_avgs.to_parquet(output, index=False, compression='brotli')

### Pair Distance Statisitcs per project

- These **properties** can be **combined** in one file per project.
- Since **per project** datasets are small, we create **one** per project dataset for **all** properties.

In [None]:
analysis_db = '/Users/amirhsi_mini/research_data/analysis/'
group = 'bug'
geometry = 'biaxial'
phase = 'ensAvg'
space_dbs = glob(analysis_db + project_details[project]['space_pat'])
ens_avg_space_dbs = [
    space_db + "/" for space_db in space_dbs if space_db.endswith(
        group + '-' + phase
    )
]
print(ens_avg_space_dbs)

hist_foci_props = ['pairDistHistFoci', 'pairDistRdfFoci']
# One per-project database for both rpeorty since they are samll and related 
project_ens_avgs = []
for prop in hist_foci_props:
    prop_ens_avgs = list()
    for ens_avg_space_db in ens_avg_space_dbs:
        space = ens_avg_space_db.split('/')[-2].split('-')[0]
        ens_avg = organizer.space_hists(
            ens_avg_space_db,
            prop,
            project_details[project]['parser'],
            project_details[project]['hierarchy'],
            project_details[project]['attributes'],
            group,
            geometry,
            bin_center=None,
            normalize=False,
            is_save=False
        )
        prop_ens_avgs.append(ens_avg)
    prop_ens_avgs = pd.concat(prop_ens_avgs,axis=0)
    # drop duplicated columns:
    prop_ens_avgs = prop_ens_avgs.loc[:, ~prop_ens_avgs.columns.duplicated()]
    prop_ens_avgs.reset_index(inplace=True, drop=True)
    project_ens_avgs.append(prop_ens_avgs)
project_ens_avgs = pd.concat(project_ens_avgs,axis=1)
# drop duplicated columns:
project_ens_avgs = project_ens_avgs.loc[:, ~project_ens_avgs.columns.duplicated()]
project_ens_avgs.reset_index(inplace=True, drop=True)
output = '-'.join(
    ['allInOne', project, group, 'pairDistStats.parquet.brotli']
)
output = analysis_db + output
project_ens_avgs.to_parquet(output, index=False, compression='brotli')


### Local radial and longitudinal distributions

In [None]:
analysis_db = '/Users/amirhsi_mini/research_data/analysis/'
group = 'all'
geometry = 'biaxial'
phase = 'ensAvg'
space_dbs = glob(analysis_db + project_details[project]['space_pat'])
ens_avg_space_dbs = [
    space_db + "/" for space_db in space_dbs if space_db.endswith(
        group + '-' + phase
    )
]
print(ens_avg_space_dbs)
# list of unique properties and property_measures:
# Local distributions do not have any property_measures:
uniq_props, _ = organizer.unique_property(
    ens_avg_space_dbs[0] + '*' + \
        project_details[project]['hierarchy'] + '.csv',
    2,
    ["-" + phase],
    drop_properties=["stamps"])
print(uniq_props)

In [None]:
species = ['Mon', 'Crd']
species = ['Mon', 'Crd', 'Foci']
normalizing_indexes = {
    'r': {
        'Mon': 0,
        'Crd': 0,
    },
    'z': {
        'Mon': (0,
        'Crd': 0,
    },
    'theta': {
        'Mon': 0,
        'Crd': 0,
    }
}

In [None]:
directions = ['r', 'z', 'theta']
for direction in directions:
    props_by_dir = [prop for prop in uniq_props if prop.startswith(direction)]
    dir_ens_avgs = list()
    for prop in props_by_dir:
        prop_ens_avgs = list()
        for ens_avg_space_db in ens_avg_space_dbs:
            space = ens_avg_space_db.split('/')[-2].split('-')[0]
            ens_avg = organizer.space_hists(
                ens_avg_space_db,
                prop,
                project_details[project]['parser'],
                project_details[project]['hierarchy'],
                project_details[project]['attributes'],
                species,
                group,
                geometry,
                bin_center=None,
                normalize=direction,
                is_save=False
            )
            prop_ens_avgs.append(ens_avg)
        prop_ens_avgs = pd.concat(prop_ens_avgs,axis=0)
        # drop duplicated columns:
        prop_ens_avgs = \
            prop_ens_avgs.loc[:, ~prop_ens_avgs.columns.duplicated()]
        prop_ens_avgs.reset_index(inplace=True, drop=True)
        dir_ens_avgs.append(prop_ens_avgs)
    dir_ens_avgs = pd.concat(dir_ens_avgs,axis=1)
        # drop duplicated columns:
    dir_ens_avgs = dir_ens_avgs.loc[:, ~dir_ens_avgs.columns.duplicated()]
    dir_ens_avgs.reset_index(inplace=True, drop=True)
    print(dir_ens_avgs.isna().sum().sum())
    output = analysis_db +  "-".join([
        'allInOne', project,  group,  direction + "LocalDist.parquet.brotli"
    ])
    dir_ens_avgs.to_parquet(output, index=False, compression='brotli')

#### Sum-Rule

In [None]:
def all_in_one_distributions(
                dist_tuples, properties, geometry, direction, save_to=None,
                round_to=4):
    """takes ensemble-averaged distributions and performs three operations
    on them. First, it concatenates the ensemble-averaged distributions of
    all the species in an ensebmle into one dataframe.

    In this dataframe, bin centers are indices and the number of columns
    are equal to the number of species. Next, it adds the properties of
    the ensemble to the merged distributions. Finally, it combines all
    the ensebles from all the gropus into one dataframe.

    Cautions:
    For each species, a distribution is a dataframe with bin centers as
    indices and a column of frequencies.
    A simulation group usually results in a graph or curve for the project
    and refers to a collection of simulations that all have the same values
    for one or several input parameters of the project.
    An ensemble is a collection of themodynamically-equivalent simulations
    that differs only in their random number seeds, initial conditions, or
    boundary conditions but have the same input parameters. In standard
    statitatical mechanical approach, an ensmeble is equivalent a simulation,
    but here we used it to reffer to all the thermodynamically-equivalent
    simulations.
    An ensemble-averaged group is an average over all the simulations in an
    ensemble and usually gives a data point.
    If there are N ensembles, each with M simulations, then there are N
    ensemble-average groups and N*M simulations in the simulation group.
    By default, the indexes of an ensemble (dataframe) are timesteps.
    For some properties such as histograms, however, the indexes are bin
    centers. As a result, simulations are tuples where each tuple can have
    one or more members. For histogram-like properties, each tuple has two
    members where the second member is the bin edges.

    Issue:
    1. Current implementation only work for a pair of species
    (e.g., a monomer type and a crowder type).
    2. This function only works when we have all the three distributions for
    a species: local histograms, densities, and volume fractions.

    Parameters:
    dist_tuples (list of tuples): A list of tuples in which each tuple has
    all the A*B ensemble-averaged distributions in one ensemble;
    here, A is the number of distribution types in this direction
    (histogram of particle numbers, number desnity, and volume fraction) ,
    and M is the number of different types of particles in that ensemble.
    properties (Pandas dataframe): The database of ensemble-average properties.
    geomtery (str): the geometry of the simulation box.
    direction (str): Direction along which the distributions are computed.
    save_to (bool): whether save to file or not
    round_to (int): rounding the whole dataframe to the given round-off number.

    Return;
    A pandad databse in which all the dsitbrutions of all the simulation
    groups are merged.
    """
    group_species = []
    properties[['ens_name', 'garbage']] = properties.filename.str.split(
        pat='-', expand=True)  # find the ensemble names.
    selected_cols = [
        'filename', 'nmon', 'dcyl', 'lcyl', 'phi_m_bulk', 'rho_m_bulk',
        'dcrowd', 'phi_c_bulk', 'rho_c_bulk', 'phi_c_bulk_normalized',
        'phi_c_bulk_eff', 'phi_c_bulk_eff_normalized', 'ens_name']
    for hist_crd, hist_mon, phi_crd, phi_mon, rho_crd, rho_mon in dist_tuples:
        hist_mon_df = pd.read_csv(hist_mon, index_col=0)
        ens_name = list(hist_mon_df.columns)[0].split('-')[0]
        hist_crd_df = pd.read_csv(
            hist_crd, names=['hist_crd_' + direction], skiprows=1, index_col=0)
        hist_mon_df = pd.read_csv(
            hist_mon, names=['hist_mon_' + direction], skiprows=1, index_col=0)
        rho_crd_df = pd.read_csv(
            rho_crd, names=['rho_crd_' + direction], skiprows=1, index_col=0)
        rho_mon_df = pd.read_csv(
            rho_mon, names=['rho_mon_' + direction], skiprows=1, index_col=0)
        phi_crd_df = pd.read_csv(
            phi_crd, names=['phi_crd_' + direction], skiprows=1, index_col=0)
        phi_mon_df = pd.read_csv(
            phi_mon, names=['phi_mon_' + direction], skiprows=1, index_col=0)
        ens_species = pd.concat([
            hist_mon_df, hist_crd_df, rho_mon_df, rho_crd_df, phi_mon_df,
            phi_crd_df], axis=1)
        ens_species.reset_index(inplace=True)
        ens_species.rename(columns={'index': direction}, inplace=True)
        # add the properties of the ensemble to merged distributions
        for col in selected_cols:
            cond = properties['ens_name'] == ens_name
            ens_species[col] = properties[cond][col].values[0]
        cell_attrs = SumRule(ens_name, geometry, warning=False)
        # normalizing the bin centers (or directions) in three different ways:
        bin_center_norm_box = {
            'r': cell_attrs.dcyl,
            'z': cell_attrs.lcyl
            }
        ens_species[direction+'_norm'] = \
            2 * ens_species[direction] / bin_center_norm_box[direction]
        # The following should be divided by 'dmon' but this operation is not
        # done since dmon = 1.0:
        ens_species[direction+'_norm_mon'] = ens_species[direction]
        ens_species[direction+'_norm_crd'] = \
            ens_species[direction] / cell_attrs.dcrowd
        # normalizing local volume fractions in different directions
        if direction == 'r':
            phi_mon_origin = ens_species.loc[0, 'phi_mon_r']
            phi_crd_infinity = ens_species.loc[0, 'phi_crd_r']  \
                # it is actually origin but for the sake of similiarity with\
            # z direction.
            rho_mon_origin = ens_species.loc[0, 'rho_mon_r']
            rho_crd_infinity = ens_species.loc[0, 'rho_crd_r']  \
                # it is actually origin but for the sake of similiarity with\
            # z direction
        if direction == 'z':
            around_max = 20
            non_zero_phi_mon_z_max_idx = ens_species[
                ens_species['phi_mon_z'] != 0]['phi_mon_z'].idxmax()
            phi_mon_origin = ens_species.loc[
                non_zero_phi_mon_z_max_idx - around_max:
                non_zero_phi_mon_z_max_idx + around_max, 'phi_mon_z'].mean()
            phi_crd_infinity = ens_species[
                ens_species['phi_mon_z'] == 0.0]['phi_crd_z'].mean()
            non_zero_rho_mon_z_max_idx = ens_species[
                ens_species['rho_mon_z'] != 0]['rho_mon_z'].idxmax()
            rho_mon_origin = ens_species.loc[
                non_zero_rho_mon_z_max_idx - around_max:
                non_zero_rho_mon_z_max_idx + around_max, 'rho_mon_z'].mean()
            rho_crd_infinity = ens_species[
                ens_species['rho_mon_z'] == 0.0]['rho_crd_z'].mean()
        ens_species['phi_mon_' + direction + '_uniform'] = phi_mon_origin
        ens_species['phi_mon_' + direction + '_uniform_size'] = \
            phi_mon_origin  \
            # This should be divided by 'dmon' but this operation is not\
        # done since dmon = 1.0
        ens_species['phi_mon_' + direction + '_uniform_size_norm'] = 1.0
        ens_species['phi_mon_' + direction + '_norm'] = \
            ens_species['phi_mon_' + direction] / phi_mon_origin
        ens_species['phi_mon_' + direction + '_norm_size'] = \
            ens_species['phi_mon_' + direction + '_norm']  \
            # This should be divided by 'dmon' but this operation is not\
        # done since dmon = 1.0
        ens_species['phi_mon_' + direction + '_size'] = \
            ens_species['phi_mon_' + direction]  \
            # This should be divided by 'dmon' but this operation is not\
        # done since dmon = 1.0
        ens_species['phi_crd_' + direction + '_uniform'] = phi_crd_infinity
        ens_species['phi_crd_' + direction + '_uniform_size'] = \
            phi_crd_infinity / cell_attrs.dcrowd
        ens_species['phi_crd_' + direction + '_uniform_size_norm'] = 1.0
        ens_species['phi_crd_' + direction + '_norm'] = \
            ens_species['phi_crd_' + direction] / phi_crd_infinity
        ens_species['phi_crd_' + direction + '_norm_size'] = \
            ens_species['phi_crd_' + direction + '_norm'] / cell_attrs.dcrowd
        ens_species['phi_crd_' + direction + '_size'] = \
            ens_species['phi_crd_' + direction] / cell_attrs.dcrowd
        ens_species['phi_' + direction + '_uniform_sum'] = \
            ens_species['phi_mon_' + direction + '_uniform_size']
        + ens_species['phi_crd_' + direction + '_uniform_size']
        ens_species['phi_' + direction + '_uniform_sum_norm'] = 1.0
        ens_species['rho_mon_' + direction + '_uniform'] = rho_mon_origin
        ens_species['rho_mon_' + direction + '_uniform_size'] = \
            rho_mon_origin  \
            # This should be mutiplied by 'dmon'**2 but this operation is\
        # not done since dmon = 1.0
        ens_species['rho_mon_' + direction + '_norm'] = \
            ens_species['rho_mon_' + direction] / rho_mon_origin
        ens_species['rho_mon_' + direction + '_norm_size'] = \
            ens_species['rho_mon_' + direction + '_norm'] \
            # This should be mutiplied by 'dmon'**2 but this operation is\
        # not done since dmon = 1.0
        ens_species['rho_mon_' + direction + '_size'] = \
            ens_species['rho_mon_' + direction] \
            # This should be mutiplied by 'dmon'**2 but this operation is\
        # not done since dmon = 1.0
        ens_species['rho_crd_' + direction + '_uniform'] = rho_crd_infinity
        ens_species['rho_crd_' + direction + '_uniform_size'] = \
            (rho_crd_infinity * cell_attrs.dcrowd**2)
        ens_species['rho_crd_' + direction + '_norm'] = \
            (ens_species['rho_crd_' + direction] / rho_crd_infinity)
        ens_species['rho_crd_' + direction + '_norm_size'] = \
            (ens_species['rho_crd_' + direction + '_norm']
                * cell_attrs.dcrowd**2)
        ens_species['rho_crd_' + direction + '_size'] = \
            (ens_species['rho_crd_' + direction] * cell_attrs.dcrowd**2)
        ens_species['rho_' + direction + '_uniform_sum'] = \
            (ens_species['rho_mon_' + direction + '_uniform_size']
                + ens_species['rho_crd_' + direction + '_uniform_size'])
        # sum rule: The monomer and crowder volume fractions normalized\
        # by their particle size and then added to create the sum rule:
        ens_species['phi_sumrule_' + direction] = \
            ens_species['phi_mon_' + direction + '_size']
        + ens_species['phi_crd_' + direction + '_size']
        ens_species['phi_sumrule_' + direction + '_norm'] = \
            (ens_species['phi_sumrule_' + direction]
                / (phi_mon_origin
                    + (phi_crd_infinity / cell_attrs.dcrowd)))  \
            # The first term in the denominator should be divided by 'dmon' \
        # but this operation isnot done since dmon = 1.0
        ens_species['rho_sumrule_' + direction] = \
            (ens_species['rho_mon_' + direction + '_size']
                + ens_species['rho_crd_' + direction + '_size'])
        ens_species['rho_sumrule_' + direction + '_norm'] = \
            (ens_species['rho_sumrule_' + direction]
                / (rho_mon_origin
                    + (rho_crd_infinity * cell_attrs.dcrowd**2))) \
            # The first term in the denominator should be mutiplied by\
        # 'dmon'**2 but this operation isnot done since dmon = 1.0
        # sume rule phi: Amir's way
        if direction == 'z':
            ens_species['phi_sumrule_' + direction] = \
                (ens_species['phi_mon_' + direction + '_size']
                    + ens_species['phi_crd_' + direction + '_size'])
            ens_species['phi_sumrule_' + direction + '_norm_crowd_only'] = \
                (ens_species['phi_sumrule_' + direction]
                    / (phi_crd_infinity / cell_attrs.dcrowd))
        if ens_species['phi_c_bulk'].any() == 0:
            ens_species['phi_crd_' + direction + '_norm'] = 0.0
            ens_species['phi_crd_' + direction + '_norm_size'] = 0.0
            ens_species['rho_crd_' + direction + '_norm'] = 0.0
            ens_species['rho_crd_' + direction + '_norm_size'] = 0.0
            if direction == 'z':
                ens_species['phi_sumrule_'
                            + direction + '_norm_crowd_only'] = 0.0
        # Defining concise name for ensembles anf groups
        ens_species['ens_name'] = f"N{cell_attrs.nmon}D{cell_attrs.dcyl}\
            ac{cell_attrs.dcrowd}nc{cell_attrs.ncrowd}"
        ens_species['group_name'] = f"N{cell_attrs.nmon}D{cell_attrs.dcyl}\
            ac{cell_attrs.dcrowd}"
        ens_species.drop(['filename'], axis=1, inplace=True)
        group_species.append(ens_species)
    species_database = pd.concat(group_species)
    species_database = species_database.round(round_to)
    species_database.reset_index(inplace=True, drop=True)
    if save_to is not None:
        species_database.to_csv(save_to)
    return species_database

