This document explains the strategy used to clusterize free text logs by looking the constant part. First, you need to import the code available at https://bitbucket.org/jpgil_cl/procdelays.

# How to color
The colors below were generated using `paintedForAlmaAntennas` function, which remove numbers but keeps some specific equipment that must be distinguished uniquely. 

In [1]:
from src.models.AlmaClasses import paintedForAlmaAntennas

In [2]:
paintedForAlmaAntennas("Example")

'Example'

In [3]:
paintedForAlmaAntennas("Example with 1 number")

'Example with ${N} number'

In [4]:
paintedForAlmaAntennas("Specific 2 antennas: DV01 and CM12")

'Specific ${N} antennas: ${ANT} and ${ANT}'

In [5]:
paintedForAlmaAntennas("Specific hardware: IFProc0 and IFProc1. Compare with others like DTX0, DTX1, and so on.")

'Specific hardware: IFProc_A and IFProc_B. Compare with others like DTX${N}, DTX${N}, and so on.'

# Discovered Palette
Some statistic and counting over colors. The palette has a dictionary that is persistent on executions (it mixes all the analysis in all files), but it does not count colors or instances per case. For that, you need to use a special class called CaseStats.

In [6]:
from src import *
from src.models.AlmaClasses import *
palette = PaletteFileDB(
    filename='../data/processed/colors-almaAntenna.pkl', 
    colorFunction=paintedForAlmaAntennas )

colors=palette.getColors()
len(colors)

1468

To see one color you can query by index, or by the color itself

In [7]:
colors[952]

'[CONTROL/${ANT}/cppContainer-GL - void Control::AntennaImpl::resynchroniseLORR()] Antenna ID Error (type=${N}, code=${N}) Detail="The LORR reports an unsynchronised TE signal. Please check that the LORR is in good shape and that the incoming TE signal is alive."'

In [8]:
colors[568]

'[CONTROL/${ANT}/cppContainer-GL - MonitorComponent::propertyArchivingInterval] Because archive_max_int (${N}) is smaller than min_timer_trig(${N}), the values of property CONTROL/${ANT}/FrontEnd/PowerDist${N}:ESNS_FOUND (IDL:alma/ACS/ROuLong:${N}) will be collected with time trigger: ${N} .'

In [9]:
palette.index("[CONTROL/${ANT}/FrontEnd/IFSwitch - ] ContainerServices::getComponentNonSticky(CONTROL/${ANT}/IFProc_B)")

511

Below is shown a subset of the colors:

In [10]:
for i in range(621,650):
    print ("Color %d: %s" % (i, colors[i][:150]))

Color 621: [CONTROL/${ANT}/cppContainer-GL - void Control::ControlDeviceImpl<componentInterfaceType, acscomponentImpl>::getSubdeviceReference(const Control::Devi
Color 622: [CONTROL/${ANT}/cppContainer-GL - void Control::ControlDeviceImpl<componentInterfaceType, acscomponentImpl>::getSubdeviceReference(const Control::Devi
Color 623: [CONTROL/${ANT}/cppContainer-GL - void Control::ControlDeviceImpl<componentInterfaceType, acscomponentImpl>::getSubdeviceReference(const Control::Devi
Color 624: [CONTROL/${ANT}/cppContainer-GL - void Control::ControlDeviceImpl<componentInterfaceType, acscomponentImpl>::getSubdeviceReference(const Control::Devi
Color 625: [CONTROL/${ANT}/cppContainer-GL - void Control::ControlDeviceImpl<componentInterfaceType, acscomponentImpl>::getSubdeviceReference(const Control::Devi
Color 626: [CONTROL/${ANT}/cppContainer-GL - void Control::ControlDeviceImpl<componentInterfaceType, acscomponentImpl>::getSubdeviceReference(const Control::Devi
Color 627: [CONTROL/${ANT}/c

Those colors were obtained in near 10 minutes over these 6 files. Note that there are 270 files that can be processed:

In [11]:
!ls ../data/interim/ | tail

dv25-acsStartContainer_cppContainer_2017-07-10_17.03.32.841_STRIPPED
dv25-acsStartContainer_cppContainer_2017-07-10_17.12.59.636_STRIPPED
dv25-acsStartContainer_cppContainer_2017-07-10_19.30.47.674_STRIPPED
dv25-acsStartContainer_cppContainer_2017-07-10_20.43.06.773_STRIPPED
dv25-acsStartContainer_cppContainer_2017-07-10_20.56.06.754_STRIPPED
dv25-acsStartContainer_cppContainer_2017-07-11_19.55.26.410_STRIPPED
dv25-acsStartContainer_cppContainer_2017-07-11_20.41.40.861_STRIPPED
dv25-acsStartContainer_cppContainer_2017-07-11_20.55.42.275_STRIPPED
dv25-acsStartContainer_cppContainer_2017-07-12_00.14.04.823_STRIPPED
dv25-acsStartContainer_cppContainer_2017-07-12_00.40.12.586_STRIPPED


# Statistics on pairs

In [27]:
from src import *
from src.models.AlmaClasses import *

* Instances per pair: how many individual cases has the pair (A, B)
* Delays per pair: sum up of all (A, B) delays measured among all instances

## CaseAntennaObserving

In [28]:
db = DelaysFileDB( caseName="CaseAntennaObserving", path= '../' + config.FILEPATH_DB+"/delays")  
db.caseName

'CaseAntennaObserving'

In [29]:
#db.instances_per_pair()[:5]

In [30]:
len(db.unique_colors())

125

In [31]:
db.total_pairs()

4025

In [17]:
some_pair, value = db.instances_per_pair()[10]
db.delays_per_pair()[some_pair]

4

In [18]:
db.total_cases()

351

## CaseRadioSetup

In [19]:
db = DelaysFileDB( caseName="CaseRadioSetup", path= '../' + config.FILEPATH_DB+"/delays")  

In [20]:
len(db.unique_colors())

9

In [21]:
db.total_pairs()

18

In [22]:
db.total_cases()

599

## CaseAntennaInArray

In [23]:
db = DelaysFileDB( caseName="CaseAntennaInArray", path= '../' + config.FILEPATH_DB+"/delays")  

In [24]:
len(db.unique_colors())

228

In [25]:
db.total_pairs()

12066

In [26]:
db.total_cases()

158