# Frequency

To use the data transformation script `Frequency.pl`, we provide it with a single input file followed by what we want it to name the output file it creates and a channel number:

`$ perl ./perl/Frequency.pl [inputFile1 inputFile2 ...] [outputFile1 outputFile2 ...] [column] [binType switch] [binValue]`

The last two values have a peculiar usage compared to the other transformation scripts.  Here, `binType` is a switch that can be either `0` or `1` to tell the script how you want to divide the data into bins; this choice then determines what the `binValue` parameter means.  The choices are

    0: Divide the data into a number of bins equal to `binValue`
    1: Divide the data into bins of width `binValue` (in nanoseconds)

It isn't immedately obvious what this means, though, or what the `column` parameter does.  We'll try it out on the test data in the `test_data` directory.  Use the UNIX shell command `$ ls test_data` to see what's there:

In [1]:
!ls test_data

6119.2016.0104.1.test.thresh  combineOut  sortOut15
6148.2016.0109.0.test.thresh  sortOut	  sortOut51
6203.2016.0104.1.test.thresh  sortOut11


Let's start simple, using a single input file and a single output file.  We'll run

`$ perl ./perl/Frequency.pl test_data/6148.2016.0109.0.test.thresh test_data/freqOut01 1 1 2`

to see what happens.  The `binType` switch is set to the e-Lab default of `1`, "bin by fixed width," and the value of that fixed width is set to the e-Lab-default of `2`ns.  Notice that we've named the output file `freqOut01`; we may have to do lots of experimentation to figure out what exactly the transformation `Frequency.pl` does, so we'll increment that number each time to keep a record of our progess.  The `column` parameter is `1`.

Before we begin, we'll make sure we know what the input file looks like.  The UNIX `wc` (word count) utility tells us that `6148.2016.0109.0.test.thresh` has over a thousand lines:

In [1]:
!wc -l test_data/6148.2016.0109.0.test.thresh

1003 test_data/6148.2016.0109.0.test.thresh


(`wc` stands for "word count", and the `-l` flag means "but count lines instead of words." The first number in the output, before the filename, is the number of lines, in this case 1003)

The UNIX `head` utility will show us the beginning of the file:

In [3]:
!head -25 test_data/6148.2016.0109.0.test.thresh

#$md5
#md5_hex(0)
#ID.CHANNEL, Julian Day, RISING EDGE(sec), FALLING EDGE(sec), TIME OVER THRESHOLD (nanosec), RISING EDGE(INT), FALLING EDGE(INT)
6148.4	2457396	0.5006992493422453	0.5006992493424479	17.51	4326041514317000	4326041514318750
6148.3	2457396	0.5006992493422887	0.5006992493424768	16.25	4326041514317375	4326041514319000
6148.2	2457396	0.5007005963399161	0.5007005963400029	7.49	4326053152376876	4326053152377625
6148.3	2457396	0.5007005963401910	0.5007005963404514	22.49	4326053152379250	4326053152381500
6148.4	2457396	0.5007005963401765	0.5007005963404658	25.00	4326053152379125	4326053152381624
6148.1	2457396	0.5014987243978154	0.5014987243980903	23.75	4332948978797125	4332948978799500
6148.2	2457396	0.5014987243980759	0.5014987243982495	15.00	4332948978799376	4332948978800875
6148.1	2457396	0.5020062862072049	0.5020062862076967	42.49	4337334312830250	4337334312834500
6148.2	2457396	0.5020062862074218	0.5020062862076389	18.75	4337334312832125	4337334312834000
6148.

Now, we'll execute

`$ perl ./perl/Frequency.pl test_data/6148.2016.0109.0.test.thresh test_data/freqOut01 1 1 2`

from the command line and see what changes.  After doing so, we can see that `freqOut01` was created in the `test_data/` folder, so we must be on the right track:

In [2]:
!ls test_data

6119.2016.0104.1.test.thresh  freqOut01		 singleChannelOut4  sortOut51
6148.2016.0109.0.test.thresh  singleChannelOut1  sortOut
6203.2016.0104.1.test.thresh  singleChannelOut2  sortOut11
combineOut		      singleChannelOut3  sortOut15


In [4]:
!wc -l test_data/freqOut01

1 test_data/freqOut01


It only has one line, though! Better investigate further:

In [5]:
!cat test_data/freqOut01

6149.000000	1000	4


It turns out that `SingleChannel` has a little bit more power, though.  It can actually handle multiple single channels at a time, as odd as that might sound.  We'll try specifying additional channels while adding additional respective output names for them:

`$ perl ./perl/SingleChannel.pl test_data/6148.2016.0109.0.test.thresh "test_data/singleChannelOut1 test_data/singleChannelOut2 test_data/singleChannelOut3 test_data/singleChannelOut4" "1 2 3 4"`

(for multiple channels/outputs, we have to add quotes `"` to make sure `SingleChannel` knows which arguments are the output filenames and which are the channel numbers)

If we run this from the command line, we do in fact get four separate output files:

In [6]:
!ls -1 test_data/

6119.2016.0104.1.test.thresh
6148.2016.0109.0.test.thresh
6203.2016.0104.1.test.thresh
combineOut
singleChannelOut1
singleChannelOut2
singleChannelOut3
singleChannelOut4
sortOut
sortOut11
sortOut15
sortOut51


Out of curiosity, let's line-count them using the UNIX `wc` utility:

In [12]:
!wc -l test_data/singleChannelOut1

258 test_data/singleChannelOut1


In [13]:
!wc -l test_data/singleChannelOut2

265 test_data/singleChannelOut2


In [14]:
!wc -l test_data/singleChannelOut3

239 test_data/singleChannelOut3


In [15]:
!wc -l test_data/singleChannelOut4

238 test_data/singleChannelOut4


Recall that the original input threshold file `6148.2016.0109.0.test.thresh` had 1003 lines - three header lines, and 1000 data lines.

**Exercise 1**

Add the line counts of the four output files above.  Do you get what you expect?

**Exercise 2**

In a well-functioning cosmic ray muon detector using 4 channels, what percentage of the total number of counts do you expect each channel to record?  Using the example above of a file with 1000 counts, how many counts would you expect each channel to have?  If the actual results differ from what you would have expected, try to explain why.

**Exercise 3**

Find a file with a much larger number of counts (that is, lines) than `6148.2016.0109.0.test.thresh` has, perhaps in the `files/` directory.  Repeat the above process of using `SingleChannel` to separate the file into individual-channel files, naming the outputs `test_data/singleChannelOut-Big1`, `test_data/singleChannelOut-Big2`, etc.  

Calculate what percentage of the total number of counts each output file has.  How do these compare to your expectations?  How do they compare to the 1000 counts of `6148.2016.0109.0.test.thresh`?

**A Word of Warning**

If you've been playing around with word counts for a bit, you may have noticed that `SingleChannel` has a quirk: if you specify an output file that already exists, `SingleChannel` will *add to* the existing file rather than replacing it with the new output.  Most of the other e-Lab data transformations will replace the existing file, so this may represent a bug in this particular script.

*Be aware of this when running similar commands multiple times!*