V:2021/10/08

# Program `Reading_Fourier`

In this notebook the data from the Sierra Nevada ELF station is read and the amplitude spectrum is obtained using the Welch method. A 10 s time window is used to do the FFT and the square of the spectrum is averaged over 10 min intervals. The amplitude spectrum is obtained by evaluating the square root of the previous value. 10 s time windows that present saturations (voltage outside the range of the measurement system) are discarded and a saturation index is defined that collects the percentage of such eliminated intervals in each 10-minute interval in which average is performed.

As a final result of this step, the 10 min amplitude spectra intervals of measurements in each month is obtained.


## 1. Sierra Nevada ELF station data

In [1]:
# INPUTS
year='2015'
month='1503'
pathin='S_N_Data/'           # Have to be specified      
pathout='S_N_DF/'       # for each computer
pin=pathin+year+'/'+month+'/'
pou=pathout+year+'/'+month+'/'

The data collected at the Sierra Nevada ELF station are grouped by years and within each year by months: `pathin + '/2015/1503'`, where 2015 and 1503 correspond to the year 2015 and the month of March. The figures for the year are repeated in order to avoid writing errors in the processing.
The station data are in the directory 'S_N_Data' (indicated by the variable `pathin`) and the outputs of the processing carried out are located in 'S_N_DF' (indicated by the variable` pathout`). The directories for each year and each month must be created.  There are both, the raw data (in 'S_N_Data') and the processed processed data (in 'S_N_DF'), in the data location.

Each month has associated data files for each sensor and each hour and in turn each hour has a data file and another one with information on the start time, time increment used and number of samples.

In the directory for each month there are, in addition, two files, ` ficheros0` and `ficheros1` that contain the name of the file for each hour and for each sensor in order not to need to use system commands to identify the file names.
The nomenclature of each file is, for example, `smplGRTU1_sensor_0_1604301629`, where` smplGRTU1_sensor_` is common to all files, then 0 or 1 comes as it corresponds to the sensor with NS orientation or EW orientation, respectively, and then there are pairs of numbers for the year, 16, month, 04, day, 30, hour, 16, and starting minute, 29. The information file has the same name but ends in `_info.txt`. There are some months that are not complete due to problems arisen at the station or due to maintenance work.

The A/D system has an approximate sample rate of $ f_m = 256 $ Hz, although the sampling time interval is 3906.000000 $\mu$s, which is slightly less than $ 1 / f_m $. In one hour, a total of 921600 samples are taken, so the time length of each file is 3599.7696 s, which means a delay of 0.2304 seconds every hour. For this reason, the last file of each month with uninterrupted measurements ends a few minutes sooner than the beginning of the first file.


In the code, the year and month of reading are first identified.
The input and output paths are set below.

The article Rodríguez‐Camacho et al. (2018) presents a study on possible methodologies for the analysis of the station data and their impact on the results obtained, which are the Schumann resonances. As a final result of the study, a methodology has been established and the software of which is presented in this notebook and in the supplementary material of this paper.
As a consequence of this process, options appear in the code that are finally fixed. Naturally, these options can be modified but it would imply a systematic monitoring of the outputs of each part of the analysis process, which has been eliminated in the code presented, although some discussion remains.

## 2. Several values definition

In [2]:
import numpy as np
divi=6          # Others options (1,2,3,4)
nleeventa=2560  # 10 s SIGNAL
nventa=2**13    # df window
nventa2=nventa//2+1

The numpy package is used and imported with the alias `np`.
The 10 min interval, where the amplitude spectrum of the signal is to be averaged, is defined with the corresponding value of the fraction of an hour (1 hour/`divi`). Different intervals have been studied (the value of the variable `divi` could be (1,2,3,4,6). Finally, an interval of 10 m has been chosen (`divi = 6`).

Each 10 min interval is divided into 10 s long windows, corresponding to `nleeventa = 2560` samples. That is, we work with two intervals: one of 10 s where the Fast Fourier Transform (FFT) is carried out and another of 10 min where the (squared) spectra of each previous intervals are averaged.

In order to do the FFT, the time window is defined with the variable `nventa =`2$^{13}$ = 8192 samples (a little more than 30 s; 30\*256 = 7680 samples). For this, `nleeventa` data samples are used and it is extended with zeros to` nventa`.
This defines the increment in frequencies of the obtained spectrum. Considering that $df=1/(N*dt)=f_m/N$, where $f_m$=256 Hz (sampling frequency) and $ N $=`nventa`, results $ df $ = 256/2 $^{13}$ = 1/32 = 0.03125 Hz.

The discrete Fourier transform returns `nventa2 = nventa/2+1` samples that constitute the spectrum in amplitude.

The general process of the program for each month performs the following steps: 
- determines the number of files corresponding to the number of hours in that month;
- reads the names of all the files and for each one it divides the data according to the variable `divi`;
- generates the temporary windows according to the value of the `nleeventa` variable; 
- determines if the window has saturated values and in that case eliminates them (in that case it leaves the transform at zero and launches a message);
- if there are not saturations, adds the number of zeros necessary to make the FFT and returns the transform;
- calibration is carried out in the frequency band from 6 to 25 Hz.

In [3]:
fmues=256       # Hz sampling freq
facdat=10.0/2**15   # Sampling factor
nhora=fmues*3600  # 921600   # Data per hour
df=float(fmues)/float(nventa)
limsup=9.990        # Saturation superior bound
liminf=-9.990        # Saturation lower bound
(f1,f2)=(6,25)   # Calibrated band limits 

Definition of the parameters of the measurement system: the sampling frequency `fmues`, the sampling factor (corresponds to the maximum voltage value, 10 V, divided by the factor corresponding to 15-bit, 16-bit sampling system minus 1 for the sign), `facdat`; the number of samples in an hour, `nhora`; the upper and lower limits of the measurement (in V) to consider it as saturation, `limsup` and` liminf`; finally the limits of the operating band of the system under calibration conditions, `f1` and` f2` in Hz.

In [4]:
datosventa=np.zeros(nventa,dtype=float)     # For fft
datosDF=np.zeros(nventa2,dtype=float)
def pofre(f,df,fi):
    return int(round((f-fi)/df))
listafrec=np.arange(0,nventa2)*fmues/nventa   # frequency list
listafreccal=listafrec[pofre(f1,df,0.):pofre(f2,df,0.)+1]       # 

Variables for the FFT are defined. The array `datosventa` stores the measures and the added zeros, while the transformated data are stored in `datosDF`. The arrays `listafrec` contains the list of frequencies and `listafreccal` the calibrated part of them.
The `pofre` function determines the position of a frequency `f` within the frequency array that begin with `fi` and has `df` as frequency increment.

## 3. Welch method and Hann window

In [5]:
nhoradivi=nhora//divi  # 10 m interval data number (divi=6)
nw=nhoradivi          # Whel's method: number of data
mw=nleeventa          # Whel's method: window
pw=nleeventa//2       # Whel's method: swapping
ninter=int(np.round((nw-mw)/pw+1))  # Number of intervals
def hann(m):                          # Hann window
    val=np.arange(0,m)
    hannv=1.0-np.cos(val*np.pi/m)**2
    nor=np.sqrt(np.sum(hannv**2)/m)
    return hannv/nor
hannvs=hann(mw)

To apply Welch's method, the FFT of a set of `n` (10 min) data, taken in groups of `m` data (10 s) and with a shift of `p` data is made. The `m` data window is expanded with zeros to `nventa` samples.

The number of intervals generated from `n`, `m` and `p` is $\left[(n-m)/p \right] +1$, where `[ ]` is the integer part.  

The Hann's window with `m` samples is defined by:  :
$$
h(i)=1-\cos^2\big(\pi \cdot (i-1)/m\big)
$$
for $i=1,\dots,m$, with the normalizing factor:
$$
\sqrt{\sum_{i=1}^{m} h(i)^2/m }
$$




## 4. Calibration function

In [6]:
Lmag= 298E3
Cs= 40E-12
Cc= 0.65*55E-12    #35.75E-12
Ce= 2E-12
Ctot= Cs+Cc+Ce
Rmag= 320E3
facAmp= 2500.0
facSen= 1.9E6
f0= 1/(2.0*np.pi*np.sqrt(Ctot*Lmag))
delta= Rmag/2.0*np.sqrt(Ctot/Lmag)
def Scal(f):
    fac1= 1.E12/(facAmp*facSen)
    fac2= np.sqrt((1-(f/f0)**2)**2+(2.*delta*f/f0)**2)
    fac3= 1/f
    return fac1*fac2*fac3
funcal= Scal(listafreccal)

The calibrated magnetic field is given by:

$$B (f) = S_c (f) V (f)$$

with $f$ in the \[6 Hz, 25 Hz\] interval. $B(f)$ is the magnetic field (in T/$\sqrt{\text{Hz}}$), $V(f)$ is the transformed voltage and $S_c(f)$ is the calibration coeficient for each frequency $f$.

The function $S_c (f)$ is given by:
$$S_c (f) = \frac{1}{2500}  \frac{1}{1.9 \times 10^{6}}  \frac{1}{f} 
\sqrt{\left( 1 - \left( \frac{f}{f_0} \right)^2 \right)^2 + \left( 2 \delta
\frac{f}{f_0} \right)^2}$$

where $1/2500 $ is the amplification factor, $1/(1.9 \times 10^6)$ is the magnetometer sensitivity, $1 / f$ is given by Faraday's induction law. $f_0$ and $\delta$ are the resonance frequency and dumping factor, respectively, of the equivalent RCL of the measurement system and they can be calculated from:

$$ f_0=\frac{1}{2 \pi \sqrt{L C_T}}$$

$$ \delta=\frac{R}{2}\sqrt{\frac{C_T}{L}} $$

where $R$ is the magnetometer resistance, $L$ is the magnetometer inductance and 

$$C_T = C_{sensor} + C_{cable} + C_e$$

where $C_{sensor} = 40 $ pF is the sensor capacitance,
$C_{cable} = (0.65 \text{ m}) \times (55 \times 10^{- 12} \text{ F/m}) = 35.75$ pF is the wire (that conects the magnetometer to the pre-amplification stage) capacitance evaluated as the product of the length by the capacitance per unit length, $C_e = 2 $ pF is the capacitance of the operational amplifier located in the pre-amplification stage.

The factor $10^{12}$ converts the magnetic field units in $\text{pT}/\sqrt{\text{Hz}}$.

## 5. Reading the names of the data files

In [7]:
# Reading files with the name of files
ficheros0=open(pin+'ficheros0')
sensor0=[]
for line in ficheros0:
    sensor0.append(line[:-1])
ficheros1=open(pin+'ficheros1')
sensor1=[]
for line in ficheros1:
    sensor1.append(line[:-1])
if (len(sensor0) != len(sensor1)):
    print("Alert in files")
nhoras=len(sensor0)
print("Processing {0} hours".format(nhoras))

Processing 24 hours


Since the file names contain the time of their creation, a rule that defines the name a priori cannot be established, but they must be read directly. In order to be able to do it independently of the operating system in which we are working, two files `ficheros0` and` ficheros1` have been generated with the specific names of the data files for each month and for each sensor. These files are read in the previous sentences.

## 6. Data processing

In [9]:
# Sensor loop
sensorbreak=-1                           # Figures
daybreak=-1                              # Figures
hourbreak=-1
timebreak=(daybreak-1)*24+hourbreak     # Figures
for ise in range(2):
    print("Sensor {0}".format(ise))
    if (ise == 0):
        leesensor=sensor0
    else:
        leesensor=sensor1
# Opening output files
    saturados=open(pou+'SR'+month+'_saturados_'+str(ise),'w')
    satper=open(pou+'SR'+month+'_satper_'+str(ise),'w')
    media=open(pou+'SR'+month+'_media_'+str(ise),'w')

    countbreak=1
    for ihora in range(nhoras):
        print("Hour {0}".format(ihora))
# Reading hours
        hora= np.array(np.fromfile(pin+leesensor[ihora],\
                           dtype='int16'),dtype='float')*facdat      
## 
        nsath=len(hora[(liminf>hora)]) + len(hora[(limsup<hora)])
        
        print("It has been found {0} saturations".format(nsath))
        if (nsath != 0):
            print(ihora+1, nsath, file=saturados)   # Contamos hora desde 1
            for i,val in enumerate(hora):
                if (limsup<val or liminf>val):
#                    print(i,val)
                    print(i+1, file=saturados)
# Each hour is divided in 6 parts
        for idivi in range(divi):
            horadivi=hora[idivi*nhoradivi:(idivi+1)*nhoradivi]
            nsat=0
            datosDF[:]=0.
# Each 10s
            for iinter in range(ninter):
                ninf=iinter*pw
                nsup=ninf+mw
                horainter=horadivi[ninf:nsup]
                nsatinter=len(horainter[(liminf>horainter)]) +\
                          len(horainter[(limsup<horainter)])
                if (nsatinter==0):
                    datosventa[0:mw]=horainter*hannvs
                    datosDF += np.abs(np.fft.rfft(datosventa))**2
                else:
                    nsat += 1
# 10 min interval finished
            if (nsat == ninter):
                print("ALERT: ALL INTERVALS ARE SATARATED")
                print('Sensor {0} hour {1}'.format(ise, ihora+1))
            else:
                datosDF = np.sqrt(datosDF*2./((ninter-nsat)*fmues*mw))
                datosDF[0] /= np.sqrt(2.)
                datosDF[-1] /= np.sqrt(2.)

            datos_Bfield= datosDF[pofre(f1,df,0.):pofre(f2,df,0.)+1]*funcal
            print(*datos_Bfield, sep='\n', file=media)
            print(nsat/ninter, file=satper)
        countbreak += 1                         # Figures
        if(countbreak == timebreak ): break     # Figures            
# Hours loop finished
    saturados.close()
    satper.close()
    media.close()
    if (ise == sensorbreak):  break             # Figures    
# Sensors loop finished

Sensor 0
Hour 0
It has been found 3 saturations
Hour 1
It has been found 0 saturations
Hour 2
It has been found 0 saturations
Hour 3
It has been found 0 saturations
Hour 4
It has been found 4 saturations
Hour 5
It has been found 0 saturations
Hour 6
It has been found 0 saturations
Hour 7
It has been found 7 saturations
Hour 8
It has been found 3 saturations
Hour 9
It has been found 4 saturations
Hour 10
It has been found 18 saturations
Hour 11
It has been found 0 saturations
Hour 12
It has been found 0 saturations
Hour 13
It has been found 0 saturations
Hour 14
It has been found 0 saturations
Hour 15
It has been found 3 saturations
Hour 16
It has been found 3 saturations
Hour 17
It has been found 0 saturations
Hour 18
It has been found 0 saturations
Hour 19
It has been found 9 saturations
Hour 20
It has been found 2 saturations
Hour 21
It has been found 0 saturations
Hour 22
It has been found 1 saturations
Hour 23
It has been found 2 saturations
Sensor 1
Hour 0
It has been found 0 satu

This is the code that collects all the processing through the following nested loops:
1. Sensor loop: 0 and 1
2. Loop for each hour of data
3. Loop for each 10 min intervals into which each hour is divided
4. Loop to process each 10 s signal intervals into which each 10 min interval is divided.

The above code will be broken down below.

Warning: do not run the following code because it is cut from the previous one for easy explanation. It is possible to stop the loops by giving different values to the variables indicated with the legend `Figures`. They correspond to: the sensor (0 or 1), day of the month and hour of the day in which you want to stop the process. If we give values out of range the loops do not stop. These statements can be removed to optimize your code.

### 6.1 Sensor loop: 0 and 1

In [None]:
# Sensors loop
for ise in range(2):
    print("Sensor {0}".format(ise))
    if (ise == 0):
        leesensor=sensor0
    else:
        leesensor=sensor1
# Output files
    saturados=open(pou+'SR'+month+'_saturados_'+str(ise),'w')
    satper=open(pou+'SR'+month+'_satper_'+str(ise),'w')
    media=open(pou+'SR'+month+'_media_'+str(ise),'w')

    for ihora in range(nhoras): 
    # ....
# Hours loop finished
    saturados.close()
    satper.close()
    media.close()
# Sensors loop finished

Now the process is done for each sensor, and this loop is closed at the end of the program.

The output files, for each sensor (0, NS oriented, or 1, EW oriented) are opened: `saturados_0,1`,` satper_0,1` and `media_0,1` in the output path and in the directory for each month. In order to open these files without generating an error in the code, the directories for the output path corresponding to the year and month that are being analyzed must be created.

### 6.2 Loop for each hour of data

In [None]:
for ihora in range(nhoras):
        print("Hour {0}".format(ihora))
# Reading hours
        hora= np.array(np.fromfile(pin+leesensor[ihora],\
                           dtype='int16'),dtype='float')*facdat      
## 
        nsath=len(hora[(liminf>hora)]) + len(hora[(limsup<hora)])
        
        print("{0} saturations have been found".format(nsath))
        if (nsath != 0):
            print(ihora+1, nsath, file=saturados)   # Contamos hora desde 1
            for i,val in enumerate(hora):
                if (limsup<val or liminf>val):
#                    print(i,val)
                    print(i+1, file=saturados)
# Each hour is divided in 6 parts
        for idivi in range(divi):
            # ...

# Hours loop finished

The loop that runs through all the hours of the month is then started. Data for each hour are read and it is determined if there are saturations in this hour.
If there are, for each one the corresponding index in the file are found and written to the file `saturados`. The numbering of the hours and the indices start the numbering at 1.

### 6.3 Loop for 10 min intervals into which each hour is divided

In [None]:
# Each hour is divided in 6 parts
        for idivi in range(divi):
            horadivi=hora[idivi*nhoradivi:(idivi+1)*nhoradivi]
            nsat=0
            datosDF[:]=0.
# Each 10 s
            for iinter in range(ninter):
                # ...
# 10 min interval finished
            if (nsat == ninter):
                print("ALERT: ALL INTERVALS ARE SATURATED")
                print('Sensor {0} hour {1}'.format(ise, ihora+1))
            else:
                datosDF = np.sqrt(datosDF*2./((ninter-nsat)*fmues*mw))
                datosDF[0] /= np.sqrt(2.)
                datosDF[-1] /= np.sqrt(2.)

            datos_Bfield= datosDF[pf(f1):pf(f2)+1]*funcal
            print(*datos_Bfield, sep='\n', file=media)
            print(nsat/ninter, file=satper)

Now begins the processing of each interval of 10 min.
For each interval, its amplitude spectrum must be calculated as the average of the squared spectrum of 10-s intervals of measurements.
After reading the data, the array `datosDF` is set to zero.
The variable `nsat` defines how many 10 s intervals are saturated, and therefore eliminated from the average.

If the 10 min interval (the 10 minutes) has all its 10 s intervals saturated, it is necessary to warn and keep the transformation at zero.
If not all the intervals have saturations (`nsat` is different from `ninter`, total number of 10-second intervals with the corresponding overlap) the power spectrum is multiplied by 2 to take into account the negative frequencies (except the frequency 0 and the one corresponding to the final sample). The spectrum is divided by the total number of intervals (to make the average) and it is also divided by the factor `fmues * mw` which corresponds to the sampling frequency and the number of effective samples (without the addition of zeros) used for the FFT. The square root of the final result is calculated in order to have the spectrum in amplitude.
The spectrum of each 10 min interval is written to the output file with `\n` as separator.

Finally, the saturation percentage of the interval, `satpc=nsat/ninter`, is calculated and written in the output file.

As a result of the processing, the following files are generated:

1. SR1603_saturados0,1: if in a file (one hour of data) there is saturation, the index of the file (hour) and the number of saturated data are written, and then the index of each data.
2. SR1603_satper0,1: for each 10 min interval the quotient between saturated and total windows is written. This file is often used to determine the total number of 10 min intervals for each month.
3. SR1603_media_0,1: amplitude spectrum for each 10 min interval.

The calibrated frequency limits are 6-25 Hz.

During the analysis process, the index of the hour that is being analyzed is displayed on the screen. In case all the windows are saturated, a message of "ALERT ALL INTERVALS ARE SATURATED" is generated on the screen. This message can occur for each 10 min interval into which each hour is divided.

### 6.4 Loop to process the 10 s signal intervals

In [None]:
# Each 10s
            for iinter in range(ninter):
                ninf=iinter*pw
                nsup=ninf+mw
                horainter=horadivi[ninf:nsup]
                nsatinter=len(horainter[(liminf>horainter)]) +\
                          len(horainter[(limsup<horainter)])
                if (nsatinter==0):
                    datosventa[0:mw]=horainter*hannvs
                    datosDF += np.abs(np.fft.rfft(datosventa))**2
                else:
                    nsat += 1

In the 10 s interval loop the FFT is performed using the function `np.fft.rfft`.

Before doing the transformation we must check that there is no saturated data in the window. 
If so (there are no saturations):

* The data is multiplied by Hann's window and the 0s defined in the `datosventa` array, which has the length of `nventa =`$2^{13}$ samples, are kept.
* Numpy's real Fourier transform subroutine is called.
* The modulus of the transform is calculated and squared.
* These values are added for all the windows to make the average in the 10 min interval.

If the 10 s interval has any saturated samples, the counter of saturated windows in the interval is incremented.

## References

Fornieles-Callejón, J., Salinas, A., Toledo-Redondo, S., Portí, J., Méndez, A., Navarro, E. A., Morente-Molinera, J. A., Soto-Aranaz, C., & Ortega-Cayuela, J. S. (2015). Extremely low frequency band station for natural electromagnetic noise measurement. Radio Science, 50, 191–201

Rodríguez‐Camacho, J., Fornieles, J., Carrión, M. C., Portí, J. A., Toledo‐Redondo, S., & Salinas, A. (2018). On the Need of a Unified Methodology for Processing Schumann Resonance Measurements. Journal of Geophysical Research: Atmospheres, 123(23), 13,277-13,290. https://doi.org/10.1029/2018JD029462

