# Identifying Hard Coded Single Precision Variables

In looking to merge MARCS with DSEP, it is immediately clear that the two codes are incompatible when it comes to passing variables. MARCS is written with single precision declarations for real variables and DSEP is written with double precision declarations. Type conversions can be carried out in several different ways, but perhaps it is worth upgrading MARCS to double precision. However, while it is simple enough to redefine real declarations to `real(dp)`, where `dp` is a double precision type declaration, there are many instances of hard coded single precision variables. That is, variables defined with `X.XeYYY` instead of `X.XdYYY`. Finding such declarations is non-trivial.

In [1]:
import fileinput as fi

An initial attempt to extract all instances of hard coded single precision real variables was done using `grep` and a regular expression,
```bash
grep -i "[0-9].e.[0-9]" *.f > marcs_single_precision.txt
```
where `-i` flags the result to be case insensitive. Let's read in the result.

In [2]:
single_precision_vars = [line.rstrip('\n') for line in fi.input('marcs_single_precision.txt')]

Take a quick look at the first few lines,

In [3]:
print single_precision_vars[:5]

['CIAh2h.f:75:        propac=1.e-30', 'CIAh2h2.f:75:        propac=1.e-30', 'CIAh2he.f:75:        propac=1.e-30', 'CIAhhe.f:75:        propac=1.e-30', 'archiv.f:156:cc        print 215, k,log10(max(1.e-30,xmettryck(k,1))),']


Everything looks reasonble, but I know for a fact that scientific notation can also be declared in `FORMAT` statements. Therefore, it's necessary to remove these from the entry list.

In [4]:
single_precision_vars = [line for line in single_precision_vars if line.lower().find('format') == -1]

Make sure we haven't removed valid entries, as compared to the initial output.

In [5]:
print single_precision_vars[:5]

['CIAh2h.f:75:        propac=1.e-30', 'CIAh2h2.f:75:        propac=1.e-30', 'CIAh2he.f:75:        propac=1.e-30', 'CIAhhe.f:75:        propac=1.e-30', 'archiv.f:156:cc        print 215, k,log10(max(1.e-30,xmettryck(k,1))),']


We can now organize the data by subroutine and output the lines on which a single precision real is expcted to occur. There will be false positives, but those can be checked by eye.

In [6]:
file_stream = open('single_precision_index.txt', 'w')
routine = ''
for line in single_precision_vars:
    line = line.split(':')
    if line[0] == routine:
        file_stream.write('\t{:4s} :: {:50s}\n'.format(line[1].strip(), line[2].strip()))
    else:
        routine = line[0]
        file_stream.write('\n\n{:s}\n\n'.format(line[0].rstrip('.f').upper()))
        file_stream.write('\t{:4s} :: {:70s}\n'.format(line[1].strip(), line[2].strip()))
file_stream.close()

Markdown is also a helpful format.

In [7]:
file_stream = open('single_precision_index.md', 'w')
routine = ''
for line in single_precision_vars:
    line = line.split(':')
    if line[0] == routine:
        file_stream.write('\t{:4s} :: {:50s}\n'.format(line[1].strip(), line[2].strip()))
    else:
        routine = line[0]
        file_stream.write('\n\n# {:s}\n\n'.format(line[0].rstrip('.f').upper()))
        file_stream.write('\t{:4s} :: {:70s}\n'.format(line[1].strip(), line[2].strip()))
file_stream.close()

---

Changing these can be done by a single person, but would perhaps benefit from multiple editors working on a subset of the declarations.

In [9]:
print "There are {:.0f} lines that need editing.".format(len(single_precision_vars))

There are 287 lines that need editing.


Which means that, split among 3 people, it's approximately 96 lines of code per person. There are some routines that contain significantly more lines that require editing, as compared to others. We can create a histogram showing number of lines per subroutine.

In [34]:
N_lines = []
routine = ''
n = 0
for line in single_precision_vars:
    line = line.split(':')
    if line[0] == routine:
        n += 1
    else:
        # output previous routine count
        N_lines.append([routine, n])
        n = 0
        
        # start a new routine
        routine = line[0]
        n += 1
N_lines.pop(0)

['', 0]

Now we should aim to sort and then distribute routines to be modified.

In [35]:
N_lines = sorted(N_lines, key = lambda routine: routine[1])

In [40]:
print N_lines
print "\n \t There are {:.0f} subroutines that need editing.".format(len(N_lines))

[['CIAh2h.f', 1], ['CIAh2h2.f', 1], ['CIAh2he.f', 1], ['CIAhhe.f', 1], ['bpl.f', 1], ['checkpartf.f', 1], ['gausi.f', 1], ['momeqcheck.f', 1], ['scale.f', 1], ['setdis.f', 1], ['oslistmo.f', 2], ['osmet_35.f', 2], ['osmet_separate_35.f', 2], ['pemake.f', 2], ['injon.f', 3], ['molfys.f', 3], ['startm.f', 3], ['inabs.f', 4], ['tranfr.f', 4], ['archiv.f', 5], ['osopac_35.f', 5], ['die_pe.f', 7], ['die_pe_lu.f', 7], ['jon.f', 7], ['osmainb.f', 8], ['eqmol_pe.f', 10], ['eqmol_pe_lu.f', 10], ['ossolve.f', 10], ['hydropacmodif.f', 32], ['detabs.f', 43], ['hlinopbpz.f', 53], ['hlinopmodif.f', 53]]

 	 There are 32 subroutines that need editing.


This is not too unreasonable for a single person, but additional checking for single precision declarations and functions would be beneficial. This requires that we also consider the total number of lines in each file that needs to be edited.

In [60]:
lines_in_routines = [line.split() for line in fi.input('marcs_routines_nlines.txt')]

In [61]:
lines_in_routines = [[x[1], int(x[0])] for x in lines_in_routines]

We now have a record for number of single precision statements to be changed and the number of lines in the entire routine.

In [62]:
total_lines = []
for routine in lines_in_routines:
    n_edits = [x[1] for x in N_lines if x[0] == routine[0]]
    if n_edits == []:
        n_edits = 0
    else:
        n_edits = int(n_edits[0])
    total_lines.append([routine[0], routine[1] + n_edits])

In [65]:
total_lines = sorted(total_lines, key = lambda routine: routine[1])

We've now combined the total number of lines in each file with the number of edits to fixed single precision constants that need to be made. Of course, there are instances where function calls need to be verified, but one expects this is somewhat proportional to the number of lines in the entire file.

In [66]:
print "\t There are {:.0f} subroutines that need checking.".format(len(total_lines))

	 There are 77 subroutines that need checking.


If divided equally, that means each of 3 people will need to check 26 separate subroutines. However, these should we weighted so that each person searches roughly the same number of routines and the same number of lines.

In [67]:
person_1 = []
person_2 = []
person_3 = []
for i, routine in enumerate(total_lines):
    if i % 3 == 0:
        person_1.append(routine)
    elif i % 3 == 1:
        person_2.append(routine)
    elif i % 3 == 2: 
        person_3.append(routine)
    else:
        print "Whoops, misplaced", routine

Now, let's confirm that everything looks about equal.

In [69]:
print len(person_1), len(person_2), len(person_3)

26 26 25
