## Goal:
Create a function that reorganizes a text file to make a new text file that is plottable by python.

## Test:
* Time incruments are spaced equally
* Same number of coefficients across all time
* Data information not included
* The correct number of l and m coefficients at each interval


## Text file format:
Block format starting with the year and following coefficient in correct order

### Example: ggf100k
$ year \quad g_{1}^{0} ... h_{4}^{4} $ &emsp; 25 elements <br>
$ g_{5}^{0} ... h_{6}^{6}   $ &emsp;  24 elements <br>
$ g_{7}^{0} ... h_{8}^{4}   $ &emsp; 24 elements<br>
$ g_{5}^{0} ... h_{6}^{6}   $ &emsp; 24 elements <br>
$ g_{8}^{9} ... h_{10}^{10} $ &emsp; 24 elements

$ year+1 \quad g_{1}^{0} ... h_{4}^{4} $ &emsp; 25 elements <br>
$ g_{5}^{0} ... h_{6}^{6}   $ &emsp;  24 elements <br>
$ g_{7}^{0} ... h_{8}^{4}   $ &emsp; 24 elements<br>
$ g_{5}^{0} ... h_{6}^{6}   $ &emsp; 24 elements <br>
$ g_{8}^{9} ... h_{10}^{10} $ &emsp; 24 elements <br>

### main function:
Imports libraries
Calls all functions as needed and in order

In [8]:
import numpy as np

[lines, linecount] = get_linecount('testPass.txt')


The total number of lines in the text document are: 12


### get_linecount:
with open(): context manager, closes file when leaving unindented <br>
readlines: reads lines as individual elements in the list <br>

In [7]:
def get_linecount(textdoc):
    """Reads in a text document and counts the number of lines.
    Input : textdoc needs to be in string format
    Output: lines comes out as a list, each element in the list is a
            in the text document
          : linecount is the total number of lines in the document
    """
    assert type(textdoc) == str, 'textdoc is not a string'
    
    with open(textdoc, mode='r') as f:
        lines = f.readlines()
        linecount = len(lines)
        
    print('The total number of lines in the text document are:', linecount)
    return lines, linecount

### Cell block below: Gets the number per line in a year block and total years
np.fromstrings: converts the list element string to array float <br>
for loop: Get the length of each line in a year block <br>

In [None]:
# need to count size needed for initalization of coefficients
eachLine = np.zeros(blocklinelen)
j=0
for index in range(0,blocklinelen):
    blockLine = np.fromstring(lines[index], dtype=float, sep=' ')
    eachLine[j] = len(blockLine)
    j += 1

totalYears = linecount/blocklinelen
assert totalYears.is_integer(), 'An uneven number of block lines with total years'
    
print('The length of each line is:', eachLine)
print('The total years:', totalYears)

In [2]:
# need to count size needed for initalization of coefficients
eachLine = np.zeros(blocklinelen)
j=0
for index in range(0,blocklinelen):
    blockLine = np.fromstring(lines[index], dtype=float, sep=' ')
    eachLine[j] = len(blockLine)
    j += 1

totalYears = linecount/blocklinelen
assert totalYears.is_integer(), 'An uneven number of block lines with total years'
    
print('The length of each line is:', eachLine)
print('The total years:', totalYears)

The length of each line is: [25. 24. 24. 24. 24.]
The total years: 100041.0


### Cell block below: Extracts the years
Similar to above cell script <br> 
for loop: collects the year from lines 0,5,10,15,... <br>

In [3]:
year = np.zeros(int(totalYears))
j=0
for index, line in enumerate(lines, 0):
    if (index % blocklinelen) == 0:
        yearList = np.fromstring(lines[index], dtype=float, sep=' ')
        year[j] = yearList[0]
        j += 1
    else:
        pass

assert year[0] < year[-1], 'Years not in order'
assert totalYears == len(year)
assert sum(np.diff(year, n=2)) == 0, 'The years are not equally spaced'

### Cell block below: Calculates the degree and order
* Total Gauss Coefficients per year$ = 2l +1 $
* While loop: subtracts from the total coefficients in an increasing order until zero

In [4]:
totalGC = sum(eachLine)-1                        # subtract year
print('Total coefficients per year:', totalGC)

countGC = totalGC
l=1
while countGC > 0:
    countGC = countGC - (2*l+1)
    l+=1
l = l-1
assert countGC == 0
print('The degree and order is', l)

Total coefficients per year: 120.0
The degree and order is 10


### Put together the matrix
for loop: similar to the loops above but creates a 2-D array with every row containing on year of coefficients

### Example: new 2-D array of ggf100k
$ year \quad g_{1}^{0} ... g_{4}^{4} h_{4}^{4} ...  h_{10}^{10} \quad \quad $ 121 elements  <br>
$ year + 1 \quad g_{1}^{0} ... g_{4}^{4} h_{4}^{4} ...  h_{10}^{10} \quad $ 121 elements <br>



In [5]:
coeff = np.zeros((int(len(lines)/blocklinelen), int(totalGC+1)))
row = 0
for index, line in enumerate(lines, 0):
    
    yearList = np.fromstring(lines[index], dtype=float, sep=' ')
    elementLen = len(yearList)       # will update with loop
        
    if ((index % blocklinelen) == 0) and (index!=0):
        row+=1
    
    if (index % blocklinelen) == 0:
        col=0
        coeff[row, col:elementLen] = yearList

    else:
        coeff[row, col:(elementLen+col)] = yearList  
    col+=elementLen

In [6]:
np.savetxt('ggf100k_test.txt', coeff)