# Section 2 

Start looking in to a more realistic case -- evolving the full set of coefficients

To have more latitude for improvement, and for more than bias being involved:


ln -sf testin.2621.csv testin.csv 


in your directory. This station has very large rms and bias, and the errors show dependence on the GFS variables themselves.


In [None]:
# Boilerplate python imports
import sys
import csv
from math import *
import matplotlib
import matplotlib.pyplot as plt
import numpy as np


The code being shown here is heavily towards the evolutionary side. 

The key physical science and mathematics are in the imported module evolution1.
Keys being:
how to translate the parameters in to a prediction
how to score a prediction

In this case, the full set of linear coefficients are being evolved, bias and coefficients for GFS -- t2m, td, thickness (1000-850 mb), rh, and wind speed.

Though not shown yet, a good thing to do is to plot the predictions vs. their target. You can add that yourself, matplotlib is already imported.

In [None]:
# basic1 from the github

# Some global parameters:
nobs = 579
nparameters = 6

npopulation = 10
per_second = 60     # estimate of number of generations per second
genmax = int(60*per_second)

train_start = int(0)
train_end   = int(364)
np.random.seed(0)      # for reproducibility

from scores import *
from evolution2 import *

######################## ######################## ########################
# Now bring in the data for real work:
matchup_set = []

with open('testin.csv') as csvfile:
    k = 0
    sreader = csv.reader(csvfile, delimiter=",")
    for line in sreader:
        day     = float(line[0])
        t2m_gfs = float(line[1])
        td_gfs  = float(line[2])
        thick_gfs = float(line[3])
        rh_gfs  = float(line[4])
        speed   = float(line[5])
        obs_t2m = float(line[6])
        obs_td  = float(line[7])
        terr    = float(line[8])
        tderr  = float(line[9])

        #Note that obs_td, obs_t2m, tderr are being ignored. They can be
        #       added to the list.
        #  n.b.: note that it is terr that is used, not t2m itself.
        #Model and observation are well-enough correlated that it is the increment
        #which makes more sense to predict [Krasnopolsky,20NNN]
        m = matchup((day,t2m_gfs,td_gfs,thick_gfs,rh_gfs,speed,terr))
        matchup_set.append(m)
        k += 1

csvfile.close()
######################## ######################## ########################

Initialize and seed the population

Note the python structure used for initializing and adding to a list of things. Population and bests can be added to at will via the .append operation. We'll use this later (section 3) to collect all the parameter suites which are good in some respect (we'll decide what constitutes 'good').


In [None]:
# basic1 

#You can change this and re-run to see the effects of changes to the metric used:
#  Other options are:
#  MEAN
#  RM3 (root mean cubed error, will be related to skew)
#  RM4 (root mean quartic error, will be related to kurtosis)
#  MAE (Mean absolute error)
#  NLOSS (count the number of times the corrected output is worse than the original); 
#      not currently working, will fall back to RMS
#  VICKIE (MAE with a 3 C tolerance -- errors less than 3 C are ignored, otherwise, MAE)
#    -- a representation of a real person's (my wife) interest in T2M accuracy.
#The 'scores' file imported above contains the code and it is easy to add or modify

measure = RMS

#Initialize and seed the population
population = []
bests      = []       # Save all then-best versions
for k in range (0,npopulation):
    population.append(critter(nparameters))

weights = np.zeros((nparameters))
sdevs   = np.zeros((nparameters))
bests.append(critter(nparameters))
bests[0].init(weights, sdevs, 99.)
nbests = 1

#for reference, take the raw gfs output's score:
population[0].init(weights, sdevs, 99.)
score_gfs = population[0].skill(matchup_set, train_start, train_end, metric = measure)

print("uncorrected score in training period: ",
         population[0].skill(matchup_set, train_start, train_end, metric = measure) )
print("uncorrected score in evaluation period: ",
         population[0].skill(matchup_set, train_end+1, nobs, metric = measure), flush=True )
population[0].show_fcst(matchup_set, train_start, train_end)

population[0].weights[0] = 0.0

print("\n",flush=True)

Initialize the population and find our first best. 

In [None]:
#Initializing the standard deviations for evolution ----------
#For the bias
sdevs[0] = 1.0
#For linear terms
for k in range (1,int(6)):
    sdevs[k] = 1.0

#For quadratic terms
#for k in range (int(6), nparameters):
#  sdevs[k] = 0.0125

#Initialize the population itself now -------------------------
for k in range (0,npopulation):
  weights[0] = np.random.normal(0,sdevs[0])
  for l in range (1, int(6) ):     #initialize only the linear part
    weights[l] = np.random.normal(0,sdevs[l])
  population[k].init(weights,sdevs, 99.)

#recall that the matchup_set is holding the matchups
#Find our first 'best' -- noting that we aren't saving raw gfs as an example
smin = 9999.
kbest = int(npopulation)
for k in range (0,npopulation):
    population[k].skill(matchup_set, train_start, train_end, measure)
    if (population[k].score < smin):
        kbest = k
        smin = population[k].score

#Start accumulating our best critters
bests.append(critter(nparameters))
bests[nbests].init(population[kbest].weights, population[kbest].sdevs, population[k].score)
nbests += 1

population[kbest].show()
print("initial kbest, smin = ",kbest, smin, flush=True)


## Type of evolution

For this evolution, we are using only mutation -- as would happen with bacteria (haploid).

As an analogy to diploids (plants, animals, people), one could also have 'crossover' mutations. Namely, to select two parents and take the first M genes from the first parent, and the remainder from the second. 

In [None]:
######################## ######################## ########################
#      Now carry out the (mutation-only) evolution
#swap best in to all slots
#then evolve a new raft of critters from that
#evaluate them
#repeat until limit of generations or happy

for gen in range(0,genmax):
    #print("generation ", gen, flush=True)

    population[0].copy(population[kbest])
    population[0].score = population[kbest].score
    score_best = float(population[0].score)
    smin = score_best
    kbest = 0
    for k in range (1, npopulation):
        population[k].copy(population[0])
        population[k].evolve()
        population[k].skill(matchup_set, train_start, train_end, metric = RMS)
        if (population[k].score < score_best):
            kbest = k
            smin = population[k].score
            bests.append(critter(nparameters))
            bests[nbests].init(population[kbest].weights, population[kbest].sdevs, population[kbest].score)
            nbests += 1
    if (kbest != 0):
        if (score_gfs != 0):
          print("new best ",gen, kbest, smin, score_best, smin/score_gfs, flush=True)
        else:
          print("new best ",gen, kbest, smin, score_best, flush=True)
        population[kbest].show()


Now consider what we found along the way

In [None]:
######################## ######################## ########################
if (score_gfs != 0):
  print("best score in training period ",gen, kbest, smin, score_best, smin/score_gfs, flush=True)
else:
  print("best score in training period ",gen, kbest, smin, score_best, flush=True)
print("score in the untrained period: ",population[kbest].skill(matchup_set, train_end+1, nobs, measure))

print("found ",nbests,"new bests along the way\n")
for k in range (0, nbests):
  bests[k].show()
  print("\n")



In [None]:
print("Forecasts in the training period:")
population[0].show_fcst(matchup_set, train_start, train_end)

#My standard run with RMS on 2621 finishes with 
# mean rms  0.30498389859902936 7.73783451221378

#Uncorrected GFS: 
# mean rms  13.939368131868132 16.79408375517474

In [None]:
print("Untrained forecasts:")
population[0].show_fcst(matchup_set, train_end, nobs)

# My standard run finishes with
#mean rms  -2.1955708992208987 5.308750022304633

# Uncorrected GFS: 
#mean rms  15.939368131868132 18.487799266091592


## Notes:

If you rerun cells after the initial np.seed(0), you'll get different results. Or, of course, you can change the seed and get different results.

This particular station has such large errors, that almost anything (random guess) makes an improvement over the GFS raw output. That's why the first population gives a best that has only 56% the RMS of the raw output. A bit of further evolution reduced that to 44%.

In saying 'a bit' of evolution, that is because we let the evolution run for only about 1 minute. The only reason for that limit is convenience in presentation. We could set that to something equivalent to an hour, or until the RMS is below 3.5 C (better than 90% of uncorrected stations). 

Note, too, the particular coding of this Python is not particularly efficient. In Fortran or C/C++, it is 10-100 times faster. Optimizing the usage of Python could also be much faster than this.

Notice that there can be many generations (a few hundred, in the example) of no improvement in the best score. There are techniques for dealing with this. We'll look at some in section 3. Also, one can introduce a hill-climbing step every 100 steps of no improvement via evolution. More generally, to use standard optimizers after evolution has produced good candidates.

### Experiment: 
Increase the maximum generation to, say, 100,000 and see what you get -- both in terms of the quality of result, and the set of coefficients.
